Hi,
First time messing around with PHP internal code and mailing lists, so
sorry if I'm in the wrong place or doing something wrong in general.
I recently noticed some weird performance issues while doing bulk
inserts with prepared statements (single INSERT with a lot of VALUES)
and using RETURNING clause to get back IDs and other columns.
So I wrote a little benchmark to insert 8000 random rows (3 columns
each) into a table and spent some time tracking down why it's slow.
Suprisingly it seems that INSERT itself takes 100-200ms, but
fetch/fetchAll returning id and one of the columns takes 2-3 seconds.
I'm sending the benchmark script and postgres database schema used.
After digging around PHP source code (pulled master branch), the problem
seems to be in PDO calling param_hook with PDO_PARAM_EVT_FETCH_PRE and
again PDO_PARAM_EVT_FETCH_POST for each fetched row, which causes
param_hook to be executed for each row x each param twice. In my little
benchmark inserting 8000 rows with 3 columns and returning 2 columns for
each row that means param_hook is called 8000x3x8000x2 = 384 000 000
times! So I took a look at pgsql_stmt_param_hook in
ext/pdo_pgsql/pgsql_statement.c and it doesn't seem to do anything for
PDO_PARAM_EVT_FETCH_PRE or PDO_PARAM_EVT_FETCH_POST. So if my
understanding is correct, it's calling a function that does nothing
meaningful 384 000 000 times, and this number grows exponentially with
the number of rows and columns.
I'm using postgres, but looking at ext/pdo_mysql code this seems to also
be the case for mysql's drivers, didn't benchmark it though.
As a test I commented out the lines dispatching those events in
ext/pdo/pdo_stmt.c:
|diff --git a/ext/pdo/pdo_stmt.c b/ext/pdo/pdo_stmt.c||
||index 96f7574638..49703a7d68 100644||
||--- a/ext/pdo/pdo_stmt.c||
||+++ b/ext/pdo/pdo_stmt.c||
||@@ -592,9 +592,9 @@ static int do_fetch_common(pdo_stmt_t *stmt, enum
pdo_fetch_orientation ori, zen||
|| return 0;||
|| }||
||||
||- if (!dispatch_param_event(stmt, PDO_PARAM_EVT_FETCH_PRE)) {||
||- return 0;||
||- }||
||+ // if (!dispatch_param_event(stmt, PDO_PARAM_EVT_FETCH_PRE)) {||
||+ // return 0;||
||+ // }||
||||
|| if (!stmt->methods->fetcher(stmt, ori, offset)) {||
|| return 0;||
||@@ -605,9 +605,9 @@ static int do_fetch_common(pdo_stmt_t stmt, enum
pdo_fetch_orientation ori, zen||
|| return 0;||
|| }||
||||
||- if (!dispatch_param_event(stmt, PDO_PARAM_EVT_FETCH_POST)) {||
||- return 0;||
||- }||
||+ // if (!dispatch_param_event(stmt, PDO_PARAM_EVT_FETCH_POST)) {||
||+ // return 0;||
||+ // }||
||||
|| if (do_bind && stmt->bound_columns) {||
|| / update those bound column variables now */|
|
|
After this change, fetching takes ~5ms and nothing seems broken, but the
whole INSERT/RETURNING is now 10 times faster.
Am I understanding this right? Could this be solved by letting each
pdo_* driver set some kind of a flag (bitmask?) telling PDO which hooks
it wants called from dispatch_param_event? For example for pdo_pgsql
that flag would be |PDO_PARAM_EVT_FREE | PDO_PARAM_EVT_NORMALIZE |
PDO_PARAM_EVT_ALLOC | PDO_PARAM_EVT_EXEC_PRE|, as other case statements
seem empty.
Hi Dino,
[snip]
||+ // if (!dispatch_param_event(stmt, PDO_PARAM_EVT_FETCH_POST)) {||
||+ // return 0;||
||+ // }||
|| ||
|| if (do_bind && stmt->bound_columns) {||
|| /* update those bound column variables now */|After this change, fetching takes ~5ms and nothing seems broken, but the
whole INSERT/RETURNING is now 10 times faster.Am I understanding this right? Could this be solved by letting each
pdo_* driver set some kind of a flag (bitmask?) telling PDO which hooks
it wants called from dispatch_param_event? For example for pdo_pgsql
that flag would be |PDO_PARAM_EVT_FREE | PDO_PARAM_EVT_NORMALIZE |
PDO_PARAM_EVT_ALLOC | PDO_PARAM_EVT_EXEC_PRE|, as other case statements
seem empty.
Nice find, thanks.
I'll have a look into it, although I think anything we do in that area
would target 8.1.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Just that you are aware the emails from both of you went to my spam folder.
Back to the matter, this is the correct list to email Dino, however could
you
open a pull request on GitHub for this change so that more eyes (and
especially eyes from people which know PDO) can have a look?
Please provide the relevant information again in the pull request and
possibly host your files on a gist.
This can still target PHP 8.0 as this seems to be akin to a bug this
might even be applicable to PHP 7.3/7.4. Anyways, good catch.
Best regards
George P. Banyard
Just that you are aware the emails from both of you went to my spam folder.
Thanks for the info. If anyone has any idea why we went to spam, please
let us know.
The reason I didn't create a pull request immediately is that I don't
have a concrete fix. The diff that I sent just shows where I commented
out the lines to confirm this was the problem.
PDO_PARAM_EVT_FETCH_POST seems to be used by pdo_firebird so I can't
just remove the hooks. Also, pdo_pgsql seems to be processing some kind
of INPUT_OUTPUT parameters and parsing postgres bools. This is in a code
path that is maybe still used while fetching:
|if (PDO_PARAM_TYPE(param->param_type) == PDO_PARAM_BOOL &&||
|| ((param->param_type & PDO_PARAM_INPUT_OUTPUT) !=
PDO_PARAM_INPUT_OUTPUT)) {||
|| const char *s = zend_is_true(¶m->parameter) ? "t" : "f";||
|| param->param_type = PDO_PARAM_STR;||
|| zval_ptr_dtor(¶m->parameter);||
|| ZVAL_STRINGL(¶m->parameter, s, 1);||
|| }|
I'm not familiar with what this does so I wanted someone more familiar
with PDO to see it first. I don't think I can create a useful pull
request that doesn't break anything.
Just that you are aware the emails from both of you went to my spam folder.
Back to the matter, this is the correct list to email Dino, however could
you
open a pull request on GitHub for this change so that more eyes (and
especially eyes from people which know PDO) can have a look?Please provide the relevant information again in the pull request and
possibly host your files on a gist.This can still target PHP 8.0 as this seems to be akin to a bug this
might even be applicable to PHP 7.3/7.4. Anyways, good catch.Best regards
George P. Banyard
Hi George,
Just that you are aware the emails from both of you went to my spam folder.
Thanks. Alas, big G doesn't like the fact I'm using my own email server,
despite all DKIM/SPF/whatnot being properly configured, according to them.
Back to the matter, this is the correct list to email Dino, however could
you
open a pull request on GitHub for this change so that more eyes (and
especially eyes from people which know PDO) can have a look?Please provide the relevant information again in the pull request and
possibly host your files on a gist.This can still target PHP 8.0 as this seems to be akin to a bug this
might even be applicable to PHP 7.3/7.4. Anyways, good catch.
You could be right. I've created a draft PR for review: Dino could you
please benchmark it and get back with results?
https://github.com/php/php-src/pull/6047
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi George,
/cc release managers - pls don't hate me ;-)
This can still target PHP 8.0 as this seems to be akin to a bug this
might even be applicable to PHP 7.3/7.4. Anyways, good catch.You could be right. I've created a draft PR for review: Dino could yo> please benchmark it and get back with results?> >
https://github.com/php/php-src/pull/6047
The PR seems to fix the issue:
https://bugs.php.net/bug.php?id=80027
Dino's bench script takes 3s on vanilla PHP8 and 120ms with the patch,
so it's seems a fairly good win. The fetching part alone goes down from
2.9s to 3ms.
The way it's been fixed should be backwards and forwards compatible with
no real need to bump PDO_VERSION_API: external PDO driver extensions
wanting to use the param_evt_skip flags could simply set them via a
preprocessor macro when compiled for PHP8+.
For now I've optimised the pdo_pgsql/mysql/sqlite extensions, but I will
look into the other bundled ones too.
Would you think it's sensible to treat this as a bug fix and target
7.3+? Or is it better to do PHP8 only? Or?
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi George,
/cc release managers - pls don't hate me ;-)
This can still target PHP 8.0 as this seems to be akin to a bug this
might even be applicable to PHP 7.3/7.4. Anyways, good catch.You could be right. I've created a draft PR for review: Dino could yo> please benchmark it and get back with results?> >
https://github.com/php/php-src/pull/6047
The PR seems to fix the issue:https://bugs.php.net/bug.php?id=80027
Dino's bench script takes 3s on vanilla PHP8 and 120ms with the patch,
so it's seems a fairly good win. The fetching part alone goes down from
2.9s to 3ms.The way it's been fixed should be backwards and forwards compatible with
no real need to bump PDO_VERSION_API: external PDO driver extensions
wanting to use the param_evt_skip flags could simply set them via a
preprocessor macro when compiled for PHP8+.For now I've optimised the pdo_pgsql/mysql/sqlite extensions, but I will
look into the other bundled ones too.Would you think it's sensible to treat this as a bug fix and target
7.3+? Or is it better to do PHP8 only? Or?
Might be better to do it for PHP 8 only, or maybe for PHP 7.4 as well.
I'm not strictly against doing it for PHP 7.3, though.
--
Christoph M. Becker
Hi George,
/cc release managers - pls don't hate me ;-)
This can still target PHP 8.0 as this seems to be akin to a bug this
might even be applicable to PHP 7.3/7.4. Anyways, good catch.You could be right. I've created a draft PR for review: Dino could yo>
please benchmark it and get back with results?> >
https://github.com/php/php-src/pull/6047
The PR seems to fix the issue:https://bugs.php.net/bug.php?id=80027
Dino's bench script takes 3s on vanilla PHP8 and 120ms with the patch,
so it's seems a fairly good win. The fetching part alone goes down from
2.9s to 3ms.The way it's been fixed should be backwards and forwards compatible with
no real need to bump PDO_VERSION_API: external PDO driver extensions
wanting to use the param_evt_skip flags could simply set them via a
preprocessor macro when compiled for PHP8+.For now I've optimised the pdo_pgsql/mysql/sqlite extensions, but I will
look into the other bundled ones too.Would you think it's sensible to treat this as a bug fix and target
7.3+? Or is it better to do PHP8 only? Or?Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hello Matteo
Let's have this patch merged into PHP 8.0 (master
branch as of today), I
have nothing against it.
About PHPs 7.3 and 7.4: if the performance improvements are this massive as
describe in this email, I'm :+1: to have it on those versions as well.
Kind regards,
The script I used to pinpoint this problem and test my temporary fix,
and later Matteo's PR is linked in the bug report I oppened. Should be
simple to run if anyone is interested in trying. I found this on
pdo_pgsql, but looking at the code it looks like the problem should
happen on other pdo_* drivers too, I can do more tests if it matters.
Hi George,
/cc release managers - pls don't hate me ;-)
This can still target PHP 8.0 as this seems to be akin to a bug this
might even be applicable to PHP 7.3/7.4. Anyways, good catch.
You could be right. I've created a draft PR for review: Dino could yo>
please benchmark it and get back with results?> >
https://github.com/php/php-src/pull/6047
The PR seems to fix the issue:https://bugs.php.net/bug.php?id=80027
Dino's bench script takes 3s on vanilla PHP8 and 120ms with the patch,
so it's seems a fairly good win. The fetching part alone goes down from
2.9s to 3ms.The way it's been fixed should be backwards and forwards compatible with
no real need to bump PDO_VERSION_API: external PDO driver extensions
wanting to use the param_evt_skip flags could simply set them via a
preprocessor macro when compiled for PHP8+.For now I've optimised the pdo_pgsql/mysql/sqlite extensions, but I will
look into the other bundled ones too.Would you think it's sensible to treat this as a bug fix and target
7.3+? Or is it better to do PHP8 only? Or?Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hello Matteo
Let's have this patch merged into PHP 8.0 (
master
branch as of today), I
have nothing against it.About PHPs 7.3 and 7.4: if the performance improvements are this massive as
describe in this email, I'm :+1: to have it on those versions as well.Kind regards,
Hi Gabriel,
Let's have this patch merged into PHP 8.0 (
master
branch as of today), I
have nothing against it.>
About PHPs 7.3 and 7.4: if the performance improvements are this massive as
describe in this email, I'm :+1: to have it on those versions as well.
The mileage may very: running the Doctrine test suite, for example,
shows no difference at all.
On the other side of the spectrum, a SELECT * FROM tbl WHERE id IN (?,
...) and 65535 parameters:
https://gist.github.com/mbeccati/6f2cd094c38b094cb4bfc4c0245a23ab
root@jameson /tmp # /tmp/php-7.4-std bench_pdo3.php
Got rows: 65535
[Timings] prep: 0.018, exec: 0.210, fetch: 35.812
root@jameson /tmp # /tmp/php-7.4-opt bench_pdo3.php
Got rows: 65535
[Timings] prep: 0.019, exec: 0.208, fetch: 0.026
That's almost 36 seconds vs 26ms.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi Gabriel,
Let's have this patch merged into PHP 8.0 (
master
branch as of today), I
have nothing against it.>
About PHPs 7.3 and 7.4: if the performance improvements are this massive as
describe in this email, I'm :+1: to have it on those versions as well.The mileage may very: running the Doctrine test suite, for example,
shows no difference at all.On the other side of the spectrum, a SELECT * FROM tbl WHERE id IN (?,
....) and 65535 parameters:https://gist.github.com/mbeccati/6f2cd094c38b094cb4bfc4c0245a23ab
root@jameson /tmp # /tmp/php-7.4-std bench_pdo3.php
Got rows: 65535
[Timings] prep: 0.018, exec: 0.210, fetch: 35.812root@jameson /tmp # /tmp/php-7.4-opt bench_pdo3.php
Got rows: 65535
[Timings] prep: 0.019, exec: 0.208, fetch: 0.026That's almost 36 seconds vs 26ms.
Can we be certain that the relevant bits of the formerly _reserved_flags
are zero-filled for all existing drivers?
--
Christoph M. Becker
Hi Christoph,
Can we be certain that the relevant bits of the formerly _reserved_flags
are zero-filled for all existing drivers?
We can, that's basically the premise of the PR itself:
- The flags are in the pdo_dbh_t struct which is the dbh->inner part
- dbh->inner = ecalloc(1, sizeof(pdo_dbh_t)):
- stmt->dbh is set as dbh->inner, in prepare and query
- drivers can eventually set some bits to 1.
A few quick links:
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/php_pdo_driver.h#L514
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/pdo_dbh.c#L562
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/pdo_dbh.c#L1085
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/pdo_dbh.c#L1481
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi Christoph,
Can we be certain that the relevant bits of the formerly _reserved_flags
are zero-filled for all existing drivers?We can, that's basically the premise of the PR itself:
- The flags are in the pdo_dbh_t struct which is the dbh->inner part
- dbh->inner = ecalloc(1, sizeof(pdo_dbh_t)):
- stmt->dbh is set as dbh->inner, in prepare and query
- drivers can eventually set some bits to 1.
A few quick links:
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/php_pdo_driver.h#L514
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/pdo_dbh.c#L562
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/pdo_dbh.c#L1085
https://github.com/php/php-src/blob/658f6ff2dae42c4f2b5c6f8d6adc8c8f09b52b47/ext/pdo/pdo_dbh.c#L1481
Thanks! So there only would be a problem, if a driver modifies
_reserved_flags (what they're not supposed to do). Looks to be okay
then for the stable release branches.
--
Christoph M. Becker