Hi Internals,
There are bugs #79276 https://bugs.php.net/bug.php?id=79276 and #80340
https://bugs.php.net/bug.php?id=80340 where PDO parses valid PostgreSQL
statements incorrectly because it uses MySQL grammar for all statements
regardless of the underlying platform.
I submitted pull request #6410 https://github.com/php/php-src/pull/6410 which
enables extensions to opt-in to MySQL-specific syntax (since it's not
standard). Currently only pdo_mysql
uses it.
As Christoph Becker suggested, I'm bringing it here for discussion.
Technically, it may be a BC break since there may exist PDO-based
extensions outside the PHP repository that rely on compatibility with the
MySQL syntax.
Are there known extensions like this? Would this change be acceptable in
one of the next minor releases?
Regards,
Sergei Morozov
This is an interesting solution to the problem, but I am unsure if it's the
best one. I didn't look into it in detail yet, but at first glance, it
seems to be a cheap hack to solve a bug. What about backtick character ` ?
What about NO_BACKSLASH_ESCAPES mode? Any question mark within
backtick-escaped column name would still cause a problem. In fact, when I
tested it with NO_BACKSLASH_ESCAPES the PR didn't fix the problem. I
got a different error, but the problem still exists.
This is a naive solution that fixes one tiny bug in a single driver with no
regards for tons of other issues. I am against adding such hacks into the
PHP source code.
What we need is a full SQL parser that is context-aware. This task should
be delegated to the PDO drivers. PDO can't do it reliably at the base
level. The problem is that PDO is never going to be able to reliably parse
the SQL even if we have a separate parser for each driver. PDO is not the
server and it doesn't know how the SQL will be parsed at the end of the
day. At its root, it seems to be an unfixable problem. Emulated prepares
have a lot of issues and this is only one of them. The solution is to use
native prepares whenever they are available on the server.
If you want you can see the effect your PR has on this test case:
$pdo = new \PDO("mysql:host=localhost;dbname=test;charset=utf8mb4", 'user',
'password', [
\PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION,
\PDO::ATTR_EMULATE_PREPARES => true
]);
$pdo->exec("SET sql_mode = NO_BACKSLASH_ESCAPES;");
$stmt = $pdo->prepare("SELECT * FROM foo WHERE text='' and 1=? and
text=''");
$stmt->execute([1]);
I'm sorry but this is a hard pass from me.
Hi,
it seems to be a cheap hack
To me, it seems to be within the existing architecture. I agree that the
existing architecture has its flaws.
What about backtick character ` ? What about NO_BACKSLASH_ESCAPES mode?
These are out of scope. I wasn't aware of the NO_BACKSLASH_ESCAPES but it
indeed seems to be impossible to support on the client side.
Emulated prepares have a lot of issues and this is only one of them.
This specific issue doesn't seem to be related to the emulation. As far as
I understand, PDO parses SQL as well in order to replace positional
placeholders and vise versa depending on the input and the target platform
capabilities.
If you want you can see the effect your PR has on this test case
I tried running the example on the stock PHP 8.0.3, and regardless of the
emulation mode and NO_BACKSLASH_ESCAPES mode I saw the same error:
SQLSTATE[HY093]: Invalid parameter number: number of bound variables does
not match number of tokens in
With my patch, I don't expect to see any changes in parsing MySQL queries
so I don't understand how it's relevant. Am I missing something?
This is an interesting solution to the problem, but I am unsure if it's
the best one. I didn't look into it in detail yet, but at first glance, it
seems to be a cheap hack to solve a bug. What about backtick character ` ?
What about NO_BACKSLASH_ESCAPES mode? Any question mark within
backtick-escaped column name would still cause a problem. In fact, when I
tested it with NO_BACKSLASH_ESCAPES the PR didn't fix the problem. I
got a different error, but the problem still exists.
This is a naive solution that fixes one tiny bug in a single driver with
no regards for tons of other issues. I am against adding such hacks into
the PHP source code.What we need is a full SQL parser that is context-aware. This task should
be delegated to the PDO drivers. PDO can't do it reliably at the base
level. The problem is that PDO is never going to be able to reliably parse
the SQL even if we have a separate parser for each driver. PDO is not the
server and it doesn't know how the SQL will be parsed at the end of the
day. At its root, it seems to be an unfixable problem. Emulated prepares
have a lot of issues and this is only one of them. The solution is to use
native prepares whenever they are available on the server.If you want you can see the effect your PR has on this test case:
$pdo = new \PDO("mysql:host=localhost;dbname=test;charset=utf8mb4",
'user', 'password', [
\PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION,
\PDO::ATTR_EMULATE_PREPARES => true
]);
$pdo->exec("SET sql_mode = NO_BACKSLASH_ESCAPES;");
$stmt = $pdo->prepare("SELECT * FROM foo WHERE text='' and 1=? and
text=''");
$stmt->execute([1]);I'm sorry but this is a hard pass from me.
--
Sergei Morozov
This specific issue doesn't seem to be related to the emulation. As far
as I understand, PDO parses SQL as well in order to replace positional
placeholders and vise versa depending on the input and the target platform
capabilities.
True, PDO has a huge flaw because of its SQL parser. In non-emulation mode
PDO will attempt to only replace named parameters with ?. This leaves a
much smaller vector for bugs. How many people put : in their SQL? With a
context-aware parser, we could minimize this to close to zero.
I tried running the example on the stock PHP 8.0.3, and regardless of the
emulation mode and NO_BACKSLASH_ESCAPES mode I saw the same error:
I don't think this is possible. I executed it with PHP 8.0.3 and it only
fails in emulation mode.
With my patch, I don't expect to see any changes in parsing MySQL queries
so I don't understand how it's relevant. Am I missing something?
Which is kind of my point. You are attempting to fix a small issue in
another driver while the MySQL parser is already broken with similar bugs.
If we are going to have diverging parsers then they should be entirely
split up. Keeping a single parser will lead to more bugs. If we try to
patch like this then it will soon get out of hand.
Kamil,
You are attempting to fix a small issue in another driver while the MySQL
parser is already broken with similar bugs.
Yes, this is what the original idea was given the existing bugs on
bugs.php.net while I wasn't aware of all the other MySQL parser problems
that you pointed out.
Which is kind of my point.
Point taken.
Matteo,
This is great to see this effort happening! As I understand, you also
implement an API for platform-specific parsers but via a much better
extensible API.
I'm curious if instead of keeping the MySQL syntax in the default parser
and creating one for PostgreSQL, it would make sense to create a MySQL
parser and remove the support for backslash escaping from the default one?
Perhaps we could work together and come up with an RFC?
Thanks for the offer but I don't think I'll be able to prioritize this
effort over my other projects. Based on the conversation above, reworking
the parser the right way requires a lot of investment and frankly speaking
it's not the most critical problem in the world to solve (all these bugs
are edge cases).
Regards,
Sergei Morozov
Sergei,
This is great to see this effort happening! As I understand, you also
implement an API for platform-specific parsers but via a much better
extensible API. >
I'm curious if instead of keeping the MySQL syntax in the default parser
and creating one for PostgreSQL, it would make sense to create a MySQL
parser and remove the support for backslash escaping from the default one?
That's a possibility too. I'll think about it.
Perhaps we could work together and come up with an RFC?
Thanks for the offer but I don't think I'll be able to prioritize this
effort over my other projects. Based on the conversation above, reworking
the parser the right way requires a lot of investment and frankly speaking
it's not the most critical problem in the world to solve (all these bugs
are edge cases).
Fair enough. Any other volunteer, I can bribe promising a phpday
elephpant in exchange ;-)
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi Sergei,
in fact I have been working on a secret plan to allow each driver to
optionally have their own parser. Due to very limited time I haven't
finalised all the details, but the proof of concept looks promising:
https://github.com/php/php-src/pull/6852
Perhaps we could work together and come up with an RFC?
Cheers
Hi Internals,
There are bugs #79276 https://bugs.php.net/bug.php?id=79276 and #80340
https://bugs.php.net/bug.php?id=80340 where PDO parses valid PostgreSQL
statements incorrectly because it uses MySQL grammar for all statements
regardless of the underlying platform.I submitted pull request #6410 https://github.com/php/php-src/pull/6410 which
enables extensions to opt-in to MySQL-specific syntax (since it's not
standard). Currently onlypdo_mysql
uses it.As Christoph Becker suggested, I'm bringing it here for discussion.
Technically, it may be a BC break since there may exist PDO-based
extensions outside the PHP repository that rely on compatibility with the
MySQL syntax.Are there known extensions like this? Would this change be acceptable in
one of the next minor releases?Regards,
Sergei Morozov
--
Matteo Beccati
Development & Consulting - http://www.beccati.com/