Hello everybody,
I'd like to start a discussion on a new RFC about fixing the default PDO
SQL parser and having (optional) driver-specific parsers.
https://wiki.php.net/rfc/pdo_driver_specific_parsers
Thanks GPB and Tom de Wit for their help with initial proof-reading.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi Matteo,
2024/04/17 21:24、Matteo Beccati php@beccati.comのメール:
Hello everybody,
I'd like to start a discussion on a new RFC about fixing the default PDO SQL parser and having (optional) driver-specific parsers.
https://wiki.php.net/rfc/pdo_driver_specific_parsers
Thanks GPB and Tom de Wit for their help with initial proof-reading.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
This is a very interesting topic for me. A few days ago I saw an issue that was largely related to this.
https://github.com/php/php-src/issues/13958
It seems possible to get around this problem using some hacks, but essentially I agree that it makes sense to have a parser for each driver.
Regards,
Saki
Hi Saki,
Il 17/04/2024 16:03, Saki Takamachi ha scritto:
This is a very interesting topic for me. A few days ago I saw an issue that was largely related to this.
https://github.com/php/php-src/issues/13958It seems possible to get around this problem using some hacks, but essentially I agree that it makes sense to have a parser for each driver.
Thanks for the feedback. I will reference this issue as duplicate too in
the RFC. I'm certain if we dig deep enough we'll find a few more.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi Matteo,
Thanks for the feedback. I will reference this issue as duplicate too in the RFC.
Thanks for the reference to the issue.
I'm certain if we dig deep enough we'll find a few more.
Agree. Maybe we can find something other than PostgreSQL.
I have read through your RFC. If we change the default scanner from the current, is there a possibility that an unintended BC Break will occur? I don't think there is a problem with MySQL, but I'm a little worried about other drivers.
Regards,
Saki
Hi Saki,
Il 17/04/2024 16:30, Saki Takamachi ha scritto:
Hi Matteo,
Thanks for the feedback. I will reference this issue as duplicate too in the RFC.
Thanks for the reference to the issue.
I'm certain if we dig deep enough we'll find a few more.
Agree. Maybe we can find something other than PostgreSQL.
I have read through your RFC. If we change the default scanner from the current, is there a possibility that an unintended BC Break will occur? I don't think there is a problem with MySQL, but I'm a little worried about other drivers.
I did a quick research and both Oracle and SQL Server seem only to
understand double single quotes.
I agree we need more research, but it's already 4 database drivers we'd
be fixing by switching the default parser to standard SQL quotes.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi Saki,
Il 17/04/2024 16:30, Saki Takamachi ha scritto:
Hi Matteo,
Thanks for the feedback. I will reference this issue as duplicate too in the RFC.
Thanks for the reference to the issue.
I'm certain if we dig deep enough we'll find a few more.
Agree. Maybe we can find something other than PostgreSQL.
I have read through your RFC. If we change the default scanner from the current, is there a possibility that an unintended BC Break will occur? I don't think there is a problem with MySQL, but I'm a little worried about other drivers.
I did a quick research and both Oracle and SQL Server seem only to
understand double single quotes.I agree we need more research, but it's already 4 database drivers we'd
be fixing by switching the default parser to standard SQL quotes.
This all seems logical, but having separate parsers would mean that the SQL strings are no longer portable, yes? Eg, many frameworks and CMSes try to (claim to) support multiple DBs transparently. (MySQL and Postgres and SQLite, usually). Some even recommend using SQLite for testing, but MySQL for prod. This change would break that, wouldn't it? Because the escaping would necessarily be different for MySQL and SQLite, and thus the queries would break on one or the other?
--Larry Garfield
Hey Larry,
Il 17/04/2024 16:51, Larry Garfield ha scritto:
This all seems logical, but having separate parsers would mean that the SQL strings are no longer portable, yes? Eg, many frameworks and CMSes try to (claim to) support multiple DBs transparently. (MySQL and Postgres and SQLite, usually). Some even recommend using SQLite for testing, but MySQL for prod. This change would break that, wouldn't it? Because the escaping would necessarily be different for MySQL and SQLite, and thus the queries would break on one or the other?
Nope. If you hardcode strings in your SQL, then it's your responsibility
to write them with the correct syntax.
For example a SELECT "foo"
will work on MySQL, but not on Postgres
already, and this RFC won't change that.
Likewise, when using single quotes, SELECT '\\'
will get you a single
backslash on MySQL right now, but two backslashes on Postgres,
regardless of this RFC.
The only proper way to safely hardcode literals is to use the
PDO::quote
method, which will take care of all the required escaping
(and charset stuff), according to the connected database. But then
again, most likely using parameters would be best in many circumstances.
As for recommending testing on SQLite when production is on MySQL, I've
always found that to be a (huge) foot gun. Of course YMMV ;-)
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
I think the question here was more about what the syntax will be after the
parameters are substituted. But if I recall correctly, the quoting is done
by PDO:: quote so the syntax will remain the same. Only the buggy behavior
would be fixed when it comes to recognizing parameters.
Hi,
I think the question here was more about what the syntax will be after the parameters are substituted. But if I recall correctly, the quoting is done by PDO:: quote so the syntax will remain the same. Only the buggy behavior would be fixed when it comes to recognizing parameters.
yes, that is unchanged and handled through the quoting functionality.
Cheers
Hi Kamil,
Il 17/04/2024 19:25, Kamil Tekiela ha scritto:
I think the question here was more about what the syntax will be after
the parameters are substituted. But if I recall correctly, the quoting
is done by PDO:: quote so the syntax will remain the same. Only the
buggy behavior would be fixed when it comes to recognizing parameters.
Exactly, the parser change is all about properly recognizing parameters.
Parameter substitution is taken care of by the driver.
With emulated prepares (default on pdo_mysql, alas) parameters are
inlined in the query after being properly "quoted".
Otherwise they will be passed according to what the database client
library needs (byte length + string, or any other format).
It's a good point and I'll make sure to add this to the RFC.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hey Larry,
Il 17/04/2024 16:51, Larry Garfield ha scritto:
This all seems logical, but having separate parsers would mean that the SQL strings are no longer portable, yes? Eg, many frameworks and CMSes try to (claim to) support multiple DBs transparently. (MySQL and Postgres and SQLite, usually). Some even recommend using SQLite for testing, but MySQL for prod. This change would break that, wouldn't it? Because the escaping would necessarily be different for MySQL and SQLite, and thus the queries would break on one or the other?
Nope. If you hardcode strings in your SQL, then it's your responsibility
to write them with the correct syntax.For example a
SELECT "foo"
will work on MySQL, but not on Postgres
already, and this RFC won't change that.Likewise, when using single quotes,
SELECT '\\'
will get you a single
backslash on MySQL right now, but two backslashes on Postgres,
regardless of this RFC.The only proper way to safely hardcode literals is to use the
PDO::quote
method, which will take care of all the required escaping
(and charset stuff), according to the connected database. But then
again, most likely using parameters would be best in many circumstances.As for recommending testing on SQLite when production is on MySQL, I've
always found that to be a (huge) foot gun. Of course YMMV ;-)
I did not say I endorse the idea, just that I have seen it done. :-) And some applications purport to run on your choice of MySQL or Postgres, even if they tend to not test the latter very well.
In any case, good to know that this won't make the situation worse. It's probably a good idea to include some version of that in the RFC for clarity.
--Larry Garfield
Hi Saki, internals,
Il 17/04/2024 16:30, Saki Takamachi ha scritto:
Hi Matteo,
Thanks for the feedback. I will reference this issue as duplicate too in the RFC.
Thanks for the reference to the issue.
I'm certain if we dig deep enough we'll find a few more.
Agree. Maybe we can find something other than PostgreSQL.
I have read through your RFC. If we change the default scanner from the current, is there a possibility that an unintended BC Break will occur? I don't think there is a problem with MySQL, but I'm a little worried about other drivers.
I have updated the RFC quite extensively. I've added the results of the
research I've done for all the supported PDO database types and ended up
slightly modifying the proposal in a way I thought made more sense
according to the results.
I also have a draft implementation ready:
https://github.com/php/php-src/pull/14035
Tests are green. DBAL tests are green for sqlite, mysql and pgsql, which
is what I could test locally.
And as an exercise, an attempt to support PgSQL escape literals, e.g.
e'backslashes 'accepted' here'
https://github.com/mbeccati/php-src/pull/1/commits/79b59d958c43042e54348dfbae0b0c2509563aa7
which I declared as out of scope but could in fact be supported without
too much effort.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi internals!
I've updated once again the RFC and implemented most of the 3 major
dialects (mysql, pgsql, sqlite) in the drivers.
https://wiki.php.net/rfc/pdo_driver_specific_parsers
https://github.com/php/php-src/pull/14035
I've tried to keep syntax changes we might not want as separate commits
in the PR.
For example:
- the pdo_pgsql driver now also understands C-style escape strings and
dollar quoted strings. - pdo_sqlite supports Access-style [identifiers].
- pdo_mysql will consider "--" a comment only when followed by whitespace.
The latter has been a particular challenge for me and I've been able to
overcome it by using the re2c:eof feature, which I then discovered being
available only in a later version compared to our requirements (1.2.1,
released Aug 2019). As is, the Windows build fails on GH because the sdk
ships with 1.1.1.
Perhaps someone with better re2c knowledge can get it working with re2c
1.0.3+, or perhaps it's not really worth it.
Looking forward to hearing from you!
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hi,
Il 03/05/2024 11:14, Matteo Beccati ha scritto:
Hi internals!
I've updated once again the RFC and implemented most of the 3 major
dialects (mysql, pgsql, sqlite) in the drivers.https://wiki.php.net/rfc/pdo_driver_specific_parsers
https://github.com/php/php-src/pull/14035
I've tried to keep syntax changes we might not want as separate commits
in the PR.For example:
- the pdo_pgsql driver now also understands C-style escape strings and
dollar quoted strings.
- pdo_sqlite supports Access-style [identifiers].
- pdo_mysql will consider "--" a comment only when followed by
whitespace.The latter has been a particular challenge for me and I've been able to
overcome it by using the re2c:eof feature, which I then discovered being
available only in a later version compared to our requirements (1.2.1,
released Aug 2019). As is, the Windows build fails on GH because the sdk
ships with 1.1.1.Perhaps someone with better re2c knowledge can get it working with re2c
1.0.3+, or perhaps it's not really worth it.Looking forward to hearing from you!
I didn't see this sparking up discussion, so it either went unnoticed,
or everybody is fine with it (even the re2c version bump) ;-)
In any case I'm planning to start the vote in about 2 weeks from now.
Next week I'll be organising phpday in Verona, so feel free to approach
me if you are around and have any questions.
I might even try go get out of my comfort zone and do a lightning talk
on the topic. That would be a first in 20+ years of conference
organisation... we'll see!
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Il 03/05/2024 11:14, Matteo Beccati ha scritto:
I've updated once again the RFC and implemented most of the 3 major
dialects (mysql, pgsql, sqlite) in the drivers.https://wiki.php.net/rfc/pdo_driver_specific_parsers
https://github.com/php/php-src/pull/14035
I've tried to keep syntax changes we might not want as separate commits in
the PR.For example:
- the pdo_pgsql driver now also understands C-style escape strings and
dollar quoted strings.
- pdo_sqlite supports Access-style [identifiers].
- pdo_mysql will consider "--" a comment only when followed by whitespace.The latter has been a particular challenge for me and I've been able to
overcome it by using the re2c:eof feature, which I then discovered being
available only in a later version compared to our requirements (1.2.1,
released Aug 2019). As is, the Windows build fails on GH because the sdk
ships with 1.1.1.Perhaps someone with better re2c knowledge can get it working with re2c
1.0.3+, or perhaps it's not really worth it.Looking forward to hearing from you!
I didn't see this sparking up discussion, so it either went unnoticed, or
everybody is fine with it (even the re2c version bump) ;-)
I left a comment about that:
https://github.com/php/php-src/pull/14035#pullrequestreview-2047992559
In any case I'm planning to start the vote in about 2 weeks from now.
Next week I'll be organising phpday in Verona, so feel free to
approach me if you are around and have any questions.I might even try go get out of my comfort zone and do a lightning talk
on the topic. That would be a first in 20+ years of conference
organisation... we'll see!
That'd be handy. I'll see you there :-)
cheers,
Derick
Hi Internals,
Il 09/05/2024 15:51, Derick Rethans ha scritto:
In any case I'm planning to start the vote in about 2 weeks from now.
Next week I'll be organising phpday in Verona, so feel free to
approach me if you are around and have any questions.I might even try go get out of my comfort zone and do a lightning talk
on the topic. That would be a first in 20+ years of conference
organisation... we'll see!That'd be handy. I'll see you there :-)
Last minute family emergencies was plausible excuse to avoid jumping on
the stage for a lightning talk, but, sadly, I really had to leave the
conference earlier than expected.
Anyway, after some work I had figured out how to make the standard
version of re2c play nice with my own changes. I've explained in person
to Derick that the bump is no longer necessary, which hopefully will
make the RFC less controversial.
I've also addressed one minor BC-break for Postgres users dealing with
dollar-quoting and escaped question marks that came up on github.
If no other topic comes out I plan to start voting on Monday next week.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
Hello everybody,
I'd like to start a discussion on a new RFC about fixing the default PDO SQL parser and having (optional) driver-specific parsers.
https://wiki.php.net/rfc/pdo_driver_specific_parsers
Thanks GPB and Tom de Wit for their help with initial proof-reading.
Cheers
Matteo Beccati
Development & Consulting - http://www.beccati.com/
FWIW, ODBC would annoyingly complicate things by not exposing a single
dialect, but it handles placeholders directly, so it hasn't needed to
parse queries. Using standard SQL quoting would probably work for most
drivers though if it comes to it.