Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:60845
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: pass (pb1.pair.com: domain oracle.com designates 148.87.113.117 as permitted sender)
Message-ID: <4FDB5604.5000704@oracle.com>
Date: Fri, 15 Jun 2012 17:34:28 +0200
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: internals@lists.php.net
References: <CAAyV7nFciZcsL+AW9W4GbhM+PhM3RhWOJ7G0iBQsEHfSWO2QmA@mail.gmail.com>
In-Reply-To: <CAAyV7nFciZcsL+AW9W4GbhM+PhM3RhWOJ7G0iBQsEHfSWO2QmA@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [PHP-DEV] [PATCH - PR] Disable ATTR_EMULATE_PREPARES by default
 for PDO_Mysql
From: ulf.wendel@oracle.com (Ulf Wendel)

Am 15.06.2012 03:01, schrieb Anthony Ferrara:
> I raised this topic on list over a year ago (
> http://marc.info/?l=php-internals&m=130417646507744&w=2 ). It was
> determined that it wasn't time yet to disable prepared statement
> emulation for MySQL yet. However, Rasmus did mention that it was a
> possibility for 5.4 (
> http://marc.info/?l=php-internals&m=130418875017027&w=2 ). Since that
> ship has sailed, I submitted a pull request for trunk to change the
> default value of prepared statement emulation for MySQL.
>
> https://github.com/php/php-src/pull/108
>
> https://bugs.php.net/bug.php?id=54638
>
> Does this need to be an RFC (should I draft one)? Or can it just be
> pulled as-is?

Please, be aware of the consequences of this move and don't break any 
tests.  Also, please, do no break it down to "prepared statement === 
security" as any such _short_ statement draws an incomplete picture and 
bares more risks than it does good.

Back then, Johannes already pointed out some consequences: 
http://marc.info/?l=php-internals&m=130423522001744&w=2 . Here's another 
iteration on the topic. There are assorted others in my blog from the 
days PDO_MySQL has seen its last major update.


*Security*

It is claimed that native prepared statements are more secure because:

  - statement and parameter are send to the server independently
  - the server builds the final statement string

Whereas for a non-prepared statement:

  - the client builds the final statement string
  - the client has to escape parameter values
  - the client sends statement and parameter in one

As long as client-side escaping is done properly, there is no practical 
difference between the two approaches. There are, however, arguments 
that boil down to limiting the possiblity of user errors:

  - bind-style API
  - messing up on charsets

A bind-style API may be convenient and prevent forgetting to use 
escaping but does not prevent to shoot yourself doing something foolish 
such as:

    PDO::query("..." . $_REQUEST['killme']);

When using non-prepared statements it must be ensured to use the correct 
charset for (client-side) escaping. As long as charset changes happen 
through API calls, the client library is always aware of the current 
charset and everything is fine. But, the charset can be changes through 
SQL as well:

   SET NAMES ...

The fact that PDO lacks API calls to set the charset, which enforces the 
use of SET NAMES, is not helpful. However, even with appropriate API 
calls users can always fool themselves using:

   PDO::query(SET NAMES)

If building the final statement is done on the server or SET NAMES is 
disallowed, fooling yourself becomes much harder.


*Security in the context of PDO*

Actually, the change of the default can give you a false feeling of 
"improved security". This is due to the design of PDO and the existance 
of a prepared statement emulation in the core of PDO.

A statement like this will use the prepared statement emulation:

  SELECT something FROM table WHERE cond = :placeholder

MySQL does not support named bind parameter such as ":placeholder". The 
PDO parser kicks in and does replace ":placeholder" with the bound 
value. Which then, is no different from:

  sprintf("SELECT something FROM table WHERE cond = '%s'", escape($param))

Of course, escape() can't be forgotten. It happens inside PHP, inside 
PDO on the C level. Messing up with charsets is still possible. Thus, 
users must be educated as ever since. The message "prepared statement 
=== security" is too short to tell the full story.

BTW, there's the classic of:

   SELECT something FROM table LIMIT :placeholder

What will happen if forget to hint a data type for :placeholer? Search 
the bug database, if you can't answer. As you are at it, try the reverse 
with ? and another database system that does not support ? as a 
placeholder...


*Performance (and impact on server load)*

After switching to native prepared statements users may see an increase 
of MySQL - PHP round trips:

   prepare() client <--> server
   execute() client <--> server

Great, if statements get executed multiple times. Bad, if statements are 
not executed multiple times, such as:

   SELECT title, received FROM news WHERE  =
   SELECT name, price, special_price FROM specials

Those statements will take two round trips, thus become slower. Things 
become slower although there is no other win. Note, that none of the 
example statements has any parameters.

A statement such as:

    INSERT INTO test(col) VALUES (?)

May become faster if multiple rows are inserted. The statement string is 
transferred only once during prepare. Later only the parameter value is 
transferred. But, of course, you'd use MySQL multi-insert syntax or 
other tricks here anyway, wouldn't you?

Back in the 70's, when prepared statements have been introduced, the 
world was a different one. Prepared statements make the assumption that 
its benefitial to "cache" various things, such as query execution plans, 
inside the database server. Furhtermore, its assumed that a statement is 
executed multiple times, making the caching a real benefit. MySQL is 
from some 30 years thereafter:
  - some parsing saved upon re-execution
  - statements kept open for too long take short server resources
    (server is always the hardest piece to scale-out!)

Keep in mind.


*Compatibility / Server support*

Recent versions of MySQL support preparing most statements, 
http://dev.mysql.com/doc/refman/5.6/en/c-api-prepared-statements.html . 
MySQL versions from the old days of when PDO was created did not.

Turning off the default emulation does not mean that statements will 
stop to work. PDO_MySQL will fall back - as ever since - to the 
emulation, if MySQL hints that it cannot prepare a statement. See above 
on "false security". Whatever the default is, it does not free users to 
understand the matter.


*Impact on data types*

Non-prepared and prepared statements use different flavours of the MySQL 
protocol. Non-prepared, used with the PDO emulation, use the "text 
protocol". Everything is returned as a string:

SELECT 1 AS _foo -> (your fetch code...) -> $_foo = "1"

Prepared statements use the "binary protocol". Native column types are 
preserved in many cases:

SELECT 1 AS _foo -> (your fetch code...) -> $_foo = 1

Regarding the amount of wire traffic, there is hardly a difference 
between the two flavours. The actual difference depends on the column 
types. Nothing to bother about unless your website justifies to have a 
dedicated MySQL development team to squeeze out the very last BIT....


Ulf