When serializing binary strings in PHP 6, we have to escape non-ASCII
characters and then unescape them on unserialization. This patch adds
the unescapement support to PHP 5.2, in order to make it easier to
exchange data between PHP 5 and 6. If no one has objections, I will
commit soon.
-Andrei
Before we commit, let's explore how it will affect existing
applications. The last thing I want to do is break the serialization
format BC in a minor release.
When serializing binary strings in PHP 6, we have to escape non-
ASCII characters and then unescape them on unserialization. This
patch adds the unescapement support to PHP 5.2, in order to make it
easier to exchange data between PHP 5 and 6. If no one has
objections, I will commit soon.-Andrei
<php52_serialization.diff.txt>
Ilia Alshanetsky
We can put it in 5.3 perhaps.
-Andrei
Before we commit, let's explore how it will affect existing
applications. The last thing I want to do is break the serialization
format BC in a minor release.When serializing binary strings in PHP 6, we have to escape non-ASCII
characters and then unescape them on unserialization. This patch adds
the unescapement support to PHP 5.2, in order to make it easier to
exchange data between PHP 5 and 6. If no one has objections, I will
commit soon.-Andrei
<php52_serialization.diff.txt>
Ilia Alshanetsky
Hello Andrei,
this will be a problematic bc break as with the patch applied strings
suddenly behave pretty different. Until now our policy was to notintroduce
forward compatibility breaks.
best regards
marcus
Friday, December 1, 2006, 10:43:32 PM, you wrote:
When serializing binary strings in PHP 6, we have to escape non-ASCII
characters and then unescape them on unserialization. This patch adds
the unescapement support to PHP 5.2, in order to make it easier to
exchange data between PHP 5 and 6. If no one has objections, I will
commit soon.
-Andrei
Best regards,
Marcus
As it stands the current code breaks BC on decoding when the
serialized string contains \ characters.
For example:
Input PHP 5.2 PHP 5.2 w/patch
s:7:"foo\10b"; foo\10b error (NULL returned)
s:7:"foo\bar"; foo\bar error (NULL returned)
s:5:"\\"; \\\ error (NULL returned)
Basically any operation involving strings with \ in them, stop
working once the patch is applied.
There is also the performance drawback to consider, which based on a
rudimentary test involving $_SERVER serialization, shows that the new
code is roughly 1/2 slower.
5.2: 0.388
5.2 w/patch: 0.652
Ilia
I don't see a way we can make it work for all the cases. I guess we'll
have to leave this task to PHP_Compat.
-Andrei
As it stands the current code breaks BC on decoding when the
serialized string contains \ characters.For example:
Input PHP 5.2 PHP 5.2 w/patch
s:7:"foo\10b"; foo\10b error (NULL returned)
s:7:"foo\bar"; foo\bar error (NULL returned)
s:5:"\\"; \\\ error (NULL returned)Basically any operation involving strings with \ in them, stop working
once the patch is applied.There is also the performance drawback to consider, which based on a
rudimentary test involving $_SERVER serialization, shows that the new
code is roughly 1/2 slower.5.2: 0.388
5.2 w/patch: 0.652Ilia
From a users point of view: you must be kidding, eh?
You want to break all the strings which were stored serialized in the
5.x-series? I can understand that it will break in php6 but not in some
minor release.
It will cause havoc with a lot of apps which can't read their cached- /
meta-data anymore.
Also the performance drawback will get you a lot of angry users.
Wasn't that just some versions before when the serialization has gotten
a lot slower and created lots of problems?
thomas
Andrei Zmievski schrieb:
I don't see a way we can make it work for all the cases. I guess we'll
have to leave this task to PHP_Compat.-Andrei
As it stands the current code breaks BC on decoding when the
serialized string contains \ characters.For example:
Input PHP 5.2 PHP 5.2 w/patch
s:7:"foo\10b"; foo\10b error (NULL
returned)
s:7:"foo\bar"; foo\bar error (NULL
returned)
s:5:"\\"; \\\ error (NULL
returned)Basically any operation involving strings with \ in them, stop working
once the patch is applied.There is also the performance drawback to consider, which based on a
rudimentary test involving $_SERVER serialization, shows that the new
code is roughly 1/2 slower.5.2: 0.388
5.2 w/patch: 0.652Ilia
From a users point of view: you must be kidding, eh?
You want to break all the strings which were stored serialized in the
5.x-series? I can understand that it will break in php6 but not in some
minor release.
It will cause havoc with a lot of apps which can't read their cached- /
meta-data anymore.Also the performance drawback will get you a lot of angry users.
Wasn't that just some versions before when the serialization has gotten
a lot slower and created lots of problems?
This is a forwards compatibility issue, if we don't care about version
less than or equal to 5.2 being able to understand serialized data from
future version then there shouldn't be a problem with backwards
compatibility. I propose checking the start of the string to be
unserialized for the character v, if found then we can parse out a
version number up to the first found comma (or some other delimiter). So
for example we have
echo serialize( array( 1, 2, 3, 4 ) );
-> a:4:{i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;}
The new system could have:
-> v:5.21,a:4:{i:0;i:1;i:1;i:2;i:2;i:3;i:3;i:4;}
Then in newer version of unserialize the version of 5.21 can be
extracted and the appropriate unserialization can occur. If no version
info is found then fall back to original semantics.
Cheers,
Rob.
.------------------------------------------------------------.
| InterJinn Application Framework - http://www.interjinn.com |
:------------------------------------------------------------:
| An application and templating framework for PHP. Boasting |
| a powerful, scalable system for accessing system services |
| such as forms, properties, sessions, and caches. InterJinn |
| also provides an extremely flexible architecture for |
| creating re-usable components quickly and easily. |
`------------------------------------------------------------'
Excuse me, I don't quite understand what you're concerned about. The
break would occur if you use PHP 6 serialized content in PHP 5, not
vice versa. I already said that having the same escape format in PHP
5.2 is not tractable for BC reasons and the support, as such, must be
left to PHP_Compat.
-Andrei
From a users point of view: you must be kidding, eh?
You want to break all the strings which were stored serialized in the
5.x-series? I can understand that it will break in php6 but not in some
minor release.
It will cause havoc with a lot of apps which can't read their cached- /
meta-data anymore.Also the performance drawback will get you a lot of angry users.
Wasn't that just some versions before when the serialization has gotten
a lot slower and created lots of problems?
Guys,
why can't you simply introduce a new variable type for serialized
strings like uppercase S:
There is absolutely no need for breaking backward compatibility. If you
want to introduce
NEW features then introduce them and do not abuse old ones.
Stefan
Andrei Zmievski schrieb:
I don't see a way we can make it work for all the cases. I guess we'll
have to leave this task to PHP_Compat.-Andrei
As it stands the current code breaks BC on decoding when the
serialized string contains \ characters.For example:
Input PHP 5.2 PHP 5.2 w/patch
s:7:"foo\10b"; foo\10b error (NULL
returned)
s:7:"foo\bar"; foo\bar error (NULL
returned)
s:5:"\\"; \\\ error (NULL
returned)Basically any operation involving strings with \ in them, stop
working once the patch is applied.There is also the performance drawback to consider, which based on a
rudimentary test involving $_SERVER serialization, shows that the new
code is roughly 1/2 slower.5.2: 0.388
5.2 w/patch: 0.652Ilia
At 13:35 07/12/2006, Stefan Esser wrote:
Guys,
why can't you simply introduce a new variable type for serialized
strings like uppercase S:
There is absolutely no need for breaking backward compatibility.
I can't think of a single reason not to use a new type like you suggest.
Can anybody think of any drawbacks?
Zeev
At 13:35 07/12/2006, Stefan Esser wrote:
Guys,
why can't you simply introduce a new variable type for serialized
strings like uppercase S:
There is absolutely no need for breaking backward compatibility.I can't think of a single reason not to use a new type like you
suggest.Can anybody think of any drawbacks?
Well, older version of PHP will be totally incapable of parsing it,
creating problems for the many people who use serialized strings as a
means of passing data between PHP applications. This would be an
especially big problem if you would want to add it to a pre-6.0 release.
Ilia Alshanetsky
Well, older version of PHP will be totally incapable of parsing it,
creating problems for the many people who use serialized strings as a
means of passing data between PHP applications. This would be an
especially big problem if you would want to add it to a pre-6.0 release.
Uhmm you are missing the point. The point was that PHP 5.2.x can read
data serialized with PHP 6. PHP 5.2.x only needs to be able to read it.
Generating it is not necessary.
At the moment it seems that data serialized in PHP 5.2.x is totally
incompatible with PHP 6.
Stefan
Stefan Esser wrote:
Well, older version of PHP will be totally incapable of parsing it,
creating problems for the many people who use serialized strings as a
means of passing data between PHP applications. This would be an
especially big problem if you would want to add it to a pre-6.0 release.
Uhmm you are missing the point. The point was that PHP 5.2.x can read
data serialized with PHP 6. PHP 5.2.x only needs to be able to read it.
Generating it is not necessary.At the moment it seems that data serialized in PHP 5.2.x is totally
incompatible with PHP 6.
As 6.0 is not, to my knowledge, anywhere near release, what about this:
- Stick a version number on the front of PHP6-serialized strings. BC
can't break inside the 6.x family if there's nothing to be
compatible with. - Don't change
serialize()
at all in PHP < 6.0.0. No version number, no
extra escaping. - Add some mechanism to tell PHP6 to generate a PHP5-compatible binary
string when serializing. I recommend an optional flag (or flags field)
toserialize()
. - Understand the new version number in PHP 5.2.1+.
That would not touch existing serialization format. It would keep
existing compatibility among the 4.x and 5.x family, and allow future
PHPs to unambiguously determine how to parse the string. With the
serialize()
flag, you could explicitly generate old-style data from PHP6
if you really, really wanted to pass it back to PHP 5.2.0 or earlier.
--
Chad Daelhousen
"Television shepherds with living room sheep/ And I pray"
--Temple of the Dog, "Wooden Jesus"
I think 'v' with version info makes sense if we foresee multiple
serialization format changes in the future. The steps mentioned below
do not seem particularly practical for that reason. I would rather go
with a new type 'S' for escaped binary strings in both 6.0 and 5.2.1.
-Andrei
As 6.0 is not, to my knowledge, anywhere near release, what about this:
- Stick a version number on the front of PHP6-serialized strings. BC
can't break inside the 6.x family if there's nothing to be
compatible with.- Don't change
serialize()
at all in PHP < 6.0.0. No version number,
no
extra escaping.- Add some mechanism to tell PHP6 to generate a PHP5-compatible binary
string when serializing. I recommend an optional flag (or flags field)
toserialize()
.- Understand the new version number in PHP 5.2.1+.
That would not touch existing serialization format. It would keep
existing compatibility among the 4.x and 5.x family, and allow future
PHPs to unambiguously determine how to parse the string. With the
serialize()
flag, you could explicitly generate old-style data from
PHP6
if you really, really wanted to pass it back to PHP 5.2.0 or earlier.--
Chad Daelhousen
"Television shepherds with living room sheep/ And I pray"
--Temple of the Dog, "Wooden Jesus"
This is the patch against 5.2.1 that supports both 's' and 'S' in
serialized strings with no BC breaks. If there are no objections, I
will commit soon.
-Andrei
Andrei, thanks for the patch.
When you initially posted a patch for serialization on this list, I
believe Ilia did some quick, basic benchmarks and saw that the
patched version had a moderate negative performance impact.
Do you believe that this performance impact is still there?
--Mike
This is the patch against 5.2.1 that supports both 's' and 'S' in
serialized strings with no BC breaks. If there are no objections, I
will commit soon.-Andrei
<php5-unserialize.diff.txt
No, because there is no change for 's' format which is all that PHP
5.2 produces. The 'S' format is produced only by PHP 6.
-Andrei
Andrei, thanks for the patch.
When you initially posted a patch for serialization on this list, I
believe Ilia did some quick, basic benchmarks and saw that the
patched version had a moderate negative performance impact.Do you believe that this performance impact is still there?
--Mike
This is the patch against 5.2.1 that supports both 's' and 'S' in
serialized strings with no BC breaks. If there are no objections,
I will commit soon.-Andrei
<php5-unserialize.diff.txt
At 13:35 07/12/2006, Stefan Esser wrote:
Guys,
why can't you simply introduce a new variable type for serialized
strings like uppercase S:
There is absolutely no need for breaking backward compatibility.I can't think of a single reason not to use a new type like you
suggest.Can anybody think of any drawbacks?
Somebody MIGHT have written their own unserializer/serializer which
made incorrect assumptions about the internal format, such as 's'
being the same as 'S'...
So, perhaps, just to be consistent with the data I think appears in
serialized form, 'u' for 'unicode' would maybe be better than 'S'
I confess I have no idea what is actually inside of serialized data,
and could be WAY off base, but I stretched far enough to come up with
something that could go wrong, because that's the kind of paranoid guy
I am. :-)
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some starving artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?