Hi internals,
Currently we do not allow (*) creating empty property names on objects, i.e.
$obj->{''} = 42;
is illegal. While empty property names are unlikely to be useful per se,
they are problematic for deserialization of foreign formats like JSON. To
avoid this issue {"": null} will currently decode to a property named
"empty" rather than "". Notably, this means that JSON decode and encode
do not round-trip (as we do not convert empty back to an empty name while
encoding).
There is no technical reason (that I can see) for keeping this arbitrary
restriction. I believe that the original reason for the restriction was our
use of NUL-prefixed property names for name mangling, combined with the
fact that an empty string at the C level happens to "start" with a NUL byte.
A patch to drop the restriction and allow empty property names:
https://github.com/php/php-src/pull/1836 It does not touch the JSON
handling, as there are BC concerns involved there, I leave that to the
ext/json maintainer.
Any objections to changing this?
Regards,
Nikita
(*) There are roundabout ways to create them anyway.
Hi internals,
Currently we do not allow (*) creating empty property names on objects,
i.e.$obj->{''} = 42;
is illegal. While empty property names are unlikely to be useful per se,
they are problematic for deserialization of foreign formats like JSON. To
avoid this issue {"": null} will currently decode to a property named
"empty" rather than "". Notably, this means that JSON decode and encode
do not round-trip (as we do not convert empty back to an empty name while
encoding).There is no technical reason (that I can see) for keeping this arbitrary
restriction. I believe that the original reason for the restriction was our
use of NUL-prefixed property names for name mangling, combined with the
fact that an empty string at the C level happens to "start" with a NUL
byte.A patch to drop the restriction and allow empty property names:
https://github.com/php/php-src/pull/1836 It does not touch the JSON
handling, as there are BC concerns involved there, I leave that to the
ext/json maintainer.Any objections to changing this?
Regards,
Nikita(*) There are roundabout ways to create them anyway.
Currently, this succeeds:
$x = json_decode('{"empty": "no", "": "foo"}', true);
While this does not:
$x = json_decode('{"empty": "no", "": "foo"}');
https://3v4l.org/FJHVV
https://3v4l.org/15Sfm
I'm personally 50/50 on it. I think allowing an empty property is kind of
weird, but not the weirdest behavior PHP allows. Overall, it might
(ironically enough!) make working with JSON more consistent, and probably
have other benefits that I can't even imagine at the moment.
Scott Arciszewski
Chief Development Officer
Paragon Initiative Enterprises https://paragonie.com/
On Fri, Mar 25, 2016 at 1:45 PM, Nikita Popov nikita.ppv@gmail.com
wrote:Hi internals,
Currently we do not allow (*) creating empty property names on objects,
i.e.$obj->{''} = 42;
is illegal. While empty property names are unlikely to be useful per se,
they are problematic for deserialization of foreign formats like JSON. To
avoid this issue {"": null} will currently decode to a property named
"empty" rather than "". Notably, this means that JSON decode and encode
do not round-trip (as we do not convert empty back to an empty name
while
encoding).There is no technical reason (that I can see) for keeping this arbitrary
restriction. I believe that the original reason for the restriction was
our
use of NUL-prefixed property names for name mangling, combined with the
fact that an empty string at the C level happens to "start" with a NUL
byte.A patch to drop the restriction and allow empty property names:
https://github.com/php/php-src/pull/1836 It does not touch the JSON
handling, as there are BC concerns involved there, I leave that to the
ext/json maintainer.Any objections to changing this?
Regards,
Nikita(*) There are roundabout ways to create them anyway.
Currently, this succeeds:
$x = json_decode('{"empty": "no", "": "foo"}', true);
While this does not:
$x = json_decode('{"empty": "no", "": "foo"}');
https://3v4l.org/FJHVV
https://3v4l.org/15SfmI'm personally 50/50 on it. I think allowing an empty property is kind of
weird, but not the weirdest behavior PHP allows. Overall, it might
(ironically enough!) make working with JSON more consistent, and probably
have other benefits that I can't even imagine at the moment.
Note that also that these ones won't work at the moment:
var_dump((object) ['' => 'foo']);
var_dump((object) ["\0*\0" => 'foo']);
var_dump((object) ["\0Foo\0" => 'foo']);
Seems to work consistently on HHVM though: https://3v4l.org/7MO7E
Marco Pivetta
Hi all,
var_dump((object) ['' => 'foo']);
var_dump((object) ["\0*\0" => 'foo']);
var_dump((object) ["\0Foo\0" => 'foo']);
Allowing null char would be too much. We reject null char in path
parameters, it should be rejected like path parameter. IMHO.
Postgresql decided not to allow null char as valid unicode string in
JSONB, for example.
http://www.postgresql.org/docs/9.4/static/datatype-json.html
The jsonb type also rejects \u0000 (because that cannot be
represented in PostgreSQL's text type)
Postgresql has it own reason why null char cannot be supported. PHP
may suppport it., but I'm 30/70 for supporting null char in property
name because I cannot think of any use cases other than attacking
applications. (BTW, null char in data should be supported as it is
now)
Array key supports binary strings. If there is vote for this change, I
would refrain from voting.
php > var_dump(["abc\0xyz"=>1234]);
php shell code:1:
array(1) {
'abc\0xyz' =>
int(1234)
}
var_dump($o->{'123Foo'}); works. ($o->123Foo is illegal)
Therefore, +1 for making $o->{''} to work and removing automatic '' ->
'empty' conversion.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Yasuo,
Hi all,
var_dump((object) ['' => 'foo']);
var_dump((object) ["\0*\0" => 'foo']);
var_dump((object) ["\0Foo\0" => 'foo']);Allowing null char would be too much. We reject null char in path
parameters, it should be rejected like path parameter. IMHO.
The sequence "\0*\0" means "protected property", while the sequence
"\0Foo\0" means "private property of class Foo": that's been the case for a
looooong time :-)
Not suggesting allowing "\0" for property names: the example just shows
creating a public, private and protected property with an empty name.
Marco Pivetta
Hi Marco,
Hi all,
var_dump((object) ['' => 'foo']);
var_dump((object) ["\0*\0" => 'foo']);
var_dump((object) ["\0Foo\0" => 'foo']);Allowing null char would be too much. We reject null char in path
parameters, it should be rejected like path parameter. IMHO.The sequence "\0*\0" means "protected property", while the sequence
"\0Foo\0" means "private property of class Foo": that's been the case for a
looooong time :-)
Oh. Was it? I've never used and encountered this. Thanks.
I'll avoid null char as I use PostgreSQL JSONB extensively, though.
Not suggesting allowing "\0" for property names: the example just shows
creating a public, private and protected property with an empty name.
Could you show some real world example use cases?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Yasuo,
Not suggesting allowing "\0" for property names: the example just shows
creating a public, private and protected property with an empty name.Could you show some real world example use cases?
I've been using the "\0<scope>\0" approach in a few codegen-related
projects, specifically in ProxyManager (
https://github.com/Ocramius/ProxyManager/blob/master/docs/lazy-loading-ghost-object.md#lazy-initialization-properties-explained
) and GeneratedHydrator ( https://github.com/Ocramius/GeneratedHydrator ).
No user input values involved, though.
See
https://github.com/php/php-src/blob/d3ed75b9ebc998b8cf325e0e3ab954bd10989918/tests/classes/array_conversion_keys.phpt
for the tests inside php-src: the inverse cast is not covered by tests,
though, as there is really little to no use-case for an stdClass
with
private or protected properties, unless the only purpose is casting it back
into an array.
Marco Pivetta
Hi all,
var_dump((object) ['' => 'foo']);
var_dump((object) ["\0*\0" => 'foo']);
var_dump((object) ["\0Foo\0" => 'foo']);Allowing null char would be too much. We reject null char in path
parameters, it should be rejected like path parameter. IMHO.The sequence "\0*\0" means "protected property", while the sequence
"\0Foo\0" means "private property of class Foo": that's been the case for a
looooong time :-)Oh. Was it? I've never used and encountered this. Thanks.
I'll avoid null char as I use PostgreSQL JSONB extensively, though.Not suggesting allowing "\0" for property names: the example just shows
creating a public, private and protected property with an empty name.Could you show some real world example use cases?
You mean PHP converts private/protected property to this form, not
currently used as JSON string.
I understand your point!
It would be good for PHP users supporting null chars in names, then.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
On Sat, Mar 26, 2016 at 8:00 AM, Marco Pivetta ocramius@gmail.com
wrote:Hi all,
On Sat, Mar 26, 2016 at 5:31 AM, Marco Pivetta ocramius@gmail.com
wrote:var_dump((object) ['' => 'foo']);
var_dump((object) ["\0*\0" => 'foo']);
var_dump((object) ["\0Foo\0" => 'foo']);Allowing null char would be too much. We reject null char in path
parameters, it should be rejected like path parameter. IMHO.The sequence "\0*\0" means "protected property", while the sequence
"\0Foo\0" means "private property of class Foo": that's been the case
for a
looooong time :-)Oh. Was it? I've never used and encountered this. Thanks.
I'll avoid null char as I use PostgreSQL JSONB extensively, though.Not suggesting allowing "\0" for property names: the example just shows
creating a public, private and protected property with an empty name.Could you show some real world example use cases?
You mean PHP converts private/protected property to this form, not
currently used as JSON string.
I understand your point!It would be good for PHP users supporting null chars in names, then.
As the discussion seems to going off on a tangent I'd like to clarify: I'm
only suggesting to remove the restriction on empty property names. The
restrictions we have on properties starting with NUL bytes stay intact, as
these do serve a purpose with regard to name mangling.
To add one further data point, dropping the empty property restriction also
fixes this bug about returning an array with an empty key from
__debugInfo(): https://bugs.php.net/bug.php?id=69537
Nikita
Hi,
I have been thinking about it and I would be ok with removing empty in
7.1. That inconsistency is quite annoying and I never liked it. If we can
rid of it, that would be great! I realise the BC concern but I see that
more as a limitation and inconsistency between array and object decoding.
It's more an edge case anyway so I don't think there should be a big
concern. I think that it makes sense to hear if anyone would be against it
in json but I think that the limitation could go away already as it has
been discussed more than a month ago and I don't see any real objections.
@Nikita: if you merge it, I will be happy to create a PR for json...
Cheers
Jakub
Hi internals,
Currently we do not allow (*) creating empty property names on objects,
i.e.$obj->{''} = 42;
is illegal. While empty property names are unlikely to be useful per se,
they are problematic for deserialization of foreign formats like JSON. To
avoid this issue {"": null} will currently decode to a property named
"empty" rather than "". Notably, this means that JSON decode and encode
do not round-trip (as we do not convert empty back to an empty name while
encoding).There is no technical reason (that I can see) for keeping this arbitrary
restriction. I believe that the original reason for the restriction was our
use of NUL-prefixed property names for name mangling, combined with the
fact that an empty string at the C level happens to "start" with a NUL
byte.A patch to drop the restriction and allow empty property names:
https://github.com/php/php-src/pull/1836 It does not touch the JSON
handling, as there are BC concerns involved there, I leave that to the
ext/json maintainer.Any objections to changing this?
Regards,
Nikita(*) There are roundabout ways to create them anyway.
Hi,
I have been thinking about it and I would be ok with removing empty in
7.1. That inconsistency is quite annoying and I never liked it. If we can
rid of it, that would be great! I realise the BC concern but I see that
more as a limitation and inconsistency between array and object decoding.
It's more an edge case anyway so I don't think there should be a big
concern. I think that it makes sense to hear if anyone would be against it
in json but I think that the limitation could go away already as it has
been discussed more than a month ago and I don't see any real objections.@Nikita: if you merge it, I will be happy to create a PR for json...
Cheers
Jakub
Sorry for the delay, I've merged the patch now:
https://github.com/php/php-src/commit/674297c7e41013c2c34d770051714518d0586271
Nikita
Hi,
I have been thinking about it and I would be ok with removing empty in
7.1. That inconsistency is quite annoying and I never liked it. If we can
rid of it, that would be great! I realise the BC concern but I see that
more as a limitation and inconsistency between array and object decoding.
It's more an edge case anyway so I don't think there should be a big
concern. I think that it makes sense to hear if anyone would be against it
in json but I think that the limitation could go away already as it has
been discussed more than a month ago and I don't see any real objections.@Nikita: if you merge it, I will be happy to create a PR for json...
Cheers
Jakub
Sorry for the delay, I've merged the patch now:
https://github.com/php/php-src/commit/674297c7e41013c2c34d770051714518d0586271Nikita
Thanks. I have just created a PR for json:
https://github.com/php/php-src/pull/1926
I will wait a week or two and if there are no objections, I will merge it
to master.
Cheers
Jakub