Hi everyone,
Two weeks have passed since this RFC was put to discussion here, and no
significant issues have cropped up. Therefore, I'm going to put it to a
vote for inclusion in PHP 7.2.
Voting starts today, 2016-11-05, and ends the Monday after next, 2016-11-14.
The RFC and voting widget can be found here:
https://wiki.php.net/rfc/convert_numeric_keys_in_object_array_casts
It's a normal 2/3 majority required vote.
Please read through the RFC, and vote if you so desire. Thanks!
Andrea Faulds
https://ajf.me/
Hi Andrea,
Two weeks have passed since this RFC was put to discussion here, and no
significant issues have cropped up. Therefore, I'm going to put it to a vote
for inclusion in PHP 7.2.Voting starts today, 2016-11-05, and ends the Monday after next, 2016-11-14.
The RFC and voting widget can be found here:
https://wiki.php.net/rfc/convert_numeric_keys_in_object_array_castsIt's a normal 2/3 majority required vote.
In short, array int index is converted to string numeric name
property, vice versa. Correct?
At first, I thought this is good idea, but it seems we are better to
allow "string integer" array key access (array_get/set_var(),
perhaps) and change other related features accordingly.
Currently, inaccessible value could happen in array due to "int
like string conversion to int" also.
Line 9:
var_dump($tmp, $tmp[0], $tmp['0']);
outputs
array(4) {
[0]=>
int(5)
[1]=>
int(6)
[2]=>
int(7)
["0"]=>
string(3) "zzz" // <== String '0' indexed element is Inaccessible
}
int(5)
int(5) // <== Only long index 0 can be accessed
Either before or after RFC has pros and cons. For instance, proposed
change will require string casts for numeric property iteration, correct?
for($i=0; $i < 100; $i++) {
$str_i = (string)$i; // <== "int" key to "string" key conversion
// requires cast.
echo $obj->{$str_i}; // This kind of expression with int/int like string
// index is not allowed now, but this
// could be valid.
// https://3v4l.org/e5L1T
}
Another cons after RFC is BC that
$arr[0] = 123;
$obj=(object)$arr;
$obj->{0} became $obj->{'0'}
Simply allowing access to "numeric string index and long index" for
both array and object seems cleaner resolution for problems we have.
(Object allows distinct access to 0 and '0' indexes already, so make
array allow to access 0 and '0' indexed element.)
e.g
// There is no way to get string '0' index element now, so add
// array_get_var();
$obj->{0} === $arr[0];
$obj->{'0'} === array_get_var($arr, '0'); // Get string '0' indexed value
// Int and string index can have distinct value
$arr[0] !== $arr['0'];
$obj->{'0'} !== $arr[0];
$obj->{0} !== array_get_var($arr, '0');
$obj->{0} !== $obj->{'0'};
// Currently, we don't have way to set string '0' indexed element except
// converting object with string '0' index (e.g. $obj->{'0'} = 123;)
// to array. So implement array_set_var() to allow array to have
// string '0' indexed element.
array_set_var($arr, '0', 123); // Set string '0' indexed value 123
$obj = (object)$arr;
$ojb->{'0'} === array_get_var($arr, '0');
Making array accessible to "string int index" and reorganize other features
accordingly seems result in more consistent spec.
What do you think?
I found interesting behaviors while testing. I don't think I have good
understanding of this issue yet, please point out if I
miss/misunderstand something.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Yasuo,
Yasuo Ohgaki wrote:
Hi Andrea,
Two weeks have passed since this RFC was put to discussion here, and no
significant issues have cropped up. Therefore, I'm going to put it to a vote
for inclusion in PHP 7.2.Voting starts today, 2016-11-05, and ends the Monday after next, 2016-11-14.
The RFC and voting widget can be found here:
https://wiki.php.net/rfc/convert_numeric_keys_in_object_array_castsIt's a normal 2/3 majority required vote.
In short, array int index is converted to string numeric name
property, vice versa. Correct?
Yes, that describes it succinctly.
At first, I thought this is good idea, but it seems we are better to
allow "string integer" array key access (array_get/set_var(),
perhaps) and change other related features accordingly.
We could do that, theoretically. However, that would mean that now array
indexing would sometimes require looking up two different keys. In
particular, it would make checking for the existence of array keys slower.
But I don't think we should have to do so in the first place. Numeric
string keys existing in arrays are a bug, as are integer property names
existing in objects. Almost all array and object operations assume that
arrays don't have numeric string keys, and objects don't have integer
property names. That ought to be a safe assumption.
Currently, inaccessible value could happen in array due to "int
like string conversion to int" also.Line 9:
var_dump($tmp, $tmp[0], $tmp['0']);
outputs
array(4) {
[0]=>
int(5)
[1]=>
int(6)
[2]=>
int(7)
["0"]=>
string(3) "zzz" // <== String '0' indexed element is Inaccessible
}
int(5)
int(5) // <== Only long index 0 can be accessed
You may know this, but your example is fixed by this RFC. Your code is:
<?php
$arr = [5,6,7];
$obj = (object)$arr;
var_dump($obj);
$obj->{'0'} = 'zzz';
var_dump($obj, $obj->{0}, $obj->{'0'});
$tmp = (array)$obj;
var_dump($tmp, $tmp[0], $tmp['0']);
and with the patch the output is:
object(stdClass)#1 (3) {
["0"]=>
int(5)
["1"]=>
int(6)
["2"]=>
int(7)
}
object(stdClass)#1 (3) {
["0"]=>
string(3) "zzz"
["1"]=>
int(6)
["2"]=>
int(7)
}
string(3) "zzz"
string(3) "zzz"
array(3) {
[0]=>
string(3) "zzz"
[1]=>
int(6)
[2]=>
int(7)
}
string(3) "zzz"
string(3) "zzz"
Either before or after RFC has pros and cons. For instance, proposed
change will require string casts for numeric property iteration, correct?
No, it won't. Object property names are usually strings, and you can
usually rely on that when iterating over an object. This RFC fixes three
places where this wasn't the case.
Unless you're referring to your proposal?
for($i=0; $i < 100; $i++) {
$str_i = (string)$i; // <== "int" key to "string" key conversion
// requires cast.
echo $obj->{$str_i}; // This kind of expression with int/int like string
// index is not allowed now, but this
// could be valid.
// https://3v4l.org/e5L1T
}
There's no need for the cast there. When you do $obj->{0}, say, PHP
implicitly converts this to $obj->{'0'}. This isn't changed by my patch,
this is the current behaviour.
Likewise, when you do $arr['0'], PHP implicitly converts this to
$arr[0]. This is also the current behaviour.
It's because of this behaviour that the broken conversion of objects to
arrays, and vice-versa, renders some keys or properties inaccessible.
Another cons after RFC is BC that
$arr[0] = 123;
$obj=(object)$arr;
$obj->{0} became $obj->{'0'}
No, it doesn't. $obj->{0} and $obj->{'0'} behave the same currently and
will continue to do so with this RFC.
Simply allowing access to "numeric string index and long index" for
both array and object seems cleaner resolution for problems we have.
I disagree, I think this is the most complex solution I've seen
proposed. This means array key and property lookups have to do two
checks, not one, and could lead to subtle bugs depending on what order
this happens (what if 0 and '0' both exist with different values, which
value do we return?).
Unless you're proposing that we consider 0 and '0' to be distinct keys
and make $arr[0] and $arr['0'] behave differently. This seems like a bad
idea in the face of weak typing: PHP usually considers 0 and '0' to be
the same value, and it would be very surprising if we changed this
behaviour of array indexing after more than two decades.
(Object allows distinct access to 0 and '0' indexes already, so make
array allow to access 0 and '0' indexed element.)
This is not true. $obj->{0} and $obj->{'0'} both look up the object
property "0", and $arr[0] and $arr['0'] both look up the array key 0.
e.g
// There is no way to get string '0' index element now, so add
// array_get_var();
$obj->{0} === $arr[0];
$obj->{'0'} === array_get_var($arr, '0'); // Get string '0' indexed value
Numeric string keys existing at all is a bug. Why should we have a
function for looking them up?
// Int and string index can have distinct value
$arr[0] !== $arr['0'];
$obj->{'0'} !== $arr[0];
$obj->{0} !== array_get_var($arr, '0');
$obj->{0} !== $obj->{'0'};
This will break existing code which relies on PHP's weak typing and
general assumption that 0 and '0' are equivalent.
// Currently, we don't have way to set string '0' indexed element except
// converting object with string '0' index (e.g. $obj->{'0'} = 123;)
// to array. So implement array_set_var() to allow array to have
// string '0' indexed element.
array_set_var($arr, '0', 123); // Set string '0' indexed value 123
$obj = (object)$arr;
$ojb->{'0'} === array_get_var($arr, '0');
Why should we introduce a function to create malformed arrays?
Making array accessible to "string int index" and reorganize other features
accordingly seems result in more consistent spec.What do you think?
I don't see the merit of this proposal. PHP has worked on the principle
0 and '0' are the same key for a long time. Edge-cases where they aren't
are merely that, edge-cases, and should be considered a bug. I would
rather embrace the intended behaviour than make the bug into a feature.
If we wanted to avoid edge-cases like this entirely, we could make
everything work like arrays (integers and non-numeric string keys only),
or make everything work like objects (only string keys). That's
suggested under Future Scope, and I believe HHVM might do the former.
But it's a larger undertaking and not something I am going to address here.
Thanks for your comments.
--
Andrea Faulds
https://ajf.me/