Hi internals,
I would like to present a draft RFC for allowing object keys in arrays:
https://wiki.php.net/rfc/object_keys_in_arrays
The specification in the RFC is incomplete, and primarily focussed on what
impact this has from an internal perspective.
https://github.com/php/php-src/pull/6588 is a draft implementation that
illustrates the necessary changes.
The primary motivation, and reason why I am looking into this now, is that
the current enum proposal is based on objects, and I consider it somewhat
important that enum values can also work as array keys.
From a technical perspective, this also lays the groundwork for supporting
other key types in the future, e.g. if we wish to switch PHP to use
arbitrary-precision integers.
Regards,
Nikita
Den 2021-01-11 kl. 15:28, skrev Nikita Popov:
Hi internals,
I would like to present a draft RFC for allowing object keys in arrays:
https://wiki.php.net/rfc/object_keys_in_arraysThe specification in the RFC is incomplete, and primarily focussed on what
impact this has from an internal perspective.
https://github.com/php/php-src/pull/6588 is a draft implementation that
illustrates the necessary changes.The primary motivation, and reason why I am looking into this now, is that
the current enum proposal is based on objects, and I consider it somewhat
important that enum values can also work as array keys.From a technical perspective, this also lays the groundwork for supporting
other key types in the future, e.g. if we wish to switch PHP to use
arbitrary-precision integers.Regards,
Nikita
+1 for the feature itself, really nice! It makes the ENUM feature more
complete.
The implementation aspects I can't judge though.
r//Björn L
Hi Nikita,
Hi internals,
I would like to present a draft RFC for allowing object keys in arrays:
https://wiki.php.net/rfc/object_keys_in_arraysThe specification in the RFC is incomplete, and primarily focussed on what
impact this has from an internal perspective.
https://github.com/php/php-src/pull/6588 is a draft implementation that
illustrates the necessary changes.The primary motivation, and reason why I am looking into this now, is that
the current enum proposal is based on objects, and I consider it somewhat
important that enum values can also work as array keys.From a technical perspective, this also lays the groundwork for supporting
other key types in the future, e.g. if we wish to switch PHP to use
arbitrary-precision integers.
Overall, this is something I always wanted, but adding it means that the
type of all array functions returning keys widens, which is quite the BC
hell ?
The basic assumption is that, when an array
is given, foreach ($array as $key => )
produces a $key
if type string|int
.
While this assumption does not hold true for iterable
(notably
SplObjectStorage
, Generator
), it's still quite a jump to break it here,
so it would probably require scheduling such a feature for 9.0.
Hi Marco
Hi internals,
I would like to present a draft RFC for allowing object keys in arrays:
https://wiki.php.net/rfc/object_keys_in_arraysOverall, this is something I always wanted, but adding it means that the
type of all array functions returning keys widens, which is quite the BC
hell ?The basic assumption is that, when an
array
is given,foreach ($array as $key => )
produces a$key
if typestring|int
.While this assumption does not hold true for
iterable
(notably
SplObjectStorage
,Generator
), it's still quite a jump to break it here,
so it would probably require scheduling such a feature for 9.0.
Definitely disagree here. Your existing code will continue to work fine
without changes. It is only after passing objects as keys to other
functions with the assumption above that your code breaks. Sure, this will
require code changes for frameworks to handle these cases gracefully but it
won't suddenly break your website that hasn't been touched for years.
Note that even minor PHP versions have historically not followed strict
semantic versioning. If we did, most PHP features would have to be pushed
back years given PHPs relatively slow release cycle. I don't think that's
desirable for neither PHP developers nor maintainers.
Ilija
Hi Nikita,
Hi internals,
I would like to present a draft RFC for allowing object keys in arrays:
https://wiki.php.net/rfc/object_keys_in_arraysThe specification in the RFC is incomplete, and primarily focussed on
what
impact this has from an internal perspective.
https://github.com/php/php-src/pull/6588 is a draft implementation that
illustrates the necessary changes.The primary motivation, and reason why I am looking into this now, is
that
the current enum proposal is based on objects, and I consider it somewhat
important that enum values can also work as array keys.From a technical perspective, this also lays the groundwork for
supporting
other key types in the future, e.g. if we wish to switch PHP to use
arbitrary-precision integers.Overall, this is something I always wanted, but adding it means that the
type of all array functions returning keys widens, which is quite the BC
hell ?The basic assumption is that, when an
array
is given,foreach ($array as $key => )
produces a$key
if typestring|int
.While this assumption does not hold true for
iterable
(notably
SplObjectStorage
,Generator
), it's still quite a jump to break it here,
so it would probably require scheduling such a feature for 9.0.
Hey Ilija,
Definitely disagree here. Your existing code will continue to work fine
without changes.
Code written to deal with array
in a generic way will no longer work when
invoked through code paths that produce object keys: this is a general
problem with widening types overall, and is a clear BC break.
This can be verified by expanding the array-key
meta-type in tools like
vimeo/psalm
to include object
, then scanning for issues in existing
libraries. For example, on generic arrays it was safe to cast keys to
string
, but that guarantee no longer holds true, and depends on
caller-side values, which will change with the new addition.
So yes, there needs to be a code migration/adaptation in a large number of
downstream infrastructural libraries.
Again: this is a common problem with widening types (function return types,
operator/constructs possible return types), and hence why I bring it up.
Whether the problem can be mitigated is what should be discussed, but the
problem is objectively there.
Code written to deal with
array
in a generic way will no longer work
when
invoked through code paths that produce object keys: this is a general
problem with widening types overall, and is a clear BC break.
If you look at levels of BC break, this is on the very low end in my
opinion. No existing code will break when upgrading to this new PHP
version. Only new code written specifically for that PHP version would
be impacted, and frameworks/libraries could, if necessary, add
additional checks without becoming incompatible with older PHP versions.
It helps that currently array keys can be integers or strings, so if the
key type is important, some checking is already necessary.
Of course tools like Psalm would see a lot of potential issues, but
those would only be potential (with new code using this new feature) and
having static analyzers reduces the impact of such a BC break even more,
as it becomes much easier to spot issues. Might be interesting to change
array key type definitions for a few projects/libraries and see if there
are that many potential issues, and what they look like.
Code written to deal with
array
in a generic way will no longer work
when
invoked through code paths that produce object keys: this is a general
problem with widening types overall, and is a clear BC break.If you look at levels of BC break, this is on the very low end in my
opinion. No existing code will break when upgrading to this new PHP
version. Only new code written specifically for that PHP version would
be impacted, and frameworks/libraries could, if necessary, add
additional checks without becoming incompatible with older PHP versions.
It helps that currently array keys can be integers or strings, so if the
key type is important, some checking is already necessary.
This is untrue. It's quite possible to have code like this today:
function make_list(array $vals) {
$out = '<select>';
foreach ($vals as $key => $label) {
$out .= sprintf('<option value="%s">%s</option>', $key, $label);
}
return $out . '</select>';
}
That would work just fine today with string or int keys, but would break if on an object-keyed array if the objects did not implement __toString().
Whether the risk of such code breaking in practice is large or small is another question, and I don't have an answer for it. I'm of mixed feelings on this RFC; on the one hand, it would make Enums even easier to work with, which is good. On the other, it pours even more behavior into the kitchen sink data structure whose (mis)use is a known security hole, which is bad. I'd like to see us using naked hash tables less, not more.
Ilija is also correct that minor potential-breaks like this have been introduced in the past, and the world has not ended. Whether that is good or bad, historically, is debatable and likely varies with the specific feature.
I would say that if this RFC moves forward, we would also want to introduce an other array_is_*() function to make adding extra guards easier. I'm not sure what we'd call it, though. array_is_keyed? array_is_assoc? array_is_not_objects? Bikeshed as you wish. :-) But we'd want some easy way to tell what key-style an array uses.
... maybe a function that returns an internal enumeration of List, Assoc, or Object for the 3 key types? :-)
--Larry Garfield
Whether the problem can be mitigated is what should be discussed, but the
problem is objectively there.
Hi all,
Like others, I like the idea of object keys, but worry about its
practical impact.
A few scatter-gun thoughts on possible approaches, I don't think any of
them is the solution, but maybe they'll spark someone else to further ideas:
-
Only allow objects that are "stringable" (i.e. implement __toString),
but don't actually call it. This retains the safety of code using
"(string)$key", but not code passing to a constraint of "string|int".
(It also means enums will all have to have an __toString, which might be
a price worth paying.) -
Create a new type, which is like an array but allows object keys, with
its own literal syntax. e.g.
$hash = hash['foo' => 42, $someObject => 69];
assert(is_array($hash) === false);
assert(is_iterable($hash) === true);
- Invent a syntax for initialising a custom collection object from a
literal with arbitrary keys and values, but not actually initialise it
as an array. Even without generics, this would make it much more
attractive to replace more arrays with specific collection types.
Regards,
--
Rowan Tommins
[IMSoP]
Le 13/01/2021 à 10:08, Rowan Tommins a écrit :
Whether the problem can be mitigated is what should be discussed, but
the
problem is objectively there.Hi all,
Like others, I like the idea of object keys, but worry about its
practical impact.A few scatter-gun thoughts on possible approaches, I don't think any
of them is the solution, but maybe they'll spark someone else to
further ideas:
Only allow objects that are "stringable" (i.e. implement
__toString), but don't actually call it. This retains the safety of
code using "(string)$key", but not code passing to a constraint of
"string|int". (It also means enums will all have to have an
__toString, which might be a price worth paying.)Create a new type, which is like an array but allows object keys,
with its own literal syntax. e.g.$hash = hash['foo' => 42, $someObject => 69];
assert(is_array($hash) === false);
assert(is_iterable($hash) === true);
- Invent a syntax for initialising a custom collection object from a
literal with arbitrary keys and values, but not actually initialise it
as an array. Even without generics, this would make it much more
attractive to replace more arrays with specific collection types.Regards,
Hello,
Instead of trying to restrict types that can be used as keys, may be
this should be the time to have enumerables, lists, maps, vectors, sets,
types, dictionaries, ... in PHP standard library ? A third party tool
such as PHP-DS should be part of core API instead of being a third party
extension.
Arrays are arrays, but the more we add features in a single polymorphic
data structure such as this, the less code is readable and
comprehensible. We use arrays everywhere as soon as we need a list or a
map, a vector, a set, a list, a dictionary, a data structure that's not
an object, etc... that isn't built using a generator, and it becomes a
serious mess. It's weakly typed, it's not explicit about the intent, in
some case, I guess specific data structure would be much faster as well.
I think adding this new feature is really cool, but having a complete
API around iterable with different types for different use cases would
be much better in my opinion.
I know it's not going to happen soon, just saying for the sake of the
conversation, but PHP arrays are really powerful, and insanely
polymorphic, and they do way too much in my opinion.
Regards,
Pierre
Instead of trying to restrict types that can be used as keys, may be
this should be the time to have enumerables, lists, maps, vectors,
sets, types, dictionaries, ... in PHP standard library ? A third party
tool such as PHP-DS should be part of core API instead of being a
third party extension.
For sure, that's why I mentioned initialisation syntax: if you can't use
objects as array keys, you can't use an array literal to construct any
other object which uses objects as keys either, so they're always going
to look uglier:
Compare:
$map = new Ds\Map([]);
$map->put($object1, 42);
$map->put($object2, 69);
$map->put($object3, 39);
$map->put($object4, 101);
To:
$map = [
$object1 => 42,
$object2 => 69,
$object3 => 39,
$object4 => 101,
];
There's a bunch of other gaps, too: arrays are copy-on-write, can be
used in constants and defaults, etc. If we want people to use something
- anything - instead of arrays, these are the things that only the core
language can provide.
Regards,
--
Rowan Tommins
[IMSoP]
On Wed, Jan 13, 2021 at 10:08 AM Rowan Tommins rowan.collins@gmail.com
wrote:
Whether the problem can be mitigated is what should be discussed, but the
problem is objectively there.Hi all,
Like others, I like the idea of object keys, but worry about its
practical impact.A few scatter-gun thoughts on possible approaches, I don't think any of
them is the solution, but maybe they'll spark someone else to further
ideas:
Only allow objects that are "stringable" (i.e. implement __toString),
but don't actually call it. This retains the safety of code using
"(string)$key", but not code passing to a constraint of "string|int".
(It also means enums will all have to have an __toString, which might be
a price worth paying.)Create a new type, which is like an array but allows object keys, with
its own literal syntax. e.g.$hash = hash['foo' => 42, $someObject => 69];
assert(is_array($hash) === false);
assert(is_iterable($hash) === true);
- Invent a syntax for initialising a custom collection object from a
literal with arbitrary keys and values, but not actually initialise it
as an array. Even without generics, this would make it much more
attractive to replace more arrays with specific collection types.
I don't think a separate dictionary type would really help (in the context
of this discussion). If you're working with arrays, you already either
treat it as a vector (in which case the keys are irrelevant) or as a
dictionary (in which case you would need to deal with object keys
regardless of whether you call it "array" or "dict"). I think a dedicated
dictionary type can have other benefits (primarily removing numeric string
-> int canonicalization), but I don't think that distinguishing allowed key
types based on it is a good idea. (If we introduced such a type, I would
expect it to have the same key requirements as normal arrays, because both
types would have to be strongly interoperable, and allow mostly seamless
conversions.)
Regards,
Nikita
Separate response for this one, sorry for the noise:
Note that even minor PHP versions have historically not followed strict
semantic versioning. If we did, most PHP features would have to be pushed
back years given PHPs relatively slow release cycle. I don't think that's
desirable for neither PHP developers nor maintainers.
It's perfectly fine to release a new major more frequently, if
required/worth it, but releasing breaking changes in minor versions
(intentionally and being well aware of them) only erodes the quite thin BC
trust.
In practice, no difference for myself personally, as I already pin my own
libraries to minor PHP versions, preventing installation with PHP 8.1
nowadays, until I know it to be "safe", post-RC testing.
That perhaps is a discussion for RMs to take on.
Le 12/01/2021 à 17:35, Ilija Tovilo a écrit :
Hi Marco
Hi internals,
I would like to present a draft RFC for allowing object keys in arrays:
https://wiki.php.net/rfc/object_keys_in_arrays
Overall, this is something I always wanted, but adding it means that the
type of all array functions returning keys widens, which is quite the BC
hell ?The basic assumption is that, when an
array
is given,foreach ($array as $key => )
produces a$key
if typestring|int
.While this assumption does not hold true for
iterable
(notably
SplObjectStorage
,Generator
), it's still quite a jump to break it here,
so it would probably require scheduling such a feature for 9.0.Definitely disagree here. Your existing code will continue to work fine
without changes. It is only after passing objects as keys to other
functions with the assumption above that your code breaks. Sure, this will
require code changes for frameworks to handle these cases gracefully but it
won't suddenly break your website that hasn't been touched for years.Note that even minor PHP versions have historically not followed strict
semantic versioning. If we did, most PHP features would have to be pushed
back years given PHPs relatively slow release cycle. I don't think that's
desirable for neither PHP developers nor maintainers.
I do agree with Marco, lots of generic code manipulating array keys will
be prone to failure with such feature, it is a breaking change.
Nevertheless, the feature is so appealing that I'm definitely OK with
fixing all the things ! Nevertheless, lots of major frameworks and
libraries will having tests to write, and fixes to do, and the
transition to support will probably be erratic and slow.
--
Pierre
This proposal is interesting, and I see why the enum proposal makes it
useful.
Supporting this will mean a small amount of work for me (assuming it
passes) and other static analysis tools, but I don't want that to factor
into anyone's decision.
I am curious, though, whether the scope of this RFC could be narrowed to
just allowing enum cases as keys? That might avoid issues with objects that
cannot be cast to string.
Hi Matthew
This proposal is interesting, and I see why the enum proposal makes it
useful.Supporting this will mean a small amount of work for me (assuming it
passes) and other static analysis tools, but I don't want that to factor
into anyone's decision.I am curious, though, whether the scope of this RFC could be narrowed to
just allowing enum cases as keys? That might avoid issues with objects that
cannot be cast to string.
The enum RFC currently does not propose auto-implementing __toString()
and actually forbids implementing __toString() manually in case we
want to add some type of coercion at some point. While we could change
that, I do think it introduces some inconsistencies.
enum Foo: string {
case Bar = 'bar';
}
var_dump(Foo::Bar . 'baz'); // barbaz
enum Foo: int {
case Bar = 1;
}
var_dump(Foo::Bar + 2); // Type error
// Will require Foo::Bar->value to work
We intentionally removed auto-coercion from enums because the rules
are complicated and not clear-cut, and type strictness is what the
language generally seems to be striving towards.
That is not to say narrowing object keys to enums is a bad idea. I
think that might be worth considering for other reasons.
Ilija