Hi internals,
I've created the RFC https://wiki.php.net/rfc/is_list
This adds a new function is_list(mixed $value): bool
that will return true
if the type of $value is array and the array keys are 0 .. count($value)-1
in that order.
It's well-known that PHP's array
data type is rare among programming languages
in that it supports both integer and string keys
and that iteration order is important and guaranteed.
(it is used for overlapping use cases - in many other languages, both vectors/lists/arrays and hash maps are available)
While it is possible to efficiently check that something is an array,
that array may still have string keys, not start from 0, have missing array offsets,
or contain out of order keys.
It can be useful to verify that the assumption that array keys are consecutive integers is correct,
both for data that is being passed into a module or for validating data before returning it from a module.
However, because it's currently inconvenient to do that, this has rarely been done in my experience.
In performance-sensitive serializers or data encoders, it may also be useful to have an efficient check to distinguish lists from associative arrays.
For example, json_encode does this when deciding to serialize a value as [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1}
for arrays depending on the key orders.
Prior email threads/PRs have had others indicate interest in the ability to efficiently check
if a PHP array
has sequential ordered keys starting from 0
https://externals.io/message/109760 “Any interest in a list type?”
https://externals.io/message/111744 “Request for couple memory optimized array improvements”
Implementation: https://github.com/php/php-src/pull/6070 (some discussion is in the linked PR it was based on)
Thanks,
- Tyson
Hi internals,
I've created the RFC https://wiki.php.net/rfc/is_list
This adds a new function
is_list(mixed $value): bool
that will return
true
if the type of $value is array and the array keys are0 .. count($value)-1
in that order.It's well-known that PHP's
array
data type is rare among programming
languages
in that it supports both integer and string keys
and that iteration order is important and guaranteed.
(it is used for overlapping use cases - in many other languages, both
vectors/lists/arrays and hash maps are available)While it is possible to efficiently check that something is an array,
that array may still have string keys, not start from 0, have missing
array offsets,
or contain out of order keys.It can be useful to verify that the assumption that array keys are
consecutive integers is correct,
both for data that is being passed into a module or for validating data
before returning it from a module.
However, because it's currently inconvenient to do that, this has
rarely been done in my experience.In performance-sensitive serializers or data encoders, it may also be
useful to have an efficient check to distinguish lists from associative
arrays.
For example, json_encode does this when deciding to serialize a value
as [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1}
for arrays depending on the key orders.Prior email threads/PRs have had others indicate interest in the
ability to efficiently check
if a PHParray
has sequential ordered keys starting from 0https://externals.io/message/109760 “Any interest in a list type?”
https://externals.io/message/111744 “Request for couple memory
optimized array improvements”
Implementation: https://github.com/php/php-src/pull/6070 (some
discussion is in the linked PR it was based on)Thanks,
- Tyson
Well, since I'm quoted... :-)
I'm fine with this, but have one question and one correction:
-
If we do eventually end up with list/vec types, would the naming here conflict at all? Or would it cause confusion and name collision? (Insert name bikeshedding here.)
-
The last quote, from me, has a small error. The last sentence shouldn't be a bullet point but its own paragraph, after the list is complete.
--Larry Garfield
Hi Larry Garfield,
Well, since I'm quoted... :-)
I'm fine with this, but have one question and one correction:
If we do eventually end up with list/vec types, would the naming here conflict at all? Or would it cause confusion and name collision? (Insert name bikeshedding here.)
The last quote, from me, has a small error. The last sentence shouldn't be a bullet point but its own paragraph, after the list is complete.
Yes, there's definitely the potential for naming conflicts if the type is called list
but not if it's called vec
/vector
/varray
similar to https://docs.hhvm.com/hack/built-in-types/arrays - I'd prefer the latter if there was a viable implementation.
I should note that in the discussion section.
If the type is named list
instead of vector
and ends up incompatible with arrays,
there'd need to be an is_list_type($val)
or $val is list
or some other new type check with a less preferable name.
If it's compatible with arrays/lists
(e.g. only checked during property assignment, passing in arguments, and returning values), then it wouldn't be an issue.
- is_array_list() or is_array_and_list() or is_values_array() would avoid some of that ambiguity but would be much more verbose
Providing objects with APIs similar to the external PECL https://www.php.net/manual/en/class.ds-vector.php and the SPL may be easier to adopt because it can be polyfilled,
but there's the drawback that there aren't the memory savings from copy-on-write and that there's the performance overhead of method calls to offsetGet(), etc.
I'd expect the addition of a separate/incompatible vec type to be a massive undertaking, and possibly unpopular if it splits the language.
In Hack/HHVM, it was practical for users to adopt because HHVM is bundled with a typechecker that checks that the uses
are correct at compile time - because PHP has no bundled type checker, a new type would potentially cause a lot of unintuitive behaviors.
I fixed the formatting of the quote.
-- Tyson
On Dec 19, 2020, at 19:43, tyson andre tysonandre775@hotmail.com
wrote:It can be useful to verify that the assumption that array keys are
consecutive integers is correct, both for data that is being passed
into a module or for validating data before returning it from a
module. However, because it's currently inconvenient to do that, this
has rarely been done in my experience.
I think there are some places where is_list()
could be unintuitive to
those who don’t understand some of the idiosyncrasies of PHP.
For example, with
$a = ['foo', 'bar', 'baz’];
is_list()
will return true
, but if you run $a
through asort()
,
is_list()
will return false
because the keys are no longer
consecutive integers, but is there any doubt this is still a list?
Maybe in a pure sense, it’s not, but I think this could be confusing.
But now, if we do
$b = array_merge($a, ['qux', 'quux']);
$b
is now back to being a list, so is_list($b)
returns true
.
While I understand the convenience is_list()
provides--I myself have
implemented the opposite of this numerous times (e.g.,
is_dict()
)--it comes close to implying a data type that PHP doesn’t
have, and I think this could give a false sense of type-safety-ness
when using this function to check whether something is a 0-indexed
array.
Cheers,
Ben
On Dec 19, 2020, at 19:43, tyson andre tysonandre775@hotmail.com
wrote:It can be useful to verify that the assumption that array keys are
consecutive integers is correct, both for data that is being passed
into a module or for validating data before returning it from a
module. However, because it's currently inconvenient to do that, this
has rarely been done in my experience.I think there are some places where
is_list()
could be unintuitive to
those who don’t understand some of the idiosyncrasies of PHP.For example, with
$a = ['foo', 'bar', 'baz’];
is_list()
will returntrue
, but if you run$a
throughasort()
,
is_list()
will returnfalse
because the keys are no longer
consecutive integers, but is there any doubt this is still a list?
Maybe in a pure sense, it’s not, but I think this could be confusing.But now, if we do
$b = array_merge($a, ['qux', 'quux']);
$b
is now back to being a list, sois_list($b)
returnstrue
.While I understand the convenience
is_list()
provides--I myself have
implemented the opposite of this numerous times (e.g.,
is_dict()
)--it comes close to implying a data type that PHP doesn’t
have, and I think this could give a false sense of type-safety-ness
when using this function to check whether something is a 0-indexed
array.Cheers,
Ben
Would either is_zero_based() or is_zero_indexed() be a reasonable name instead?
-Mike
P.S. See https://en.wikipedia.org/wiki/Zero-based_numbering <https://en.wikipedia.org/wiki/Zero-based_numbering
On Dec 19, 2020, at 19:43, tyson andre tysonandre775@hotmail.com
wrote:It can be useful to verify that the assumption that array keys are
consecutive integers is correct, both for data that is being passed
into a module or for validating data before returning it from a
module. However, because it's currently inconvenient to do that, this
has rarely been done in my experience.I think there are some places where
is_list()
could be unintuitive to
those who don’t understand some of the idiosyncrasies of PHP.For example, with
$a = ['foo', 'bar', 'baz’];
is_list()
will returntrue
, but if you run$a
throughasort()
,
is_list()
will returnfalse
because the keys are no longer
consecutive integers, but is there any doubt this is still a list?
Maybe in a pure sense, it’s not, but I think this could be confusing.But now, if we do
$b = array_merge($a, ['qux', 'quux']);
$b
is now back to being a list, sois_list($b)
returnstrue
.While I understand the convenience
is_list()
provides--I myself have
implemented the opposite of this numerous times (e.g.,
is_dict()
)--it comes close to implying a data type that PHP doesn’t
have, and I think this could give a false sense of type-safety-ness
when using this function to check whether something is a 0-indexed
array.Cheers,
BenWould either is_zero_based() or is_zero_indexed() be a reasonable
name instead?-Mike
I don’t think changing the name changes my concern, but as Tyson
pointed out:
These idiosyncrasies of php and unintuitive behaviors existed prior
to this RFC.
I’m generally +1 on this RFC, but as I think about it, maybe there is a
problem with the name. If we choose to introduce a pure list construct
at some point, then is_list()
may cause confusion (or maybe it won’t,
if it can be made to return true
for a zero-indexed, standard PHP
array and a list
, whatever that might look like in the language).
Cheers,
Ben
Hi internals,
I've created the RFC https://wiki.php.net/rfc/is_list
This adds a new function
is_list(mixed $value): bool
that will return true
if the type of $value is array and the array keys are0 .. count($value)-1
in that order.It's well-known that PHP's
array
data type is rare among programming languages
in that it supports both integer and string keys
and that iteration order is important and guaranteed.
(it is used for overlapping use cases - in many other languages, both vectors/lists/arrays and hash maps are available)While it is possible to efficiently check that something is an array,
that array may still have string keys, not start from 0, have missing array offsets,
or contain out of order keys.It can be useful to verify that the assumption that array keys are consecutive integers is correct,
both for data that is being passed into a module or for validating data before returning it from a module.
However, because it's currently inconvenient to do that, this has rarely been done in my experience.In performance-sensitive serializers or data encoders, it may also be useful to have an efficient check to distinguish lists from associative arrays.
For example, json_encode does this when deciding to serialize a value as [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1}
for arrays depending on the key orders.Prior email threads/PRs have had others indicate interest in the ability to efficiently check
if a PHParray
has sequential ordered keys starting from 0https://externals.io/message/109760 “Any interest in a list type?”
https://externals.io/message/111744 “Request for couple memory optimized array improvements”
Implementation: https://github.com/php/php-src/pull/6070 (some discussion is in the linked PR it was based on)
Due to concerns about naming causing confusion with theoretical potential future changes to the language,
I've updated https://wiki.php.net/rfc/is_list to use the name is_array_and_list(mixed $value): bool
instead.
(e.g. what if php used the reserved word list
to add an actual list type in the future, and is_list() returned false for that.)
I plan to start voting on the RFC in a few days.
Thanks,
- Tyson
Hi internals,
I've created the RFC https://wiki.php.net/rfc/is_list
This adds a new function
is_list(mixed $value): bool
that will return true
if the type of $value is array and the array keys are0 .. count($value)-1
in that order.It's well-known that PHP's
array
data type is rare among programming languages
in that it supports both integer and string keys
and that iteration order is important and guaranteed.
(it is used for overlapping use cases - in many other languages, both vectors/lists/arrays and hash maps are available)While it is possible to efficiently check that something is an array,
that array may still have string keys, not start from 0, have missing array offsets,
or contain out of order keys.It can be useful to verify that the assumption that array keys are consecutive integers is correct,
both for data that is being passed into a module or for validating data before returning it from a module.
However, because it's currently inconvenient to do that, this has rarely been done in my experience.In performance-sensitive serializers or data encoders, it may also be useful to have an efficient check to distinguish lists from associative arrays.
For example, json_encode does this when deciding to serialize a value as [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1}
for arrays depending on the key orders.Prior email threads/PRs have had others indicate interest in the ability to efficiently check
if a PHParray
has sequential ordered keys starting from 0https://externals.io/message/109760 “Any interest in a list type?”
https://externals.io/message/111744 “Request for couple memory optimized array improvements”
Implementation: https://github.com/php/php-src/pull/6070 (some discussion is in the linked PR it was based on)Due to concerns about naming causing confusion with theoretical
potential future changes to the language,
I've updated https://wiki.php.net/rfc/is_list to use the name
is_array_and_list(mixed $value): bool
instead.
(e.g. what if php used the reserved wordlist
to add an actual list
type in the future, and is_list() returned false for that.)I plan to start voting on the RFC in a few days.
Possible alternative that's less clumsy: is_packed_array?
--Larry Garfield
I've created the RFC https://wiki.php.net/rfc/is_list
This adds a new function
is_list(mixed $value): bool
that will
return true if the type of $value is array and the array keys are
0 .. count($value)-1
in that order.It's well-known that PHP's
array
data type is rare among
programming languages in that it supports both integer and string
keys and that iteration order is important and guaranteed. (it is
used for overlapping use cases - in many other languages, both
vectors/lists/arrays and hash maps are available)While it is possible to efficiently check that something is an
array, that array may still have string keys, not start from 0,
have missing array offsets, or contain out of order keys.It can be useful to verify that the assumption that array keys
are consecutive integers is correct, both for data that is being
passed into a module or for validating data before returning it
from a module. However, because it's currently inconvenient to do
that, this has rarely been done in my experience.In performance-sensitive serializers or data encoders, it may
also be useful to have an efficient check to distinguish lists
from associative arrays. For example, json_encode does this when
deciding to serialize a value as [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1}
for arrays depending on the key orders.Prior email threads/PRs have had others indicate interest in the
ability to efficiently check if a PHParray
has sequential
ordered keys starting from 0https://externals.io/message/109760 “Any interest in a list type?”
https://externals.io/message/111744 “Request for couple memory
optimized array improvements”
Implementation: https://github.com/php/php-src/pull/6070
(some discussion is in the linked PR it was based on)Due to concerns about naming causing confusion with theoretical
potential future changes to the language,
I've updated https://wiki.php.net/rfc/is_list to use the name
is_array_and_list(mixed $value): bool
instead.
(e.g. what if php used the reserved wordlist
to add an actual list
type in the future, and is_list() returned false for that.)I plan to start voting on the RFC in a few days.
Possible alternative that's less clumsy: is_packed_array?
Hi Tyson,
Thanks for the proposal and implementation. I've wanted a function
that does this on numerous occasions, so I think it will be a good
addition to the standard library.
If I may chime in with Larry's suggestion: I think is_packed_array
would a better name than is_array_and_list
. The latter is potentially
confusing since it sounds like the value has more than one type.
is_packed_array
makes it clearer that the function checks whether
the value is an array matching certain criteria. I think it also reads
better when used along with other functions. For example:
function is_associative_array(mixed $value) {
return is_array($value) && !is_packed_array($value);
// vs.
return is_array($value) && !is_array_and_list($value);
}
Best regards,
Theodore
On Sun, Dec 20, 2020 at 2:43 AM tyson andre tysonandre775@hotmail.com
wrote:
Hi internals,
I've created the RFC https://wiki.php.net/rfc/is_list
This adds a new function
is_list(mixed $value): bool
that will return
true
if the type of $value is array and the array keys are0 .. count($value)-1
in that order.It's well-known that PHP's
array
data type is rare among programming
languages
in that it supports both integer and string keys
and that iteration order is important and guaranteed.
(it is used for overlapping use cases - in many other languages, both
vectors/lists/arrays and hash maps are available)While it is possible to efficiently check that something is an array,
that array may still have string keys, not start from 0, have missing
array offsets,
or contain out of order keys.It can be useful to verify that the assumption that array keys are
consecutive integers is correct,
both for data that is being passed into a module or for validating data
before returning it from a module.
However, because it's currently inconvenient to do that, this has rarely
been done in my experience.In performance-sensitive serializers or data encoders, it may also be
useful to have an efficient check to distinguish lists from associative
arrays.
For example, json_encode does this when deciding to serialize a value as
[0, 1, 2] instead of {“0”:0,“2”:1,“1”:1}
for arrays depending on the key orders.Prior email threads/PRs have had others indicate interest in the ability
to efficiently check
if a PHParray
has sequential ordered keys starting from 0https://externals.io/message/109760 “Any interest in a list type?”
https://externals.io/message/111744 “Request for couple memory optimized
array improvements”
Implementation: https://github.com/php/php-src/pull/6070 (some discussion
is in the linked PR it was based on)
I probably brought this up in a previous thread, but I think it's worth
considering again here, given recent changes to the RFC:
I think it would make more sense to introduce this as function array_is_list(array $array): bool
. That is, a function that only accepts
arrays in the first place and determines whether the given array is a list.
Your RFC does mention this possibility, but I think the argument it makes
against it is not particularly strong, especially given the recent rename.
The argument is that is_array_and_list($array) is shorter than writing out
is_array($array) && array_is_list($array) -- that's still true, but with
the new name, it's really not that much shorter anymore.
On the other hand, is_array($array) && array_is_list($array) cleanly
separates out the two predicates. If we take into account the fact that in
the vast majority of cases we will know a-priori that the input is an
array, just not whether it is a list, making the function
array_is_list($array) is both clearer and more concise.
function foo(array $array) {
assert(array_is_list($array)); // Already know it's an array...
}
if (is_array($value)) {
if (array_is_list($value)) { // Already know it's an array...
return serialize_as_list($value);
} else {
return serialize_as_dict($value);
}
}
Regards,
Nikita