Hello,
I would like to open a discussion for https://github.com/php/php-src/issues/10791 .
[https://opengraph.githubassets.com/a23cb565cc8acac6a33ecab5d9ee68a46f046a1ffe215501673156e506695430/php/php-src/issues/10791]https://github.com/php/php-src/issues/10791
Array spread append · Issue #10791 · php/php-srchttps://github.com/php/php-src/issues/10791
Description Currently spread operator can be used for almost anything. But not for array append. I propose the following to be supported: <?php $arr = [1, 2]; $arr2 = [3, 4]; $arr[...] = $arr2; // ...
github.com
Appending N elements to an array is quite common language usage pattern and I belive it should be supported natively for shorter syntax, language consistency and performance.
I am unable to implement it, but I am ready to help with RFC.
Michael Vorisek
Hello,
I would like to open a discussion for https://github.com/php/php-src/issues/10791 .
[https://opengraph.githubassets.com/a23cb565cc8acac6a33ecab5d9ee68a46f046a1ffe215501673156e506695430/php/php-src/issues/10791]https://github.com/php/php-src/issues/10791
Array spread append · Issue #10791 · php/php-srchttps://github.com/php/php-src/issues/10791
Description Currently spread operator can be used for almost anything. But not for array append. I propose the following to be supported: <?php $arr = [1, 2]; $arr2 = [3, 4]; $arr[...] = $arr2; // ...
github.com
Appending N elements to an array is quite common language usage pattern and I belive it should be supported natively for shorter syntax, language consistency and performance.I am unable to implement it, but I am ready to help with RFC.
Michael Vorisek
I have the feeling I'm missing something, but how would this compare to
array join (which is already available and is actually shorter than this) ?
$arr += $arr2;
Smile,
Juliette
I have the feeling I'm missing something, but how would this compare to
array join
Array join does neither append nor replace if a numeric index already exists.
But you can do [...$arr0, ...$arr1].
https://3v4l.org/CIinR
print json_encode([
[...['a', 'b'], ...['c', 'd']], // [a, b, c, d]
array_merge(['a', 'b'], ['c', 'd']), // [a, b, c, d]
['a', 'b'] + ['c', 'd'], // [a, b]
array_replace(['a', 'b'], ['c', 'd']), // [c, d]
]);
(I wish we could use yaml for these outputs)
$arr[...] = $arr2;
Using spread in the array index would be different from current usage
of the spread operator.
But it could be a nice alternative to $arr = [...$arr, $arr2]
, as it
does not require to repeat the first variable.
So I am not against it.
--- Andreas
On Thu, 6 Apr 2023 at 00:36, Juliette Reinders Folmer
php-internals_nospam@adviesenzo.nl wrote:
Hello,
I would like to open a discussion for https://github.com/php/php-src/issues/10791 .
[https://opengraph.githubassets.com/a23cb565cc8acac6a33ecab5d9ee68a46f046a1ffe215501673156e506695430/php/php-src/issues/10791]https://github.com/php/php-src/issues/10791
Array spread append · Issue #10791 · php/php-srchttps://github.com/php/php-src/issues/10791
Description Currently spread operator can be used for almost anything. But not for array append. I propose the following to be supported: <?php $arr = [1, 2]; $arr2 = [3, 4]; $arr[...] = $arr2; // ...
github.com
Appending N elements to an array is quite common language usage pattern and I belive it should be supported natively for shorter syntax, language consistency and performance.I am unable to implement it, but I am ready to help with RFC.
Michael Vorisek
I have the feeling I'm missing something, but how would this compare to
array join (which is already available and is actually shorter than this) ?$arr += $arr2;
Smile,
Juliette
Call me sentimental, but are we just trying to choke out the few remaining
reasons to keep array_push()
in the language?
Okay, yeah, sometimes at is a drag that the first parameter has to be
declared in advance and is modified by reference (which means you can't
enjoy null coalescing or short ternary syntax directly in the first
parameter) but...
The whole purpose of array_push()
is to append one or more elements to an
array. Spreading the second parameter makes this task simple/clean.
https://3v4l.org/RcKRD
I feel like PHP already has an extensive toolbelt for merging array data.
I am not yet sold on this new/strange implementation of the spread operator.
Mick Harner
mickmackusa
Hi Mick
czw., 6 kwi 2023 o 10:00 mickmackusa mickmackusa@gmail.com napisał(a):
Call me sentimental, but are we just trying to choke out the few remaining
reasons to keeparray_push()
in the language?Okay, yeah, sometimes at is a drag that the first parameter has to be
declared in advance and is modified by reference (which means you can't
enjoy null coalescing or short ternary syntax directly in the first
parameter) but...The whole purpose of
array_push()
is to append one or more elements to an
array. Spreading the second parameter makes this task simple/clean.
https://3v4l.org/RcKRD
If a parameter of array_push is a dictionary it fails with:
Fatal error: Uncaught ArgumentCountError: array_push()
does not accept
unknown named parameters
See https://3v4l.org/BHPSt
Best regards,
Michał Marcin Brzuchalski
Hi Michael
I would like to open a discussion for https://github.com/php/php-src/issues/10791 .
[https://opengraph.githubassets.com/a23cb565cc8acac6a33ecab5d9ee68a46f046a1ffe215501673156e506695430/php/php-src/issues/10791]https://github.com/php/php-src/issues/10791
Array spread append · Issue #10791 · php/php-srchttps://github.com/php/php-src/issues/10791
Description Currently spread operator can be used for almost anything. But not for array append. I propose the following to be supported: <?php $arr = [1, 2]; $arr2 = [3, 4]; $arr[...] = $arr2; // ...
github.com
Appending N elements to an array is quite common language usage pattern and I belive it should be supported natively for shorter syntax, language consistency and performance.Hi Michael
There are a few questions that come to mind (there may be more).
- Are integer keys preserved? I'm assuming no, as otherwise it would
be the same as$a + $b
. - What is the return value of the expression
$a[...] = $b
? I'm
assuming $a after the additions of $b? - How does it behave in combination with
ArrayAccess
? Throw? Call
offsetSet
for each element? - How does it interact with references? E.g. This is valid PHP code:
assign_by_ref($a[])
(https://3v4l.org/qoJYn) - How does it interact with undefined/null values? E.g.
$a[] = 42;
works without declaring $a first. - Is there a need for this? Given that
+
doesn't work with
sequential lists andarray_push($a, ...$b)
doesn't work with strings
I'd say possibly.[...$a, ...$b]
works but requires duplication of
the array which in loops can be detrimental to performance.
Ilija
Hi Ilija,
- Are integer keys preserved? I'm assuming no, as otherwise it would
be the same as$a + $b
.
$arr[...] = $arr;
should be the same as
foreach ($arr as $v) { $arr[] = $v; }
all other answers should be answered from this and consistency
- What is the return value of the expression
$a[...] = $b
? I'm
assuming $a after the additions of $b?
https://3v4l.org/63IEU - the right side expr to be consistent with regular $arr[] = expr
- How does it behave in combination with
ArrayAccess
? Throw? Call
offsetSet
for each element?
https://3v4l.org/v26Bh - call offsetSet
for each element
- How does it interact with references? E.g. This is valid PHP code:
assign_by_ref($a[])
(https://3v4l.org/qoJYn)
assign_by_ref($a[...])
would be about the same as assign_by_ref($a)
, so compile error
- How does it interact with undefined/null values? E.g.
$a[] = 42;
works without declaring $a first.
https://3v4l.org/JtO7g - same behaviour, based on the foreach definition
- Is there a need for this? Given that
+
doesn't work with
sequential lists andarray_push($a, ...$b)
doesn't work with strings
I'd say possibly.[...$a, ...$b]
works but requires duplication of
the array which in loops can be detrimental to performance.
Yes, +
and/or array_push($a, ...$b)
are not equivalents, and all other syntaxes like
[...$a, ...$b]
or array_merge($a, $b)
are longer to write and slower
- One extra QA about references:
https://3v4l.org/U02Cq - $arr[...] = &$arr
should be compile error
Michael
From: Ilija Tovilo tovilo.ilija@gmail.com
Sent: Thursday, April 6, 2023 12:18 PM
To: internals@lists.php.net internals@lists.php.net
Subject: Re: [PHP-DEV] Array spread append
Hi Michael
I would like to open a discussion for https://github.com/php/php-src/issues/10791 .
[https://opengraph.githubassets.com/a23cb565cc8acac6a33ecab5d9ee68a46f046a1ffe215501673156e506695430/php/php-src/issues/10791]https://github.com/php/php-src/issues/10791
Array spread append · Issue #10791 · php/php-srchttps://github.com/php/php-src/issues/10791
Description Currently spread operator can be used for almost anything. But not for array append. I propose the following to be supported: <?php $arr = [1, 2]; $arr2 = [3, 4]; $arr[...] = $arr2; // ...
github.com
Appending N elements to an array is quite common language usage pattern and I belive it should be supported natively for shorter syntax, language consistency and performance.Hi Michael
There are a few questions that come to mind (there may be more).
- Are integer keys preserved? I'm assuming no, as otherwise it would
be the same as$a + $b
. - What is the return value of the expression
$a[...] = $b
? I'm
assuming $a after the additions of $b? - How does it behave in combination with
ArrayAccess
? Throw? Call
offsetSet
for each element? - How does it interact with references? E.g. This is valid PHP code:
assign_by_ref($a[])
(https://3v4l.org/qoJYn) - How does it interact with undefined/null values? E.g.
$a[] = 42;
works without declaring $a first. - Is there a need for this? Given that
+
doesn't work with
sequential lists andarray_push($a, ...$b)
doesn't work with strings
I'd say possibly.[...$a, ...$b]
works but requires duplication of
the array which in loops can be detrimental to performance.
Ilija
--
To unsubscribe, visit: https://www.php.net/unsub.php
- Are integer keys preserved? I'm assuming no, as otherwise it would
be the same as$a + $b
.$arr[...] = $arr;
should be the same as
foreach ($arr as $v) { $arr[] = $v; }all other answers should be answered from this and consistency
Since $arr1[...] = $arr2;
should be consistent with foreach ($arr2 as $v) { $arr1[] = $v; }
(or more succinctly with a bodiless loop: foreach ($arr2 as $arr1[]);
. In this regard, it appears to destroy $arr2 keys
while appending. So for cases with non-numeric keys that choke the spread
operator within array_push()
, then the bodiless loop isn't horrific.
What I do find unsettling is that the spread operator in its current
implementations always serves to unpack the payload that follows it (on its
right side). However, in this new implementation, the three dots are
entombed in square braces -- it would be inconsistent with the language.
I think it would be more intuitive to implement an array "appending"
functionality with "concatenating" syntax. The .=
syntax already exists,
but is forbidden from use on non-strings. If you want to implement this
preexisting syntax as an array concatenation operator, then it is a blank
slate -- you can decree whatever behaviors you wish for it without blurring
what the combined operator does with strings. This opportunity is similar
to how +=
acts as an addition-assignment operator or an array union
operator depending on context.
I don't think I have an appetite for going down this road, but if so, I
could imagine appending with $all = $arr1 . $arr2 . $arr3;
The reason
that I find this unsavory is probably because it becomes harder to intuit a
data type by merely scanning code functionality. If the variable names
aren't indicative and there is no type hinting, it will be hard to discern
string variables from array variables without more context.
I suppose my parting thought on this discussion is that the spread operator
is just not the best tool for this job -- if this is a job worth doing at
all.
Mick
I think it would be more intuitive to implement an array "appending"
functionality with "concatenating" syntax. The.=
syntax already exists,
but is forbidden from use on non-strings. If you want to implement this
preexisting syntax as an array concatenation operator, then it is a blank
slate -- you can decree whatever behaviors you wish for it without blurring
what the combined operator does with strings.
This would be the idea situation, but this is not correct.
Arrays, still in PHP 8, get automatically cast to strings during
concatenation or echo:
<?php
$a = [1, 2, 3];
$b = [4, 5, 6];
$a .= $b;
var_dump($a);
?>
Results in:
Warning: Array to string conversion
Warning: Array to string conversion
string(10) "ArrayArray"
I'm expecting that in PHP 9 this is going to throw a TypeError similarly to
how non-string castable objects throw a TypeError in PHP 8.
But because this was "only" a notice in PHP 7, this specific change only
got promoted to an E_WARNING.
Thus, this suggestion is not possible until the major version, but I do
prefer this compared to the weird $a[...] syntax.
Bbet regards,
George P. Banyard
czw., 6 kwi 2023, 13:53 użytkownik Vorisek, Michael vorismi3@fjfi.cvut.cz
napisał:
Hi Ilija,
- Are integer keys preserved? I'm assuming no, as otherwise it would
be the same as$a + $b
.$arr[...] = $arr;
should be the same as
foreach ($arr as $v) { $arr[] = $v; }
I'd argue with that because I think the spread operator should be
consistent with preserve keys strategy used in other places.
Cheers,
Michał Marcin Brzuchalski
The goal is fast multi-item array append, which is currently not possible, as +=
and array_push
operators does something else and array_merge
is very slow as the whole array is copied.
Currently the fastest approach is to use the foreach equivalent, which requires much more code and two intermediate variables for k/v.
For better understanding, I have updated the example codes in https://github.com/php/php-src/issues/10791 but sadly all short codes are slow as array is copied.
Michael
From: Michał Marcin Brzuchalski michal.brzuchalski@gmail.com
Sent: Thursday, April 6, 2023 5:01 PM
To: Vorisek, Michael vorismi3@fjfi.cvut.cz
Cc: PHP internals internals@lists.php.net
Subject: Re: [PHP-DEV] Array spread append
czw., 6 kwi 2023, 13:53 użytkownik Vorisek, Michael <vorismi3@fjfi.cvut.czmailto:vorismi3@fjfi.cvut.cz> napisał:
Hi Ilija,
- Are integer keys preserved? I'm assuming no, as otherwise it would
be the same as$a + $b
.
$arr[...] = $arr;
should be the same as
foreach ($arr as $v) { $arr[] = $v; }
I'd argue with that because I think the spread operator should be consistent with preserve keys strategy used in other places.
Cheers,
Michał Marcin Brzuchalski
Hi
I would like to open a discussion for https://github.com/php/php-src/issues/10791 .
For the discussion phase to actually be open (with regard to the minimal
discussion period before a vote on an RFC may be opened), an actual RFC
needs to exist.
I am unable to implement it, but I am ready to help with RFC.
Having followed along this existing mailing list thread, I feel like it
makes sense that you actually write up an RFC, because it would make the
discussion much more organized, as all the questions and edge cases can
be answered in a single place, instead of being scattered around in
multiple emails.
Having an actual well-written and complete RFC text should also make it
easier for you to find some developer that is capable and willing to
assist you with the implementation, because it's then clear what you
desire to be implemented.
Details with regard to "How to RFC" can be found in the Wiki:
https://wiki.php.net/rfc/howto. Step (1) is resolved by this mailing
list thread and "detailed RFC will clarify the proposal" definitely
applies here. However please note the part of step (1) that says
something about top-posting.
Best regards
Tim Düsterhus
Hi
Hello,
I would like to open a discussion for https://github.com/php/php-src/issues/10791 .
[https://opengraph.githubassets.com/a23cb565cc8acac6a33ecab5d9ee68a46f046a1ffe215501673156e506695430/php/php-src/issues/10791]https://github.com/php/php-src/issues/10791
Array spread append · Issue #10791 · php/php-srchttps://github.com/php/php-src/issues/10791
Description Currently spread operator can be used for almost anything. But not for array append. I propose the following to be supported: <?php $arr = [1, 2]; $arr2 = [3, 4]; $arr[...] = $arr2; // ...
github.com
Appending N elements to an array is quite common language usage pattern and I belive it should be supported natively for shorter syntax, language consistency and performance.I am unable to implement it, but I am ready to help with RFC.
Michael Vorisek
As far as I understand, this is somewhat of a continuation of https://github.com/php/php-src/issues/9881, in which the concern about the array_merge performance was posted. This in turn resulted in this feature proposal if I understand correctly.
I think $x = array_merge($x, ...); is probably a common pattern. Creating a new function or a new syntax will make it more difficult for users to know when to use the array_merge approach or when to use the new append approach.
I saw the original comments on the array_merge GitHub thread, and read about the optimization difficulty.
The reason there's a performance penalty is due the first array always being copied because:
- it may be modified (i.e. keys made sequential)
- it may be shared in more places
However, I think it might be worth adding an optimization for the most common $x = array_merge($x, ...) case.
To do this:
- we need to check if the array is sequential. For packed array this can be done with HT_IS_WITHOUT_HOLES().
- we need to detect that coding pattern. This can be done by checking if there's a ZEND_ASSIGN after the call that assigns the result to the first passed array variable.
- array must not be immutable
- array must have refcount 2: once as variable and once as argument
I implemented this optimization in the following commit on my fork: https://github.com/nielsdos/php-src/commit/269d547043c9d3c59d7c9d773eb1ef174dcc5c44
For now it's limited to packed arrays, and it's only a proof-of-concept. I'm also not sure if it's 100% correct, I'd need to do some more debugging / testing to know for sure, but I'm short on time now.
The performance improvement for the array_merge case in https://github.com/php/php-src/issues/10993#issuecomment-1493166391 is big:
1.32s without my patch, vs 0.008s with my patch
I think this could be made more generic, and be cleaned up.
But I don't know if something like this is desired in PHP.
Kinds regards
Niels
Hi
I think this could be made more generic, and be cleaned up.
But I don't know if something like this is desired in PHP.
Yes, please. I believe that “performance” should not influence API
design if it can be avoided. Instead the heavy lifting of optimization
should be done by the engine (see also:
https://externals.io/message/118896#118921).
The suggested optimization of "the input is overwritten with the output"
would then also allow to avoid introducing reference parameters just for
optimization purposes. The sort()
family comes to my mind and also the
shuffle()
function. Randomizer::shuffleArray() already returns a copy
and thus would benefit from the proposed optimization for $a =
$r->shuffleArray($a).
Best regards
Tim Düsterhus
Hey Tim
Hi
I think this could be made more generic, and be cleaned up.
But I don't know if something like this is desired in PHP.Yes, please. I believe that “performance” should not influence API design if it can be avoided. Instead the heavy lifting of optimization should be done by the engine (see also: https://externals.io/message/118896#118921).
I agree.
The suggested optimization of "the input is overwritten with the output" would then also allow to avoid introducing reference parameters just for optimization purposes. The
sort()
family comes to my mind and also theshuffle()
function. Randomizer::shuffleArray() already returns a copy and thus would benefit from the proposed optimization for $a = $r->shuffleArray($a).
I did extend my optimization since the first time I posted it here.
It can handle two cases:
- $x = array_something($x, ...) like I previously showed with array_merge
- $x = array_something(temporary_array_with_refcount_1,...) which is new
There is one caveat with the first optimization: it is only safe if we know for sure no exception can happen during array modification.
Because, if an exception happens during modification, then the changed array is already visible to the exception handler, but this isn't allowed because the assignment didn't happen yet.
This "exception problem" does not happen for the second optimization though.
So I looked if it was possible to do the optimization for shuffleArray.
It is only possible for the second case, because I see some EG(exception) checks inside php_array_data_shuffle().
Unless we can determine upfront which random algorithms have an exception-free range function.
Best regards
Tim Düsterhus
Kind regards
Niels
Hi
The suggested optimization of "the input is overwritten with the output" would then also allow to avoid introducing reference parameters just for optimization purposes. The
sort()
family comes to my mind and also theshuffle()
function. Randomizer::shuffleArray() already returns a copy and thus would benefit from the proposed optimization for $a = $r->shuffleArray($a).I did extend my optimization since the first time I posted it here.
It can handle two cases:
- $x = array_something($x, ...) like I previously showed with array_merge
- $x = array_something(temporary_array_with_refcount_1,...) which is new
There is one caveat with the first optimization: it is only safe if we know for sure no exception can happen during array modification.
Because, if an exception happens during modification, then the changed array is already visible to the exception handler, but this isn't allowed because the assignment didn't happen yet.
This "exception problem" does not happen for the second optimization though.
I see. That makes stuff certainly more complicated, because an exception
can also arise in an error handler (i.e. for any warnings and notices).
It's also not just about visibility to the exception handler, but also
"incomplete modifications" in cases like these:
try {
$foo = something($foo);
} catch (\Exception) {}
// $foo might or might not be fully modified.
So I looked if it was possible to do the optimization for shuffleArray.
It is only possible for the second case, because I see some EG(exception) checks inside php_array_data_shuffle().
Unless we can determine upfront which random algorithms have an exception-free range function.
For the internal engines this is easy (when ignoring extensions):
- Mt19937
- PcgOneseq128XslRr64
- Xoshiro256StarStar
... are all infallible.
- Secure
... is fallible (it doesn't fail in practice on modern OSes, though [1])
Any userland engine (engine_user.c) can do anything it wants, of course.
Unfortunately with Secure being fallible, this optimization is of little
use in practice. The same would likely be true for a non-reference
sorting function, because of incomparable values and userland comparison
handlers that can all kinds of unsafe stuff.
Best regards
Tim Düsterhus
[1] For a future major we might be able to make a working CSPRNG a
hard requirement.
Hi
Hi
The suggested optimization of "the input is overwritten with the output" would then also allow to avoid introducing reference parameters just for optimization purposes. The
sort()
family comes to my mind and also theshuffle()
function. Randomizer::shuffleArray() already returns a copy and thus would benefit from the proposed optimization for $a = $r->shuffleArray($a).I did extend my optimization since the first time I posted it here.
It can handle two cases:
- $x = array_something($x, ...) like I previously showed with array_merge
- $x = array_something(temporary_array_with_refcount_1,...) which is new
There is one caveat with the first optimization: it is only safe if we know for sure no exception can happen during array modification.
Because, if an exception happens during modification, then the changed array is already visible to the exception handler, but this isn't allowed because the assignment didn't happen yet.
This "exception problem" does not happen for the second optimization though.I see. That makes stuff certainly more complicated, because an exception can also arise in an error handler (i.e. for any warnings and notices).
Right. Thanks for pointing this out! I completely forgot about this.
That's unfortunate... So far only the array_unique optimization will have trouble with this. I added a check now that checks if a user handler is installed.
In any case, if anyone is interested, I created a PR against my fork (https://github.com/nielsdos/php-src/pull/5) where I'm tinkering with the optimization idea.
It's also not just about visibility to the exception handler, but also "incomplete modifications" in cases like these:
try {
$foo = something($foo);
} catch (\Exception) {}// $foo might or might not be fully modified.
So I looked if it was possible to do the optimization for shuffleArray.
It is only possible for the second case, because I see some EG(exception) checks inside php_array_data_shuffle().
Unless we can determine upfront which random algorithms have an exception-free range function.For the internal engines this is easy (when ignoring extensions):
- Mt19937
- PcgOneseq128XslRr64
- Xoshiro256StarStar
... are all infallible.
- Secure
... is fallible (it doesn't fail in practice on modern OSes, though [1])
Any userland engine (engine_user.c) can do anything it wants, of course. Unfortunately with Secure being fallible, this optimization is of little use in practice. The same would likely be true for a non-reference sorting function, because of incomparable values and userland comparison handlers that can all kinds of unsafe stuff.
That's unfortunate. Then it does indeed seem like the optimization will not work well for your use case :/
Best regards
Tim Düsterhus[1] For a future major we might be able to make a working CSPRNG a hard requirement.
Kind regards
Niels