I noticed this commit
https://github.com/facebook/hhvm/commit/4167d66fcfa81a1c2a19e853282fac968d9cd454
recently in HHVM, which makes array_values()
return the same array back if
it's packed instead of building a copy.
My understanding is PHP 7 also has packed arrays like HHVM, and my reading
of PHP's array_values()
implementation (here
https://github.com/php/php-src/blob/c6982995504b6e21e8a5ade29cfb16a55196dc43/ext/standard/array.c#L4000)
is that it always creates a copy of the array, even if it's packed.
Would this be a worthwhile optimisation to make in PHP as well?
On Mon, Mar 13, 2017 at 6:34 AM, Jesse Schalken me@jesseschalken.com
wrote:
I noticed this commit
<https://github.com/facebook/hhvm/commit/4167d66fcfa81a1c2a1
9e853282fac968d9cd454>
recently in HHVM, which makesarray_values()
return the same array back if
it's packed instead of building a copy.My understanding is PHP 7 also has packed arrays like HHVM, and my reading
of PHP'sarray_values()
implementation (here
<https://github.com/php/php-src/blob/c6982995504b6e21e8a5ade
29cfb16a55196dc43/ext/standard/array.c#L4000>)
is that it always creates a copy of the array, even if it's packed.Would this be a worthwhile optimisation to make in PHP as well?
That would be nice if array_values()
wouldn't reindex the array.
But it does.
$a = [12=> 'foo', 42 => 'bar']; // packed array
$b = array_values($a);
$b // [0 => 'foo', 42 => 'bar']; // keys have been changed and are not
kept from source
So, array_values()
must allocate a new array, and fill it in with the
values of source, because array_values()
reindexes the source array from
key 0 ; whatever was the original array.
Note that the allocation is very light, in PHP 7, we allocate a full buffer
at once , and not one buffer for each slot. The performance inpact of such
an allocation is really really tiny.
Julien.Pauli
I noticed this commit
<https://github.com/facebook/hhvm/commit/4167d66fcfa81a1c2a1
9e853282fac968d9cd454>
recently in HHVM, which makesarray_values()
return the same array back if
it's packed instead of building a copy.My understanding is PHP 7 also has packed arrays like HHVM, and my reading
of PHP'sarray_values()
implementation (here
<https://github.com/php/php-src/blob/c6982995504b6e21e8a5ade
29cfb16a55196dc43/ext/standard/array.c#L4000>)
is that it always creates a copy of the array, even if it's packed.Would this be a worthwhile optimisation to make in PHP as well?
That would be nice if
array_values()
wouldn't reindex the array.But it does.
$a = [12=> 'foo', 42 => 'bar']; // packed array
$b = array_values($a);$b // [0 => 'foo', 42 => 'bar']; // keys have been changed and are not
kept from sourceSo,
array_values()
must allocate a new array, and fill it in with the
values of source, becausearray_values()
reindexes the source array from
key 0 ; whatever was the original array.Note that the allocation is very light, in PHP 7, we allocate a full buffer
at once , and not one buffer for each slot. The performance inpact of such
an allocation is really really tiny.
Minor nit: [12=>'foo', 42=>'bar'] is not a packed array.
[0=>'foo', 2=>'bar'] is however, so your primary point stands.
However it should be simple enough to detect when a packed array is
also vector-like (indexed from 0 to n-1) and make this minor
optimization.
-Sara
Minor nit: [12=>'foo', 42=>'bar'] is not a packed array.
[0=>'foo', 2=>'bar'] is however, so your primary point stands.
However it should be simple enough to detect when a packed array is
also vector-like (indexed from 0 to n-1) and make this minor
optimization.-Sara
I didn't realise packed arrays could still have gaps. If that's the case,
why can't [12=>'foo', 42=>'bar'] be packed (index 0-11 and 13-41 are
undefined)?
I agree checking if the array is both packed and without gaps would still
make a worthwhile optimisation.
I didn't realise packed arrays could still have gaps. If that's the case,
why can't [12=>'foo', 42=>'bar'] be packed (index 0-11 and 13-41 are
undefined)?
It can be, but in practice isn't likely to be since that's a lot of
empty space for a two element array. It will end up mixed pretty
quickly before you get to that point.
I agree checking if the array is both packed and without gaps would still
make a worthwhile optimisation.
Oh yeah, btw... six hours ago...
https://github.com/php/php-src/commit/c74bc87c74f48bc55541b3bf2fc67d595f58a3b5
-Sara
Oh yeah, btw... six hours ago...
https://github.com/php/php-src/commit/c74bc87c74f48bc55541b3bf2fc67d
595f58a3b5-Sara
Awesome, thanks Sara.
On Wed, Mar 15, 2017 at 12:58 AM, Jesse Schalken me@jesseschalken.com
wrote:
Oh yeah, btw... six hours ago...
https://github.com/php/php-src/commit/c74bc87c74f48bc55541b3
bf2fc67d595f58a3b5-Sara
Awesome, thanks Sara.
Nice Sara.
Yep my example wasn't that good , but such an array with holes could be
packed, thus pretty unlikely.
For interested souls , I detailed that in an article of mine, targeting 7.0
, at http://jpauli.github.io/2016/04/08/hashtables.html
Julien.P