Hi all,
I have a question about the internal implementation of PHP's hashtables. I
did some researches, but I didn't find the answer.
Here is an example of what I would like to understand.
Start by creating an array:
$a = array();
Fill it, using implicit and explicit keys:
$a[] = 'cc';
$a[1] = 'dd';
If we look at what's inside, we get:
Array
(
[0] => cc
[1] => dd
)
OK, everything is obvious. Now, I add a value at the beginning of the array:
array_unshift($a, 'ee');
Do a print_r()
again:
Array
(
[0] => ee
[1] => cc
[2] => dd
)
As you can see, the keys for 'cc' and 'dd' have been recalculated. It works
as espected.
My question is how does it work? Are all numeric keys computed when the
array_shift()
is done? Or is the iterator calculating them on-the-fly?
Thanks.
Amaury
Hi all,
I have a question about the internal implementation of PHP's hashtables. I
did some researches, but I didn't find the answer.Here is an example of what I would like to understand.
Start by creating an array:
$a = array();Fill it, using implicit and explicit keys:
$a[] = 'cc';
$a[1] = 'dd';If we look at what's inside, we get:
Array
(
[0] => cc
[1] => dd
)OK, everything is obvious. Now, I add a value at the beginning of the array:
array_unshift($a, 'ee');Do a
print_r()
again:
Array
(
[0] => ee
[1] => cc
[2] => dd
)As you can see, the keys for 'cc' and 'dd' have been recalculated. It works
as espected.
My question is how does it work? Are all numeric keys computed when the
array_shift()
is done? Or is the iterator calculating them on-the-fly?Thanks.
Amaury
To clarify, this particular functionality you're using as an example
"array_unshift" really isn't specific to the internal implementation
of hashtables in PHP. That is to say that this side-effect you're
describing is specific to that function and not necessarily
hashtables' internal implementation.
Essentially, array_unshift()
just rebuilds the array. From
http://php.net/array-unshift "All numerical array keys will be
modified to start counting from zero while literal keys won't be
touched"
If we were to write a PHP user-land implementation of array_unshift()
it would probably look something like this...
function array_unshift(&$array) {
$new_array = array();
foreach (array_slice(func_get_args(),1) as $e) {
$new_array[] = $e;
}
foreach ($array as $k => $v) {
if (is_numeric($k)) {
$new_array[] = $v;
} else {
$new_array[$k] = $v;
}
}
$array = $new_array;
return count($array);
}
If you're interested in seeing the actual code for array_unshift's
internal implementation you can take a look here:
http://lxr.php.net/xref/PHP_5_4/ext/standard/array.c#2049
As for how the internal implementation of hashtables in PHP, that can
be quite an extensive topic. PHP uses hashtables for pretty much
everything internally. The hashtable struct can be seen here
http://lxr.php.net/xref/PHP_5_4/Zend/zend_hash.h#66
If you're interested in a more detailed explanation of the
implementation of PHP's Array type you might find this article by
Nikic useful: http://nikic.github.com/2012/03/28/Understanding-PHPs-internal-array-implementation.html
2012/9/2 Sherif Ramadan theanomaly.is@gmail.com
To clarify, this particular functionality you're using as an example
"array_unshift" really isn't specific to the internal implementation
of hashtables in PHP. That is to say that this side-effect you're
describing is specific to that function and not necessarily
hashtables' internal implementation.
OK, thanks for the information. It explains why I didn't find anything in
the HashTable structure (unlike iterator pointer or the next free key).
Essentially, array_unshift()
just rebuilds the array. From
http://php.net/array-unshift "All numerical array keys will be
modified to start counting from zero while literal keys won't be
touched"
You're right; I hadn't noticed that.
Side effect: add values at the beginning of an array could be very
time-consuming, according to its size.
If you're interested in a more detailed explanation of the
implementation of PHP's Array type you might find this article by
Nikic useful:
http://nikic.github.com/2012/03/28/Understanding-PHPs-internal-array-implementation.html
Thanks again for this link. I have the book of Sara (huge piece of work,
btw), and I found some good documentation. But more information wouldn't
hurt! :-)
OK, thanks for the information. It explains why I didn't find anything in
the HashTable structure (unlike iterator pointer or the next free key).
Yes, because the hashtable is an ordered map. The only way to
logically prepend to it is to rework the whole thing. That's generally
how you prepend to a stack as well. You'll probably find that in most
C implementations of a stack or a linked list the only way to prepend
is to rebuild the entire stack or list.
That's why the existing implementation is just doing the equivalent of
what I demonstrated in PHP code (it creates a new hashtable).
Line 2053 of ext/standard/array.c
HashTable new_hash; / New hashtable for the stack */
You're right; I hadn't noticed that.
Side effect: add values at the beginning of an array could be very
time-consuming, according to its size.
It's generally more costly to do so, yes.
Thanks again for this link. I have the book of Sara (huge piece of work,
btw), and I found some good documentation. But more information wouldn't
hurt! :-)
Cheers :)