I've been wondering for some time why PHP does not have the ability to
efficiently insert string keys before/after an existing string key.
Let's say I have an array of 10,000 string key/value pairs and I want to
insert five new string key/value pairs at specific positions in the
existing array. Unless I've missed something, despite looking at each
of the array_...() functions, this is not possible in userland without
copying the entire array to insert the new elements (or some real
hackery such as append with some tracking info and use a function like
uksort()
to somehow only move the newly added items). There are several
userland functions in the PHP documentation comments and also on
StackOverflow that end up recreating the array each time they are
called. Several examples:
https://stackoverflow.com/a/7257599/917198
http://us1.php.net/manual/en/function.array-splice.php
As the author of the OrderedHash implementation here:
https://github.com/cubiclesoft/cross-platform-cpp
I provide an ordered hash data structure that is similar-ish to the PHP
array implementation, but it also has support for inserting a node
anywhere in the OrderedHash by making a few pointer adjustments.
Shifting a few pointers around is usually magnitudes faster than
performing memory allocations and copying data.
--
Thomas Hruska
CubicleSoft President
I've got great, time saving software that you will find useful.
And once you find my software useful:
On Fri, Nov 3, 2017 at 9:51 AM, Thomas Hruska thruska@cubiclesoft.com
wrote:
I've been wondering for some time why PHP does not have the ability to
efficiently insert string keys before/after an existing string key.Let's say I have an array of 10,000 string key/value pairs and I want to
insert five new string key/value pairs at specific positions in the
existing array. Unless I've missed something, despite looking at each of
the array_...() functions, this is not possible in userland without copying
the entire array to insert the new elements
It's impossible in user-land because, well, that's just how HashTable works
under the hood. Quoting nikic 1:
The arData array stores the elements in order of insertion. So the first
array element will be stored in arData[0], the second in arData1 etc.
This does not in any way depend on the used key, only the order of
insertion matters here.
That last part is important: the keys from user-land have no bearing on the
order in memory. If one wants to effect the in-memory ordering, one must
create a new user-land array and insert them in the desired order.
As the author of the OrderedHash implementation here:
https://github.com/cubiclesoft/cross-platform-cpp
I provide an ordered hash data structure that is similar-ish to the PHP
array implementation, but it also has support for inserting a node anywhere
in the OrderedHash by making a few pointer adjustments. Shifting a few
pointers around is usually magnitudes faster than performing memory
allocations and copying data.
Shifting elements in a HashTable.arData would not be cheap. But, honestly I
can't think of a scenario where changing the in-memory order would be
necessary. If I want an array that traverses in order, I'll insert in
order. Otherwise, I'll sort on demand. What seems to matter most is
efficiency of lookup by key.
--
Thomas Hruska
CubicleSoft President
bishop
On Fri, Nov 3, 2017 at 9:51 AM, Thomas Hruska thruska@cubiclesoft.com
wrote:http://nikic.github.io/2014/12/22/PHPs-new-hashtable-implementation.html
Ah. Very clever and cool. That memory layout does indeed pose a
problem with random order insertions of string keys. Detachable nodes
are more useful to me in C++ for a general-purpose data structure but I
can see the benefits of doing this for PHP 7, which takes advantage of
the most common array usage patterns (i.e. an array).
After reading that post, I was inspired to implement PackedOrderedHash:
https://github.com/cubiclesoft/cross-platform-cpp
I can definitely see why PHP 7 adopted the structure. My own benchmarks
of PackedOrderedHash show significant performance improvements.
You'll note that in PackedOrderedHash I excluded inserting just anywhere
- likely for the same reason that PHP doesn't offer such a feature. I
still think PHP could natively offer string key-based insertions even
though it'll require moving elements of an array around. That's likely
to be faster than rebuilding the whole array.
Thanks for the long writeup and Nikic for the blog post.
--
Thomas Hruska
CubicleSoft President
I've got great, time saving software that you will find useful.
And once you find my software useful: