An optimization idea

16 years ago by Karoly Negyesi — view source

unread

Hi,

I think zend_hash_compare could get a healthy speed boost in some
cases if first it would check whether the two variables passed to it
are actually the same (because of "reference counting"). Sorry, my C
skills are way too rusty to write the patch which is likely to be just
a few lines long.

Regards,

Karoly Negyesi

16 years ago by shire — view source

unread

Hi,

I think zend_hash_compare could get a healthy speed boost in some
cases if first it would check whether the two variables passed to it
are actually the same (because of "reference counting"). Sorry, my C
skills are way too rusty to write the patch which is likely to be just
a few lines long.

A quick patch/test seems to agree with you, I'd be interested to know
if you/others are able to apply this patch and see any real-world
savings with your web application. The following was done with a
debug build of PHP so results are likely exaggerated. I'm also using
non-recursive array with 256 byte strings below, results with numeric
values are less significant of course (approx a 25% gain)...

I'll also look into applying this to some other functions like
compare_function where it's likely to yield a larger savings in a
general application.

patch against php-5.2 CVS head

iff --git a/Zend/zend_hash.c b/Zend/zend_hash.c
index a1d7071..d11785f 100644
--- a/Zend/zend_hash.c
+++ b/Zend/zend_hash.c
@@ -1327,16 +1327,14 @@ ZEND_API int zend_hash_compare(HashTable *ht1,
HashTable *ht2, compare_func_t co
IS_CONSISTENT(ht1);
IS_CONSISTENT(ht2);

```
  HASH_PROTECT_RECURSION(ht1);
```
```
  HASH_PROTECT_RECURSION(ht2);
```

   result = ht1->nNumOfElements - ht2->nNumOfElements;

```
  if (result!=0) {
```

          HASH_UNPROTECT_RECURSION(ht1);

          HASH_UNPROTECT_RECURSION(ht2);

  if (ht1 == ht2 || result!=0) {
           return result;
   }

```
  HASH_PROTECT_RECURSION(ht1);
```
```
  HASH_PROTECT_RECURSION(ht2);
```

   p1 = ht1->pListHead;
   if (ordered) {
           p2 = ht2->pListHead;

simple test script

<?php

$arr1 = array();
for ($i=0; $i < 10000; $i++) {
$arr1[$i] = str_repeat('x', 256);
}
$arr2 = array();
for ($i=0; $i < 10000; $i++) {
$arr2[$i] = str_repeat('x', 256);
}
$arr3 = $arr1;

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr2) {
$count++;
}
}
$stop = microtime(true);
echo "different array time: ".($stop-$start)."\n";
echo "count: $count \n";

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr3) {
$count++;
}
}
$stop = microtime(true);
echo "identical array time: ".($stop-$start)."\n";
echo "count: $count \n";

(un-patched php build)
shire@shirebook:~/data/php/git/php$ ./sapi/cli/php.vanilla test.php
different array time: 4.2019698619843
count: 1000
identical array time: 2.4957029819489
count: 1000

(patched build)
shire@shirebook:~/data/php/git/php$ ./sapi/cli/php test.php
different array time: 4.059928894043
count: 1000
identical array time: 0.00043511390686035
count: 1000

-shire

16 years ago by Lukas Kahwe Smith — view source

unread

Hello again,

once again top posting this this is fairly old ..
however for something that is sounding so promising I am wondering why
this hasnt been picked up ..

regards,
Lukas

Hi,

I think zend_hash_compare could get a healthy speed boost in some
cases if first it would check whether the two variables passed to it
are actually the same (because of "reference counting"). Sorry, my C
skills are way too rusty to write the patch which is likely to be
just
a few lines long.

A quick patch/test seems to agree with you, I'd be interested to
know if you/others are able to apply this patch and see any real-
world savings with your web application. The following was done
with a debug build of PHP so results are likely exaggerated. I'm
also using non-recursive array with 256 byte strings below, results
with numeric values are less significant of course (approx a 25%
gain)...

I'll also look into applying this to some other functions like
compare_function where it's likely to yield a larger savings in a
general application.

patch against php-5.2 CVS head

iff --git a/Zend/zend_hash.c b/Zend/zend_hash.c
index a1d7071..d11785f 100644
--- a/Zend/zend_hash.c
+++ b/Zend/zend_hash.c
@@ -1327,16 +1327,14 @@ ZEND_API int zend_hash_compare(HashTable
*ht1, HashTable *ht2, compare_func_t co
IS_CONSISTENT(ht1);
IS_CONSISTENT(ht2);
  HASH_PROTECT_RECURSION(ht1);
  HASH_PROTECT_RECURSION(ht2);
 result = ht1->nNumOfElements - ht2->nNumOfElements;
  if (result!=0) {
          HASH_UNPROTECT_RECURSION(ht1);
          HASH_UNPROTECT_RECURSION(ht2);
  if (ht1 == ht2 || result!=0) {
         return result;
 }
  HASH_PROTECT_RECURSION(ht1);
  HASH_PROTECT_RECURSION(ht2);
 p1 = ht1->pListHead;
 if (ordered) {
         p2 = ht2->pListHead;
simple test script

<?php

$arr1 = array();
for ($i=0; $i < 10000; $i++) {
$arr1[$i] = str_repeat('x', 256);
}
$arr2 = array();
for ($i=0; $i < 10000; $i++) {
$arr2[$i] = str_repeat('x', 256);
}
$arr3 = $arr1;

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr2) {
$count++;
}
}
$stop = microtime(true);
echo "different array time: ".($stop-$start)."\n";
echo "count: $count \n";

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr3) {
$count++;
}
}
$stop = microtime(true);
echo "identical array time: ".($stop-$start)."\n";
echo "count: $count \n";

(un-patched php build)
shire@shirebook:~/data/php/git/php$ ./sapi/cli/php.vanilla test.php
different array time: 4.2019698619843
count: 1000
identical array time: 2.4957029819489
count: 1000

(patched build)
shire@shirebook:~/data/php/git/php$ ./sapi/cli/php test.php
different array time: 4.059928894043
count: 1000
identical array time: 0.00043511390686035
count: 1000

-shire

--

Lukas Kahwe Smith
mls@pooteeweet.org

16 years ago by Graham Kelly — view source

unread

Hi,

I played with a patched version of compare_function with this optimization
in it. Other then for strings it didn't seem to make much of a difference. A
quick test showed that in an average php application only about 0.15% of
calls to compare_function actually have the same op1 and op2. Considering it
doesn't make much of a difference for anything except string comparison (and
maybe arrays?) it probably isnt worth putting it in compare_function. This
certinally is not a common case and would only serve to add an extra
needless comparison and branch to the other 99% of calls to
compare_function. Also, it seems as if in most cases this type of behavior
could be detected and handled appropriatly at compile time by a good
optimizer.

However, while the common compare_function call probably wouldent gain
anything from this, maybe it could be put in the TYPE_PAIR(IS_STRING,
IS_STRING) case or put into zendi_smart_strcmp (which only seems to ever be
called by the TYPE_PAIR(IS_STRING, IS_STRING) case of compare_function).

I havent played with the patched version of zend_hash_compare to see if it
might provide any real world savings. Though it appears as if that function
might see a significant improvement for these cases.

~ Graham Kelly

On Tue, Nov 4, 2008 at 4:53 PM, Lukas Kahwe Smith mls@pooteeweet.orgwrote:

Hello again,

once again top posting this this is fairly old ..
however for something that is sounding so promising I am wondering why this
hasnt been picked up ..

regards,
Lukas
Hi,

I think zend_hash_compare could get a healthy speed boost in some
cases if first it would check whether the two variables passed to it
are actually the same (because of "reference counting"). Sorry, my C
skills are way too rusty to write the patch which is likely to be just
a few lines long.

A quick patch/test seems to agree with you, I'd be interested to know if
you/others are able to apply this patch and see any real-world savings with
your web application. The following was done with a debug build of PHP so
results are likely exaggerated. I'm also using non-recursive array with 256
byte strings below, results with numeric values are less significant of
course (approx a 25% gain)...

I'll also look into applying this to some other functions like
compare_function where it's likely to yield a larger savings in a general
application.

patch against php-5.2 CVS head

iff --git a/Zend/zend_hash.c b/Zend/zend_hash.c
index a1d7071..d11785f 100644
--- a/Zend/zend_hash.c
+++ b/Zend/zend_hash.c
@@ -1327,16 +1327,14 @@ ZEND_API int zend_hash_compare(HashTable *ht1,
HashTable *ht2, compare_func_t co
IS_CONSISTENT(ht1);
IS_CONSISTENT(ht2);
  HASH_PROTECT_RECURSION(ht1);
  HASH_PROTECT_RECURSION(ht2);
result = ht1->nNumOfElements - ht2->nNumOfElements;
  if (result!=0) {
          HASH_UNPROTECT_RECURSION(ht1);
          HASH_UNPROTECT_RECURSION(ht2);
  if (ht1 == ht2 || result!=0) {
        return result;
}
  HASH_PROTECT_RECURSION(ht1);
  HASH_PROTECT_RECURSION(ht2);
p1 = ht1->pListHead;
if (ordered) {
        p2 = ht2->pListHead;
simple test script

<?php

$arr1 = array();
for ($i=0; $i < 10000; $i++) {
$arr1[$i] = str_repeat('x', 256);
}
$arr2 = array();
for ($i=0; $i < 10000; $i++) {
$arr2[$i] = str_repeat('x', 256);
}
$arr3 = $arr1;

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr2) {
$count++;
}
}
$stop = microtime(true);
echo "different array time: ".($stop-$start)."\n";
echo "count: $count \n";

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr3) {
$count++;
}
}
$stop = microtime(true);
echo "identical array time: ".($stop-$start)."\n";
echo "count: $count \n";

(un-patched php build)
shire@shirebook:~/data/php/git/php$ ./sapi/cli/php.vanilla test.php
different array time: 4.2019698619843
count: 1000
identical array time: 2.4957029819489
count: 1000

(patched build)
shire@shirebook:~/data/php/git/php$ ./sapi/cli/php test.php
different array time: 4.059928894043
count: 1000
identical array time: 0.00043511390686035
count: 1000

-shire

--
Lukas Kahwe Smith
mls@pooteeweet.org

16 years ago by shire — view source

unread

Hi,

I played with a patched version of compare_function with this
optimization in it. Other then for strings it didn't seem to make
much of a difference. A quick test showed that in an average php
application only about 0.15% of calls to compare_function actually
have the same op1 and op2. Considering it doesn't make much of a
difference for anything except string comparison (and maybe arrays?)
it probably isnt worth putting it in compare_function. This
certinally is not a common case and would only serve to add an extra
needless comparison and branch to the other 99% of calls to
compare_function. Also, it seems as if in most cases this type of
behavior could be detected and handled appropriatly at compile time
by a good optimizer.

I started looking at this briefly, and agree with what you describe
above. I was curious to see if anyone came back with any real world
applications where they where hitting this scenario but it seems
unlikely.

However, while the common compare_function call probably wouldent
gain anything from this, maybe it could be put in the
TYPE_PAIR(IS_STRING, IS_STRING) case or put into zendi_smart_strcmp
(which only seems to ever be called by the TYPE_PAIR(IS_STRING,
IS_STRING) case of compare_function).

I havent played with the patched version of zend_hash_compare to see
if it might provide any real world savings. Though it appears as if
that function might see a significant improvement for these cases.

Arrays and strings do seem to be the only real worth while comparisons
for this type of optimization, but only because the costs of comparing
long strings and large array lists make the savings more significant.
I think it might be hard to see this in a real world application
though, so it would be nice if someone could justify the change by
showing test results. I'll try to bench Facebook's code base to see
what we can get out of it.

-shire

16 years ago by Kenan R Sulayman — view source

unread

Hi,
I'd completely agree with that patch - it's very useful for me since I only
build applications based on arrays;
I'll try to apply some other modifications - and maybe we got it ;-D

benchmarking the app core-

Your,

Life's Live Poetry

16 years ago by shire — view source

unread

However, while the common compare_function call probably wouldent
gain anything from this, maybe it could be put in the
TYPE_PAIR(IS_STRING, IS_STRING) case or put into zendi_smart_strcmp
(which only seems to ever be called by the TYPE_PAIR(IS_STRING,
IS_STRING) case of compare_function).

I havent played with the patched version of zend_hash_compare to
see if it might provide any real world savings. Though it appears
as if that function might see a significant improvement for these
cases.

Arrays and strings do seem to be the only real worth while
comparisons for this type of optimization, but only because the
costs of comparing long strings and large array lists make the
savings more significant. I think it might be hard to see this in a
real world application though, so it would be nice if someone could
justify the change by showing test results. I'll try to bench
Facebook's code base to see what we can get out of it.

I just ran a quick benchmark on this and I'm not seeing a significant
real world change for Facebook's codebase. (definitely less than 1%)
This was a pretty small test, without a lot of code execution, so I
could see other applications doing better. I'm pretty neutral on this
one, it's not a really big change so might be worth adding if it's
going to give a few applications out there a gain, but I couldn't see
doing this everywhere of course.

-shire

16 years ago by Stan Vassilev | FM — view source

unread

I just ran a quick benchmark on this and I'm not seeing a significant
real world change for Facebook's codebase. (definitely less than 1%)
This was a pretty small test, without a lot of code execution, so I could
see other applications doing better. I'm pretty neutral on this one,
it's not a really big change so might be worth adding if it's going to
give a few applications out there a gain, but I couldn't see doing this
everywhere of course.

-shire

Hi,

Something that could give arrays a boost would be to:

pool all string literals in each file
give each stringl iteral an id (unique in the loaded environment)
bind early all array access with string literals (a common occurence), so
they don't need to be resolved with a hashmap lookup, but rather, a direct
stirng literal id lookup.

The benefit of this approach are:

There's no need to generate a hashmap in the first place
String literal id-s are unique integers and have no conflicts, so no need
to do conflict resolution.

That would help with objects as well.

Regards,
Stan Vassilev

16 years ago by shire — view source

unread

I just ran a quick benchmark on this and I'm not seeing a
significant real world change for Facebook's codebase. (definitely
less than 1%) This was a pretty small test, without a lot of code
execution, so I could see other applications doing better. I'm
pretty neutral on this one, it's not a really big change so might
be worth adding if it's going to give a few applications out there
a gain, but I couldn't see doing this everywhere of course.

-shire

Hi,

Something that could give arrays a boost would be to:

pool all string literals in each file

give each stringl iteral an id (unique in the loaded environment)

bind early all array access with string literals (a common
occurence), so they don't need to be resolved with a hashmap lookup,
but rather, a direct stirng literal id lookup.

The benefit of this approach are:

There's no need to generate a hashmap in the first place

String literal id-s are unique integers and have no conflicts, so
no need to do conflict resolution.

This wouldn't really help with the case here of "if ($array1 ==
$array2)..." though right? (not to say it's not good, just making
sure I understand ;-) ).

It sounds like this would only work if the array contents where static
though, as you're mapping a constant string to the contents of the
hash (or did I misunderstand and you'd be mapping string const. values
to hash IDs?).

I am at the point now where my #1 item CPU wise is my hash function.
I'm working on finding a slightly faster implementation, but I'm also
playing with having a hash value stored in every zval so that the hash
function call can be skipped if we've already ran it once. The
problem that comes into play here is that there's no central code to
update or clear the hash if the string or contents of the zval
change. Of course, doing this with constant values would be easier to
accomplish and could be done at compile time, and like you say this is
a pretty common scenario.

-shire

16 years ago by Stan Vassilev | FM — view source

unread

This wouldn't really help with the case here of "if ($array1 ==
$array2)..." though right? (not to say it's not good, just making sure I
understand ;-) ).

Yes I'm talking about speeding up scenarios full of hash lookups in general.

It sounds like this would only work if the array contents where static
though, as you're mapping a constant string to the contents of the hash
(or did I misunderstand and you'd be mapping string const. values to hash
IDs?).

My point is, replacing this process:

$a['foo'] or $a->foo -> compute hash of 'foo' -> find item for hash 'foo' ->
many items? -> resolve conflict -> return item

With this process:

$a[% string_literal_id_5 %] -> lookup item key 5 for array -> return item

Notice we skipped hash generation, and conflict resolution altogether. We
only have the lookup for the integer id.
If some additional work is done, even this lookup can be eliminated and make
this an O(1) process.

If instead the coder used variable:

$a[$bar] or $a->$foo (var array lookup and var var object lookup), then this
optimization can't kick in, and the existing algorithm will be used.

However "static" access is the predominant usage, especially for objects,
but also for arrays, so this should have significant impact.

Regards, Stan Vassilev

16 years ago by shire — view source

unread

This wouldn't really help with the case here of "if ($array1 ==
$array2)..." though right? (not to say it's not good, just making
sure I understand ;-) ).

Yes I'm talking about speeding up scenarios full of hash lookups in
general.

Ok, still seems though like this particular optimization might also
provide additional benefits (albeit perhaps not nearly as significant).

It sounds like this would only work if the array contents where
static though, as you're mapping a constant string to the contents
of the hash (or did I misunderstand and you'd be mapping string
const. values to hash IDs?).

My point is, replacing this process:

$a['foo'] or $a->foo -> compute hash of 'foo' -> find item for hash
'foo' -> many items? -> resolve conflict -> return item

With this process:

$a[% string_literal_id_5 %] -> lookup item key 5 for array -> return
item

Notice we skipped hash generation, and conflict resolution
altogether. We only have the lookup for the integer id.
If some additional work is done, even this lookup can be eliminated
and make this an O(1) process.

If instead the coder used variable:

$a[$bar] or $a->$foo (var array lookup and var var object lookup),
then this optimization can't kick in, and the existing algorithm
will be used.

However "static" access is the predominant usage, especially for
objects, but also for arrays, so this should have significant impact.

Thanks for the clarification, this is pretty much the same idea as
what I've been interested in working on next. I think I was more
inclined to store an extra hash value within the zvals themselves,
with the hope that this could be expanded to non-constant values. I
believe ruby implements it's lookups this way (noted just for
reference, not as an argument to copy another language ;-) ). Any
thoughts on reasons not to do this (other than increasing the size of
zval struct), it's pretty simple to implement this for static values I
believe, dynamic values are a lot more difficult obviously...

-shire

16 years ago by Ronald Chmara — view source

unread

It sounds like this would only work if the array contents where
static though, as you're mapping a constant string to the
contents of the hash (or did I misunderstand and you'd be
mapping string const. values to hash IDs?).
My point is, replacing this process:
$a['foo'] or $a->foo -> compute hash of 'foo' -> find item for
hash 'foo' -> many items? -> resolve conflict -> return item
With this process:
$a[% string_literal_id_5 %] -> lookup item key 5 for array ->
return item
Notice we skipped hash generation, and conflict resolution
altogether. We only have the lookup for the integer id.
If some additional work is done, even this lookup can be
eliminated and make this an O(1) process.
If instead the coder used variable:
$a[$bar] or $a->$foo (var array lookup and var var object lookup),
then this optimization can't kick in, and the existing algorithm
will be used.
However "static" access is the predominant usage, especially for
objects, but also for arrays, so this should have significant impact.

Thanks for the clarification, this is pretty much the same idea as
what I've been interested in working on next. I think I was more
inclined to store an extra hash value within the zvals themselves,
with the hope that this could be expanded to non-constant values.
I believe ruby implements it's lookups this way (noted just for
reference, not as an argument to copy another language ;-) ). Any
thoughts on reasons not to do this (other than increasing the size
of zval struct), it's pretty simple to implement this for static
values I believe, dynamic values are a lot more difficult obviously...

Since nobody else has chimed in with the obvious (to me, anyways):

I've worked with some code that uses disgustingly huge (>512Mb)
arrays, largest implementation was a single 2.5 Gb array (before we
took the offending programmer into a room and had a... chat).

I'd be interested in seeing some metrics on the needed extra CPU
ticks for determining if an array (or array sub-element) is static or
dynamic under the scheme, as well as the extra memory for storing an
(many?) extra value(s). It sounds like it might be totally livable,
if done right... done wrong, we could be looking at millions of CPU
hits for checking millions of single element static arrays.... (and
yes, storing millions of values as single element arrays is "doing it
wrong", but I've learned not to underestimate the creativity of
people who write software).

Oh, and while we're at it, what about "re-assigning" "static" arrays?

The idea sounds good, the corner-cases on mis-implementations are
where it always becomes amusing.

-Bop

16 years ago by Stan Vassilev | FM — view source

unread

Since nobody else has chimed in with the obvious (to me, anyways):

I've worked with some code that uses disgustingly huge (>512Mb) arrays,
largest implementation was a single 2.5 Gb array (before we took the
offending programmer into a room and had a... chat).

I'd be interested in seeing some metrics on the needed extra CPU ticks
for determining if an array (or array sub-element) is static or dynamic
under the scheme, as well as the extra memory for storing an (many?)
extra value(s).

That's not exactly what I had in mind: I don't split whole arrays into
static and dynamic (this could be an entirely orthogonal optimization to
what I'm saying, similar to what Chrome's V8 engine does currently).

We just know a lot more at compile time about member lookups that are
defined by a string literal:

$obj->foo(), $obj->foo and $arr['foo']

as opposed to fully dynamic accesses which we don't know at parse time:

$obj->$bar(), $obj->$bar and $arr[$bar]

And we can optimize based on that. I've not spent significant time with the
PHP source code yet, to be honest, but if no one is willing to look at it, I
could start tinkering with it in my free time and post patches if I arrive
at anything. Then we can test how it impacts real world code.

Regards, Stan Vassilev

An optimization idea

patch against php-5.2 CVS head

simple test script

$count = 0; $start = microtime(true); for ($i = 0; $i < 1000; $i++) { if ($arr1 == $arr3) { $count++; } } $stop = microtime(true); echo "identical array time: ".($stop-$start)."\n"; echo "count: $count \n";

patch against php-5.2 CVS head

simple test script

$count = 0; $start = microtime(true); for ($i = 0; $i < 1000; $i++) { if ($arr1 == $arr3) { $count++; } } $stop = microtime(true); echo "identical array time: ".($stop-$start)."\n"; echo "count: $count \n";

patch against php-5.2 CVS head

simple test script

$count = 0; $start = microtime(true); for ($i = 0; $i < 1000; $i++) { if ($arr1 == $arr3) { $count++; } } $stop = microtime(true); echo "identical array time: ".($stop-$start)."\n"; echo "count: $count \n";

Your,

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr3) {
$count++;
}
}
$stop = microtime(true);
echo "identical array time: ".($stop-$start)."\n";
echo "count: $count \n";

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr3) {
$count++;
}
}
$stop = microtime(true);
echo "identical array time: ".($stop-$start)."\n";
echo "count: $count \n";

$count = 0;
$start = microtime(true);
for ($i = 0; $i < 1000; $i++) {
if ($arr1 == $arr3) {
$count++;
}
}
$stop = microtime(true);
echo "identical array time: ".($stop-$start)."\n";
echo "count: $count \n";