I'd like to show you my recent work on a jumptable optimization for switches.
https://github.com/php/php-src/pull/1048 https://github.com/php/php-src/pull/1048
It is a fully transparent optimization of switches, by putting a new ZEND_SWITCH opcode at the top of the switch in case we can build a jumptable, which is the case if the cases are only scalars (no doubles) or evaluate to them at compile-time.
Switches tend to be big sometimes with a lot of static (literals or constants usually) cases, which was a comparison for each case, and now just a simple hashtable lookup is needed, which greatly improves performance. In case where the result of the switch can be determined at compile-time, it even replaces the comparisons with a simple ZEND_JMP opcode.
In synthetic benchmarks the results are mind blowing, but in practice, benefits mostly are stateful userland parsers (often called with sometimes big switches), where it makes it a few percent faster. (For more concrete numbers see https://github.com/php/php-src/pull/1048#issuecomment-73032647)
As the optimization is only done if there are 3 or more cases, the lookup is always faster than the single comparisons. So there's no negative performance impact.
It already works, except with opcache where the CFG optimization pass still lacks support. Dmitry or Laruence volunteered to add support for opcache.
I'd like to know if there are any technical objections to the PR. If not, they will add opcache support and merge it in.
Thanks,
Bob
I'd like to show you my recent work on a jumptable optimization for switches.
https://github.com/php/php-src/pull/1048 https://github.com/php/php-src/pull/1048
It is a fully transparent optimization of switches, by putting a new ZEND_SWITCH opcode at the top of the switch in case we can build a jumptable, which is the case if the cases are only scalars (no doubles) or evaluate to them at compile-time.
Switches tend to be big sometimes with a lot of static (literals or constants usually) cases, which was a comparison for each case, and now just a simple hashtable lookup is needed, which greatly improves performance. In case where the result of the switch can be determined at compile-time, it even replaces the comparisons with a simple ZEND_JMP opcode.
In synthetic benchmarks the results are mind blowing, but in practice, benefits mostly are stateful userland parsers (often called with sometimes big switches), where it makes it a few percent faster. (For more concrete numbers see https://github.com/php/php-src/pull/1048#issuecomment-73032647)
As the optimization is only done if there are 3 or more cases, the lookup is always faster than the single comparisons. So there's no negative performance impact.
It already works, except with opcache where the CFG optimization pass still lacks support. Dmitry or Laruence volunteered to add support for opcache.
I'd like to know if there are any technical objections to the PR. If not, they will add opcache support and merge it in.
It looks ok to me if Xinchen or Dmitry gets the opcache support working
for it and the feature is stabilized in time. I gave it a try and it is
segfaulting for me (no opcache):
==5612== Command: sapi/cli/php -n jump.php
==5612==
==5612== Invalid read of size 4
==5612== at 0xA64637: _zend_is_inconsistent (zend_hash.c:44)
==5612== by 0xA65749: _zend_hash_add_or_update_i (zend_hash.c:427)
==5612== by 0xA65AF9: _zend_hash_add (zend_hash.c:489)
==5612== by 0xA37E28: zend_hash_add_ptr (zend_hash.h:458)
==5612== by 0xA39B0A: zend_hash_add_constant (zend_constants.c:486)
==5612== by 0xA39D01: zend_register_constant (zend_constants.c:523)
==5612== by 0xA389E9: zend_register_long_constant (zend_constants.c:194)
==5612== by 0xA3855C: zend_register_standard_constants
(zend_constants.c:116)
==5612== by 0xA52D6E: zend_startup (zend.c:671)
==5612== by 0x9C7FAA: php_module_startup (main.c:2099)
==5612== by 0xAFA460: php_cli_startup (php_cli.c:421)
==5612== by 0xAFC4D6: main (php_cli.c:1335)
==5612== Address 0x8 is not stack'd, malloc'd or (recently) free'd
And jump.php is:
<?php
$a = 1;
switch($a) {
case 0: echo 0; break;
case 1: echo 1; break;
case 2: echo 1; break;
case 3: echo 1; break;
case 4: echo 1; break;
case 5: echo 1; break;
case 6: echo 1; break;
case 7: echo 1; break;
case 8: echo 1; break;
case 9: echo 1; break;
}
-Rasmus
Am 17.02.2015 um 00:30 schrieb Rasmus Lerdorf rasmus@lerdorf.com:
I'd like to show you my recent work on a jumptable optimization for switches.
https://github.com/php/php-src/pull/1048 https://github.com/php/php-src/pull/1048
It is a fully transparent optimization of switches, by putting a new ZEND_SWITCH opcode at the top of the switch in case we can build a jumptable, which is the case if the cases are only scalars (no doubles) or evaluate to them at compile-time.
Switches tend to be big sometimes with a lot of static (literals or constants usually) cases, which was a comparison for each case, and now just a simple hashtable lookup is needed, which greatly improves performance. In case where the result of the switch can be determined at compile-time, it even replaces the comparisons with a simple ZEND_JMP opcode.
In synthetic benchmarks the results are mind blowing, but in practice, benefits mostly are stateful userland parsers (often called with sometimes big switches), where it makes it a few percent faster. (For more concrete numbers see https://github.com/php/php-src/pull/1048#issuecomment-73032647)
As the optimization is only done if there are 3 or more cases, the lookup is always faster than the single comparisons. So there's no negative performance impact.
It already works, except with opcache where the CFG optimization pass still lacks support. Dmitry or Laruence volunteered to add support for opcache.
I'd like to know if there are any technical objections to the PR. If not, they will add opcache support and merge it in.
It looks ok to me if Xinchen or Dmitry gets the opcache support working
for it and the feature is stabilized in time. I gave it a try and it is
segfaulting for me (no opcache):==5612== Command: sapi/cli/php -n jump.php
==5612==
==5612== Invalid read of size 4
==5612== at 0xA64637: _zend_is_inconsistent (zend_hash.c:44)
==5612== by 0xA65749: _zend_hash_add_or_update_i (zend_hash.c:427)
==5612== by 0xA65AF9: _zend_hash_add (zend_hash.c:489)
==5612== by 0xA37E28: zend_hash_add_ptr (zend_hash.h:458)
==5612== by 0xA39B0A: zend_hash_add_constant (zend_constants.c:486)
==5612== by 0xA39D01: zend_register_constant (zend_constants.c:523)
==5612== by 0xA389E9: zend_register_long_constant (zend_constants.c:194)
==5612== by 0xA3855C: zend_register_standard_constants
(zend_constants.c:116)
==5612== by 0xA52D6E: zend_startup (zend.c:671)
==5612== by 0x9C7FAA: php_module_startup (main.c:2099)
==5612== by 0xAFA460: php_cli_startup (php_cli.c:421)
==5612== by 0xAFC4D6: main (php_cli.c:1335)
==5612== Address 0x8 is not stack'd, malloc'd or (recently) free'dAnd jump.php is:
<?php
$a = 1;
switch($a) {
case 0: echo 0; break;
case 1: echo 1; break;
case 2: echo 1; break;
case 3: echo 1; break;
case 4: echo 1; break;
case 5: echo 1; break;
case 6: echo 1; break;
case 7: echo 1; break;
case 8: echo 1; break;
case 9: echo 1; break;
}-Rasmus
Hey,
this doesn't really look related to my patch. Did you do something wrong when cloning my branch? Or forget a make clean or similar?
Locally it works for me and echoes 1 as expected.
Bob
Am 17.02.2015 um 00:30 schrieb Rasmus Lerdorf <rasmus@lerdorf.com
mailto:rasmus@lerdorf.com>:I'd like to show you my recent work on a jumptable optimization for
switches.https://github.com/php/php-src/pull/1048
https://github.com/php/php-src/pull/1048It is a fully transparent optimization of switches, by putting a new
ZEND_SWITCH opcode at the top of the switch in case we can build a
jumptable, which is the case if the cases are only scalars (no
doubles) or evaluate to them at compile-time.Switches tend to be big sometimes with a lot of static (literals or
constants usually) cases, which was a comparison for each case, and
now just a simple hashtable lookup is needed, which greatly improves
performance. In case where the result of the switch can be determined
at compile-time, it even replaces the comparisons with a simple
ZEND_JMP opcode.In synthetic benchmarks the results are mind blowing, but in
practice, benefits mostly are stateful userland parsers (often called
with sometimes big switches), where it makes it a few percent faster.
(For more concrete numbers see
https://github.com/php/php-src/pull/1048#issuecomment-73032647)As the optimization is only done if there are 3 or more cases, the
lookup is always faster than the single comparisons. So there's no
negative performance impact.It already works, except with opcache where the CFG optimization pass
still lacks support. Dmitry or Laruence volunteered to add support
for opcache.I'd like to know if there are any technical objections to the PR. If
not, they will add opcache support and merge it in.It looks ok to me if Xinchen or Dmitry gets the opcache support working
for it and the feature is stabilized in time. I gave it a try and it is
segfaulting for me (no opcache):==5612== Command: sapi/cli/php -n jump.php
==5612==
==5612== Invalid read of size 4
==5612== at 0xA64637: _zend_is_inconsistent (zend_hash.c:44)
==5612== by 0xA65749: _zend_hash_add_or_update_i (zend_hash.c:427)
==5612== by 0xA65AF9: _zend_hash_add (zend_hash.c:489)
==5612== by 0xA37E28: zend_hash_add_ptr (zend_hash.h:458)
==5612== by 0xA39B0A: zend_hash_add_constant (zend_constants.c:486)
==5612== by 0xA39D01: zend_register_constant (zend_constants.c:523)
==5612== by 0xA389E9: zend_register_long_constant
(zend_constants.c:194)
==5612== by 0xA3855C: zend_register_standard_constants
(zend_constants.c:116)
==5612== by 0xA52D6E: zend_startup (zend.c:671)
==5612== by 0x9C7FAA: php_module_startup (main.c:2099)
==5612== by 0xAFA460: php_cli_startup (php_cli.c:421)
==5612== by 0xAFC4D6: main (php_cli.c:1335)
==5612== Address 0x8 is not stack'd, malloc'd or (recently) free'dAnd jump.php is:
<?php
$a = 1;
switch($a) {
case 0: echo 0; break;
case 1: echo 1; break;
case 2: echo 1; break;
case 3: echo 1; break;
case 4: echo 1; break;
case 5: echo 1; break;
case 6: echo 1; break;
case 7: echo 1; break;
case 8: echo 1; break;
case 9: echo 1; break;
}-Rasmus
Hey,
this doesn't really look related to my patch. Did you do something wrong
when cloning my branch? Or forget a make clean or similar?Locally it works for me and echoes 1 as expected.
Ah, looks like you are right. A full distclean cleared it up. I had just
done a distclean, but I had built once before applying your patch.
Crappy Makefile deps..
-Rasmus
this doesn't really look related to my patch. Did you do something wrong
when cloning my branch? Or forget a make clean or similar?
Locally it works for me and echoes 1 as expected.
Ah, looks like you are right. A full distclean cleared it up. I had just
done a distclean, but I had built once before applying your patch.
Crappy Makefile deps..
It seems not even running distclean clears down everything that needs
removing for some of the changes currently going through. ./vcsclean
does not work for me since I'm running Hg locally, but an hg purge /all
did clean everything up, so obviously there is something in the
.gitignore file that should also be in the distclean list? Or should
distclean actually be using .gitignore as it's 'clean' list?
Anyway I now have a working build of PHP7 again having wasted a couple
more days trying to track down what was wrong.
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Am 17.02.2015 um 10:51 schrieb Lester Caine:
It seems not even running distclean clears down everything that needs
removing for some of the changes currently going through. ./vcsclean
I stopped using distclean / vcsclean a while ago. "git clean -xdf"
does the trick for me.
It seems not even running distclean clears down everything that needs
removing for some of the changes currently going through. ./vcsclean
I stopped using distclean / vcsclean a while ago. "git clean -xdf"
does the trick for me.
hg purge /all needed a little help to hide the bits that I need for thg
and eclipse, but I seem to have that under control now. So it's just a
right click function in thg.
All that remains is adding back in extensions like imagick which I'd
like in the ext folder but which git/hg want from a different repo.
Ideally ext would just be a list of sub-repo's one can switch on and off
rather than having to list each in the ./configure string. I'm using a
local process via hg to merge the different git elements so that the IDE
sees all the cross references in the one code tree.
On that side, I don't enable the mysql stuff normally and while mysqlnd
is compiling happily as a shared extension it looks like I can't load it
as it is not actually a valid extension? Is that a bug or is the fact
it's allowed to select 'shared' in ./configure the bug?
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
I'd like to show you my recent work on a jumptable optimization for
switches.https://github.com/php/php-src/pull/1048 <
https://github.com/php/php-src/pull/1048>It is a fully transparent optimization of switches, by putting a new
ZEND_SWITCH opcode at the top of the switch in case we can build a
jumptable, which is the case if the cases are only scalars (no doubles) or
evaluate to them at compile-time.Switches tend to be big sometimes with a lot of static (literals or
constants usually) cases, which was a comparison for each case, and now
just a simple hashtable lookup is needed, which greatly improves
performance. In case where the result of the switch can be determined at
compile-time, it even replaces the comparisons with a simple ZEND_JMP
opcode.In synthetic benchmarks the results are mind blowing, but in practice,
benefits mostly are stateful userland parsers (often called with sometimes
big switches), where it makes it a few percent faster. (For more concrete
numbers see https://github.com/php/php-src/pull/1048#issuecomment-73032647
)As the optimization is only done if there are 3 or more cases, the lookup
is always faster than the single comparisons. So there's no negative
performance impact.It already works, except with opcache where the CFG optimization pass
still lacks support. Dmitry or Laruence volunteered to add support for
opcache.I'd like to know if there are any technical objections to the PR. If not,
they will add opcache support and merge it in.Thanks,
Bob
The patch looks interesting.
We will implement support for opcache and then analyze its impact on
performance.
I see it as a self-containing feature that won't affect anyone (except
opcache and may be debuggers).
I'm not going to work on it in nearest days or weeks, because of more
critical tasks.
Thanks. Dmitry.
Very clever ... thanks bob :)
Cheers
Joe
I'd like to show you my recent work on a jumptable optimization for
switches.https://github.com/php/php-src/pull/1048 <
https://github.com/php/php-src/pull/1048>It is a fully transparent optimization of switches, by putting a new
ZEND_SWITCH opcode at the top of the switch in case we can build a
jumptable, which is the case if the cases are only scalars (no doubles)
or
evaluate to them at compile-time.Switches tend to be big sometimes with a lot of static (literals or
constants usually) cases, which was a comparison for each case, and now
just a simple hashtable lookup is needed, which greatly improves
performance. In case where the result of the switch can be determined at
compile-time, it even replaces the comparisons with a simple ZEND_JMP
opcode.In synthetic benchmarks the results are mind blowing, but in practice,
benefits mostly are stateful userland parsers (often called with
sometimes
big switches), where it makes it a few percent faster. (For more concrete
numbers see
https://github.com/php/php-src/pull/1048#issuecomment-73032647
)As the optimization is only done if there are 3 or more cases, the lookup
is always faster than the single comparisons. So there's no negative
performance impact.It already works, except with opcache where the CFG optimization pass
still lacks support. Dmitry or Laruence volunteered to add support for
opcache.I'd like to know if there are any technical objections to the PR. If not,
they will add opcache support and merge it in.Thanks,
BobThe patch looks interesting.
We will implement support for opcache and then analyze its impact on
performance.
I see it as a self-containing feature that won't affect anyone (except
opcache and may be debuggers).
I'm not going to work on it in nearest days or weeks, because of more
critical tasks.Thanks. Dmitry.