The goal of this message is to encourage and motivate a few people to
give me a hand with tracking down APC bugs. There are still a few
outstanding bugs that is slowing PHP 5.4 adoption and it would be really
nice to plow through these instead of wasting cycles on some of the
other discussions going on here.
The example bug I will use is one I just fixed. Bug http://php.net/62398
which was that with apc.stat=0 the second request (eg. the cached
request) to a certain script would have include filenames replaced with
an empty string.
Clue #1 here is that we are dealing with string literals and those
obviously changed a lot in 5.4.
Clue #2 I got lucky with. I was able to reproduce it locally which makes
everything much simpler because then Valgrind can do most of the
thinking for us.
So, knowing it has to do with string literals and taking my reproduce
case, I first eliminated all non-php code by using the built-in web
server in 5.4. I have a little memcheck shell script that runs Valgrind
with my preferred options. It looks like this:
#!/bin/bash
USE_ZEND_ALLOC=0 valgrind --tool=memcheck
--suppressions=/home/rasmus/.suppressions --leak-check=yes
--track-origins=yes --num-callers
=30 --show-reachable=yes "$@"
And my .suppressions file can be found here:
http://lerdorf.com/suppressions.txt
Then I did:
cd /var/www/docroot
memcheck php -S localhost:8888
and hit port 8888 with my browser.
Valgrind essentially told me exactly what was wrong:
==30123== Conditional jump or move depends on uninitialised value(s)
==30123== at 0x9B35B5: ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER
(zend_vm_execute.h:2535)
==30123== by 0x9AA5E1: execute (zend_vm_execute.h:410)
==30123== by 0x96F115: zend_execute_scripts (zend.c:1279)
==30123== by 0x8E050F: php_execute_script (main.c:2473)
==30123== by 0xAAFC6E: php_cli_server_dispatch_script
(php_cli_server.c:1847)
==30123== by 0xAB0DA8: php_cli_server_dispatch (php_cli_server.c:2008)
==30123== by 0xAB14CE: php_cli_server_recv_event_read_request
(php_cli_server.c:2189)
==30123== by 0xAB17EF: php_cli_server_do_event_for_each_fd_callback
(php_cli_server.c:2275)
==30123== by 0xAACDBE: php_cli_server_poller_iter_on_active
(php_cli_server.c:793)
==30123== by 0xAB18AF: php_cli_server_do_event_for_each_fd
(php_cli_server.c:2301)
==30123== by 0xAB1903: php_cli_server_do_event_loop
(php_cli_server.c:2311)
==30123== by 0xAB1C7D: do_cli_server (php_cli_server.c:2412)
==30123== by 0xAA68C7: main (php_cli.c:1364)
==30123== Uninitialised value was created by a client request
==30123== at 0xD750FC7: apc_realpool_alloc (apc_pool.c:307)
==30123== by 0xD74B9D6: apc_copy_op_array (apc_compile.c:1271)
==30123== by 0xD74DAF1: apc_compile_cache_entry (apc_main.c:450)
==30123== by 0xD74E414: my_compile_file (apc_main.c:610)
==30123== by 0x6FAC25: phar_compile_file (phar.c:3391)
==30123== by 0x91768D: compile_filename (zend_language_scanner.l:625)
==30123== by 0x9B394C: ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER
(zend_vm_execute.h:2592)
==30123== by 0x9AA5E1: execute (zend_vm_execute.h:410)
==30123== by 0x96F115: zend_execute_scripts (zend.c:1279)
==30123== by 0x8E050F: php_execute_script (main.c:2473)
==30123== by 0xAAFC6E: php_cli_server_dispatch_script
(php_cli_server.c:1847)
==30123== by 0xAB0DA8: php_cli_server_dispatch (php_cli_server.c:2008)
==30123== by 0xAB14CE: php_cli_server_recv_event_read_request
(php_cli_server.c:2189)
==30123== by 0xAB17EF: php_cli_server_do_event_for_each_fd_callback
(php_cli_server.c:2275)
==30123== by 0xAACDBE: php_cli_server_poller_iter_on_active
(php_cli_server.c:793)
==30123== by 0xAB18AF: php_cli_server_do_event_for_each_fd
(php_cli_server.c:2301)
==30123== by 0xAB1903: php_cli_server_do_event_loop
(php_cli_server.c:2311)
==30123== by 0xAB1C7D: do_cli_server (php_cli_server.c:2412)
==30123== by 0xAA68C7: main (php_cli.c:1364)
zend_vm_execute.h:2535 looks like this:
if (inc_filename->type!=IS_STRING) {
and Valgrind is saying that the conditional depends on an uninitialised
value. Ok, that means inc_filename->type is somehow not set here.
Looking just above this line we see that inc_filename is just op1:
inc_filename = _get_zval_ptr_tmp(opline->op1.var, EX_Ts(), &free_op1
TSRMLS_CC);
Next, Valgrind even tells us where this unitialized value is coming from:
==30123== Uninitialised value was created by a client request
==30123== at 0xD750FC7: apc_realpool_alloc (apc_pool.c:307)
==30123== by 0xD74B9D6: apc_copy_op_array (apc_compile.c:1271)
Ok, so it is in memory allocated in apc_pool.c. That in itself is not
that interesting. Everything is in allocated pool memory. What is more
interesting is what happens to this allocated pool right after it is
allocated and we see the caller is apc_compile.c line 1271. Starting at
line 1270 that block of code looks like this:
#ifdef ZEND_ENGINE_2_4
dzo->op1.literal = (zend_literal*)
apc_pool_alloc(pool, sizeof(zend_literal));
Z_STRLEN_P(dzo->op1.zv) = strlen(fullpath);
Z_STRVAL_P(dzo->op1.zv) =
apc_pstrdup(fullpath, pool TSRMLS_CC);
Z_SET_REFCOUNT_P(dzo->op1.zv, 2);
Z_SET_ISREF_P(dzo->op1.zv);
dzo->op1.literal->hash_value =
zend_hash_func(Z_STRVAL_P(dzo->op1.zv), Z_STRLEN_P(dzo->op1.zv)+1);
#else
dzo->op1.u.constant.value.str.len =
strlen(fullpath);
dzo->op1.u.constant.value.str.val =
apc_pstrdup(fullpath, pool TSRMLS_CC);
#endif
So we can see from the ifdef that what used to just be a simple op1
string initialization before the new string literal optimization in 5.4
is now much more complicated. We generate op1 and calculate the hash.
But since we are now working with a naked zv instead of a string
constant, we need to set the type as Valgrind illustrated:
Z_TYPE_P(dzo->op1.zv) = IS_STRING;
and that indeed fixes the bug. Running memcheck again there are no
complaints from Valgrind (except for on server-shutdown, but those are
normal).
Usually these APC bugs are trickier than this. I got lucky on this one.
But the steps are similar, you may need to spend some time in gdb as
well to figure out why a certain variable ends up in a certain state or
why a section of code is or isn't being run.
-Rasmus
Forgive my ignorance, what is APC?
The goal of this message is to encourage and motivate a few people to
give me a hand with tracking down APC bugs. There are still a few
outstanding bugs that is slowing PHP 5.4 adoption and it would be really
nice to plow through these instead of wasting cycles on some of the
other discussions going on here.The example bug I will use is one I just fixed. Bug http://php.net/62398
which was that with apc.stat=0 the second request (eg. the cached
request) to a certain script would have include filenames replaced with
an empty string.Clue #1 here is that we are dealing with string literals and those
obviously changed a lot in 5.4.Clue #2 I got lucky with. I was able to reproduce it locally which makes
everything much simpler because then Valgrind can do most of the
thinking for us.So, knowing it has to do with string literals and taking my reproduce
case, I first eliminated all non-php code by using the built-in web
server in 5.4. I have a little memcheck shell script that runs Valgrind
with my preferred options. It looks like this:#!/bin/bash
USE_ZEND_ALLOC=0 valgrind --tool=memcheck
--suppressions=/home/rasmus/.suppressions --leak-check=yes
--track-origins=yes --num-callers
=30 --show-reachable=yes "$@"And my .suppressions file can be found here:
http://lerdorf.com/suppressions.txtThen I did:
cd /var/www/docroot
memcheck php -S localhost:8888and hit port 8888 with my browser.
Valgrind essentially told me exactly what was wrong:
==30123== Conditional jump or move depends on uninitialised value(s)
==30123== at 0x9B35B5: ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER
(zend_vm_execute.h:2535)
==30123== by 0x9AA5E1: execute (zend_vm_execute.h:410)
==30123== by 0x96F115: zend_execute_scripts (zend.c:1279)
==30123== by 0x8E050F: php_execute_script (main.c:2473)
==30123== by 0xAAFC6E: php_cli_server_dispatch_script
(php_cli_server.c:1847)
==30123== by 0xAB0DA8: php_cli_server_dispatch (php_cli_server.c:2008)
==30123== by 0xAB14CE: php_cli_server_recv_event_read_request
(php_cli_server.c:2189)
==30123== by 0xAB17EF: php_cli_server_do_event_for_each_fd_callback
(php_cli_server.c:2275)
==30123== by 0xAACDBE: php_cli_server_poller_iter_on_active
(php_cli_server.c:793)
==30123== by 0xAB18AF: php_cli_server_do_event_for_each_fd
(php_cli_server.c:2301)
==30123== by 0xAB1903: php_cli_server_do_event_loop
(php_cli_server.c:2311)
==30123== by 0xAB1C7D: do_cli_server (php_cli_server.c:2412)
==30123== by 0xAA68C7: main (php_cli.c:1364)==30123== Uninitialised value was created by a client request
==30123== at 0xD750FC7: apc_realpool_alloc (apc_pool.c:307)
==30123== by 0xD74B9D6: apc_copy_op_array (apc_compile.c:1271)
==30123== by 0xD74DAF1: apc_compile_cache_entry (apc_main.c:450)
==30123== by 0xD74E414: my_compile_file (apc_main.c:610)
==30123== by 0x6FAC25: phar_compile_file (phar.c:3391)
==30123== by 0x91768D: compile_filename (zend_language_scanner.l:625)
==30123== by 0x9B394C: ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER
(zend_vm_execute.h:2592)
==30123== by 0x9AA5E1: execute (zend_vm_execute.h:410)
==30123== by 0x96F115: zend_execute_scripts (zend.c:1279)
==30123== by 0x8E050F: php_execute_script (main.c:2473)
==30123== by 0xAAFC6E: php_cli_server_dispatch_script
(php_cli_server.c:1847)
==30123== by 0xAB0DA8: php_cli_server_dispatch (php_cli_server.c:2008)
==30123== by 0xAB14CE: php_cli_server_recv_event_read_request
(php_cli_server.c:2189)
==30123== by 0xAB17EF: php_cli_server_do_event_for_each_fd_callback
(php_cli_server.c:2275)
==30123== by 0xAACDBE: php_cli_server_poller_iter_on_active
(php_cli_server.c:793)
==30123== by 0xAB18AF: php_cli_server_do_event_for_each_fd
(php_cli_server.c:2301)
==30123== by 0xAB1903: php_cli_server_do_event_loop
(php_cli_server.c:2311)
==30123== by 0xAB1C7D: do_cli_server (php_cli_server.c:2412)
==30123== by 0xAA68C7: main (php_cli.c:1364)zend_vm_execute.h:2535 looks like this:
if (inc_filename->type!=IS_STRING) {
and Valgrind is saying that the conditional depends on an uninitialised
value. Ok, that means inc_filename->type is somehow not set here.
Looking just above this line we see that inc_filename is just op1:inc_filename = _get_zval_ptr_tmp(opline->op1.var, EX_Ts(), &free_op1
TSRMLS_CC);Next, Valgrind even tells us where this unitialized value is coming from:
==30123== Uninitialised value was created by a client request
==30123== at 0xD750FC7: apc_realpool_alloc (apc_pool.c:307)
==30123== by 0xD74B9D6: apc_copy_op_array (apc_compile.c:1271)Ok, so it is in memory allocated in apc_pool.c. That in itself is not
that interesting. Everything is in allocated pool memory. What is more
interesting is what happens to this allocated pool right after it is
allocated and we see the caller is apc_compile.c line 1271. Starting at
line 1270 that block of code looks like this:#ifdef ZEND_ENGINE_2_4
dzo->op1.literal = (zend_literal*)
apc_pool_alloc(pool, sizeof(zend_literal));
Z_STRLEN_P(dzo->op1.zv) = strlen(fullpath);
Z_STRVAL_P(dzo->op1.zv) =
apc_pstrdup(fullpath, pool TSRMLS_CC);
Z_SET_REFCOUNT_P(dzo->op1.zv, 2);
Z_SET_ISREF_P(dzo->op1.zv);
dzo->op1.literal->hash_value =
zend_hash_func(Z_STRVAL_P(dzo->op1.zv), Z_STRLEN_P(dzo->op1.zv)+1);
#else
dzo->op1.u.constant.value.str.len =
strlen(fullpath);
dzo->op1.u.constant.value.str.val =
apc_pstrdup(fullpath, pool TSRMLS_CC);
#endifSo we can see from the ifdef that what used to just be a simple op1
string initialization before the new string literal optimization in 5.4
is now much more complicated. We generate op1 and calculate the hash.
But since we are now working with a naked zv instead of a string
constant, we need to set the type as Valgrind illustrated:Z_TYPE_P(dzo->op1.zv) = IS_STRING;
and that indeed fixes the bug. Running memcheck again there are no
complaints from Valgrind (except for on server-shutdown, but those are
normal).Usually these APC bugs are trickier than this. I got lucky on this one.
But the steps are similar, you may need to spend some time in gdb as
well to figure out why a certain variable ends up in a certain state or
why a section of code is or isn't being run.-Rasmus
2012/7/19 Andrew Faulds ajfweb@googlemail.com:
Forgive my ignorance, what is APC?
Alternative PHP Cache
http://pecl.php.net/apc
http://php.net/apc
--
regards,
Kalle Sommer Nielsen
kalle@php.net
Ah, thanks. My mobile internet is horribly slow: But somehow Gmail works
pretty fast (constant connection??). Hence not Googling first.
2012/7/19 Andrew Faulds ajfweb@googlemail.com:
Forgive my ignorance, what is APC?
Alternative PHP Cache
http://pecl.php.net/apc
http://php.net/apc--
regards,Kalle Sommer Nielsen
kalle@php.net
The goal of this message is to encourage and motivate a few people to
give me a hand with tracking down APC bugs. There are still a few
outstanding bugs that is slowing PHP 5.4 adoption and it would be really
nice to plow through these instead of wasting cycles on some of the
other discussions going on here.
Heh Rasmus, that's an interesting step you take.
Thanks for explaining/sharing us your strategy, that in fact may help
ppl with APC bug fixing, as APC is all but an easy extension, and
requires deep Zend VM understanding.
Julien.P
The goal of this message is to encourage and motivate a few people to
give me a hand with tracking down APC bugs. There are still a few
outstanding bugs that is slowing PHP 5.4 adoption and it would be really
nice to plow through these instead of wasting cycles on some of the
other discussions going on here.
This type of content is exactly why I'm signed up on the list. I'll
not make any promises about APC work specifically (it seems a bit more
complex than the other extensions), but this information generalizes
to bug hunting in other extensions, too.
Thank you very much!
Adam