Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:61506 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 38337 invoked from network); 19 Jul 2012 18:14:53 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Jul 2012 18:14:53 -0000 Authentication-Results: pb1.pair.com header.from=rasmus@lerdorf.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=rasmus@lerdorf.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lerdorf.com from 209.85.216.42 cause and error) X-PHP-List-Original-Sender: rasmus@lerdorf.com X-Host-Fingerprint: 209.85.216.42 mail-qa0-f42.google.com Received: from [209.85.216.42] ([209.85.216.42:55285] helo=mail-qa0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 12/9B-18983-D9E48005 for ; Thu, 19 Jul 2012 14:14:53 -0400 Received: by qafi31 with SMTP id i31so4323970qaf.8 for ; Thu, 19 Jul 2012 11:14:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :x-enigmail-version:content-type:content-transfer-encoding :x-gm-message-state; bh=tyGUXIb4o5iuHAnoGHRuy6EMO9pniOz4sEANZqtANXg=; b=btkm78kbNf7Oda+c3ywIph89hPR4Ly4d8LEuvjHsrmvUuuAZkIIZ296mKCp6IWUMIL iiv3E2D4wuDmfWAq7+ZIzpWSxP0BuqCN5A4pTeLVhpTxxEkW1msFwfk0QY2D8V4Js68u q7m8WbZOLjWJiVpfO1xpKMnxAOQIVMnO+8u0gCm/NOYzgGfTO6VDHbq/SvFhepBbmX1h 10RiZYd7/e+0ovAi2jl9VzVXE04vCr6gplc999I8fFDvGMkxH1cOYGS3IfdtKsNIm74B +QQDmK8q6LVZoO1CNcbwhGFWKFFlpgrhv8OalaH0lnfiXULPzEhoSZwbwuyyLl4xysfS bFLg== Received: by 10.224.195.199 with SMTP id ed7mr5477753qab.22.1342721690374; Thu, 19 Jul 2012 11:14:50 -0700 (PDT) Received: from [172.16.26.30] ([38.106.64.245]) by mx.google.com with ESMTPS id bo5sm3510633qab.1.2012.07.19.11.14.48 (version=SSLv3 cipher=OTHER); Thu, 19 Jul 2012 11:14:49 -0700 (PDT) Message-ID: <50084E97.9030606@lerdorf.com> Date: Thu, 19 Jul 2012 11:14:47 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: PHP internals X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Gm-Message-State: ALoCoQltm9z7Tp/MNRjGcj9Iwxb4WQwHl9HQ0B6gxkLXGRklmurTrvvPuzP/RTONKi0/q8tWe64O Subject: How to track down an APC bug From: rasmus@lerdorf.com (Rasmus Lerdorf) The goal of this message is to encourage and motivate a few people to give me a hand with tracking down APC bugs. There are still a few outstanding bugs that is slowing PHP 5.4 adoption and it would be really nice to plow through these instead of wasting cycles on some of the other discussions going on here. The example bug I will use is one I just fixed. Bug http://php.net/62398 which was that with apc.stat=0 the second request (eg. the cached request) to a certain script would have include filenames replaced with an empty string. Clue #1 here is that we are dealing with string literals and those obviously changed a lot in 5.4. Clue #2 I got lucky with. I was able to reproduce it locally which makes everything much simpler because then Valgrind can do most of the thinking for us. So, knowing it has to do with string literals and taking my reproduce case, I first eliminated all non-php code by using the built-in web server in 5.4. I have a little memcheck shell script that runs Valgrind with my preferred options. It looks like this: #!/bin/bash USE_ZEND_ALLOC=0 valgrind --tool=memcheck --suppressions=/home/rasmus/.suppressions --leak-check=yes --track-origins=yes --num-callers =30 --show-reachable=yes "$@" And my .suppressions file can be found here: http://lerdorf.com/suppressions.txt Then I did: cd /var/www/docroot memcheck php -S localhost:8888 and hit port 8888 with my browser. Valgrind essentially told me exactly what was wrong: ==30123== Conditional jump or move depends on uninitialised value(s) ==30123== at 0x9B35B5: ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER (zend_vm_execute.h:2535) ==30123== by 0x9AA5E1: execute (zend_vm_execute.h:410) ==30123== by 0x96F115: zend_execute_scripts (zend.c:1279) ==30123== by 0x8E050F: php_execute_script (main.c:2473) ==30123== by 0xAAFC6E: php_cli_server_dispatch_script (php_cli_server.c:1847) ==30123== by 0xAB0DA8: php_cli_server_dispatch (php_cli_server.c:2008) ==30123== by 0xAB14CE: php_cli_server_recv_event_read_request (php_cli_server.c:2189) ==30123== by 0xAB17EF: php_cli_server_do_event_for_each_fd_callback (php_cli_server.c:2275) ==30123== by 0xAACDBE: php_cli_server_poller_iter_on_active (php_cli_server.c:793) ==30123== by 0xAB18AF: php_cli_server_do_event_for_each_fd (php_cli_server.c:2301) ==30123== by 0xAB1903: php_cli_server_do_event_loop (php_cli_server.c:2311) ==30123== by 0xAB1C7D: do_cli_server (php_cli_server.c:2412) ==30123== by 0xAA68C7: main (php_cli.c:1364) ==30123== Uninitialised value was created by a client request ==30123== at 0xD750FC7: apc_realpool_alloc (apc_pool.c:307) ==30123== by 0xD74B9D6: apc_copy_op_array (apc_compile.c:1271) ==30123== by 0xD74DAF1: apc_compile_cache_entry (apc_main.c:450) ==30123== by 0xD74E414: my_compile_file (apc_main.c:610) ==30123== by 0x6FAC25: phar_compile_file (phar.c:3391) ==30123== by 0x91768D: compile_filename (zend_language_scanner.l:625) ==30123== by 0x9B394C: ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER (zend_vm_execute.h:2592) ==30123== by 0x9AA5E1: execute (zend_vm_execute.h:410) ==30123== by 0x96F115: zend_execute_scripts (zend.c:1279) ==30123== by 0x8E050F: php_execute_script (main.c:2473) ==30123== by 0xAAFC6E: php_cli_server_dispatch_script (php_cli_server.c:1847) ==30123== by 0xAB0DA8: php_cli_server_dispatch (php_cli_server.c:2008) ==30123== by 0xAB14CE: php_cli_server_recv_event_read_request (php_cli_server.c:2189) ==30123== by 0xAB17EF: php_cli_server_do_event_for_each_fd_callback (php_cli_server.c:2275) ==30123== by 0xAACDBE: php_cli_server_poller_iter_on_active (php_cli_server.c:793) ==30123== by 0xAB18AF: php_cli_server_do_event_for_each_fd (php_cli_server.c:2301) ==30123== by 0xAB1903: php_cli_server_do_event_loop (php_cli_server.c:2311) ==30123== by 0xAB1C7D: do_cli_server (php_cli_server.c:2412) ==30123== by 0xAA68C7: main (php_cli.c:1364) zend_vm_execute.h:2535 looks like this: if (inc_filename->type!=IS_STRING) { and Valgrind is saying that the conditional depends on an uninitialised value. Ok, that means inc_filename->type is somehow not set here. Looking just above this line we see that inc_filename is just op1: inc_filename = _get_zval_ptr_tmp(opline->op1.var, EX_Ts(), &free_op1 TSRMLS_CC); Next, Valgrind even tells us where this unitialized value is coming from: ==30123== Uninitialised value was created by a client request ==30123== at 0xD750FC7: apc_realpool_alloc (apc_pool.c:307) ==30123== by 0xD74B9D6: apc_copy_op_array (apc_compile.c:1271) Ok, so it is in memory allocated in apc_pool.c. That in itself is not that interesting. Everything is in allocated pool memory. What is more interesting is what happens to this allocated pool right after it is allocated and we see the caller is apc_compile.c line 1271. Starting at line 1270 that block of code looks like this: #ifdef ZEND_ENGINE_2_4 dzo->op1.literal = (zend_literal*) apc_pool_alloc(pool, sizeof(zend_literal)); Z_STRLEN_P(dzo->op1.zv) = strlen(fullpath); Z_STRVAL_P(dzo->op1.zv) = apc_pstrdup(fullpath, pool TSRMLS_CC); Z_SET_REFCOUNT_P(dzo->op1.zv, 2); Z_SET_ISREF_P(dzo->op1.zv); dzo->op1.literal->hash_value = zend_hash_func(Z_STRVAL_P(dzo->op1.zv), Z_STRLEN_P(dzo->op1.zv)+1); #else dzo->op1.u.constant.value.str.len = strlen(fullpath); dzo->op1.u.constant.value.str.val = apc_pstrdup(fullpath, pool TSRMLS_CC); #endif So we can see from the ifdef that what used to just be a simple op1 string initialization before the new string literal optimization in 5.4 is now much more complicated. We generate op1 and calculate the hash. But since we are now working with a naked zv instead of a string constant, we need to set the type as Valgrind illustrated: Z_TYPE_P(dzo->op1.zv) = IS_STRING; and that indeed fixes the bug. Running memcheck again there are no complaints from Valgrind (except for on server-shutdown, but those are normal). Usually these APC bugs are trickier than this. I got lucky on this one. But the steps are similar, you may need to spend some time in gdb as well to figure out why a certain variable ends up in a certain state or why a section of code is or isn't being run. -Rasmus