Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63241 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 35669 invoked from network); 28 Sep 2012 16:31:55 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 28 Sep 2012 16:31:55 -0000 Authentication-Results: pb1.pair.com smtp.mail=vesselin.atanasov@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=vesselin.atanasov@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.212.182 as permitted sender) X-PHP-List-Original-Sender: vesselin.atanasov@gmail.com X-Host-Fingerprint: 209.85.212.182 mail-wi0-f182.google.com Received: from [209.85.212.182] ([209.85.212.182:47211] helo=mail-wi0-f182.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 48/F2-08845-AF0D5605 for ; Fri, 28 Sep 2012 12:31:55 -0400 Received: by wibhm2 with SMTP id hm2so20579wib.11 for ; Fri, 28 Sep 2012 09:31:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:subject:date:message-id:user-agent:mime-version :content-type:content-transfer-encoding; bh=PeC8of6qSzCOoevqF5u5opCCEF8b2DaORTh12vlNvoA=; b=FGGF+gIyvxJ7J/4U1cOM0nJpFCb9cpGSILIHny/qPqetwc99rkFknrWd5QGnj+KlRY V0js3UfEOkk04KrdxYcon0Fg4/YnYJ8yt7D/wEh5klRl4Hw6HFeGYEnLAu40agru6oTa XoYrvvJTt/lxusffP/vMBbkJDtcOUxmULxWlacPWYesqfuvNFesLygbz3smi9AZsvf7R qyXQ/WGOT2/bU6fnjaU/26X9QIwoeYXd9/bs0pNl5DYtadA/WKfdpLOAh8Kd7gHLVuqA Df9Yb8vvamxHGobxW0yWGkiGTZ5Y6KuaufV/tfZajNpS2qN39TbNZp9LolPINyuY+8ge kQtA== Received: by 10.180.104.200 with SMTP id gg8mr4910736wib.14.1348849911839; Fri, 28 Sep 2012 09:31:51 -0700 (PDT) Received: from dragon.shopnetwide.com (95-42-156-203.btc-net.bg. [95.42.156.203]) by mx.google.com with ESMTPS id b7sm154939wiz.3.2012.09.28.09.31.50 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 28 Sep 2012 09:31:51 -0700 (PDT) To: internals@lists.php.net Date: Fri, 28 Sep 2012 19:31:03 +0300 Message-ID: <2651996.kcl60pNQ8o@dragon.shopnetwide.com> User-Agent: KMail/4.8.5 (Linux/3.5.2-3.fc17.i686.PAE; KDE/4.8.5; i686; ; ) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="nextPart2024741.LOBhEVL9ib" Content-Transfer-Encoding: 7Bit Subject: Corruption of hash tables (bugfix attached) From: vesselin.atanasov@gmail.com (Vesselin Atanasov) --nextPart2024741.LOBhEVL9ib Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Hello, I reported this issue as bug #63180, but I am also posting it here. We are going to use the attached patch on our production servers, so I would like to know if any of the developers sees any problems with the attached patch. The text of the bug report follows below: PHP 5.4 has a bug in the handling if interned (literal) strings. Each time an HTTP request terminates, the interned strings are restored to the snapshot that was taken at the end of the startup sequence, which effectively removes any interned strings that have been added after the snapshot has been made. Usually when a hash item is added, the code in zend_hash.c allocates extra space after the end of the Bucket structure, copies the key there and then points Bucket::arKey to the key copy. However when dealing with interned hash keys the code tries to optimize the algorithm by not allocating extra space after the end of the Bucket structure, but just pointing Bucket::arKey to the passed arKey parameter. if (IS_INTERNED(arKey)) { p = (Bucket *) pemalloc(sizeof(Bucket), ht->persistent); if (!p) { return FAILURE; } p->arKey = arKey; } else { p = (Bucket *) pemalloc(sizeof(Bucket) + nKeyLength, ht->persistent); if (!p) { return FAILURE; } p->arKey = (const char*)(p + 1); memcpy((char*)p->arKey, arKey, nKeyLength); } The problem happens when a persistent hash table gets an interned key as a parameter. What happens is that the table and its bucket are persistent, i.e. they remain even after the current HTTP request has been terminated, but the bucket key is removed from the interned keys table and its memory will be reused by other interned keys upon the next request. This leads to corruption of the array keys in the persistent hash table. One such case is with the PCRE cache. It is initialized in: php_pcre.c:PHP_GINIT_FUNCTION(pcre) by the following code: zend_hash_init(&pcre_globals->pcre_cache, 0, NULL, php_free_pcre_cache, 1); The last parameter (1) means that it is a persistent hash table. However the code in php_pcre.c:pcre_get_compiled_regex_cache(char *regex, int regex_len TSRMLS_DC) just passes the regex parameter to zend_hash_update: zend_hash_update(&PCRE_G(pcre_cache), regex, regex_len+1, (void *)&new_entry, sizeof(pcre_cache_entry), (void**)&pce); Given that in most cases the regex parameter is created from string literals in the compiled code, this means that in most cases we end up with interned, non-persistent keys in the persistent PCRE cache table. So when the next HTTP request comes and we create different interned strings they will overwrite the previous ones in the PCRE cache table. The suggested solution to this bug is to modify the code in zend_hash.c and change it in such a way that copying of keys is skipped only when the key is interned and the hash table is persistent. When either the key is non- interned or the hash table is persistent, the key is copied after the Bucket structure, so that its maximum lifetime will be the same as the lifetime of the hash table. I am attaching a patch which uses the suggested solution. I tested this patch and it solves the problems with PCRE cache corruption that we observed in our PHP code. The patch is for PHP 5.4.5 but it also applies cleanly to the latest git version of PHP 5.4 Regards, Vesselin Atanasov --nextPart2024741.LOBhEVL9ib Content-Disposition: attachment; filename="php-5.4.5-interned_keys.patch" Content-Transfer-Encoding: 7Bit Content-Type: text/x-patch; charset="UTF-8"; name="php-5.4.5-interned_keys.patch" --- php-5.4.5/Zend/zend_hash.c.interned_keys 2012-07-18 09:19:16.000000000 +0300 +++ php-5.4.5/Zend/zend_hash.c 2012-09-28 06:42:14.000000000 +0300 @@ -243,8 +243,8 @@ p = p->pNext; } - if (IS_INTERNED(arKey)) { - p = (Bucket *) pemalloc(sizeof(Bucket), ht->persistent); + if (!ht->persistent && IS_INTERNED(arKey)) { + p = (Bucket *) emalloc(sizeof(Bucket)); if (!p) { return FAILURE; } @@ -320,8 +320,8 @@ p = p->pNext; } - if (IS_INTERNED(arKey)) { - p = (Bucket *) pemalloc(sizeof(Bucket), ht->persistent); + if (!ht->persistent && IS_INTERNED(arKey)) { + p = (Bucket *) emalloc(sizeof(Bucket)); if (!p) { return FAILURE; } @@ -1360,8 +1360,8 @@ (!IS_INTERNED(p->arKey) && p->nKeyLength != str_length)) { Bucket *q; - if (IS_INTERNED(str_index)) { - q = (Bucket *) pemalloc(sizeof(Bucket), ht->persistent); + if (!ht->persistent && IS_INTERNED(str_index)) { + q = (Bucket *) emalloc(sizeof(Bucket)); } else { q = (Bucket *) pemalloc(sizeof(Bucket) + str_length, ht->persistent); } @@ -1400,7 +1400,7 @@ } else { p->h = h; p->nKeyLength = str_length; - if (IS_INTERNED(str_index)) { + if (!ht->persistent && IS_INTERNED(str_index)) { p->arKey = str_index; } else { p->arKey = (const char*)(p+1); --nextPart2024741.LOBhEVL9ib--