Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:26340 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 35548 invoked by uid 1010); 5 Nov 2006 12:30:34 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 35532 invoked from network); 5 Nov 2006 12:30:34 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 5 Nov 2006 12:30:34 -0000 Authentication-Results: pb1.pair.com header.from=php_lists@realplain.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=php_lists@realplain.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain realplain.com from 209.142.136.132 cause and error) X-PHP-List-Original-Sender: php_lists@realplain.com X-Host-Fingerprint: 209.142.136.132 msa2-mx.centurytel.net Linux 2.4/2.6 Received: from [209.142.136.132] ([209.142.136.132:53989] helo=msa2-mx.centurytel.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 96/06-24806-869DD454 for ; Sun, 05 Nov 2006 07:30:33 -0500 Received: from pc1 (69-29-172-34.dyn.centurytel.net [69.29.172.34]) by msa2-mx.centurytel.net (8.13.6/8.13.6) with SMTP id kA5CU8eB013437; Sun, 5 Nov 2006 06:30:09 -0600 Message-ID: <00db01c700d6$221c7480$0201a8c0@pc1> To: , "Nuno Lopes" References: <00dc01c6e231$3d091c80$0201a8c0@pc1> <016801c6e263$be69c400$0100a8c0@pc07653> Date: Sun, 5 Nov 2006 06:30:09 -0600 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1807 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1807 Subject: Re: [PHP-DEV] array/HashTable filling optimization From: php_lists@realplain.com ("Matt Wilmas") Hi Nuno, Late reply, but I'm glad the idea was able to be used. :-) To the other hash people: While checking how Ilia made implode() faster in 5.2, I think I found another optimization that can be done. In zend_hash_copy/merge, it seems the zend_hash_quick_* functions can be used instead for associative entries, thus saving the key from being hashed again, right? I just checked, and it's definitely faster, how much so being dependent on the key length of course... Patches for this simple additional optimization (hopefully): http://realplain.com/php/hash_quick_copy.diff http://realplain.com/php/hash_quick_copy_5_2.diff Thanks, Matt ----- Original Message ----- From: "Nuno Lopes" Sent: Wednesday, September 27, 2006 > Aaahh nice catch ;) > > >From a quick lookup in PHP sources I also found these functions were the > same optimization can be applied: > zend_default_exception_new_ex() - trivial > convert_to_array() - not as easy as others > zend_object_std_init() - possibly > clone_wrapper_hash() - trivial (php-src/main/streams/streams.c) > compiler_globals_ctor() > ... > > > Nuno > > > ----- Original Message ----- > > Hi all, > > > > I noticed awhile ago how most every use of zend[_u]_hash_init has nSize as > > 0. Of course it isn't always known how many elements will be added to the > > array (nTableSize), but there are places where an accurate value could be > > used instead of getting the minimum default (8). For anyone who doesn't > > know, HashTable sizes are a power of 2, and when there are more than that > > many elements, zend_hash_do_resize realloc's more memory to double the > > table > > size, and then zend_hash_rehash basically goes through and "reinserts" all > > the elements again. > > > > I ran some tests to see the speed improvement from specifying the *actual* > > number of elements in hash_init, thus saving that extra work (_rehash > > mostly). The "n+1" is with one more element (9, 17, 33...) that triggers > > another rehash when the actual number wasn't specified (most extreme). (I > > used add_next_index_zval, so the direct zend_hash* functions would give > > slightly higher % I guess, being less overhead, right?) Results with > > 5.2.0-RC4: > > > > Elements > > n | n n+1 > > --------------------- > > 8 | 0% 14.2% > > 16 | 11.0% 20.9% > > 32 | 13.5% 22.1% > > 64 | 12.6% 21.3% > > 128 | 11.7% 18.5% > > 256 | 9.3% 16.4% > > 512 | 8.6% 17.0% > > 1024 | 7.9% 15.8% > > 2048 | 4.8% 33.3% > > 4096 | 7.8% 28.3% > > 8192 | 10.2% 58.4% > > 16384 | 24.1% 70.5% > > 32768 | 34.5% 80.4% > > 65536 | 34.8% 68.6% > > > > I haven't looked thoroughly, but the only place I've seen that uses an > > non-zero nSize in hash_init (besides some in the core engine) is the > > unserialize() function. :-/ It seems the most simple and obvious place > > that > > could be changed is in Zend/zend_variables.c:_zval_copy_ctor_func? Just > > replace > > > > zend[_u]_hash_init(tmp_ht, 0, ... > > with > > zend[_u]_hash_init(tmp_ht, original_ht->nNumOfElements, ... > > > > Other than that, some of PHP's array functions and array-returning > > functions > > (database fetch row functions, etc.) are ones I thought of that could be > > optimized like this. Maybe create an array_init_size() macro to be used > > in > > places where the number of elements can be easily determined? Thoughts? > > I'd volunteer to look for places that could be updated and modify them. > > :-) > > > > > > Thanks, > > Matt > > > > P.S. I guess the array stuff applies for objects also? But I don't know > > much about their internals...