Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:51164 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 91571 invoked from network); 31 Dec 2010 18:16:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 31 Dec 2010 18:16:08 -0000 Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass; domainkeys=bad Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.214.42 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.214.42 mail-bw0-f42.google.com Received: from [209.85.214.42] ([209.85.214.42:32975] helo=mail-bw0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 81/F0-21194-3ED1E1D4 for ; Fri, 31 Dec 2010 13:16:03 -0500 Received: by bwz13 with SMTP id 13so11917467bwz.29 for ; Fri, 31 Dec 2010 10:16:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=lu6ELAne0iryCl1b6Ti1BwBaG6QgEnzlI6/2QdCwG5s=; b=uca7sfFoolfHqlERBLn26YKf+0xYMgEE+oMb7Sd0L4Z5oNn4O5gIp0Eb7onZ3MI56n LCenJguSAQjhEQ0nzVHpirXJo0qiTUVWorLSoJJlSaqsG1dlBUoYt/MibTiEM4GPCPWG zw6JBijutSGNYjh6Pa4PzAfTY9kVkjhNJNMzY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=MvuzFmNhJm+iHReOW2rXvOpHCivrVYfTF2eMP0bc89ZIhd4bVrPi0PCP8Iw6bL9qLI 3YyzivoCcdPVP+0zpMq7snsM69N+4Y9hQAhB+SsQCIgbemYYvh/yArGwc6lo9UV1o4pP +aaLQNETkc/7lILwxlq9CliVWUMCvDYcZpbc4= MIME-Version: 1.0 Received: by 10.204.54.147 with SMTP id q19mr1869894bkg.145.1293819359578; Fri, 31 Dec 2010 10:15:59 -0800 (PST) Received: by 10.204.52.129 with HTTP; Fri, 31 Dec 2010 10:15:59 -0800 (PST) In-Reply-To: References: Date: Fri, 31 Dec 2010 19:15:59 +0100 Message-ID: To: Marcin Babij Cc: internals@lists.php.net Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [PHP-DEV] Re: Zend engine's hashtable performance tweaks From: pierre.php@gmail.com (Pierre Joye) hi, Thanks for the patches :) Can you open a bug report please (and attach the patches to it)? I'm sure this patch will be updated a couple of times before it reaches the repository. Cheers, On Fri, Dec 31, 2010 at 4:58 PM, Marcin Babij wrote: > Sorry for no attachments in previous message, I think my attachments weren't > redirected with message by lists.php.net email confirmation system. I send > them again, and for sure I attach links to public copy of them over HTTP: > https://gist.github.com/761094 - php-5.3.4-hashtable-optimization.patch > https://gist.github.com/761096 - apc-3.1.6-hashtable-optimization.patch > >> Hello, >> I work for social network company, where we were running optimization >> project. One of it's results is patch to Zend engine's Hashtable, which we >> want to share and ask you for comments and improvements. >> >> Why we do this? >> We run profiling on our production servers and found out that zend_hash_* >> functions take 10-20% CPU time of request. So there is some room for easy >> improvements. >> >> What was done? >> - Hash function in zend_hash.h was rebuilt and became much faster, without >> losing the most important properties. >> - Hashtable implementation was changed from Simple chaining to Open >> addressing with linear probing, but with linked bucket, not included in >> hash array, which causes: >> -- Bucket structure to lose 2 pointers. >> -- Searching works similar, but don't have to jump with pointers stored in >> different memory locations, inserting, deleting and rehashing don't need >> to update linked list, but must search for first empty bucket, which is >> fast, because it scans continuous memory. >> -- Load factor decreases from 1.0 to 0.5-0.75 to make less collisions and >> faster hashtable, which in turn increases memory footprint a little. >> - Open addressing doesn't change significantly performance, but next thing >> was to create new array (arEmpty), which is of size nTableSize bytes, >> which keeps track of used/empty buckets and makes inserting and rehashing >> much faster. In future it can be tested as bit-array with size of >> nTableSize/8 bytes. >> - More macros were added to replace repetitive constructs. >> - New constants were added to allow: >> -- Creating new hashtables of size at least X (where 4 and 8 are >> reasonable), which makes no rehashing and reallocing memory while changing >> size to 2 and then to 4. >> -- For small tables it's better to extend them by a factor of 4 times, not >> 2, to make rehashing cost smaller for most hashtables, of cost of little >> higher memory consumption. >> -- For large tables it's better to have other load factor, closer to 1, >> while for small tables it's better to use load factor closer to 0.5. >> - APC was patched to take changes in Bucket structure into account. >> >> How was it tested? >> It was tested with make test, where one more (comparing to original >> sources) test fails, but it's most probably because >> http://bugs.php.net/bug.php?id=48858 - IMO test is badly constructed (is >> too simple) and any change of hashing function makes it fail. Also it was >> tested on our testing environment and production servers against >30mln >> requests to our site, with 120requests/s at peak on Xeon @ 2.50GHz with >> 8GB RAM running Debian Linux. >> >> What is the gain? >> After tests CPU usage dropped by about 4% -6%. >> Memory footprint goes up just by few percent. >> >> What can be done in future? >> - Make new constants configurable by php.ini. >> - Test if changing arEmpty from byte-array to bit-array helps on >> performance. >> - Tweak default constants' values using some real-live benchmarks. >> - Prove (or modify and prove) hash function to have property, that it has >> no collisions if two keys don't differ on no more than 6 bytes, which will >> lead to memcmp omit first (or last) 6 bytes of key. Also simpler thing may >> be proven, that is it has no collisions if two keys are not longer than 6 >> bytes, which will make most string keys omit memcpy at all. >> >> The patch was created and tested against php-5.3.0, apc-3.1.3p1, then >> merged with php-5.3.4, apc-3.1.6 without conflicts, and for these last >> versions patches are attached. Also, it shouldn't conflict with >> http://wiki.php.net/rfc/performanceimprovements . > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > -- Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org