Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:47560 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 38762 invoked from network); 24 Mar 2010 14:12:28 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 Mar 2010 14:12:28 -0000 Authentication-Results: pb1.pair.com header.from=zeev@zend.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=zeev@zend.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 212.25.124.185 as permitted sender) X-PHP-List-Original-Sender: zeev@zend.com X-Host-Fingerprint: 212.25.124.185 il-mr1.zend.com Received: from [212.25.124.185] ([212.25.124.185:51585] helo=il-mr1.zend.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 78/11-33174-8CD1AAB4 for ; Wed, 24 Mar 2010 09:12:26 -0500 Received: from il-gw1.zend.com (unknown [10.1.1.21]) by il-mr1.zend.com (Postfix) with ESMTP id B87A250475 for ; Wed, 24 Mar 2010 15:52:37 +0200 (IST) Received: from LAP-ZEEV.zend.com ([10.1.2.209]) by il-gw1.zend.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 24 Mar 2010 16:12:18 +0200 Message-ID: <7.0.1.0.2.20100324161145.0b7eda68@zend.com> Message-Id: <7.0.1.0.2.20100324155152.0b7ed7d8@zend.com> X-Mailer: QUALCOMM Windows Eudora Version 7.0.1.0 Date: Wed, 24 Mar 2010 16:12:17 +0200 To: internals@lists.php.net Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: quoted-printable X-OriginalArrivalTime: 24 Mar 2010 14:12:18.0642 (UTC) FILETIME=[034F0320:01CACB5C] Subject: Performance improvements From: zeev@zend.com (Zeev Suraski) Hi, Over the last few weeks we've been working on=20 several ideas we had for performance=20 enhancements. We've managed to make some good=20 progress. Our initial tests show roughly 10%=20 speed improvement on real world apps. On pure OO=20 code we're seeing as much as 25% improvement (!) While this still is a work in progress (and not=20 production quality code yet) we want to get=20 feedback sooner rather than later. The diff=20 (available at http://bit.ly/aDPTmv) applies=20 cleanly to trunk. We'd be happy for people to try it out and send comments. What does it contain? 1) Constant operands have been moved from being=20 embedded within the opcodes into a separate=20 literal table. In additional to the zval it=20 contains pre-calculated hash values for string=20 literals. As result PHP uses less memory and=20 doesn't have to recalculate hash values for constants at run-time. 2) Lazy HashTable buckets allocation =96 we now=20 only allocate the buckets array when we actually=20 insert data into the hash for the first=20 time. This saves both memory and time as many=20 hash tables do not have any data in them. 3) Interned strings (see=20 http://en.wikipedia.org/wiki/= String_interning). Most strings known at compile-time are allocated=20 in a single copy with some additional information=20 (pre-calculated hash value, etc.). We try to=20 make most incarnations of a given string point to=20 that same single version, allowing us to save=20 memory, but more importantly - run comparisons by=20 comparing pointers instead of comparing strings=20 and avoid redundant hash value calculations. A couple of notes: a. Not all of the strings are interned - which=20 means that if a pointer comparison fails, we=20 still go through a string comparison; But if it succeeds - it's good= enough. b. We'd need to add support for this in the=20 bytecode caches. We'd be happy to work with the=20 various bytecode cache teams to guide how to=20 implement support so that you do not have to intern on each request. To get a better feel for what interning actually=20 does, consider the following examples: // Lookup for $arr will not calculate a hash=20 value, and will only require a pointer comparison in most cases // Lookup for "foo" in $arr will not calculate a=20 hash value, and will only require a pointer comparison // The string "foo" will not have to be allocated as a key in the Bucket // "blah" when assigned doesn't have to be duplicated $arr[=93foo=94] =3D =93blah=94; $a =3D =93b=94; if ($a =3D=3D =93b=94) { // pointer comparison only ... } Comments welcome! Zeev Patch available at: http://bit.ly/aDPTmv