Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:64834 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 49277 invoked from network); 10 Jan 2013 23:32:10 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 10 Jan 2013 23:32:10 -0000 Authentication-Results: pb1.pair.com smtp.mail=christopher.jones@oracle.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=christopher.jones@oracle.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain oracle.com designates 141.146.126.69 as permitted sender) X-PHP-List-Original-Sender: christopher.jones@oracle.com X-Host-Fingerprint: 141.146.126.69 aserp1040.oracle.com Received: from [141.146.126.69] ([141.146.126.69:44055] helo=aserp1040.oracle.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id CF/41-02684-87F4FE05 for ; Thu, 10 Jan 2013 18:32:09 -0500 Received: from ucsinet22.oracle.com (ucsinet22.oracle.com [156.151.31.94]) by aserp1040.oracle.com (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id r0ANW4dM008849 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 10 Jan 2013 23:32:05 GMT Received: from acsmt357.oracle.com (acsmt357.oracle.com [141.146.40.157]) by ucsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r0ANW42V018416 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 10 Jan 2013 23:32:04 GMT Received: from abhmt115.oracle.com (abhmt115.oracle.com [141.146.116.67]) by acsmt357.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id r0ANW4QB032634 for ; Thu, 10 Jan 2013 17:32:04 -0600 Received: from [10.187.91.52] (/10.187.91.52) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 10 Jan 2013 15:32:03 -0800 Message-ID: <50EF4F71.9080701@oracle.com> Date: Thu, 10 Jan 2013 15:32:01 -0800 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: internals@lists.php.net References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Subject: Re: [PHP-DEV] strtr vs. str_replace runtime From: christopher.jones@oracle.com (Christopher Jones) On 01/09/2013 02:45 PM, Gustavo Lopes wrote: > On Thu, 03 Jan 2013 11:40:31 +0100, Gustavo Lopes wrote: > >> The algorithm behaves very poorly in this case because at each position of the text, all the substrings starting there and with size between m and n (where m is the size of the smallest pattern and n is the largest) are checked, even if there are only >> two patterns with size m and n. We could fix this easily by building a set of the pattern sizes found and try only with those. The hashing of the substrings could also be improved; we don't have to recalculate everything when we advance in the text. >> > > Both optimizations (the hash rolling and limiting the substrings hashed on each iteration) worked quite well. > > But I got much better results with another algorithm [1], so I'm going to merge the branch with it [2] instead. I get these results with a 1.7 MB string and 13 replacement strings, the smallest with 6 characters and 30 iterations (x86-64, gcc -O3): > > strtr: 0.1387 > str_replace: 0.4471 > > The algorithm doesn't perform as well when the replacement strings are small. Adding a replacement for the pattern '_' (1 character) yields: > > strtr: 0.6157 > str_replace: 0.6230 How does this compare with your baseline results? > > But even in this case, it works better than my optimized version of the current algorithm. > > I plan on merging to 5.4 and 5.5; you may want to review it as introducing completely new code carries some risk. Depending on the improvement, it might be tempting to merge to 5.4 but I would prefer to see it in 5.5+. Let's keep 5.4 stable. Chris > > [1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.2927 > [2] https://github.com/cataphract/php-src/compare/strtr_wu94 > -- christopher.jones@oracle.com http://twitter.com/ghrd Newly updated, free PHP & Oracle book: http://www.oracle.com/technetwork/topics/php/underground-php-oracle-manual-098250.html