Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:38119 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 47957 invoked from network); 9 Jun 2008 13:58:16 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 9 Jun 2008 13:58:16 -0000 Authentication-Results: pb1.pair.com smtp.mail=scott@macvicar.net; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=scott@macvicar.net; sender-id=unknown Received-SPF: error (pb1.pair.com: domain macvicar.net from 193.227.246.108 cause and error) X-PHP-List-Original-Sender: scott@macvicar.net X-Host-Fingerprint: 193.227.246.108 ip246-108-v193.static.x-ip.net Received: from [193.227.246.108] ([193.227.246.108:55049] helo=lovelace.midden.org.uk) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 3C/EB-21959-7F63D484 for ; Mon, 09 Jun 2008 09:58:16 -0400 Received: from office.vbulletin.com ([217.155.246.60] helo=[10.0.0.116]) by lovelace.midden.org.uk with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1K5hsv-0002rn-UX; Mon, 09 Jun 2008 14:58:12 +0100 Message-ID: <484D36EB.9080202@macvicar.net> Date: Mon, 09 Jun 2008 14:58:03 +0100 User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Nuno Lopes CC: internals@lists.php.net, Michal Dziemianko References: <7E62CA6E-83F4-4F9C-86FB-75EBE7D489C9@gmail.com> In-Reply-To: X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -4.2 X-Spam_Report: Spam detection software, running on the system "lovelace.midden.org.uk", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: There is rabin-karp too but its worse case is O(nm) so that might not be ideal, perhaps we should try to compare all of them. Scott Nuno Lopes wrote: > Hi, > > So some comments: > - you have some problems with the indentation. We only use tabs, so > please stick to that. Also, there are some lines that are not indented > correctly > - Have you considered the Boyer-Moore algorithm? I think it's a little > faster than KMP (take a look at e.g. > http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/) > - please remove the //TUTAJ SKONCZYLEM comment > - revert this change (as well as a few other that are similar): > - for (r_end = r + Z_STRLEN_P(return_value) - 1; r < r_end; ) { > + for ( r_end = r + Z_STRLEN_P( return_value ) - 1; r < r_end; ){ > (we like small diffs, not long diffs with changes that also break our > coding standards. e.g. we don't use space after the '(' char. Philip > wrote a nice article about diffs at > http://wiki.php.net/doc/articles/whitespace) > - in strrpos_reverse_kmp() I think you allocate 4 bytes less that you want > - I think you've too many comments.. We don't need 1 comment per line :) > > After fixing all these points and after running the test suite (with > valgrind) and make sure there are no regressions, I think it's safe for > you to commit. Still, I would like to see some performance figures > comparing the KMP and the Boyer-Moore (or point me some papers about the > subject). > > Thanks for your work and good luck for the rest of the SoC project :) > > Nuno > > > ----- Original Message ----- From: "Michal Dziemianko" > > To: > Sent: Monday, June 09, 2008 12:39 PM > Subject: [PHP-DEV] Algorithm Optimizations - string search > > >> Hello, >> Here: http://212.85.117.53/DIFF.txt is small patch that will speed >> up following functions: >> strpos, >> stripos, >> strrpos >> strripos, >> and probably some others (all that use zend_memnstr/php_memnstr >> function) >> >> The speedup of zend_memnstr is about 8% on average (about 30% in case >> of artificial strings). >> Functions strrpos and strripos are about 25% faster on average. >> >> The only [...] Content analysis details: (-4.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.8 ALL_TRUSTED Passed through trusted hosts only via SMTP 0.1 TW_RR BODY: Odd Letter Triples with RR 0.0 NORMAL_HTTP_TO_IP URI: Uses a dotted-decimal IP address in URL -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] 0.1 AWL AWL: From: address is in the auto white-list Subject: Re: [PHP-DEV] Algorithm Optimizations - string search From: scott@macvicar.net (Scott MacVicar) There is rabin-karp too but its worse case is O(nm) so that might not be ideal, perhaps we should try to compare all of them. Scott Nuno Lopes wrote: > Hi, > > So some comments: > - you have some problems with the indentation. We only use tabs, so > please stick to that. Also, there are some lines that are not indented > correctly > - Have you considered the Boyer-Moore algorithm? I think it's a little > faster than KMP (take a look at e.g. > http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/) > - please remove the //TUTAJ SKONCZYLEM comment > - revert this change (as well as a few other that are similar): > - for (r_end = r + Z_STRLEN_P(return_value) - 1; r < r_end; ) { > + for ( r_end = r + Z_STRLEN_P( return_value ) - 1; r < r_end; ){ > (we like small diffs, not long diffs with changes that also break our > coding standards. e.g. we don't use space after the '(' char. Philip > wrote a nice article about diffs at > http://wiki.php.net/doc/articles/whitespace) > - in strrpos_reverse_kmp() I think you allocate 4 bytes less that you want > - I think you've too many comments.. We don't need 1 comment per line :) > > After fixing all these points and after running the test suite (with > valgrind) and make sure there are no regressions, I think it's safe for > you to commit. Still, I would like to see some performance figures > comparing the KMP and the Boyer-Moore (or point me some papers about the > subject). > > Thanks for your work and good luck for the rest of the SoC project :) > > Nuno > > > ----- Original Message ----- From: "Michal Dziemianko" > > To: > Sent: Monday, June 09, 2008 12:39 PM > Subject: [PHP-DEV] Algorithm Optimizations - string search > > >> Hello, >> Here: http://212.85.117.53/DIFF.txt is small patch that will speed >> up following functions: >> strpos, >> stripos, >> strrpos >> strripos, >> and probably some others (all that use zend_memnstr/php_memnstr >> function) >> >> The speedup of zend_memnstr is about 8% on average (about 30% in case >> of artificial strings). >> Functions strrpos and strripos are about 25% faster on average. >> >> The only drawback - it needs some additional space (size of the needle). >> >> All functions pass all the tests. >> >> If it looks fine than I will apply for cvs account. >> >> Cheers, >> Michal > >