Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:38159 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 16529 invoked from network); 11 Jun 2008 08:02:38 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 11 Jun 2008 08:02:38 -0000 Authentication-Results: pb1.pair.com smtp.mail=Tex.Texin@netapp.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=Tex.Texin@netapp.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain netapp.com designates 216.240.18.37 as permitted sender) X-PHP-List-Original-Sender: Tex.Texin@netapp.com X-Host-Fingerprint: 216.240.18.37 mx2.netapp.com Received: from [216.240.18.37] ([216.240.18.37:47598] helo=mx2.netapp.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E6/EC-26183-A968F484 for ; Wed, 11 Jun 2008 04:02:36 -0400 X-IronPort-AV: E=Sophos;i="4.27,623,1204531200"; d="scan'208";a="91532126" Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2-out.netapp.com with ESMTP; 11 Jun 2008 01:02:31 -0700 Received: from svlexrs02.hq.netapp.com (svlexrs02.corp.netapp.com [10.57.156.154]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id m5B826Tt016534; Wed, 11 Jun 2008 01:02:31 -0700 (PDT) Received: from SACEXMV02.hq.netapp.com ([10.99.190.109]) by svlexrs02.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 11 Jun 2008 01:02:22 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Date: Wed, 11 Jun 2008 01:01:28 -0700 Message-ID: <819912BDAE6BCB4097883B226DA473B10B0AC8B4@SACEXMV02.hq.netapp.com> In-Reply-To: <484D36EB.9080202@macvicar.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PHP-DEV] Algorithm Optimizations - string search Thread-Index: AcjKOPJGwG017Kh3TfiUT5ak+bTldQBX+sKg References: <7E62CA6E-83F4-4F9C-86FB-75EBE7D489C9@gmail.com> <484D36EB.9080202@macvicar.net> To: "Scott MacVicar" , "Nuno Lopes" Cc: , "Michal Dziemianko" X-OriginalArrivalTime: 11 Jun 2008 08:02:22.0534 (UTC) FILETIME=[7A7A7660:01C8CB99] Subject: RE: [PHP-DEV] Algorithm Optimizations - string search From: Tex.Texin@netapp.com ("Texin, Tex") When I looked at the code, I assumed that it wasn't intended for = international use I'll have to go back and look to give you details, but it doesn't work = for international use or unicode. It would be ok for 8859-1.=20 > -----Original Message----- > From: Scott MacVicar [mailto:scott@macvicar.net]=20 > Sent: Monday, June 09, 2008 6:58 AM > To: Nuno Lopes > Cc: internals@lists.php.net; Michal Dziemianko > Subject: Re: [PHP-DEV] Algorithm Optimizations - string search >=20 > There is rabin-karp too but its worse case is O(nm) so that=20 > might not be ideal, perhaps we should try to compare all of them. >=20 > Scott >=20 > Nuno Lopes wrote: > > Hi, > >=20 > > So some comments: > > - you have some problems with the indentation. We only use tabs, so=20 > > please stick to that. Also, there are some lines that are=20 > not indented=20 > > correctly > > - Have you considered the Boyer-Moore algorithm? I think=20 > it's a little=20 > > faster than KMP (take a look at e.g. > > http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/) > > - please remove the //TUTAJ SKONCZYLEM comment > > - revert this change (as well as a few other that are similar): > > - for (r_end =3D r + Z_STRLEN_P(return_value) - 1; r <=20 > r_end; ) { > > + for ( r_end =3D r + Z_STRLEN_P( return_value ) - 1; r=20 > < r_end;=20 > > ){ (we like small diffs, not long diffs with changes that=20 > also break=20 > > our coding standards. e.g. we don't use space after the '(' char.=20 > > Philip wrote a nice article about diffs at > > http://wiki.php.net/doc/articles/whitespace) > > - in strrpos_reverse_kmp() I think you allocate 4 bytes=20 > less that you=20 > > want > > - I think you've too many comments.. We don't need 1=20 > comment per line=20 > > :) > >=20 > > After fixing all these points and after running the test suite (with > > valgrind) and make sure there are no regressions, I think it's safe=20 > > for you to commit. Still, I would like to see some=20 > performance figures=20 > > comparing the KMP and the Boyer-Moore (or point me some=20 > papers about=20 > > the subject). > >=20 > > Thanks for your work and good luck for the rest of the SoC=20 > project :) > >=20 > > Nuno > >=20 > >=20 > > ----- Original Message ----- From: "Michal Dziemianko"=20 > > > > To: > > Sent: Monday, June 09, 2008 12:39 PM > > Subject: [PHP-DEV] Algorithm Optimizations - string search > >=20 > >=20 > >> Hello, > >> Here: http://212.85.117.53/DIFF.txt is small patch that=20 > will speed=20 > >> up following functions: > >> strpos, > >> stripos, > >> strrpos > >> strripos, > >> and probably some others (all that use zend_memnstr/php_memnstr > >> function) > >> > >> The speedup of zend_memnstr is about 8% on average (about=20 > 30% in case=20 > >> of artificial strings). > >> Functions strrpos and strripos are about 25% faster on average. > >> > >> The only drawback - it needs some additional space (size=20 > of the needle). > >> > >> All functions pass all the tests. > >> > >> If it looks fine than I will apply for cvs account. > >> > >> Cheers, > >> Michal > >=20 > >=20 >=20 > -- > PHP Internals - PHP Runtime Development Mailing List To=20 > unsubscribe, visit: http://www.php.net/unsub.php >=20 >=20