Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:21784 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 46790 invoked by uid 1010); 3 Feb 2006 08:14:35 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 46775 invoked from network); 3 Feb 2006 08:14:35 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Feb 2006 08:14:35 -0000 X-Host-Fingerprint: 81.169.182.136 ajaxatwork.net Linux 2.4/2.6 Received: from ([81.169.182.136:49427] helo=strato.aixcept.de) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 8B/B6-41770-AE013E34 for ; Fri, 03 Feb 2006 03:14:34 -0500 Received: from [192.168.1.3] (dslb-084-063-010-008.pools.arcor-ip.net [84.63.10.8]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by strato.aixcept.de (Postfix) with ESMTP id 21CD035C1DD; Fri, 3 Feb 2006 09:14:30 +0100 (CET) Date: Fri, 3 Feb 2006 09:12:33 +0100 Reply-To: Marcus Boerger X-Priority: 3 (Normal) Message-ID: <134017377.20060203091233@marcus-boerger.de> To: Christian Schneider Cc: Andrei Zmievski , PHP Developers Mailing List In-Reply-To: <43E2AF9F.60607@cschneid.com> References: <43E2AF9F.60607@cschneid.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] Re: Unicode string iterator performance From: helly@php.net (Marcus Boerger) Hello Christian, caching? There is nothing to cache. And even if we would do that we would make every string an object since we would need to invalidate the position cache on write operations. Also i agree with the others that most common usage would be accessing a few chars probably changing them. And *I* never had code where I used the same position twice. Besides the all time favorite search for backlsash and forward slash. But that can be done better using the right search functions anyway. Also looking for backslashes and changing them to forward slashes can be done with iterators. Then checking if the second char is a ':' (common usecase under windows) is best done with [], but that's a one time read access. The place caching and its optimization effect i see left is sequential scanning. But for all of that iterators and functions are much better. So i am convinced that the cache would only blow up the code, make everything much more complex and in the end slow down php. best regards marcus Friday, February 3, 2006, 2:19:27 AM, you wrote: > Andrei Zmievski wrote: >> I am not sure how we can optimize [] to be faster than the iterator >> approach. Food for thought? > You could cache the last position (PHP- and Unicode string index) and > start from there. This assumes that most accesses are (more or less) > sequential. If you can step backward as well as forward you could use > the cached version for both directions but even if you can only go > forward it would cover the most common case I guess. > Very simple idea but maybe it helps, > - Chris Best regards, Marcus