Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:21785 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 78008 invoked by uid 1010); 3 Feb 2006 10:44:31 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 77992 invoked from network); 3 Feb 2006 10:44:31 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Feb 2006 10:44:31 -0000 X-Host-Fingerprint: 195.226.6.9 darkcity.gna.ch Linux 2.4/2.6 Received: from ([195.226.6.9:33539] helo=darkcity.gna.ch) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id F6/D8-41770-E0433E34 for ; Fri, 03 Feb 2006 05:44:30 -0500 Received: from localhost (localhost [127.0.0.1]) by darkcity.gna.ch (Postfix) with ESMTP id BD287120336; Fri, 3 Feb 2006 11:44:26 +0100 (CET) Received: from unknown by localhost (amavisd-new, unix socket) id client-XXiNCox0; Fri, 3 Feb 2006 11:44:25 +0100 (CET) Received: by darkcity.gna.ch (Postfix, from userid 65534) id 37C45120335; Fri, 3 Feb 2006 11:44:25 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on darkcity.gna.ch X-Spam-Level: *** X-Spam-Status: No, score=3.3 required=5.0 tests=AWL,RCVD_IN_NJABL_DUL, RCVD_IN_SORBS_DUL autolearn=disabled version=3.1.0 Received: from [192.168.1.43] (217-162-175-14.dclient.hispeed.ch [217.162.175.14]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by darkcity.gna.ch (Postfix) with ESMTP id 4ACBC120326; Fri, 3 Feb 2006 11:44:14 +0100 (CET) Message-ID: <43E333FA.1030309@cschneid.com> Date: Fri, 03 Feb 2006 11:44:10 +0100 User-Agent: Thunderbird 1.5 (Macintosh/20051201) MIME-Version: 1.0 To: Marcus Boerger Cc: Andrei Zmievski , PHP Developers Mailing List References: <43E2AF9F.60607@cschneid.com> <134017377.20060203091233@marcus-boerger.de> In-Reply-To: <134017377.20060203091233@marcus-boerger.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at gna.ch Subject: Re: [PHP-DEV] Re: Unicode string iterator performance From: cschneid@cschneid.com (Christian Schneider) First of all I was simply proposing a very generic concept without bothering about the implementation on purpose. If it's not feasible then simply ignore it. Marcus Boerger wrote: > caching? There is nothing to cache. And even if we would do that we would > make every string an object since we would need to invalidate the position > cache on write operations. Also i agree with the others that most common Tracking changes to the string could be tricky, agreed. I don't know enough about the internal handling of strings, from the user perspective PHP strings look somewhat immutable but that could be very wrong internally, you know better than me. Changing Unicode strings in place sounds kinda tricky to me too so I'd have expected that to the encapsulated somewhere. > And *I* never had code where I used the same position twice. Besides the all You don't need to access the exact same position. If you know the last array index plus the Unicode offset then you can step by Unicode characters from there which would result to one single Unicode step for iterating over a string. But would also work for $a[$i += 2] as opposed to the originally proposed TextIterator. And if there's a way to step backwards then $a[$i -= 2] could work too. > So i am convinced that the cache would only blow up the code, make everything > much more complex and in the end slow down php. Could well be. It was just an idea, feel free to ignore it ;-) - Chris