Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:19483 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 5085 invoked by uid 1010); 7 Oct 2005 20:35:12 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 5070 invoked from network); 7 Oct 2005 20:35:12 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 Oct 2005 20:35:12 -0000 X-Host-Fingerprint: 82.94.239.5 jdi.jdi-ict.nl Linux 2.5 (sometimes 2.4) (4) Received: from ([82.94.239.5:53847] helo=jdi.jdi-ict.nl) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id D0/2F-54476-FFBD6434 for ; Fri, 07 Oct 2005 16:35:12 -0400 Received: from localhost (localhost [127.0.0.1]) by jdi.jdi-ict.nl (8.12.11/8.12.11) with ESMTP id j97KZ6eQ022352; Fri, 7 Oct 2005 22:35:06 +0200 Received: from localhost (localhost [127.0.0.1]) by jdi.jdi-ict.nl (8.12.11/8.12.11) with ESMTP id j97KZ0fA022316; Fri, 7 Oct 2005 22:35:01 +0200 Date: Fri, 7 Oct 2005 22:35:00 +0200 (CEST) X-X-Sender: derick@localhost To: Andrei Zmievski cc: PHP Developers Mailing List In-Reply-To: <99dd4f75f4ceebfe1c980cf439e97416@gravitonic.com> Message-ID: References: <99dd4f75f4ceebfe1c980cf439e97416@gravitonic.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new at jdi-ict.nl Subject: Re: [PHP-DEV] Unicode Implementation From: derick@php.net (Derick Rethans) On Thu, 6 Oct 2005, Andrei Zmievski wrote: > On Oct 6, 2005, at 10:56 AM, Derick Rethans wrote: > > > > I think I would prefer an IS_UNICODE/unicode=on only PHP. > > > > This would mean that: > > - no duplicate functionality for tons of functions that will make > > maintaining the thing very hard > > This is true. > > > - a cleaner (and a bit faster) Unicode implementation > > This is true too. > > > - we have a bit less BC. > > "A bit less"? I'd say it would break BC in a major way. People who want to > upgrade to PHP 6 would need to rewrite a lot of their scripts. Can you please specify which things you think that will break? I've gave it some thoughts but couldn't really think of anything serious... > > This is something I find quite not acceptable, and we need to figure > > out a way on how to optimize this - for substr the penalty is > > probably what we are using an iterator and not a direct memcpy > > (because of surrogates), I am not so sure about the others. > > We can try switching to _UNSAFE versions of the iterator macros - they > assume well-formed UTF-16, so they will be somewhat faster. That's worth a try - I'll put that on my todo list somewhere. Derick -- Derick Rethans http://derickrethans.nl | http://ez.no | http://xdebug.org