Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:47330 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 8059 invoked from network); 16 Mar 2010 19:12:52 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 16 Mar 2010 19:12:52 -0000 Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass; domainkeys=bad Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.219.217 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.219.217 mail-ew0-f217.google.com Received: from [209.85.219.217] ([209.85.219.217:33887] helo=mail-ew0-f217.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id BF/81-15129-338DF9B4 for ; Tue, 16 Mar 2010 14:12:51 -0500 Received: by ewy9 with SMTP id 9so102196ewy.11 for ; Tue, 16 Mar 2010 12:12:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=OnVMkDvFdAjSVmks9HOec1+if2uLmuR8EYJ99v2+H/w=; b=a0fXa6saKEubD57m9JYNERyNFzRQCPgYza6zw/olVw0MJ7bPu8xlLDKV8RduckGkfb NPuWXkRvi0iT1LDBSlFsDcBsznCPI+pQ0vVd8b58BmhE83ADatR3SirbTiL1/IT4MXim CzY0PapEa7yr3t7EY3uZNk7zRwgbOYeWUgrHQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=Vtj71bmNFA/s50ihhe8/f5+6YW1zJnGoL/5Y0mKekDMd8ZVScwO62kxhYDTroNBX5c NTy3ig9YoZMqHBXmd3JRLIMRdL4CBuod/itiaT6POAf3I0tzjOGyi+6d7LKbnrOxv1k1 06fQlIAF2XSdKx4XmhjQeWSaKvdqsEIaO/8mU= MIME-Version: 1.0 Received: by 10.216.90.196 with SMTP id e46mr13946wef.138.1268766656457; Tue, 16 Mar 2010 12:10:56 -0700 (PDT) In-Reply-To: <4B9FCEA7.50108@lerdorf.com> References: <4B9C9007.1080802@lsces.co.uk> <4B9EC3B2.7070901@zend.com> <4B9F4196.9030404@lsces.co.uk> <99cf22521003160448k5028ae61y70e1e61428d13280@mail.gmail.com> <99cf22521003161040x4dba08fblb7e088cef16b64a9@mail.gmail.com> <4B9FCEA7.50108@lerdorf.com> Date: Tue, 16 Mar 2010 20:10:56 +0100 Message-ID: To: Rasmus Lerdorf Cc: dreamcat four , Lester Caine , PHP internals Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Where are we ACTUALLY on Unicode? From: pierre.php@gmail.com (Pierre Joye) On Tue, Mar 16, 2010 at 7:32 PM, Rasmus Lerdorf wrote: > Well, the obvious original reason is that ICU uses UTF-16 internally and > the logic was that we would be going in and out of ICU to do all the > various Unicode operations many more times than we would be interfacing > with external things like MySQL or files on disk. =A0You generally only > read or write a string once from an external source, but you may perform > multiple Unicode operations on that same string so avoiding a conversion > for each operation seems logical. Exactly, that's why I was not so affirmative about using UTF-8 over UTF-16. I would like to evaluate both solutions with a small set of PHP features (say some file ops, 1-2 DBs and part of the core string functions) and see the impact of using UTF-8 or UTF-16. But it is definitivelly not a small decision. --=20 Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org