Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:47300 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 66175 invoked from network); 16 Mar 2010 11:48:24 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 16 Mar 2010 11:48:24 -0000 Authentication-Results: pb1.pair.com smtp.mail=dreamcat4@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=dreamcat4@gmail.com; sender-id=pass; domainkeys=bad Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.220.219 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: dreamcat4@gmail.com X-Host-Fingerprint: 209.85.220.219 mail-fx0-f219.google.com Received: from [209.85.220.219] ([209.85.220.219:58984] helo=mail-fx0-f219.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 33/F6-15129-7007F9B4 for ; Tue, 16 Mar 2010 06:48:24 -0500 Received: by fxm19 with SMTP id 19so4314335fxm.1 for ; Tue, 16 Mar 2010 04:48:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type; bh=49pIIxN3gINkdFsOmncFAOBmuzcWDfk7598ip1YG9YI=; b=RHIP0iQC0iaXifTA/kcm15WgQ96SuZ+aGgdpjm4LHyHT+20vyUEF8uGN1uzRllxQ2t K56F8oaRD5WO5+DOTd3iBqijujXelG1YvFyP0493PtiU9Cn8BO7z0Uxyx7eub/O2lrIe 00lMChSCiKqKpbpOelZPTqS5WaclaetNVjwj8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=VDUNd/ptxuNwmW0Mpc7n+rSFeSCGyhRCETtwEg0QIslP+awAFPrftIgJxjGcNCfbIy bcKVTKUn4/Hn75U6JjoW/ejWuhtNbxYSTonhHoRlHUEB9OVt8XHZ1/htZ9u7Fyb83coC fLNPBAt9ZpGFcSdruGWeev5szCfBhd8Jvg+m0= MIME-Version: 1.0 Received: by 10.223.5.75 with SMTP id 11mr6358960fau.46.1268740100150; Tue, 16 Mar 2010 04:48:20 -0700 (PDT) In-Reply-To: <4B9F4196.9030404@lsces.co.uk> References: <4B9C9007.1080802@lsces.co.uk> <4B9EC3B2.7070901@zend.com> <4B9F4196.9030404@lsces.co.uk> Date: Tue, 16 Mar 2010 11:48:00 +0000 Message-ID: <99cf22521003160448k5028ae61y70e1e61428d13280@mail.gmail.com> To: Lester Caine Cc: PHP internals Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] Where are we ACTUALLY on Unicode? From: dreamcat4@gmail.com (dreamcat four) On Tue, Mar 16, 2010 at 8:30 AM, Lester Caine wrote: > '3' is not a very processor friendly number, so working with 4 even though > wasteful on memory, does make perfect sense. How long is it since we had a > 640k limit on working memory? SERVERS should have a good amount of memory > for caching information anyway. SO is UTF-16 the right approach for > processing wide strings? It needs special code to handle everything wider > than 16 bits, but at what gain really? If all core functionality is handled > as 32 bit characters is there that much of an overhead over the additional > processing to get around strings of dissimilar sizes in UTF-16 ? Just to re-enforce some of Lester's points above here. 4-byte per character is never slower that 2-bytes per character... its faster if anything. Bear in mind that 4-byte has been the defacto size for all modern cpu registers / 32-bit microarchitectures since.... like... Forever. Give a c compiler 4bytes of data... it'll say: thank you very much, and more of the same please! It keeps em happy ;) Sure UTF-16 can make sense. But only if your external representations are also in UTF-16. So whats the default Unicode settings for MYSQL, POSTGRE, etc? Well, are they always set to UTF-8, or UTF-16? Just do the same as them.