Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:58200 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 70323 invoked from network); 27 Feb 2012 21:40:42 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 27 Feb 2012 21:40:42 -0000 Authentication-Results: pb1.pair.com smtp.mail=lester@lsces.co.uk; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=lester@lsces.co.uk; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lsces.co.uk from 213.123.26.187 cause and error) X-PHP-List-Original-Sender: lester@lsces.co.uk X-Host-Fingerprint: 213.123.26.187 c2beaomr09.btconnect.com Received: from [213.123.26.187] ([213.123.26.187:21637] helo=mail.btconnect.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 50/C8-29394-858FB4F4 for ; Mon, 27 Feb 2012 16:40:41 -0500 Received: from host81-138-11-136.in-addr.btopenworld.com (EHLO _10.0.0.4_) ([81.138.11.136]) by c2beaomr09.btconnect.com with ESMTP id GLF64107; Mon, 27 Feb 2012 21:40:37 +0000 (GMT) Message-ID: <4F4BF855.8090305@lsces.co.uk> Date: Mon, 27 Feb 2012 21:40:37 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:9.0.1) Gecko/20111220 Firefox/9.0.1 SeaMonkey/2.6.1 MIME-Version: 1.0 To: PHP internals References: <4F49E737.9050002@sugarcrm.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Mirapoint-IP-Reputation: reputation=Fair-1, source=Queried, refid=tid=0001.0A0B0301.4F4BF855.0094, actions=tag X-Junkmail-Premium-Raw: score=8/50, refid=2.7.2:2012.2.27.205414:17:8.129, ip=81.138.11.136, rules=__MOZILLA_MSGID, __HAS_MSGID, __SANE_MSGID, __USER_AGENT, __MIME_VERSION, __TO_MALFORMED_2, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __CT, __CT_TEXT_PLAIN, __CTE, __ANY_URI, __URI_NO_MAILTO, __CP_URI_IN_BODY, SUPERLONG_LINE, BODYTEXTP_SIZE_3000_LESS, BODY_SIZE_1200_1299, __MIME_TEXT_ONLY, RDNS_GENERIC_POOLED, HTML_00_01, HTML_00_10, BODY_SIZE_5000_LESS, RDNS_SUSP_GENERIC, RDNS_SUSP, BODY_SIZE_2000_LESS, BODY_SIZE_7000_LESS X-Junkmail-Status: score=10/50, host=c2beaomr09.btconnect.com X-Junkmail-Signature-Raw: score=unknown, refid=str=0001.0A0B0201.4F4BF855.0143:SCFSTAT14830815,ss=1,re=-4.000,fgs=0, ip=0.0.0.0, so=2011-07-25 19:15:43, dmn=2011-05-27 18:58:46, mode=multiengine X-Junkmail-IWF: false Subject: Re: [PHP-DEV] bugs.php.net & php 6 From: lester@lsces.co.uk (Lester Caine) John Crenshaw wrote: > Wait, is the default going to be "Unicode" (wide, always 2 bytes per char, I.E. more memory consumption) or "UTF-8" (1 byte for the first 127, more bytes for wider text, mostly unchanged memory consumption)? I thought it was originally a conversion to Unicode, but that was scrapped? Can someone clarify? I seem to recall that the original plan was along the lines of windows wide string? But it was the fact that unicode is wider then 16 bit that this was simply wrong? Trying to shoehorn things into the wrong structure was just not working and making the job more difficult? Keeping things simple really requires 4 bytes per character, even if one of those is never used, but it does make sense when manipulating strings? However most of the time UTF8 works happily and only becomes a problem when a multibyte character gets cropped because the processing does not know about it? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk// Firebird - http://www.firebirdsql.org/index.php