Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:84124 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 70629 invoked from network); 1 Mar 2015 21:28:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 Mar 2015 21:28:08 -0000 Authentication-Results: pb1.pair.com smtp.mail=rowan.collins@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=rowan.collins@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.179 as permitted sender) X-PHP-List-Original-Sender: rowan.collins@gmail.com X-Host-Fingerprint: 74.125.82.179 mail-we0-f179.google.com Received: from [74.125.82.179] ([74.125.82.179:41961] helo=mail-we0-f179.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id F8/33-53678-86483F45 for ; Sun, 01 Mar 2015 16:28:08 -0500 Received: by wevm14 with SMTP id m14so29839529wev.8 for ; Sun, 01 Mar 2015 13:28:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=4YtfSyKKbPxYSA7qaBGETfQQMTWZ/GekKl7US62eXRQ=; b=TX97ezidNL7S5Jxiyi7R/mSSt3VJ3CIuSFXvQS0lnLhdHYj6jTPH2gk5dhUvHcawTR LTVzIXs+4K39PH5LJdLeytJ8K2hYbHlvxU2xOE+oz7DqiFs+aOv6gWLWCG3/cYDYBRIU TZU+ARZq6DO0TBSij7/WmeZlYlwttFF/a7codSDf8hqts3vB/5iqcmOTIRyFJaynBGyh 1nCkMIpd4WsAeSpIH+om1F+UYhzeXRlcHfCpUP6RmO+oVeQddKUgVlJb6HE+x0Q96nEB hnjKLrvKFh1biuJawSt0tPXrVpHh6NUvj3RrFn50s0njWrbz7Qnr4C8btJKLUgU1GCFw T/ww== X-Received: by 10.180.207.227 with SMTP id lz3mr29252290wic.47.1425245284893; Sun, 01 Mar 2015 13:28:04 -0800 (PST) Received: from [192.168.0.3] (cpc68956-brig15-2-0-cust215.3-3.cable.virginm.net. [82.6.24.216]) by mx.google.com with ESMTPSA id v16sm13205938wib.5.2015.03.01.13.28.02 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 01 Mar 2015 13:28:02 -0800 (PST) Message-ID: <54F38455.5010704@gmail.com> Date: Sun, 01 Mar 2015 21:27:49 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: internals@lists.php.net References: <1413875212.2624.3.camel@localhost.localdomain> <54469840.3070708@sugarcrm.com> <1414051917.2624.35.camel@localhost.localdomain> <1414060726.2624.60.camel@localhost.localdomain> <1414072403.3228.3.camel@kuechenschabe> <87D717D5-273B-4A32-A3E5-83EBDFD314CB@ajf.me> <1414077690.3228.12.camel@kuechenschabe> <54495CF6.30608@sugarcrm.com> <1414130585.2624.64.camel@localhost.localdomain> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] [RFC] UString From: rowan.collins@gmail.com (Rowan Collins) On 01/03/2015 20:59, Yasuo Ohgaki wrote: > However, I don't mind too much allowing any encoding stored in "Text"/ > "UString" object. IIRC, Ruby does this and have not much problem. As I understand it, Ruby's string type is actually a whole bunch of overloaded types, each responsible for re-implementing the various methods available. This leads to a whole bunch of "partially supported" encodings/codepages, which is a big pile of "leaky abstraction" for the small benefit of removing re-encoding operations in a few scenarios. Unicode is explicitly designed to supersede all previous encodings, so it makes much perfect sense to me to use it to internally represent what the user just wants to think of as "text". The fact that within that internal representation you need some byte-level encoding then leads to the optimisation of using a byte-level encoding the user is likely to use as input and output, i.e. UTF-8. Regards, -- Rowan Collins [IMSoP]