Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:84129 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 80216 invoked from network); 1 Mar 2015 22:15:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 Mar 2015 22:15:27 -0000 Authentication-Results: pb1.pair.com header.from=yohgaki@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=yohgaki@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.192.51 as permitted sender) X-PHP-List-Original-Sender: yohgaki@gmail.com X-Host-Fingerprint: 209.85.192.51 mail-qg0-f51.google.com Received: from [209.85.192.51] ([209.85.192.51:64976] helo=mail-qg0-f51.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id BE/51-06875-D7F83F45 for ; Sun, 01 Mar 2015 17:15:26 -0500 Received: by mail-qg0-f51.google.com with SMTP id e89so8207056qgf.10 for ; Sun, 01 Mar 2015 14:15:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=MxOhzSivNZbnq3mUwSt6R9YlEcXPshqwikeCSQCowwA=; b=E1PyKI8vkOWGMoUefjlPJoSCdyoQC678pn6Sl/5Kl7XfjV82mn+bZSviT+JcsRieeK K1yMG9W5NQMBYDtaHThvkR8ISuG7CJbYZnxs3NIkWchbu31h5xaLaDtHZa4FqxwJo2YM iFIJP9CntPER3WrNmPeaX+63OSyjLuKwAAXCY1cxxpM6XehFVhUFBZxJAQrAAuT4/OGy ecR8kWTUQnirRhhLOLnkLcVzIAearBQVNOym4T+DH4ptACL46jZfJUwc9/pTPZ+dChvW 8rJPqWLss6VlaoW9IyU8eySd9nk+8lurVzVtxag/gf5APk8UYNp5JvmL/DyuSM4Erl9A c1IA== X-Received: by 10.140.101.227 with SMTP id u90mr11567887qge.48.1425248122836; Sun, 01 Mar 2015 14:15:22 -0800 (PST) MIME-Version: 1.0 Sender: yohgaki@gmail.com Received: by 10.229.198.8 with HTTP; Sun, 1 Mar 2015 14:14:42 -0800 (PST) In-Reply-To: <54F3869E.9000208@gmail.com> References: <1413875212.2624.3.camel@localhost.localdomain> <54469840.3070708@sugarcrm.com> <1414051917.2624.35.camel@localhost.localdomain> <1414060726.2624.60.camel@localhost.localdomain> <1414072403.3228.3.camel@kuechenschabe> <87D717D5-273B-4A32-A3E5-83EBDFD314CB@ajf.me> <1414077690.3228.12.camel@kuechenschabe> <54495CF6.30608@sugarcrm.com> <1414130585.2624.64.camel@localhost.localdomain> <54F377D2.7030601@lsces.co.uk> <54F3869E.9000208@gmail.com> Date: Mon, 2 Mar 2015 07:14:42 +0900 X-Google-Sender-Auth: zqCbH_ZcSGhLC4dGfB2B0l-PGt0 Message-ID: To: Rowan Collins Cc: "internals@lists.php.net" Content-Type: multipart/alternative; boundary=001a11c16884141d1a0510417057 Subject: Re: [PHP-DEV] [RFC] UString From: yohgaki@ohgaki.net (Yasuo Ohgaki) --001a11c16884141d1a0510417057 Content-Type: text/plain; charset=UTF-8 Hi Joe and Rowan, On Mon, Mar 2, 2015 at 6:37 AM, Rowan Collins wrote: > On 01/03/2015 20:34, Lester Caine wrote: > >> On 28/02/15 06:48, Joe Watkins wrote: >> >>> This is just a quick note to announce my intention to ready this RFC >>> for voting next week. >>> >> Since there is nothing in this which needs any changes to the core then >> surly it simply needs to exist in pecl until such time as a proper >> replacement for unicode in core strings has been addressed? Since it >> will still require intl to provide those areas it does not support, and >> I question if we really need to provide yet another encoding converter. >> >> A unicode string handler that just handles UTF8 strings may be yet >> another stepping stone, but it still falls short of beings able to >> handle all of the internationalization problems and is simply an >> alternate to mbstring so one either runs both, or sit down and convert >> all the third party libraries to eliminate mbstring. >> >> Like http extension, it's not essential that it's loaded by default, and >> leaving it in pecl allows development outside that of the core? >> >> > I think this is probably a good idea at this stage. It will give people a > chance to play around with it in an "experimental" state before committing > to maintaining a particular API. > > Since there's no real BC break here, there's no reason it couldn't be > bundled into 7.1 if it was deemed ready by then, so it seems unwise to rush > into including it in 7.0 straight from what feels like a prototype > implementation. Sounds reasonable. Joe, I don't have much time to help, but I'm willing to help UString development. I think it's better to keep it simple. Having unified internal encoding (NFC normalized UTF-8 string without BOM) for internal string representation would be much simpler than multiple encodings. We may consider various issues/ideas like this in relatively long term. http://websec.github.io/unicode-security-guide/character-transformations/ http://docs.parrot.org/parrot/latest/html/docs/pdds/pdd28_strings.pod.html Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net --001a11c16884141d1a0510417057--