Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:72738 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 3829 invoked from network); 21 Feb 2014 12:30:19 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Feb 2014 12:30:19 -0000 Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.216.176 as permitted sender) X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.216.176 mail-qc0-f176.google.com Received: from [209.85.216.176] ([209.85.216.176:48268] helo=mail-qc0-f176.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 92/43-22355-9D647035 for ; Fri, 21 Feb 2014 07:30:17 -0500 Received: by mail-qc0-f176.google.com with SMTP id r5so974003qcx.35 for ; Fri, 21 Feb 2014 04:30:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Yo73FXpFYcps065N8aRlGpCnFiysII1PxaFuXyCutOk=; b=XApcLcvs9sjNwHYfOTdmiu+FJGAcxUyqf9/Hx4LsX6jHHshdfVURDUdKHC9vLRkNT9 Kv01Myn6E7WDILnLwPrUxoqXScIvGAlaNFpgTAhMx/sWc9GOlIgzMtpupLpVmdH9ncXN WYnp9HISum81J+NY/fBOnQoHpdEysPPcJ9SBVkLXjgKLiLtYt4k/RY4DV57X8lbYEVnh W3DdtRtC+yxcb9D/u9UpMdhPvEoy6+Q21NbZeDVF8WBKBQEKupmeMWXACLM//SQsItLh I3RaAKNaBClDJL6gGZ5sNQcUw5h7aGm+pwxs+uHvRCoN66rPEa6MmRZYdZuxH0Pjy65s Wj1Q== MIME-Version: 1.0 X-Received: by 10.224.11.196 with SMTP id u4mr9422075qau.4.1392985814876; Fri, 21 Feb 2014 04:30:14 -0800 (PST) Received: by 10.140.18.145 with HTTP; Fri, 21 Feb 2014 04:30:14 -0800 (PST) In-Reply-To: <530740B9.5000509@lsces.co.uk> References: <53061982.2050901@googlemail.com> <53066DE9.4090809@googlemail.com> <530740B9.5000509@lsces.co.uk> Date: Fri, 21 Feb 2014 13:30:14 +0100 Message-ID: To: Lester Caine Cc: PHP internals Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] [php6] Unicode support, options? From: pierre.php@gmail.com (Pierre Joye) On Fri, Feb 21, 2014 at 1:04 PM, Lester Caine wrote: > Pierre Joye wrote: >>> >>> What do you understand by "storage"? >> >> To have string stored as UTF-8 only, no conversion required for 99% of our >> use. > > > I think that the first thing that needs to be agreed on is if there will be > support for UTF-8 in the core? As has already been said, in many places this > currently just works and so blocking that may be more of a problem now? The > question surly is "What is the 1% that needs some extra work?" I think we pretty much agree already that we need UTF-8 as the base, meaning are stored in UTF-8. Conversions may be needed for advanced usages provided by ICU (or maybe not, I just do not know for sure now). > I light library would be most appropriate for filling the gaps currently > created by use of UTF-8 strings in the core? It is not until one starts > adding the mbstring level of string processing that a more powerful library > is required. Something that simply ensures UTF-8 strings are valid and can > carry out comparisons as required? it is more than only comparison. If only comparison, additions and the likes, utf8proc is enough, or librope with some additions. > The black hole is still 'case sensitivity' and it is perhaps laying down a > 'light' set of rules for this which would allow a path forward? As I have > indicated, I'd prefer simply dropping case insensitivity, but a compromise > might be to retain it where a string length does not change, and a clean > reverse transform exists? So a library that provides that comparison as part > of the core package? I do not care much about languages support for UTF-8 names for methods, functons, variables etc. My take on it is that we should stick to ASCII for it and be done with that. But that's only my opinion :) We may end writing our own library for the core operations... But I would prefer to avoid that as it is really not a trivial task. Cheers, -- Pierre @pierrejoye | http://www.libgd.org