Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:72732
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: error (pb1.pair.com: domain lsces.co.uk from 217.147.176.204 cause and error)
Message-ID: <530740B9.5000509@lsces.co.uk>
Date: Fri, 21 Feb 2014 12:04:09 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
MIME-Version: 1.0
To: internals@lists.php.net
References: <CAEZPtU4zzN=Xxs03XbBGEM2UTVe3jXH5TAaXUqjsVkzm3kOOyg@mail.gmail.com>	<53061982.2050901@googlemail.com>	<CAEZPtU5qnhwcq1BCkqG99dXJ8F=p6K8ABALacCiAOvPBQ=huYQ@mail.gmail.com>	<53066DE9.4090809@googlemail.com> <CAEZPtU6b+aLuma-nxy84BYVZdOOTr7+ZGZhOVvWMNfbZ4RdPNg@mail.gmail.com>
In-Reply-To: <CAEZPtU6b+aLuma-nxy84BYVZdOOTr7+ZGZhOVvWMNfbZ4RdPNg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [PHP-DEV] [php6] Unicode support, options?
From: lester@lsces.co.uk (Lester Caine)

Pierre Joye wrote:
>> What do you understand by "storage"?
> To have string stored as UTF-8 only, no conversion required for 99% of our use.

I think that the first thing that needs to be agreed on is if there will be 
support for UTF-8 in the core? As has already been said, in many places this 
currently just works and so blocking that may be more of a problem now? The 
question surly is "What is the 1% that needs some extra work?"

I light library would be most appropriate for filling the gaps currently created 
by use of UTF-8 strings in the core? It is not until one starts adding the 
mbstring level of string processing that a more powerful library is required. 
Something that simply ensures UTF-8 strings are valid and can carry out 
comparisons as required?

The black hole is still 'case sensitivity' and it is perhaps laying down a 
'light' set of rules for this which would allow a path forward? As I have 
indicated, I'd prefer simply dropping case insensitivity, but a compromise might 
be to retain it where a string length does not change, and a clean reverse 
transform exists? So a library that provides that comparison as part of the core 
package?

I think that moving forward, ICU support is essential, but it is difficult while 
the 'wrong' defaults are applied and I am seeing private builds being used in 
other projects to get around that hurdle. Hence my question as to if people are 
taking that approach.

-- 
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk