PHP 6, part 2 ... Unicode

11 years ago by Lester Caine — view source

unread

1/ Use of a fast and lite UTF-8 procession libraries for all core string operations
When you start looking at the ICU UTF-8 handling then it does provide a stable
base. It is conversion to UTF-8 which seems to get in the way. But now that I
have had time to dig deeper, UCONFIG_NO_CONVERSION is the key. How many people
today only ACTUALLY handle UTF-8 in their content? Conversion is only required
when one finds material that is not already UTF-8 ...

2/ Use of intl for any advanced operations, localization or conversion?
Simply follows on from 1/
Certainly localizations like currency, timezone and calendar management should
use a common base, but I get the impression that 'locale' is still somewhat
messy when it comes to 'collations'? This is one of those areas where the
language returned in a browser header may or may not give the right information,
just as currently we can't guarantee to correctly guess the timezone? But in a
way I view this as secondary to the UTF-8 debate? Even just fulldefault support
for UTF-8 in the core is better than the current piecemeal support?

3/ Support of UTF-8 for the language itself, as PHP currently allows non ascii
encoding in scripts, I would recommend to stop supporting it, except in comments.
Million dollar question?
If ALL one is processing in UTF-8 is the content, then we are already there?
Just use mbstring to manage the content and as Pierre says - stop using UTF-8 in
identifiers and the like? While that would not affect me one bit as I don't need
anything other than ASCII for my own identifiers, I think it is just this area
that other users are looking to upgrade so that they can make scripts more
readable in their own languages?

I really am getting fed up with the website structure ... having to manually
change domain in the address line so you can search documentation when working
through the wiki or bugs is hopeless. There should be one search engine covering
the whole site - and NOT Google since that does not work at all when you want
results in one language! Another example of 'locale' simply not working?

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

11 years ago by Crypto Compress — view source

unread

Hi,

"C++ Usage" discussion is somewhat of a dependency for "Unicode" as it
may add some more varieties of "awesome Unicode lib".

cryptocompress

11 years ago by Lester Caine — view source

unread

Crypto Compress wrote:

"C++ Usage" discussion is somewhat of a dependency for "Unicode" as it may add
some more varieties of "awesome Unicode lib".

I don't see that the two do go together, and I'm with Pierre that introducing
C++ now would only delay progress. It would however be worth serious
consideration on a longer timeline? I still see unicode as a stepping stone
which has already been incorporated into PHP by the back door and now simply
needs a few loose ends tying up. Making it a major version upgraded is more
about setting the ground rules for other areas than unicode per say. Formalising
say UTF-8 usage by introducing a few BC breaks opens the debate where trying to
simply shoehorn the missing UTF-8 elements into PHP5 may not be so practical?

PHP 6, part 2 ... Unicode

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL