Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:30295 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 56086 invoked by uid 1010); 21 Jun 2007 05:06:08 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 56070 invoked from network); 21 Jun 2007 05:06:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Jun 2007 05:06:08 -0000 Authentication-Results: pb1.pair.com smtp.mail=tokul@users.sourceforge.net; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=tokul@users.sourceforge.net; sender-id=unknown Received-SPF: error (pb1.pair.com: domain users.sourceforge.net from 213.197.162.99 cause and error) X-PHP-List-Original-Sender: tokul@users.sourceforge.net X-Host-Fingerprint: 213.197.162.99 avilys.eik.lt Linux 2.6 Received: from [213.197.162.99] ([213.197.162.99:56964] helo=avilys.eik.lt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D5/B2-32874-D370A764 for ; Thu, 21 Jun 2007 01:06:07 -0400 Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id 2484C1F5105; Thu, 21 Jun 2007 08:04:22 +0300 (EEST) Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id 061861F5104; Thu, 21 Jun 2007 08:04:22 +0300 (EEST) Received: from 78.61.224.253 (NaSMail authenticated user tomas@topolis.lt) by avilys.eik.lt with HTTP; Thu, 21 Jun 2007 08:04:22 +0300 (EEST) Message-ID: <35998.78.61.224.253.1182402262.squirrel@avilys.eik.lt> In-Reply-To: References: <1181829227.3478.3.camel@localhost.localdomain> <7d5a202f0706141844l3c75b556hdbecbcd5a43747c9@mail.gmail.com> <4671F184.2020401@lerdorf.com> <6sof73dj69ldpspfc5ukrc58qr9ckbin2b@4ax.com> <4677E7B1.2080305@lerdorf.com> <4677F3FA.3010000@pooteeweet.org> <49348.78.61.224.253.1182270469.squirrel@avilys.eik.lt> Date: Thu, 21 Jun 2007 08:04:22 +0300 (EEST) To: "Andrei Zmievski" Cc: internals@lists.php.net User-Agent: NaSMail/1.0 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] What is the use of "unicode.semantics" in PHP 6? From: tokul@users.sourceforge.net ("Tomas Kuliavas") > No one is going to write code in their own native language and > distribute it worldwide. > > How can you say that "PHP6 Unicode support is not designed for > international environment"? Have you even tried it? Ok. International environment. Do you have strtoupper|strtolower|strcasecmp functions operating in LC_CTYPE=C without switching locale? If I remember correctly, PHP does not use those even internally and developers are constantly triggering same Turkish|Kurdish|Azerbaijani bug in different functions. If I want case conversion or case insensitive comparison functions to follow C rules and not LC_CTYPE=some_translation, I am forced to use own functions, because strtoupper|strtolower are definitely locale aware in PHP. PHP6 unicode.semantics=on reduces my options and forces me to recode all 8bit string operations. After recoding functions are not backwards compatible with anything lower than 5.2.1. Your slides show that unicode characters are defined with \u, yet you mess with octals (\300) and hexadecimals (\xC0). It is possible that I am not right and I will be able to do everything more efficiently in PHP6. But for now I have broken password encryption handling, broken work with binary strings and over noisy stream functions. And I still haven't checked how it will handle streams with data encoded in different character sets. I will be forced to recode the code if PHP6 forces me to work in unicode.semantics=on. Don't expect that I will praise PHP6 for that. You are helping same people, who ask others to turn on mbstring.func_overload in php.ini in order to get unicode support. You are not helping people who already have code working with 8bit strings in different character sets. -- Tomas