Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:48004 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 21043 invoked from network); 19 Apr 2010 06:39:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Apr 2010 06:39:37 -0000 Authentication-Results: pb1.pair.com header.from=tokul@users.sourceforge.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=tokul@users.sourceforge.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain users.sourceforge.net from 77.240.252.9 cause and error) X-PHP-List-Original-Sender: tokul@users.sourceforge.net X-Host-Fingerprint: 77.240.252.9 avilys.eik.lt Linux 2.6 Received: from [77.240.252.9] ([77.240.252.9:58166] helo=avilys.eik.lt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id AD/91-10110-6AAFBCB4 for ; Mon, 19 Apr 2010 02:39:35 -0400 Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id 1B7FA1F51F1 for ; Mon, 19 Apr 2010 09:35:52 +0300 (EEST) Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id F1C031F51F0 for ; Mon, 19 Apr 2010 09:35:51 +0300 (EEST) Received: from 195.22.180.233 (NaSMail authenticated user tomas@topolis.lt) by avilys.eik.lt with HTTP; Mon, 19 Apr 2010 09:35:51 +0300 (EEST) Message-ID: <43869.c316b4e9.1271658951.nsm@avilys.eik.lt> In-Reply-To: <6BF24C36133C4E919CFAA5A66278D531@pc> References: <6BF24C36133C4E919CFAA5A66278D531@pc> Date: Mon, 19 Apr 2010 09:35:51 +0300 (EEST) To: internals@lists.php.net User-Agent: NaSMail/1.7-dev MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] Turkish/Azeri locale support From: tokul@users.sourceforge.net ("Tomas Kuliavas") 2010.04.19 07:59 Stan Vassilev rašė: >>As at least some of you would already be aware, there's a >>long-standing issue with using PHP in a Turkish or Azeri locale, >>namely that case-insensitive lookups within the Zend engine (method >>names, for example) fail on lookups involving upper-case I characters, >>since lower-case I in those languages is ı instead of i (note the lack >>of a dot). > >>The long term plan for this, per bug #35050 and any number of >>duplicates, was to deal with it in PHP 6. Since PHP 6 isn't going to >>happen in its original form, I think we're going to need to revisit >>how we want to deal with this. There's a patch linked in the bug from >>Tomas Kuliavas and Marcus that fixes the problem by simply redefining >>zend_tolower() to a simple locale-insensitive ASCII tolower() >>function, which does fix the Turkish and Azeri locales. > > As you illustrated in your post, fixing it for locales becomes... > complicated. > > If you ask me, there's only one way to fix this, which is how most other > languages fixed it: make the next major version of PHP case-sensitive for > all identifiers. For less bugs, less locale problems and more performance. > > It was somewhat-the-plan, even before the Turkish locale issue was brought > up. Fixing issue is not complicated. I could do that without any C coding background. Your (@php.net) developers only have to learn that they should not use locale sensitive functions and assume that English case sensitivity rules apply. This is main lesson Turkey presents to any coder. If you continue to ignore it, you will continue to trigger PHP bugs in Turkey. For n years PHP used only locale sensitive case-insensitive functions. You never bothered to fix it. Offsetting it to some distant PHP6 feature does not help Turks. Patch for 35050 is not something that should break things. You reviewed patch, commented it, I have updated patch based on your style comments and you continued to ignore the problem. Excuse that patch breaks something is funny, because Win32 builds are set to use internal (not for public use) Microsoft C library calls that are locale insensitive. If some PHP code breaks when string functions are locale insensitive, please show that code. I would like to see how i18n unportable PHP5 programming looks like. If users want to use Turkish locale here and now, they must set LC_CTYPE to C. This workaround disables all locale specific quirks and only gettext must be set to use correct charset for all translations. Other fix is more complex. php scripts must replace all locale sensitive native functions with own locale insensitive replacements and pray that supported PHP versions don't trigger bugs, when LC_CTYPE is not C. -- Tomas