Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:48205 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 74747 invoked from network); 4 May 2010 17:26:17 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 4 May 2010 17:26:17 -0000 Authentication-Results: pb1.pair.com header.from=tokul@users.sourceforge.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=tokul@users.sourceforge.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain users.sourceforge.net from 77.240.252.9 cause and error) X-PHP-List-Original-Sender: tokul@users.sourceforge.net X-Host-Fingerprint: 77.240.252.9 avilys.eik.lt Linux 2.6 Received: from [77.240.252.9] ([77.240.252.9:51717] helo=avilys.eik.lt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 4C/9A-12067-6B850EB4 for ; Tue, 04 May 2010 13:26:15 -0400 Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id E2F841F51F4 for ; Tue, 4 May 2010 20:20:12 +0300 (EEST) Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id C752A1F51F0 for ; Tue, 4 May 2010 20:20:12 +0300 (EEST) Received: from 78.63.99.33 (NaSMail authenticated user tomas@topolis.lt) by avilys.eik.lt with HTTP; Tue, 4 May 2010 20:20:12 +0300 (EEST) Message-ID: <33413.4e3f6321.1272993612.nsm@avilys.eik.lt> In-Reply-To: References: Date: Tue, 4 May 2010 20:20:12 +0300 (EEST) To: internals@lists.php.net User-Agent: NaSMail/1.7-dev MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] Re: Turkish/Azeri locale support From: tokul@users.sourceforge.net ("Tomas Kuliavas") 2010.05.04 17:56 Derick Rethans rašė: > On Tue, 4 May 2010, Adam Harvey wrote: > >> The options are: >> >> 1. Apply Tomas's patch to make case-insensitive lookups >> locale-ignorant. Pros: fixes immediate problem. Cons: breaks BC for >> case-insensitive function/method name lookups for high-bit characters >> in single-byte encodings. (Not that we've ever advertised or >> documented that.) > > People *do* do this though. > > I'm for option 2. Change to 100% case-insensitive function names has bigger probability of BC break. I think I've seen code which used functions in a way that depended on case insensitive lookups and same code had problems with Turkish, because case insensitive dependency was only on latin I. Option 1 maintains BC for ascii names. high bit characters don't break only in some locales. You will be lucky until you hit something in 0xC0-0xDF range that does not have direct match in 0xE0-0xFF range, you will enter minefield, if you use 0x80-0xBF and code will be hosed when locale does not support any usual iso-8859-1 high-bit characters matching. -- Tomas