Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:60267 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 63743 invoked from network); 24 Apr 2012 07:59:29 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 Apr 2012 07:59:29 -0000 Authentication-Results: pb1.pair.com smtp.mail=tyra3l@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=tyra3l@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.160.42 as permitted sender) X-PHP-List-Original-Sender: tyra3l@gmail.com X-Host-Fingerprint: 209.85.160.42 mail-pb0-f42.google.com Received: from [209.85.160.42] ([209.85.160.42:55870] helo=mail-pb0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id A6/75-34190-E5D569F4 for ; Tue, 24 Apr 2012 03:59:26 -0400 Received: by pbcun1 with SMTP id un1so5152979pbc.29 for ; Tue, 24 Apr 2012 00:59:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=MV3UkKP73y4PbzmrB+PFHwF/oUiHt235oMubr4402Bc=; b=SflPtEw4w+Ok/DJe+65NeVSUYRzWnfPvMWina/MJ80IREqbnlzTPYrVGlFdBQ302/y wCMQfkYCBahiw6fHHJY0wNcDHa72KycY9o5TfsDZ+NwN2N/S2VjiVIKwA28RReham+rE 8TDqr5bk7Hl5H7/o0OSZyLgWtjHe1iPEnwYACRopozu0ssIQPDrRTsXMduetj6S4vUQO HJOQ7WcmkVo72A4LlGmVpKXh4HybKrkHGIBcL0EQUl4RrteKNex19bCJeahCzNIhmJwq oHYzwNpVPar4u8dNj+koYIQIPGvhNvJoxb3ENGlYgr7yTVWcRPMucgRp9gkTgS4Ei3Nu PFBA== MIME-Version: 1.0 Received: by 10.68.221.66 with SMTP id qc2mr2164264pbc.131.1335254363881; Tue, 24 Apr 2012 00:59:23 -0700 (PDT) Received: by 10.68.233.229 with HTTP; Tue, 24 Apr 2012 00:59:23 -0700 (PDT) In-Reply-To: References: Date: Tue, 24 Apr 2012 09:59:23 +0200 Message-ID: To: Galen Wright-Watson Cc: "C.Koy" , internals@lists.php.net Content-Type: multipart/alternative; boundary=e89a8f3b55d40b415204be682236 Subject: Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP) From: tyra3l@gmail.com (Ferenc Kovacs) --e89a8f3b55d40b415204be682236 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Tue, Apr 24, 2012 at 1:06 AM, Galen Wright-Watson wr= ote: > On Mon, Apr 23, 2012 at 3:22 AM, C.Koy wrote: > > > On 4/22/2012 11:32 PM, Galen Wright-Watson wrote: > > > >> 2012/4/22 C.Koy > >> > >> On 4/21/2012 4:37 AM, Galen Wright-Watson wrote: > >>> > >> > >> But, I did not start this thread to discuss such bug fix, because: > >>> > >>> 1. It does not take a genius to figure it out, and should take minute= s > to > >>> implement for someone experienced in the internals. Given the 10 year > >>> span > >>> and dozens of comments/complaints on the bug's entry, it's hard to sa= y > >>> this > >>> issue went unnoticed. So I had to conclude that such fix has quietly > been > >>> overruled for performance and/or other undisclosed reasons. > >>> > >>> > >> Why does it matter if a solution is simple? > >> > > > > It doesn't matter, you've misunderstood. > > > > You've misunderstood me. While you may have set out with the goal of > discussing making PHP completely case-sensitive, that doesn't preclude > others from suggesting fixes for the specific bug you mention. Indeed, so= me > of the first e-mails were around the bug, and not just in the context of > case-sensitive PHP. > > I didn't introduce the custom case conversion solution as a > counter-argument to case-sensitive PHP, and I wasn't asking for feedback = on > that solution in the context of case-sensitive PHP; I was asking for > reasons why it wouldn't be a suitable solution for the bug. The only plac= e > case-sensitive PHP enters into it was your statement that: > > As the recent comments on that page indicate, there's not a deterministic > > way to resolve this issue, apart from eliminating tolower() calls for > > function/class names during lookup. Hence totally case-sensitive PHP. > > > My proposition shows this is isn't entirely true, and branches off from t= he > original discussion at that point. I'm focusing on fixing the bug, which = is > a smaller issue than case-sensitivity. Discussion of case-sensitivity can > continue without regard to the custom conversion solution. As such, I've > changed the subject of this e-mail. > > Furthermore, going back to your original e-mail, you explicitly stated it > was about the bug, making case sensitivity subordinate to it. > > This post is about bug #18556 > (https://bugs.php.net/bug.php?**id=3D18556< > https://bugs.php.net/bug.php?id=3D18556>) > > which is a decade old. > > > I hope you can see why others might take the bug to be the context for > case-sensitivity, rather than the other way around. > > And that's what makes me curious and confused about why this bug still > > exists. See, I'm drawing a conclusion with what little information I > have, > > and stating the reasonings it's based on (first two statements). > > Overall, that and the item following it were an explanation of "why I'm > > suggesting a major feature change in solution to a specific bug", > although > > noone directly asked me to. > > > > In other words, you jumped to a conclusion. I wasn't asking about > possible > reasons why custom conversion hasn't been accepted as the solution to thi= s > bug. Neither was I asking why you didn't suggest it. I was (and still am) > asking for explicit, justifiable reasons as to whether or not it's a > suitable solution to the bug. > > > > > >> If it's already been rejected privately, it's time to bring the reason= s > >> into the open (which is why I asked). If not, it should be considered > >> publicly. > >> > > > > A comment dated 2002-09-26 on bug's page states the bug is fixed. The > next > > comment dated 2006-02-17 states it reappeared. > > I don't know who did what 10, 6 years ago but it's been revoked. Why? > > That was the main reason I deemed this bug not fixable, hence suggest > > other ways to resolve. > > > > I don't know either, but I'm not about to disregard potential fixes if > they haven't been publicly discussed. The regression could just as easily > have been a mistake. From looking at the original fix (revision 97040, > http://svn.php.net/viewvc?view=3Drevision&revision=3D97040, authored by i= liaa) > and the bug comments, something along the lines of what I'm suggesting ha= s > been suggested and even implemented before, but there's no real discussio= n > of it. The original fix (zend_str_tolower_nlc) assumed ASCII, which isn't > entirely suitable as there are uppercase characters that it doesn't > convert, which suggests yet another reason for the regression, namely tha= t > using zend_str_tolower would convert the characters that > zend_str_tolower_nlc missed. > > As for the real reason why the bug reappeared, we can continue on in our > historical examination. Revision 99001 ( > http://svn.php.net/viewvc?view=3Drevision&revision=3D99001, also authored > by iliaa) replaced zend_str_tolower with zend_str_tolower_nlc, making all > internal Zend case conversion use ASCII. iliaa had this to say about the > change (http://news.php.net/php.zend-engine.cvs/478): > > It appears that there no reason to keep both zend_str_tolower_nlc and > > zend_str_tolower. zend_str_tolower_nlc can be safely renamed to > > zend_str_tolower. The places it is used in, do not appear to depend on > > locale. For people who do need it there is an alternative php function > > php_strtolower, which they can use, which does respect the locale. So, = if > > there are no objections I'll prepare a patch that will change > > zend_str_tolower_nlc to zend_str_tolower. > > > Revision 128057 (http://svn.php.net/viewvc?view=3Drevision&revision=3D128= 057, > authored by sterling) adds zend_str_tolower for use in > fast_call_user_function, which makes use of tolower rather than a custom > conversion. Revision 128060 ( > http://svn.php.net/viewvc?view=3Drevision&revision=3D128060, same author)= then > changes zend_str_tolower to use tolower instead of its custom ASCII-based > conversion. The commit message is: "make this faster and sexier". Within > these revisions, zend_lookup_class is case sensitive. This change, in > combination with 99001, mask the reason for the custom conversion. > > Introduction of zend_tolower and use of tolower_l was introduced by > revision 224372 (http://svn.php.net/viewvc?view=3Drevision&revision=3D224= 372, > authored by stas (hi, Stas!)). The commit message is: "Improve > tolower()-related functions on Windows and VC2005 by caching locale and > using tolower_l function." > > There are plenty of other edits to Zend functions affecting case handling > (look over the commit messages listed in > > http://svn.php.net/viewvc/php/php-src/trunk/Zend/zend_operators.c?view=3D= log&pathrev=3D225000 > ) > that make similar tweaks involving case conversion and the character > encoding. What are we to conclude from all this? That the custom conversi= on > was a bug fix was lost as the file was edited and different people worked > on it. In other words, the fix was not lost due to a conscious decision > made by anyone, but rather the typical reason for regression (in the > original sense of the word): there's too much for anyone to keep all of i= t > in mind at once, so someone can easily re-introduce a bug without being > aware of it. > > I trust this demonstrates that "there must be an undisclosed reason" isn'= t > a justifiable reason not to implement my proposed solution. > > The abstract property that makes a locale problematic is obvious. I > >> was looking for specific locales, as they need to be identified for a > >> complete solution. > > > > > > I'm not locale expert. Given the public complaints/bugs we can, in > > practice, assume this affects Turkish and Azerbaijani only. (I don't kn= ow > > about Kurdish) > > > > Kurdish is mentioned by Mike and Tokul in the comments for the bug. I > could easily have come to the same conclusion, but I want an answer from > someone who knows without needing to make any assumptions. Are there any > locale experts (or someone willing to put in the leg-work) reading this > with a conclusive answer to my question about problematic locales? > thanks for digging this out. ps: you had a few extra > at the end of the first lines of your sentences, I experienced similar problems with gmail, the solution for me was to always put an extra new line after the quoted text. --=20 Ferenc Kov=C3=A1cs @Tyr43l - http://tyrael.hu --e89a8f3b55d40b415204be682236--