Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:60401 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 2149 invoked from network); 1 May 2012 18:12:28 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 May 2012 18:12:28 -0000 Authentication-Results: pb1.pair.com header.from=ww.galen@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=ww.galen@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.213.42 as permitted sender) X-PHP-List-Original-Sender: ww.galen@gmail.com X-Host-Fingerprint: 209.85.213.42 mail-yw0-f42.google.com Received: from [209.85.213.42] ([209.85.213.42:34613] helo=mail-yw0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 70/6A-38165-B8720AF4 for ; Tue, 01 May 2012 14:12:27 -0400 Received: by yhfq11 with SMTP id q11so2404021yhf.29 for ; Tue, 01 May 2012 11:12:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=x9e4IH0p95aht9Rt+N1GNBkwSlCN+sPNmfY4XcfQZPA=; b=O5DY0De94Lq0ZYAFKAkTV9k2TnYTHPhSSjkMlln0+HxGIrpl+w5OYSELO5+I0fnnkz PI0G77AAkfCLZTGpiVBFeoadtNf4ArWGCwn7289cv1HwtxLzwM0rZMX6vQYzN06ACj+B G2ZBaKqAvBU/78LhwJg4lCnH/LwK+MjoT7f6wO0EpywCQfzNwV1OaPUE78nDoucvY9zj qTS3JkqKmFe8TBg2Yag+tKFHVoj0Qwk5hNnvwfKAadrLQNgg9iQ1cHgxLG1aW6BCcJhB JWAqODnpYsJ2piGtbMPbeyYanLRqfB2yG7M7d6iB3WIr6QWTpySSx26j/f/uYynR8kld LoWQ== Received: by 10.42.28.9 with SMTP id l9mr1137872icc.31.1335895944835; Tue, 01 May 2012 11:12:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.144.201 with HTTP; Tue, 1 May 2012 11:11:44 -0700 (PDT) In-Reply-To: References: Date: Tue, 1 May 2012 11:11:44 -0700 Message-ID: To: "C.Koy" Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary=20cf301d41c03fb09604befd83eb Subject: Re: [PHP-DEV] Fixing bug #18556 (was: Complete case-sensitivity in PHP) From: ww.galen@gmail.com (Galen Wright-Watson) --20cf301d41c03fb09604befd83eb Content-Type: text/plain; charset=ISO-8859-1 On Thu, Apr 26, 2012 at 3:45 AM, C.Koy wrote: > As of 5.3.0 this bug does not exist for function names. Only classes and > interfaces. > > Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface. Could this be a clue for how to fix it for those as well? Function names are generally resolved at compile time (dynamic function names are resolved at run time, which is why the bug surfaces for them), before the call to setlocale in the script has been executed. Class name resolution is put off until execution time for autoloading and possibly other purposes. Converting class names to lowercase at compile time may work. A quick glance at the source shows that class_name, fully_qualified_class_name and class_name_reference all depend on namespace_name, which is the rule that is responsible for the parsing of the class name. namespace_name: T_STRING { $$ = $1; } | namespace_name T_NS_SEPARATOR T_STRING { zend_do_build_namespace_name(&$$, &$1, &$3 TSRMLS_CC); } ; However, static_scalar is also dependent on namespace_name, and I don't believe that symbol should be made case-insensitive. Creating an additional symbol for case-independency would allow a more targeted approach. The various class symbols would then rely on this new symbol, rather than namespace_name. lc_namespace_name: T_STRING { zend_str_tolower($1); $$ = $1; } | lc_namespace_name T_NS_SEPARATOR T_STRING { zend_str_tolower($3); zend_do_build_namespace_name(&$$, &$1, &$3 TSRMLS_CC); } ; Converting class names to lower case early may have additional consequences. It may affect class names in error messages, for example (I didn't dig deep enough to determine this). __CLASS__ should be unaffected (when defining a class, the class name is parsed as a T_STRING; the value for __CLASS__ comes from this symbol). It also won't resolve the bug for dynamic names. I suspect that altering variable_class_name and dynamic_class_name_reference in a manner described previously (use a custom lowercase conversion or temporarily switch locale) to convert the name would resolve the bug in the dynamic case for class names. Changing a number of the production rules for function_call in a similar manner should resolve the bug for dynamic function call. Again, there will likely be unintended consequences. Alternatively, updating zend_do_begin_dynamic_function_call() and zend_do_fetch_class() to use custom conversion should resolve the bug in the dynamic case. I like the idea of using the system default locale for name conversion (making name resolution independent of the current locale), but am concerned that it will make name lookup slow. Instead, a second set of locale-independent, unicode-aware conversion functions (basically, iliaa's original solution, but Unicode compatible) to be used for identifiers would make name resolution independent of the current locale. Any time an identifiers needs to be converted, it would use one of these functions. As a run-time optimization, non-dynamic class names could use the system locale conversion, but that would be a separate thing from resolving this bug. --20cf301d41c03fb09604befd83eb--