Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:39394 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 37593 invoked from network); 28 Jul 2008 06:56:05 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 28 Jul 2008 06:56:05 -0000 Authentication-Results: pb1.pair.com header.from=dmitry@zend.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=dmitry@zend.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 212.25.124.163 as permitted sender) X-PHP-List-Original-Sender: dmitry@zend.com X-Host-Fingerprint: 212.25.124.163 il-gw1.zend.com Windows 2000 SP4, XP SP1 Received: from [212.25.124.163] ([212.25.124.163:11901] helo=il-gw1.zend.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 50/F5-31471-38D6D884 for ; Mon, 28 Jul 2008 02:56:04 -0400 Received: from [10.1.10.10] ([10.1.10.10]) by il-gw1.zend.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 28 Jul 2008 09:56:39 +0300 Message-ID: <488D6D77.6000300@zend.com> Date: Mon, 28 Jul 2008 10:55:51 +0400 User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: Matt Wilmas CC: internals@lists.php.net, Andi Gutmans , Stanislav Malyshev References: <00ea01c8a160$2edd8160$0201a8c0@pc1> <016c01c8eccd$e28cfac0$0201a8c0@pc1> <488835D0.1040005@zend.com> <00e501c8ed77$3498cf20$0201a8c0@pc1> <48889901.8000008@zend.com> <022c01c8ee74$5b539340$0201a8c0@pc1> <488C5464.1030706@zend.com> <003c01c8efed$98d23620$0201a8c0@pc1> In-Reply-To: <003c01c8efed$98d23620$0201a8c0@pc1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 28 Jul 2008 06:56:40.0047 (UTC) FILETIME=[15FCEFF0:01C8F07F] Subject: Re: [PHP-DEV] New optimization idea; was: No runtime fetching of built-in global constants From: dmitry@zend.com (Dmitry Stogov) Hi Matt, It would be nice if scanner check for "#!" at start of file and skip the whole first line if found. Nothing else. Of curse it won't save syscall itself. Now shebang line is checked (in CGI SAPI code) before call to compiler and in case of scanner modification this code might be removed. So with opcode caches, which override the compiler routine, this syscall will be saved. I'll review your constant notes/patches later today. Thanks. Dmitry. Matt Wilmas wrote: > Hi Dmitry, > > ----- Original Message ----- > From: "Dmitry Stogov" > Sent: Sunday, July 27, 2008 > >> Hi Matt, >> >> At first as you are a scanner expert, I would like you to look into >> another optimization idea. >> >> Probably for historical reason PHP supports shebang lines >> (#! /usr/bin/php) on top of php files. Especially to handle them PHP >> (CGI/FastCGI/CLI) opens file and check for it. So even with opcode >> caches FastCGI PHP does open syscall for the requested script, however >> with opcode caches it's absolutely useless. >> >> In case PHP scanner will handle shebang lines itself, we will able to >> save this syscall. > > Sorry, but now I'm the one who's confused here, since I have no idea what > I'm supposed to look into exactly. :-/ I know about shebang lines, and I > know there's a check in the scanner now to skip over it (must have been > somewhere else with flex...). PHP itself doesn't "use" the shebang, does > it? I don't have much knowledge of *nix system stuff, but I thought the > shebang is for the OS to use when asked to run an executable script... > > So I don't understand what could be changed in the scanner to save the open > syscall -- since if the scanner is called, the file has already been opened, > right? Again, sorry, but I'll need more explanation. :-) > >> I never had time and enough flex/re2c knowledge to implement this idea >> myself. May be you'll able to look into the problem. In case you find a >> simple solution we will able to do it in php-5.3. >> >> Most PHP hosters and large sites use FastCGI with opcode caches (it is >> also the primary way for MS Windows users), so this optimization is >> really important. >> >> [see below] > > Yes, more reply below... > >> Matt Wilmas wrote: >>> Hi Dmitry, >>> >>> I saw that you commited this patch, with the addition of only replacing >>> persistent constants (just mentioning for others on the list). The > attached >>> patches have a few tweaks: >>> >>> The main thing I noticed is that if "something" creates a persistent, >>> case-INsensitive constant (any of those in "standard" PHP, besides >>> TRUE/FALSE/NULL?), it could still wrongly be substituted. My change >>> eliminates that possibility. >> After yesterday's subbotnik I'm so stupid so cannot understand simple >> tings. :) >> Could you point me into the exact part of the patch. > > Sure, I'll explain more. If something (extension, etc...) creates a > persistent constant "foo" and it does NOT have the CONST_CS flag, and > capital "FOO" or whatever is used in a script (and a case-sensitive version > doesn't exist, of course), it will be lowercased, as you know, and "foo" > will be found, but that can't be used since a case-sensitive version may be > defined later (TRUE/FALSE/NULL are exceptions, but OK to do since they have > CONST_CT_SUBST). Current zend_get_ct_const() behavior after finding > case-insensitive "foo": > > if ((c->flags & CONST_CS) && ...) { > efree(lookup_name); > return NULL; /* Doesn't fail for "foo" since CONST_CS isn't > set... */ > ..... > if (c->flags & CONST_CT_SUBST) { > return c; /* This doesn't happen since CONST_CT_SUBST isn't set... */ > } > if ((c->flags & CONST_PERSISTENT) && > ...) { > return c; /* This DOES happen since CONST_PERSISTENT is set */ > } > > The last part is the problem, of course. Substitution should only be done > for ones that have: > > CONST_CT_SUBST > or > CONST_PERSISTENT and whose case matches that used in the script (CONST_CS or > lowercase is used, e.g. "foo" in script is OK, not "FOO" or "Foo") > > (BTW, that also implies that CT_SUBST is only useful for case-insensitive > constants (like T/F/N) and if something sets it with (CS | PERSISTENT), it > doesn't really do anything -- except not allow it to be used in "const ..." > declarations. :-P Though that could be desired, by something, and/or to > also have substitution even in a namespace.) > > With my change, after finding case-insensitive "foo": > > if ((c->flags & CONST_CT_SUBST) && !(c->flags & CONST_CS)) { > efree(lookup_name); > return c; /* Happens for TRUE/FALSE/NULL */ > } > } > efree(lookup_name); > return NULL; /* Happens for "foo" before persistent check */ > > Hope that wasn't TOO verbose. :-O I just swapped the logic (succeed vs > fail) of that top part so the else { } block could be removed. > >>> Checking Z_TYPE(c->value) != IS_CONSTANT[_ARRAY] isn't needed with the >>> persistent check now, is it? >> I think you are right, but it's better to have this checks, because >> nobody prohibit to create array constants in extensions. > > OK. :-) > >>> From orginal patch (zend_constants.c part), the memcmp(...) != 0 isn't >>> needed as it will always be true. If it wasn't, the *first* hash lookup >>> wouldn't have failed. :-) Like I said in the original message, it's old >>> code from a long time ago, but was never removed... >> I'll check it with more clear head. > > I looked through the CVS logs/versions for zend_constants.c again, and the > memcmp() was needed before v1.40 (6 years ago). That's when all constants > were lowercased first, and then case was checked. 1.40 added case-sensitive > lookup first: memcmp() no longer necessary, as I said. And this old code > was then copied to zend_get_ct_const(), etc. That code area shouldn't get > hit very often or anything, but might as well remove since it's redundant! > :-) > >> Thanks. Dmitry. > > - Matt >