Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:39014 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 895 invoked from network); 16 Jul 2008 16:22:49 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 16 Jul 2008 16:22:49 -0000 Authentication-Results: pb1.pair.com header.from=rasmus@lerdorf.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=rasmus@lerdorf.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lerdorf.com from 204.11.219.139 cause and error) X-PHP-List-Original-Sender: rasmus@lerdorf.com X-Host-Fingerprint: 204.11.219.139 mail.lerdorf.com Received: from [204.11.219.139] ([204.11.219.139:60547] helo=mail.lerdorf.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 54/8A-54589-7502E784 for ; Wed, 16 Jul 2008 12:22:49 -0400 Received: from trainburn-lm.corp.yahoo.com (trainburn-lm.corp.yahoo.com [207.126.233.11]) (authenticated bits=0) by mail.lerdorf.com (8.14.3/8.14.3/Debian-4) with ESMTP id m6GGMfEm006121; Wed, 16 Jul 2008 09:22:42 -0700 Message-ID: <487E2051.70000@lerdorf.com> Date: Wed, 16 Jul 2008 09:22:41 -0700 User-Agent: Thunderbird/3.0a2pre (Macintosh; 2008071516) MIME-Version: 1.0 To: Amir Hardon CC: Arvids Godjuks , Oleg Grenrus , PHP Internals List References: <1216133436.6875.35.camel@amirh> <487CEF26.7030802@lerdorf.com> <1216159988.28846.12.camel@localhost> <487D22AE.20109@lerdorf.com> <5a2b1bf10807152247r563092a6l88d555d2389506e4@mail.gmail.com> <9b3df6a50807160340w578d7490xb6f2810652e1bb47@mail.gmail.com> <487DFB86.3030404@lerdorf.com> <1216224862.20625.33.camel@amirh> In-Reply-To: <1216224862.20625.33.camel@amirh> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-3.0 (mail.lerdorf.com [204.11.219.139]); Wed, 16 Jul 2008 09:22:42 -0700 (PDT) Subject: Re: [PHP-DEV] lstat call on each directory level From: rasmus@lerdorf.com (Rasmus Lerdorf) Amir Hardon wrote: > On Wed, 2008-07-16 at 06:45 -0700, Rasmus Lerdorf wrote: >> Arvids Godjuks wrote: >> > Hello. >> > >> > I think this should be optimized. >> > I'm not an expert ofcourse, but as I understood there is only one case >> > witch need a special treatment - require/include _one when a file with >> > equal contents is included from different directories. >> > You can make a switch witch controls if lstat is made or not in these >> > cases. People who know what they are doing will switch it to off and >> > make sure their includes don't result in Fatal error (anyway, to my >> > opinion it is bad desing if such thing happens). >> > Ofcourse open_basedir users will don't have any benefit from it, but >> > that's their choise. >> > So I think you should think it out and make this optimization to 5.3 >> > release. It would be great optimization, IMHO. >> >> But all these lstats should be getting cached, so I don't see how it >> would affect performance very much. If you are blowing your realpath >> cache, you need to take a look at why that is happening. >> >> We probably should disconnect clearstatcache() from the realpath_cache, >> and we could perhaps look at doing partial path caches through our own >> realpath implementation. The other thing that can suck is when you have >> an include_path miss. We don't cache misses like this, so if you are >> relying on include_path to find your files and you don't hit it on the >> first try, you are going to see a bunch of stats. But that is again >> something that is easily fixed by not writing crappy code. >> >> I think that breaking code that looks like this: >> >> require_once './a.inc'; >> require_once 'a.inc'; >> require_once '../a.inc'; >> require_once 'includes/a.inc'; >> >> when these all refer to the same a.inc file depending on where the >> parent file is and what the coder had for breakfast that morning would >> be a very bad idea. >> >> -Rasmus >> > > Since the realpath cache is only relevant for a single request(right?), > removing these lstats > calls will a major benefit. No, of course not. It would be a useless cache if that was the case. The cache spans requests. > Before moving our portal dir to the / dir, ~40% of our page requests > were slow on the server side (I'm not sure if my company policies allow > me to expose exactly what is considered slow), > after moving it ~20% of the page requests were slow! this is significant. > And there are still many lstat calls made inside our portal's directory > tree. Yes, you need to figure out why you are not hitting the cache, or why you are seeing so many lstat calls. There are 3 main possibilities: 1. You have a crapload of files and they simply won't fit in the cache. Increase your realpath_cache size to address this. 2. Something somewhere is calling clearstatcache() often. Don't do that. 3. You are relying on include_path to find your files and you are always missing the file on the first couple of tries. Hint, it is a good idea to get rid of "." from the beginning of your include_path and always use "./foo.inc" to include something from the current directory instead of having include_path do this for you. That lets you put some other path first in the include_path and you can then include "path/file.inc" and have that be relative to the first path in your include_path. And make sure all your include_path includes are relative to that first path. Never include something that will hit the second component of include_path. -Rasmus