Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:38986 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 30448 invoked from network); 15 Jul 2008 19:44:21 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 15 Jul 2008 19:44:21 -0000 Authentication-Results: pb1.pair.com smtp.mail=amirh@metacafe.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=amirh@metacafe.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain metacafe.com designates 91.198.254.7 as permitted sender) X-PHP-List-Original-Sender: amirh@metacafe.com X-Host-Fingerprint: 91.198.254.7 war.metacafe.com Linux 2.6 Received: from [91.198.254.7] ([91.198.254.7:34395] helo=war.office.mc) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 0C/BD-11609-21EFC784 for ; Tue, 15 Jul 2008 15:44:20 -0400 Received: from metacafe.com (brain.office.mc [192.168.0.3]) by war.office.mc (Postfix) with ESMTP id 466FF100804F; Tue, 15 Jul 2008 19:45:10 +0000 (UTC) Received: from 192.168.1.119 ([192.168.1.119]) by brain.office.mc ([192.168.0.3]) with Microsoft Exchange Server HTTP-DAV ; Tue, 15 Jul 2008 19:43:13 +0000 Received: from amirh by brain.office.mc; 15 Jul 2008 22:44:15 +0300 To: Rasmus Lerdorf Cc: PHP Internals List In-Reply-To: <487CF99F.8090701@lerdorf.com> References: <1216133436.6875.35.camel@amirh> <487CEF26.7030802@lerdorf.com> <1216149220.16451.10.camel@amirh> <487CF99F.8090701@lerdorf.com> Content-Type: multipart/alternative; boundary="=-Mpk7/1tZ6LPyrkSp5HR2" Date: Tue, 15 Jul 2008 22:44:15 +0300 Message-ID: <1216151055.17692.9.camel@amirh> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Subject: Re: [PHP-DEV] lstat call on each directory level From: amirh@metacafe.com (Amir Hardon) --=-Mpk7/1tZ6LPyrkSp5HR2 Content-Type: text/plain Content-Transfer-Encoding: 7bit On Tue, 2008-07-15 at 12:25 -0700, Rasmus Lerdorf wrote: > Amir Hardon wrote: > > On Tue, 2008-07-15 at 11:40 -0700, Rasmus Lerdorf wrote: > >> Amir Hardon wrote: > >> > I've noticed a weird behavior when doing file access from PHP: > >> > PHP seems to make an lstat call on each of the parent directories of the > >> > accessed file, for example see this script: > >> > > >> > >> > $fp=fopen("/var/www/metacafe/test","r"); > >> > fclose($fp); > >> > ?> > >> > > >> > When running with strace -e lstat I see this: > >> > lstat("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 > >> > lstat("/var/www", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0 > >> > lstat("/var/www/metacafe", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = > >> > 0 > >> > lstat("/var/www/metacafe/test", 0x7fbfff9b10) = -1 ENOENT (No such file > >> > or directory) > >> > > >> > Measuring total syscalls time for an apache process on a production > >> > server, I found out > >> > that ~33% of the time it spends in syscalls is spent on lstat. > >> > > >> > I did a pretty deep web search on the issue and came out with nothing. > >> > I'll also note that I did a small experiment - moving our root portal > >> > folder to /, > >> > this gave an amazing performance improvement! > >> > > >> > So my questions are: > >> > What is the reason for doing these lstat calls? > >> > How can it be disabled? if not by configuration, maybe by patching php > >> > (can you direct me to where is this being done in php's source?) > >> > >> That's a realpath() call and it should be getting cached by the realpath > >> cache, so if you are seeing these on every request, try increasing your > >> realpath_cache size in your .ini. Without checking the realpath, you > >> would be able to circumvent open_basedir checking really easily with a > >> symlink. > >> > >> -Rasmus > > > > I've already increased the realpath_cache to the point it didn't give > > any more benefit(And it did give benefit), > > but there are still many lstat calls, and still placing our portal dir > > in the root directory gave a huge performance benefit(After fine-tuning > > realpath_cache). > > We don't use open_basedir. > > > > I think it might be wise to make this dir check configurable, as the > > performance impact is major. > > Anyway - can you please direct me to the place where this check is made > > in php's source, so I'll be able to disable it manually? > > Well, it is used in other places too, like in figuring out _once paths. > Including the same file using different paths still needs to be caught. > > Are you calling clearstatcache() manually anywhere? That blows away the > entire realpath cache and completely destroys your performance, so you > will want to avoid doing that very often. > > -Rasmus > About clearstatcache() - not using it at all. Correct me if I'm wrong but this realpath cache is a per-request cache (when using php as an apache module), so unless I'm wrong ,the performance benefit I'm getting when moving the portal to the / dir should be obvious (our code is splitted to many files, and each file that is being required, is generating few lstat calls). About the issue with the _once, did the patch Derick offered handles it (I haven't examined it yet). If not, I just need to make sure that the same file isn't being referenced by two paths and I'm safe right? (I mean assuming I'll adjust it to php5) Thanks again to both of you! -Amir. --=-Mpk7/1tZ6LPyrkSp5HR2--