Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:12590
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Message-ID: <5.1.0.14.2.20040905231035.086788e0@localhost>
Date: Sun, 05 Sep 2004 23:30:46 +0300
To: Russ Garrett <russ@last.fm>
Cc: internals@lists.php.net
In-Reply-To: <20040905193319.91641.qmail@pb1.pair.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
Subject: Re: [PHP-DEV] Really odd PHP problem
From: zeev@zend.com (Zeev Suraski)
References: <20040905193319.91641.qmail@pb1.pair.com>

Is your server really unusable w/o a compiled code cache?  Of not, try to 
remove it and see if the problem persists.  One of the problems of most 
opcode caches is that a crash bug in PHP or one of its modules can end up 
resulting in a full server crash.

I have to say though that it doesn't look that way to me.  From first 
glance, it appears to be the standard 'spiraling crash'.  What that 
basically means is:

1.  For whatever reason, the number of Apache processes rises (typically 
due to increased end user load, but sometimes also because some 
administration script is being run, database slowdown, etc.).
2.  The machine hits the swap threshold , which causes it to slow down much 
more (typically by an order of magnitude, at least).
3.  Because of the slowdown, the increased number of Apache processes 
quickly becomes saturated (it takes each one more time to serve the 
request), and with new requests flowing in, the number of Apache processes 
increases even more.
4.  More swap is necessary for the increased number of Apache processes, 
and an hopeless spiral begins, typically ending only when the server dies.

If that's indeed what happens on your system (and it happens to almost 
everyone, sooner or later) - then it means your system has a value of 
MaxClients that's not backed by its CPU power and more importantly, 
available memory.  You need to either decrease that number or add more memory.

Generally, your machine should have enough memory to run Apache when it 
reaches MaxClients without hitting swap.  You can test it by setting 
StartServers to the same number as MaxClients, and then hitting some of 
your PHP-based pages with a high-concurrency ab.

It might be possible that the crash is somehow related, especially if it 
corrupts the compiled code cache and results in frequent crashes of Apache 
processes, which will cause Apache to fork more and more processes that can 
be the initial slowdown trigger, but still, a properly configured server 
should not die out of memory because of that.

Zeev


At 22:35 05/09/2004, Russ Garrett wrote:
>OK, first of all I apologise for not posting this in the "right place", 
>but this is an unreproducable bug (the worst kind...), and I need some 
>educated guesses as to what is causing it. This thing has me at my wits' end...
>
>The situation is this: Apache on our main dynamic web server keeps on 
>suddenly eating all 3+GB of available virtual memory. When this happens, 
>the server stops responding to all requests, and basically freezes until 
>the kernel OOM killer gets around to killing enough httpd processes so we 
>can get in to kill the rest and restart it. This happens every couple of 
>minutes to couple of hours - there's no pattern to it. You can see an 
>attractive graph of the occurence here:
>
>http://static.last.fm/phpbug/mem.gif
>
>As you can see, the memory usage shoots up suddenly - it doesn't appear to 
>be a conventional memory leak. This is accompanied by a similar spike in 
>the number of apache processes - right up to the MaxClients limit.
>
>We've been running PHP with debug support enabled for the last couple of 
>days, and we've noticed that a series of errors is always logged just 
>before the spike in memory usage. A log snippet is available here (note 
>that these errors carry on for several pages - although I suspect the 
>first one is the only relevant one - this is only the first page or so):
>
>http://static.last.fm/phpbug/log.txt
>
>The error doesn't just happen with that script, however the initial error 
>always occurs in the same place (zend_variables.c:44).
>
>This machine serves around 500,000 hits daily, and 99% of them are 
>PHP-parsed. It's running Debian 3.0 with backported kernel 2.6.7. The bug 
>manifests itself with both Apache 1.3 and 2, and both PHP 4.3.8 and 
>4.3.9RC2. Compile options are as follows:
>
>'./configure' '--with-apxs2=/web/apache2/bin/apxs' '--without-mysql' 
>'--with-zlib-dir=/usr' '--enable-gd-native-ttf' '--with-gettext' 
>'--enable-mbstring' '--with-pgsql=/usr/local/pgsql' '--enable-sysvmsg' 
>'--with-gd' '--with-jpeg-dir=/usr' '--enable-debug'
>
>The only third-party module we're using is Turck mmcache - removing it is 
>kind of difficult since running without any cache brings the machine to 
>its knees :).
>
>Sorry about the length of this message, I had to fit all the details in... 
>I'd appreciate it if you have any suggestions at all, this is really 
>annoying me now. It's problems like this with PHP which make me consider 
>moving to Java ;)... Anyhow.
>
>Thanks in advance,
>
>Russ Garrett
>russ@last.fm
>
>--
>PHP Internals - PHP Runtime Development Mailing List
>To unsubscribe, visit: http://www.php.net/unsub.php