Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:12635 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 79701 invoked by uid 1010); 7 Sep 2004 18:24:26 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 79646 invoked from network); 7 Sep 2004 18:24:25 -0000 Received: from unknown (HELO colo.lerdorf.com) (66.198.51.121) by pb1.pair.com with SMTP; 7 Sep 2004 18:24:25 -0000 Received: from rasmus2.corp.yahoo.com (rasmus2.corp.yahoo.com [207.126.233.18]) by colo.lerdorf.com (8.13.1/8.13.1/Debian-12) with ESMTP id i87IOPUq026102; Tue, 7 Sep 2004 11:24:25 -0700 Date: Tue, 7 Sep 2004 11:24:24 -0700 (PDT) X-X-Sender: rasmus@t42p To: Xuefer cc: internals@lists.php.net In-Reply-To: Message-ID: References: <20040905193319.91641.qmail@pb1.pair.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: [PHP-DEV] Really odd PHP problem From: rasmus@php.net (Rasmus Lerdorf) On Wed, 8 Sep 2004, Xuefer wrote: > >> both mmcache and apc does not have "crash recover" > > > > The concept of a crash recover is somewhat flawed in my opinion. The only > > way to really do this is to catch SIGSEGV, SIGBUS and other such fatal > > signals and twiddle a knob somewhere in shared memory that tells other > > processes to flush the cache. The problem with doing this is that once > > you get a SEGV, it really isn't safe to do anything like that. You run a > > very serious risk of ending up in an infinite crash loop where you catch > > > the crash, try to set the crash-recover flag, crash trying to do that, > > catch the crash, etc. > > > > -Rasmus > > > > without crash recover, corrupted share mem will trigger the crash too in another process. > > sorry for my low experience on C and sharemem > but IMHO, it not that hard > it easy to make a reset_flag at top or bottom of sharemem > flag is just an int, not pointer > it won't crash unless the sharemem is unavailable, or the pointer to the share mem is corrupted, maybe possible? > after all we can reset the signalhandler when we're going to operate on the flag > remember to log something when share mem is going to reset(no matter can or cannot obtain write lock to reset) There are different ways of doing it, but using a signal handler to catch a SEGV is not a good idea. You can't count on any code of any sort working after a SEGV. You could turn it around and have processes check in and out as they handle requests and if a process doesn't check out within some allotted time, assume a crash and reset. Or you could have an external mechanism monitor for crashes and do the reset externally. But having the process itself that crashed do anything is just asking for trouble. It doesn't matter if the flag is an int or what it is. Any code at all executed after a SEGV is unsafe. -Rasmus