Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:12635
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Date: Tue, 7 Sep 2004 11:24:24 -0700 (PDT)
To: Xuefer <Xuefer@hotmail.com>
cc: internals@lists.php.net
In-Reply-To: <BAY24-DAV12lGOkljKs0004ff44@hotmail.com>
Message-ID: <Pine.LNX.4.58.0409071119460.1483@t42p>
References: <20040905193319.91641.qmail@pb1.pair.com> <BAY24-DAV18JqlUm7iC0004cbb2@hotmail.com>
 <Pine.LNX.4.58.0409062057490.1473@t42p> <BAY24-DAV12lGOkljKs0004ff44@hotmail.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: Re: [PHP-DEV] Really odd PHP problem
From: rasmus@php.net (Rasmus Lerdorf)

On Wed, 8 Sep 2004, Xuefer wrote:
> >> both mmcache and apc does not have "crash recover"
> >
> > The concept of a crash recover is somewhat flawed in my opinion.  The only
> > way to really do this is to catch SIGSEGV, SIGBUS and other such fatal
> > signals and twiddle a knob somewhere in shared memory that tells other
> > processes to flush the cache.  The problem with doing this is that once
> > you get a SEGV, it really isn't safe to do anything like that.  You run a
> > very serious risk of ending up in an infinite crash loop where you catch
>
> > the crash, try to set the crash-recover flag, crash trying to do that,
> > catch the crash, etc.
> >
> > -Rasmus
> >
>
> without crash recover, corrupted share mem will trigger the crash too in another process.
>
> sorry for my low experience on C and sharemem
> but IMHO, it not that hard
> it easy to make a reset_flag at top or bottom of sharemem
> flag is just an int, not pointer
> it won't crash unless the sharemem is unavailable, or the pointer to the share mem is corrupted, maybe possible?
> after all we can reset the signalhandler when we're going to operate on the flag
> remember to log something when share mem is going to reset(no matter can or cannot obtain write lock to reset)

There are different ways of doing it, but using a signal handler to catch
a SEGV is not a good idea.  You can't count on any code of any sort
working after a SEGV.  You could turn it around and have processes check
in and out as they handle requests and if a process doesn't check out
within some allotted time, assume a crash and reset.  Or you could have an
external mechanism monitor for crashes and do the reset externally.  But
having the process itself that crashed do anything is just asking for
trouble.  It doesn't matter if the flag is an int or what it is.  Any code
at all executed after a SEGV is unsafe.

-Rasmus