Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:46430 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 92651 invoked from network); 15 Dec 2009 19:55:17 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 15 Dec 2009 19:55:17 -0000 Authentication-Results: pb1.pair.com header.from=jerome@loyet.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=jerome@loyet.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain loyet.net from 209.85.223.195 cause and error) X-PHP-List-Original-Sender: jerome@loyet.net X-Host-Fingerprint: 209.85.223.195 mail-iw0-f195.google.com Received: from [209.85.223.195] ([209.85.223.195:59362] helo=mail-iw0-f195.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 17/12-16228-5A9E72B4 for ; Tue, 15 Dec 2009 14:55:17 -0500 Received: by iwn33 with SMTP id 33so170567iwn.29 for ; Tue, 15 Dec 2009 11:55:15 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.146.2 with SMTP id f2mr448348ibv.23.1260906914583; Tue, 15 Dec 2009 11:55:14 -0800 (PST) In-Reply-To: <3bea96c40912151125k7120524qc06810880a57369@mail.gmail.com> References: <3bea96c40912150808n18cd3333idc644823e5a4cae@mail.gmail.com> <3bea96c40912151125k7120524qc06810880a57369@mail.gmail.com> Date: Tue, 15 Dec 2009 20:55:14 +0100 Message-ID: <3bea96c40912151155p47000c13j8bd5162405ced9e8@mail.gmail.com> To: php-dev Cc: Antony Dovgal Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [BUG] fpm : bugs with epoll From: jerome@loyet.net (=?ISO-8859-1?B?Suly9G1lIExveWV0?=) I found a solution: Just after the fork() call to create a child, the first thing to do is to call event_init() to create a fresh libevent environment for the new child. In fact epoll shares the same fd between process and having not libevent reinit make things not working as excepted. It's not the case for others event libraries. There is no need to make this for the main fork because the event_init is called after this fork. So here is the patch. As it's very simple I put it here and not as an attached file ++ Jerome Index: sapi/fpm/fpm/fpm_children.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sapi/fpm/fpm/fpm_children.c (r=E9vision 292175) +++ sapi/fpm/fpm/fpm_children.c (copie de travail) @@ -373,6 +373,7 @@ switch (pid) { case 0 : + event_init(); fpm_child_resources_use(child); fpm_globals.is_child =3D 1; if (in_event_loop) { Le 15 d=E9cembre 2009 20:25, J=E9r=F4me Loyet a =E9crit = : > Le 15 d=E9cembre 2009 17:08, J=E9r=F4me Loyet a =E9cri= t : >> Hi >> >> I just figured out a bug with epoll on the last php-fpm version from svn= . >> >> It seems that with epoll only (no problem with poll or select -- don't >> have a freebsd ready to check kqueue), the catch_worker_output is >> buggy. > > I just tested on freebsd with libevent 1.4.12. When using kqueue the > problems does not occur. It seems to be a an epoll problem.I'll check > on another linux like fedora (I tested only on ubuntu) > >> >> With catch_workers_output set to "no", all output to stdout or stderr >> from children are sent to /dev/null. >> >> With catch_workers_output set to "yes", output to stdout or stderr >> from children are sent through 2 pipes to the parent process. The >> parent process has an event on reading on those pipes. When a child >> write to one of the pipes, the parent write to the log_file something >> like: >> >> Dec 15 16:48:10.670668 [WARNING] child 24035 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> >> To check it in action, simply change the allowed_clients to something >> else than 127.0.0.1 (127.0.0.2). >> Restart php-fpm and try connect to php-fpm (with "telnet 127.0.0.1 >> 9000" or through a webserver in front). You should see the previous >> log in your error_log each time you try to connect to fpm, it should >> be written in (pseudo) realtime. >> >> Here is a normal error_log in this situation with 2 children: >> >> # EVENT_NOEPOLL=3D ./sapi/fpm/php-fpm >> Dec 15 16:58:45.931952 [NOTICE] getrlimit(nofile): max:1024, cur:1024 >> Dec 15 16:58:45.932489 [NOTICE] libevent: using poll >> Dec 15 16:58:45.932797 [NOTICE] fpm is running, pid 24200 >> Dec 15 16:58:45.934172 [NOTICE] child 24201 (pool default) started >> Dec 15 16:58:45.935521 [NOTICE] child 24202 (pool default) started >> Dec 15 16:58:45.935782 [NOTICE] libevent: entering main loop >> Dec 15 16:58:50.667584 [WARNING] child 24201 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> Dec 15 16:58:51.610799 [WARNING] child 24202 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> Dec 15 16:58:54.231871 [WARNING] child 24201 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> Dec 15 16:58:54.432654 [WARNING] child 24202 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> ^CDec 15 16:58:56.836278 [NOTICE] received SIGINT >> Dec 15 16:58:56.836324 [NOTICE] switching to 'terminating' state >> Dec 15 16:58:56.836342 [NOTICE] sending signal 15 SIGTERM to child >> 24202 (pool default) >> Dec 15 16:58:56.836355 [NOTICE] sending signal 15 SIGTERM to child >> 24201 (pool default) >> Dec 15 16:58:56.836363 [NOTICE] 2 children are still alive >> Dec 15 16:58:56.836822 [NOTICE] received SIGCHLD >> Dec 15 16:58:56.836901 [WARNING] child 24202 (pool default) exited on >> signal 2 SIGINT after 10.901391 seconds from start >> Dec 15 16:58:56.837290 [NOTICE] received SIGCHLD >> Dec 15 16:58:56.837342 [WARNING] child 24201 (pool default) exited on >> signal 2 SIGINT after 10.903209 seconds from start >> Dec 15 16:58:56.837357 [NOTICE] exiting, bye-bye! >> >> I made 4 connections and in real time I have in the error_log the >> corresponding error messages. Note that I'm using POLL here. We can >> see that child 24201 handled the 1st and 3rd connection and the child >> 24202 handled the 2nd and 4th connection. >> >> Now with EPOLL: >> >> # ./sapi/fpm/php-fpm >> Dec 15 17:01:34.721411 [NOTICE] getrlimit(nofile): max:1024, cur:1024 >> Dec 15 17:01:34.721973 [NOTICE] libevent: using epoll >> Dec 15 17:01:34.722286 [NOTICE] fpm is running, pid 24375 >> Dec 15 17:01:34.723695 [NOTICE] child 24376 (pool default) started >> Dec 15 17:01:34.725099 [NOTICE] child 24377 (pool default) started >> Dec 15 17:01:34.725383 [NOTICE] libevent: entering main loop >> Dec 15 17:01:37.044002 [WARNING] child 24377 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> Dec 15 17:01:38.215927 [WARNING] child 24377 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> ^CDec 15 17:01:41.237395 [NOTICE] received SIGINT >> Dec 15 17:01:41.237443 [NOTICE] switching to 'terminating' state >> Dec 15 17:01:41.237462 [NOTICE] sending signal 15 SIGTERM to child >> 24377 (pool default) >> Dec 15 17:01:41.237473 [NOTICE] sending signal 15 SIGTERM to child >> 24376 (pool default) >> Dec 15 17:01:41.237481 [NOTICE] 2 children are still alive >> Dec 15 17:01:41.237938 [NOTICE] received SIGCHLD >> Dec 15 17:01:41.238015 [WARNING] child 24377 (pool default) exited on >> signal 2 SIGINT after 6.512935 seconds from start >> Dec 15 17:01:41.238389 [NOTICE] received SIGCHLD >> Dec 15 17:01:41.238441 [WARNING] child 24376 (pool default) exited on >> signal 2 SIGINT after 6.514792 seconds from start >> Dec 15 17:01:41.238473 [WARNING] child 24376 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped." >> Dec 15 17:01:41.238490 [WARNING] child 24376 (pool default) said into >> stderr: "Connection from disallowed IP address '127.0.0.1' is >> dropped.", pipe is closed >> Dec 15 17:01:41.238502 [NOTICE] exiting, bye-bye! >> >> I made the same number of connection and only the ones handled by the >> last created child has been written on real time. Messages from the >> other child have been written when the program terminated (so much >> much later in real production env). >> >> To change the polling system used by libevent you can play with the >> env variables EVENT_NOEPOLL, EVENT_NOKQUEUE, EVENT_NODEVPOLL, >> EVENT_NOPOLL or EVENT_NOSELECT. >> >> Can someone reproduce this bug ? I want to be sure it's a bug on FPM >> and not a mis-configured system. >> I'll try to find out an explanation and a solution >> >> ++ jerome >> >