Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:46425 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 51284 invoked from network); 15 Dec 2009 16:08:46 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 15 Dec 2009 16:08:46 -0000 Authentication-Results: pb1.pair.com smtp.mail=jerome@loyet.net; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=jerome@loyet.net; sender-id=unknown Received-SPF: error (pb1.pair.com: domain loyet.net from 209.85.223.195 cause and error) X-PHP-List-Original-Sender: jerome@loyet.net X-Host-Fingerprint: 209.85.223.195 mail-iw0-f195.google.com Received: from [209.85.223.195] ([209.85.223.195:33370] helo=mail-iw0-f195.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 6E/03-36405-C84B72B4 for ; Tue, 15 Dec 2009 11:08:46 -0500 Received: by iwn33 with SMTP id 33so4581iwn.29 for ; Tue, 15 Dec 2009 08:08:40 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.156.85 with SMTP id v21mr2081052ibw.38.1260893320790; Tue, 15 Dec 2009 08:08:40 -0800 (PST) Date: Tue, 15 Dec 2009 17:08:40 +0100 Message-ID: <3bea96c40912150808n18cd3333idc644823e5a4cae@mail.gmail.com> To: php-dev Cc: Antony Dovgal Content-Type: text/plain; charset=ISO-8859-1 Subject: [BUG] fpm : bugs with epoll From: jerome@loyet.net (=?ISO-8859-1?B?Suly9G1lIExveWV0?=) Hi I just figured out a bug with epoll on the last php-fpm version from svn. It seems that with epoll only (no problem with poll or select -- don't have a freebsd ready to check kqueue), the catch_worker_output is buggy. With catch_workers_output set to "no", all output to stdout or stderr from children are sent to /dev/null. With catch_workers_output set to "yes", output to stdout or stderr from children are sent through 2 pipes to the parent process. The parent process has an event on reading on those pipes. When a child write to one of the pipes, the parent write to the log_file something like: Dec 15 16:48:10.670668 [WARNING] child 24035 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." To check it in action, simply change the allowed_clients to something else than 127.0.0.1 (127.0.0.2). Restart php-fpm and try connect to php-fpm (with "telnet 127.0.0.1 9000" or through a webserver in front). You should see the previous log in your error_log each time you try to connect to fpm, it should be written in (pseudo) realtime. Here is a normal error_log in this situation with 2 children: # EVENT_NOEPOLL= ./sapi/fpm/php-fpm Dec 15 16:58:45.931952 [NOTICE] getrlimit(nofile): max:1024, cur:1024 Dec 15 16:58:45.932489 [NOTICE] libevent: using poll Dec 15 16:58:45.932797 [NOTICE] fpm is running, pid 24200 Dec 15 16:58:45.934172 [NOTICE] child 24201 (pool default) started Dec 15 16:58:45.935521 [NOTICE] child 24202 (pool default) started Dec 15 16:58:45.935782 [NOTICE] libevent: entering main loop Dec 15 16:58:50.667584 [WARNING] child 24201 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." Dec 15 16:58:51.610799 [WARNING] child 24202 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." Dec 15 16:58:54.231871 [WARNING] child 24201 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." Dec 15 16:58:54.432654 [WARNING] child 24202 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." ^CDec 15 16:58:56.836278 [NOTICE] received SIGINT Dec 15 16:58:56.836324 [NOTICE] switching to 'terminating' state Dec 15 16:58:56.836342 [NOTICE] sending signal 15 SIGTERM to child 24202 (pool default) Dec 15 16:58:56.836355 [NOTICE] sending signal 15 SIGTERM to child 24201 (pool default) Dec 15 16:58:56.836363 [NOTICE] 2 children are still alive Dec 15 16:58:56.836822 [NOTICE] received SIGCHLD Dec 15 16:58:56.836901 [WARNING] child 24202 (pool default) exited on signal 2 SIGINT after 10.901391 seconds from start Dec 15 16:58:56.837290 [NOTICE] received SIGCHLD Dec 15 16:58:56.837342 [WARNING] child 24201 (pool default) exited on signal 2 SIGINT after 10.903209 seconds from start Dec 15 16:58:56.837357 [NOTICE] exiting, bye-bye! I made 4 connections and in real time I have in the error_log the corresponding error messages. Note that I'm using POLL here. We can see that child 24201 handled the 1st and 3rd connection and the child 24202 handled the 2nd and 4th connection. Now with EPOLL: # ./sapi/fpm/php-fpm Dec 15 17:01:34.721411 [NOTICE] getrlimit(nofile): max:1024, cur:1024 Dec 15 17:01:34.721973 [NOTICE] libevent: using epoll Dec 15 17:01:34.722286 [NOTICE] fpm is running, pid 24375 Dec 15 17:01:34.723695 [NOTICE] child 24376 (pool default) started Dec 15 17:01:34.725099 [NOTICE] child 24377 (pool default) started Dec 15 17:01:34.725383 [NOTICE] libevent: entering main loop Dec 15 17:01:37.044002 [WARNING] child 24377 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." Dec 15 17:01:38.215927 [WARNING] child 24377 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." ^CDec 15 17:01:41.237395 [NOTICE] received SIGINT Dec 15 17:01:41.237443 [NOTICE] switching to 'terminating' state Dec 15 17:01:41.237462 [NOTICE] sending signal 15 SIGTERM to child 24377 (pool default) Dec 15 17:01:41.237473 [NOTICE] sending signal 15 SIGTERM to child 24376 (pool default) Dec 15 17:01:41.237481 [NOTICE] 2 children are still alive Dec 15 17:01:41.237938 [NOTICE] received SIGCHLD Dec 15 17:01:41.238015 [WARNING] child 24377 (pool default) exited on signal 2 SIGINT after 6.512935 seconds from start Dec 15 17:01:41.238389 [NOTICE] received SIGCHLD Dec 15 17:01:41.238441 [WARNING] child 24376 (pool default) exited on signal 2 SIGINT after 6.514792 seconds from start Dec 15 17:01:41.238473 [WARNING] child 24376 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped." Dec 15 17:01:41.238490 [WARNING] child 24376 (pool default) said into stderr: "Connection from disallowed IP address '127.0.0.1' is dropped.", pipe is closed Dec 15 17:01:41.238502 [NOTICE] exiting, bye-bye! I made the same number of connection and only the ones handled by the last created child has been written on real time. Messages from the other child have been written when the program terminated (so much much later in real production env). To change the polling system used by libevent you can play with the env variables EVENT_NOEPOLL, EVENT_NOKQUEUE, EVENT_NODEVPOLL, EVENT_NOPOLL or EVENT_NOSELECT. Can someone reproduce this bug ? I want to be sure it's a bug on FPM and not a mis-configured system. I'll try to find out an explanation and a solution ++ jerome