Hi
I just figured out a bug with epoll on the last php-fpm version from svn.
It seems that with epoll only (no problem with poll or select -- don't
have a freebsd ready to check kqueue), the catch_worker_output is
buggy.
With catch_workers_output set to "no", all output to stdout or stderr
from children are sent to /dev/null.
With catch_workers_output set to "yes", output to stdout or stderr
from children are sent through 2 pipes to the parent process. The
parent process has an event on reading on those pipes. When a child
write to one of the pipes, the parent write to the log_file something
like:
Dec 15 16:48:10.670668 [WARNING] child 24035 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
To check it in action, simply change the allowed_clients to something
else than 127.0.0.1 (<value name="allowed_clients">127.0.0.2</value>).
Restart php-fpm and try connect to php-fpm (with "telnet 127.0.0.1
9000" or through a webserver in front). You should see the previous
log in your error_log each time you try to connect to fpm, it should
be written in (pseudo) realtime.
Here is a normal error_log in this situation with 2 children:
EVENT_NOEPOLL= ./sapi/fpm/php-fpm
Dec 15 16:58:45.931952 [NOTICE] getrlimit(nofile): max:1024, cur:1024
Dec 15 16:58:45.932489 [NOTICE] libevent: using poll
Dec 15 16:58:45.932797 [NOTICE] fpm is running, pid 24200
Dec 15 16:58:45.934172 [NOTICE] child 24201 (pool default) started
Dec 15 16:58:45.935521 [NOTICE] child 24202 (pool default) started
Dec 15 16:58:45.935782 [NOTICE] libevent: entering main loop
Dec 15 16:58:50.667584 [WARNING] child 24201 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:51.610799 [WARNING] child 24202 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:54.231871 [WARNING] child 24201 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:54.432654 [WARNING] child 24202 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
^CDec 15 16:58:56.836278 [NOTICE] received SIGINT
Dec 15 16:58:56.836324 [NOTICE] switching to 'terminating' state
Dec 15 16:58:56.836342 [NOTICE] sending signal 15 SIGTERM
to child
24202 (pool default)
Dec 15 16:58:56.836355 [NOTICE] sending signal 15 SIGTERM
to child
24201 (pool default)
Dec 15 16:58:56.836363 [NOTICE] 2 children are still alive
Dec 15 16:58:56.836822 [NOTICE] received SIGCHLD
Dec 15 16:58:56.836901 [WARNING] child 24202 (pool default) exited on
signal 2 SIGINT
after 10.901391 seconds from start
Dec 15 16:58:56.837290 [NOTICE] received SIGCHLD
Dec 15 16:58:56.837342 [WARNING] child 24201 (pool default) exited on
signal 2 SIGINT
after 10.903209 seconds from start
Dec 15 16:58:56.837357 [NOTICE] exiting, bye-bye!
I made 4 connections and in real time I have in the error_log the
corresponding error messages. Note that I'm using POLL here. We can
see that child 24201 handled the 1st and 3rd connection and the child
24202 handled the 2nd and 4th connection.
Now with EPOLL:
./sapi/fpm/php-fpm
Dec 15 17:01:34.721411 [NOTICE] getrlimit(nofile): max:1024, cur:1024
Dec 15 17:01:34.721973 [NOTICE] libevent: using epoll
Dec 15 17:01:34.722286 [NOTICE] fpm is running, pid 24375
Dec 15 17:01:34.723695 [NOTICE] child 24376 (pool default) started
Dec 15 17:01:34.725099 [NOTICE] child 24377 (pool default) started
Dec 15 17:01:34.725383 [NOTICE] libevent: entering main loop
Dec 15 17:01:37.044002 [WARNING] child 24377 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 17:01:38.215927 [WARNING] child 24377 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
^CDec 15 17:01:41.237395 [NOTICE] received SIGINT
Dec 15 17:01:41.237443 [NOTICE] switching to 'terminating' state
Dec 15 17:01:41.237462 [NOTICE] sending signal 15 SIGTERM
to child
24377 (pool default)
Dec 15 17:01:41.237473 [NOTICE] sending signal 15 SIGTERM
to child
24376 (pool default)
Dec 15 17:01:41.237481 [NOTICE] 2 children are still alive
Dec 15 17:01:41.237938 [NOTICE] received SIGCHLD
Dec 15 17:01:41.238015 [WARNING] child 24377 (pool default) exited on
signal 2 SIGINT
after 6.512935 seconds from start
Dec 15 17:01:41.238389 [NOTICE] received SIGCHLD
Dec 15 17:01:41.238441 [WARNING] child 24376 (pool default) exited on
signal 2 SIGINT
after 6.514792 seconds from start
Dec 15 17:01:41.238473 [WARNING] child 24376 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 17:01:41.238490 [WARNING] child 24376 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped.", pipe is closed
Dec 15 17:01:41.238502 [NOTICE] exiting, bye-bye!
I made the same number of connection and only the ones handled by the
last created child has been written on real time. Messages from the
other child have been written when the program terminated (so much
much later in real production env).
To change the polling system used by libevent you can play with the
env variables EVENT_NOEPOLL, EVENT_NOKQUEUE, EVENT_NODEVPOLL,
EVENT_NOPOLL or EVENT_NOSELECT.
Can someone reproduce this bug ? I want to be sure it's a bug on FPM
and not a mis-configured system.
I'll try to find out an explanation and a solution
++ jerome
Le 15 décembre 2009 17:08, Jérôme Loyet jerome@loyet.net a écrit :
Hi
I just figured out a bug with epoll on the last php-fpm version from svn.
It seems that with epoll only (no problem with poll or select -- don't
have a freebsd ready to check kqueue), the catch_worker_output is
buggy.
I just tested on freebsd with libevent 1.4.12. When using kqueue the
problems does not occur. It seems to be a an epoll problem.I'll check
on another linux like fedora (I tested only on ubuntu)
With catch_workers_output set to "no", all output to stdout or stderr
from children are sent to /dev/null.With catch_workers_output set to "yes", output to stdout or stderr
from children are sent through 2 pipes to the parent process. The
parent process has an event on reading on those pipes. When a child
write to one of the pipes, the parent write to the log_file something
like:Dec 15 16:48:10.670668 [WARNING] child 24035 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."To check it in action, simply change the allowed_clients to something
else than 127.0.0.1 (<value name="allowed_clients">127.0.0.2</value>).
Restart php-fpm and try connect to php-fpm (with "telnet 127.0.0.1
9000" or through a webserver in front). You should see the previous
log in your error_log each time you try to connect to fpm, it should
be written in (pseudo) realtime.Here is a normal error_log in this situation with 2 children:
EVENT_NOEPOLL= ./sapi/fpm/php-fpm
Dec 15 16:58:45.931952 [NOTICE] getrlimit(nofile): max:1024, cur:1024
Dec 15 16:58:45.932489 [NOTICE] libevent: using poll
Dec 15 16:58:45.932797 [NOTICE] fpm is running, pid 24200
Dec 15 16:58:45.934172 [NOTICE] child 24201 (pool default) started
Dec 15 16:58:45.935521 [NOTICE] child 24202 (pool default) started
Dec 15 16:58:45.935782 [NOTICE] libevent: entering main loop
Dec 15 16:58:50.667584 [WARNING] child 24201 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:51.610799 [WARNING] child 24202 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:54.231871 [WARNING] child 24201 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:54.432654 [WARNING] child 24202 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
^CDec 15 16:58:56.836278 [NOTICE] receivedSIGINT
Dec 15 16:58:56.836324 [NOTICE] switching to 'terminating' state
Dec 15 16:58:56.836342 [NOTICE] sending signal 15SIGTERM
to child
24202 (pool default)
Dec 15 16:58:56.836355 [NOTICE] sending signal 15SIGTERM
to child
24201 (pool default)
Dec 15 16:58:56.836363 [NOTICE] 2 children are still alive
Dec 15 16:58:56.836822 [NOTICE] receivedSIGCHLD
Dec 15 16:58:56.836901 [WARNING] child 24202 (pool default) exited on
signal 2SIGINT
after 10.901391 seconds from start
Dec 15 16:58:56.837290 [NOTICE] receivedSIGCHLD
Dec 15 16:58:56.837342 [WARNING] child 24201 (pool default) exited on
signal 2SIGINT
after 10.903209 seconds from start
Dec 15 16:58:56.837357 [NOTICE] exiting, bye-bye!I made 4 connections and in real time I have in the error_log the
corresponding error messages. Note that I'm using POLL here. We can
see that child 24201 handled the 1st and 3rd connection and the child
24202 handled the 2nd and 4th connection.Now with EPOLL:
./sapi/fpm/php-fpm
Dec 15 17:01:34.721411 [NOTICE] getrlimit(nofile): max:1024, cur:1024
Dec 15 17:01:34.721973 [NOTICE] libevent: using epoll
Dec 15 17:01:34.722286 [NOTICE] fpm is running, pid 24375
Dec 15 17:01:34.723695 [NOTICE] child 24376 (pool default) started
Dec 15 17:01:34.725099 [NOTICE] child 24377 (pool default) started
Dec 15 17:01:34.725383 [NOTICE] libevent: entering main loop
Dec 15 17:01:37.044002 [WARNING] child 24377 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 17:01:38.215927 [WARNING] child 24377 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
^CDec 15 17:01:41.237395 [NOTICE] receivedSIGINT
Dec 15 17:01:41.237443 [NOTICE] switching to 'terminating' state
Dec 15 17:01:41.237462 [NOTICE] sending signal 15SIGTERM
to child
24377 (pool default)
Dec 15 17:01:41.237473 [NOTICE] sending signal 15SIGTERM
to child
24376 (pool default)
Dec 15 17:01:41.237481 [NOTICE] 2 children are still alive
Dec 15 17:01:41.237938 [NOTICE] receivedSIGCHLD
Dec 15 17:01:41.238015 [WARNING] child 24377 (pool default) exited on
signal 2SIGINT
after 6.512935 seconds from start
Dec 15 17:01:41.238389 [NOTICE] receivedSIGCHLD
Dec 15 17:01:41.238441 [WARNING] child 24376 (pool default) exited on
signal 2SIGINT
after 6.514792 seconds from start
Dec 15 17:01:41.238473 [WARNING] child 24376 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 17:01:41.238490 [WARNING] child 24376 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped.", pipe is closed
Dec 15 17:01:41.238502 [NOTICE] exiting, bye-bye!I made the same number of connection and only the ones handled by the
last created child has been written on real time. Messages from the
other child have been written when the program terminated (so much
much later in real production env).To change the polling system used by libevent you can play with the
env variables EVENT_NOEPOLL, EVENT_NOKQUEUE, EVENT_NODEVPOLL,
EVENT_NOPOLL or EVENT_NOSELECT.Can someone reproduce this bug ? I want to be sure it's a bug on FPM
and not a mis-configured system.
I'll try to find out an explanation and a solution++ jerome
I found a solution:
Just after the fork() call to create a child, the first thing to do is
to call event_init() to create a fresh libevent environment for the
new child. In fact epoll shares the same fd between process and having
not libevent reinit make things not working as excepted. It's not the
case for others event libraries.
There is no need to make this for the main fork because the event_init
is called after this fork.
So here is the patch. As it's very simple I put it here and not as an
attached file
++ Jerome
Index: sapi/fpm/fpm/fpm_children.c
--- sapi/fpm/fpm/fpm_children.c (révision 292175)
+++ sapi/fpm/fpm/fpm_children.c (copie de travail)
@@ -373,6 +373,7 @@
switch (pid) {
case 0 :
-
event_init(); fpm_child_resources_use(child); fpm_globals.is_child = 1; if (in_event_loop) {
Le 15 décembre 2009 20:25, Jérôme Loyet jerome@loyet.net a écrit :
Le 15 décembre 2009 17:08, Jérôme Loyet jerome@loyet.net a écrit :
Hi
I just figured out a bug with epoll on the last php-fpm version from svn.
It seems that with epoll only (no problem with poll or select -- don't
have a freebsd ready to check kqueue), the catch_worker_output is
buggy.I just tested on freebsd with libevent 1.4.12. When using kqueue the
problems does not occur. It seems to be a an epoll problem.I'll check
on another linux like fedora (I tested only on ubuntu)With catch_workers_output set to "no", all output to stdout or stderr
from children are sent to /dev/null.With catch_workers_output set to "yes", output to stdout or stderr
from children are sent through 2 pipes to the parent process. The
parent process has an event on reading on those pipes. When a child
write to one of the pipes, the parent write to the log_file something
like:Dec 15 16:48:10.670668 [WARNING] child 24035 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."To check it in action, simply change the allowed_clients to something
else than 127.0.0.1 (<value name="allowed_clients">127.0.0.2</value>).
Restart php-fpm and try connect to php-fpm (with "telnet 127.0.0.1
9000" or through a webserver in front). You should see the previous
log in your error_log each time you try to connect to fpm, it should
be written in (pseudo) realtime.Here is a normal error_log in this situation with 2 children:
EVENT_NOEPOLL= ./sapi/fpm/php-fpm
Dec 15 16:58:45.931952 [NOTICE] getrlimit(nofile): max:1024, cur:1024
Dec 15 16:58:45.932489 [NOTICE] libevent: using poll
Dec 15 16:58:45.932797 [NOTICE] fpm is running, pid 24200
Dec 15 16:58:45.934172 [NOTICE] child 24201 (pool default) started
Dec 15 16:58:45.935521 [NOTICE] child 24202 (pool default) started
Dec 15 16:58:45.935782 [NOTICE] libevent: entering main loop
Dec 15 16:58:50.667584 [WARNING] child 24201 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:51.610799 [WARNING] child 24202 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:54.231871 [WARNING] child 24201 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 16:58:54.432654 [WARNING] child 24202 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
^CDec 15 16:58:56.836278 [NOTICE] receivedSIGINT
Dec 15 16:58:56.836324 [NOTICE] switching to 'terminating' state
Dec 15 16:58:56.836342 [NOTICE] sending signal 15SIGTERM
to child
24202 (pool default)
Dec 15 16:58:56.836355 [NOTICE] sending signal 15SIGTERM
to child
24201 (pool default)
Dec 15 16:58:56.836363 [NOTICE] 2 children are still alive
Dec 15 16:58:56.836822 [NOTICE] receivedSIGCHLD
Dec 15 16:58:56.836901 [WARNING] child 24202 (pool default) exited on
signal 2SIGINT
after 10.901391 seconds from start
Dec 15 16:58:56.837290 [NOTICE] receivedSIGCHLD
Dec 15 16:58:56.837342 [WARNING] child 24201 (pool default) exited on
signal 2SIGINT
after 10.903209 seconds from start
Dec 15 16:58:56.837357 [NOTICE] exiting, bye-bye!I made 4 connections and in real time I have in the error_log the
corresponding error messages. Note that I'm using POLL here. We can
see that child 24201 handled the 1st and 3rd connection and the child
24202 handled the 2nd and 4th connection.Now with EPOLL:
./sapi/fpm/php-fpm
Dec 15 17:01:34.721411 [NOTICE] getrlimit(nofile): max:1024, cur:1024
Dec 15 17:01:34.721973 [NOTICE] libevent: using epoll
Dec 15 17:01:34.722286 [NOTICE] fpm is running, pid 24375
Dec 15 17:01:34.723695 [NOTICE] child 24376 (pool default) started
Dec 15 17:01:34.725099 [NOTICE] child 24377 (pool default) started
Dec 15 17:01:34.725383 [NOTICE] libevent: entering main loop
Dec 15 17:01:37.044002 [WARNING] child 24377 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 17:01:38.215927 [WARNING] child 24377 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
^CDec 15 17:01:41.237395 [NOTICE] receivedSIGINT
Dec 15 17:01:41.237443 [NOTICE] switching to 'terminating' state
Dec 15 17:01:41.237462 [NOTICE] sending signal 15SIGTERM
to child
24377 (pool default)
Dec 15 17:01:41.237473 [NOTICE] sending signal 15SIGTERM
to child
24376 (pool default)
Dec 15 17:01:41.237481 [NOTICE] 2 children are still alive
Dec 15 17:01:41.237938 [NOTICE] receivedSIGCHLD
Dec 15 17:01:41.238015 [WARNING] child 24377 (pool default) exited on
signal 2SIGINT
after 6.512935 seconds from start
Dec 15 17:01:41.238389 [NOTICE] receivedSIGCHLD
Dec 15 17:01:41.238441 [WARNING] child 24376 (pool default) exited on
signal 2SIGINT
after 6.514792 seconds from start
Dec 15 17:01:41.238473 [WARNING] child 24376 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped."
Dec 15 17:01:41.238490 [WARNING] child 24376 (pool default) said into
stderr: "Connection from disallowed IP address '127.0.0.1' is
dropped.", pipe is closed
Dec 15 17:01:41.238502 [NOTICE] exiting, bye-bye!I made the same number of connection and only the ones handled by the
last created child has been written on real time. Messages from the
other child have been written when the program terminated (so much
much later in real production env).To change the polling system used by libevent you can play with the
env variables EVENT_NOEPOLL, EVENT_NOKQUEUE, EVENT_NODEVPOLL,
EVENT_NOPOLL or EVENT_NOSELECT.Can someone reproduce this bug ? I want to be sure it's a bug on FPM
and not a mis-configured system.
I'll try to find out an explanation and a solution++ jerome