Hi,
I was working on a patch to support interned strings in thread safe builds
of PHP. ATM there seems to be a good progress on this. The proposed patch
unifies TS/NTS builds in regard to the interned strings handling and fixes
issues in TS builds.
https://github.com/php/php-src/pull/2390
The basic idea is like this
- CG(interned_strings) is removed
- there are two tables
- one table is persistent only, can be filled only at startup and is
maintained per process - second table is allocated per thread in TS builds and per process in NTS
builds - second table is filled/emptied at every request, separate for every
process/thread - in the startup phase, any interned string is saved into the first table
- once a snapshot was made, that table stays unchanged until the process
exit - after startup, any call to zend_new_interned_string saves into the second
table, which is cleaned up per request
There was also another idea we discussed with Dmitry, to have only one
interned strings table, initially filled with permanent strings, which would
be copied into every new thread. In my early implementation I had to abandon
it, as a multiple tables idea seems to have more perspective. Especially in
TS builds it means the long living strings reside in memory only once. The
second table overhead is neglecting, the API can also be extended to
override the storages, thus giving the full control over where and how
things are stored.
Additionally, I've defined three stages, where the interned string tables
init/dtor happens. Those are
- SAPI
- TSRM
- Zend
The functions for init/dtor will record the stage, and will call the dtor
accordingly. For the TS Apache variant it is required, because the main
thread holds all the globals and the final interned strings destruction
needs to happen after the main thread globals destruction. In general, it
also useful in any build kind, as the permanent storage can be then used
even before the engine is actually initialized. To do so, the SAPI can
define init/dtor calls at the outermost places. The TSRM stage will be ATM
missing in NTS builds, and if nothing else is defined, the old way of
init/dtor in Zend is taken. The init function can be called at any stage,
but the subsequent calls will be ignored and the dtor will be called only
when the matching stage argument is passed.
From what I can tell so far, at least the bugs here are fixed
https://bugs.php.net/bug.php?id=74015
https://bugs.php.net/bug.php?id=72451
These are likely to have been fixed, too, but some harder to test
https://bugs.php.net/bug.php?id=71115
https://bugs.php.net/bug.php?id=74011
On my side, I tested quite hard with several small snippets from the bug
reports and with WP and Symfony and could observe a huge stability increase.
Unfortunately it is not really sensible to tell a performance diff, as same
snippets were crashing quite quickly without the patch. The test setup I
have is Apache worker/winnt penetrated by 8/16 simultaneous clients on a
machine with 4 (Linux) or 8 (Windows) real processor cores.
Regarding Opcache, this solution seems to work out of the box. As many the
strings are overridden, there seems to be no impact. Though this should be
double checked.
I have to mention, that there might be also other issues regarding TS, fe
with resources. I for now was only concentrated on the interned strings, as
this is the most urgent part we still miss. Some general refactoring to the
TS layer could be targeted later. Possible improvements to strings could be
- intern more strings on startup - the constants, the ini directive names,
etc. - introduce static interned strings, those that are already hardcoded into
the text/data segment - introduce fixed size interned zend_string for short strings
- introduce a special zend_string to hold path, thus saving strlen at many
places, maybe also fixed size
Thanks
Anatol
Hi Anatol,
Great progress!
I didn't look into the code careful yet, but the explanation looks reasonable.
For now, I see one missing part - opcache can't work completely out of the box, because its interned string related functions are wrapped with #ifndef ZTS
Thanks. Dmitry.
Hi Dmitry,
-----Original Message-----
From: Dmitry Stogov [mailto:dmitry@zend.com]
Sent: Friday, February 17, 2017 1:46 PM
To: Anatol Belski anatol.php@belski.net; 'Nikita Popov' nikic@php.net;
'Xinchen Hui' laruence@php.net; 'Joe Watkins' krakjoe@php.net
Cc: 'PHP internals list' internals@lists.php.net
Subject: [PHP-DEV] Re: Thread safe interned stringsHi Anatol,
Great progress!
I didn't look into the code careful yet, but the explanation looks
reasonable.For now, I see one missing part - opcache can't work completely out of the
box,
because its interned string related functions are wrapped with #ifndef ZTS
Oh, indeed. Getting to this part now then :)
Thanks
Anatol
Thanks. Dmitry.
From: Anatol Belski anatol.php@belski.net
Sent: Friday, February 17, 2017 3:17:05 PM
To: Dmitry Stogov; 'Nikita Popov'; 'Xinchen Hui'; 'Joe Watkins'
Cc: 'PHP internals list'
Subject: Thread safe interned stringsHi,
I was working on a patch to support interned strings in thread safe builds
of PHP.
ATM there seems to be a good progress on this. The proposed patch unifies
TS/NTS builds in regard to the interned strings handling and fixes issues
in TS
builds.https://github.com/php/php-src/pull/2390
The basic idea is like this
- CG(interned_strings) is removed
- there are two tables
- one table is persistent only, can be filled only at startup and is
maintained per
process- second table is allocated per thread in TS builds and per process in NTS
builds- second table is filled/emptied at every request, separate for every
process/thread- in the startup phase, any interned string is saved into the first table
- once a snapshot was made, that table stays unchanged until the process
exit- after startup, any call to zend_new_interned_string saves into the
second table,
which is cleaned up per requestThere was also another idea we discussed with Dmitry, to have only one
interned strings table, initially filled with permanent strings, which
would be
copied into every new thread. In my early implementation I had to abandon
it,
as a multiple tables idea seems to have more perspective. Especially in TS
builds
it means the long living strings reside in memory only once. The second
table
overhead is neglecting, the API can also be extended to override the
storages,
thus giving the full control over where and how things are stored.Additionally, I've defined three stages, where the interned string tables
init/dtor
happens. Those are
- SAPI
- TSRM
- Zend
The functions for init/dtor will record the stage, and will call the dtor
accordingly.
For the TS Apache variant it is required, because the main thread holds
all the
globals and the final interned strings destruction needs to happen after
the main
thread globals destruction. In general, it also useful in any build kind,
as the
permanent storage can be then used even before the engine is actually
initialized. To do so, the SAPI can define init/dtor calls at the
outermost places.
The TSRM stage will be ATM missing in NTS builds, and if nothing else is
defined,
the old way of init/dtor in Zend is taken. The init function can be called
at any
stage, but the subsequent calls will be ignored and the dtor will be
called only
when the matching stage argument is passed.From what I can tell so far, at least the bugs here are fixed
https://bugs.php.net/bug.php?id=74015
https://bugs.php.net/bug.php?id=72451These are likely to have been fixed, too, but some harder to test
https://bugs.php.net/bug.php?id=71115
https://bugs.php.net/bug.php?id=74011On my side, I tested quite hard with several small snippets from the bug
reports
and with WP and Symfony and could observe a huge stability increase.
Unfortunately it is not really sensible to tell a performance diff, as
same snippets
were crashing quite quickly without the patch. The test setup I have is
Apache
worker/winnt penetrated by 8/16 simultaneous clients on a machine with 4
(Linux) or 8 (Windows) real processor cores.Regarding Opcache, this solution seems to work out of the box. As many the
strings are overridden, there seems to be no impact. Though this should be
double checked.I have to mention, that there might be also other issues regarding TS, fe
with
resources. I for now was only concentrated on the interned strings, as
this is the
most urgent part we still miss. Some general refactoring to the TS layer
could be
targeted later. Possible improvements to strings could be
- intern more strings on startup - the constants, the ini directive
names, etc.- introduce static interned strings, those that are already hardcoded into
the
text/data segment- introduce fixed size interned zend_string for short strings
- introduce a special zend_string to hold path, thus saving strlen at many
places,
maybe also fixed sizeThanks
Anatol
Hi,
For now, I see one missing part - opcache can't work completely out of
the
box,
because its interned string related functions are wrapped with #ifndef
ZTSOh, indeed. Getting to this part now then :)
The Opcache support is done now. Additional things that are done
- some startup/shutdown sequence refactoring was required in zend.c, to
mitigate shutdown issues with Opcache - CG(zend_string) was removed, instead there is a global variable
- fixed phpdbg integration
With this, looks like it's ready for the next review round. I'm doing some
load tests in between.
Thanks
Anatol
Hi,
-----Original Message-----
From: Anatol Belski [mailto:anatol.php@belski.net]
Sent: Monday, February 20, 2017 4:50 PM
To: 'Dmitry Stogov' dmitry@zend.com; 'Nikita Popov' nikic@php.net;
'Xinchen Hui' laruence@php.net; 'Joe Watkins' krakjoe@php.net
Cc: 'PHP internals list' internals@lists.php.net
Subject: RE: [PHP-DEV] Re: Thread safe interned stringsHi,
For now, I see one missing part - opcache can't work completely out
of the
box,
because its interned string related functions are wrapped with
#ifndef ZTSOh, indeed. Getting to this part now then :)
The Opcache support is done now. Additional things that are done
- some startup/shutdown sequence refactoring was required in zend.c, to
mitigate shutdown issues with Opcache- CG(zend_string) was removed, instead there is a global variable
- fixed phpdbg integration
With this, looks like it's ready for the next review round. I'm doing some
load
tests in between.
I'm glad to inform, that the interned strings unification patch has reached
the stable state now. It can be found under
https://github.com/php/php-src/pull/2390 . The tests are successful so far
with FCGI and Apache under Linux and Windows. The intention is to merge the
patch into master next days. Additional testing would be great to have, of
course.
Huge thanks to Dmitry for the valuable support!
Regards
Anatol