Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98324 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 39572 invoked from network); 17 Feb 2017 12:17:20 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Feb 2017 12:17:20 -0000 Authentication-Results: pb1.pair.com header.from=anatol.php@belski.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=anatol.php@belski.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain belski.net from 85.214.73.107 cause and error) X-PHP-List-Original-Sender: anatol.php@belski.net X-Host-Fingerprint: 85.214.73.107 klapt.com Received: from [85.214.73.107] ([85.214.73.107:44238] helo=h1123647.serverkompetenz.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id C1/3F-01562-DC9E6A85 for ; Fri, 17 Feb 2017 07:17:18 -0500 Received: by h1123647.serverkompetenz.net (Postfix, from userid 1006) id 28C43784BF7; Fri, 17 Feb 2017 13:17:14 +0100 (CET) Received: from w530phpdev (p57A87528.dip0.t-ipconnect.de [87.168.117.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by h1123647.serverkompetenz.net (Postfix) with ESMTPSA id D338A784BF3; Fri, 17 Feb 2017 13:17:11 +0100 (CET) To: "'Dmitry Stogov'" , "'Nikita Popov'" , "'Xinchen Hui'" , "'Joe Watkins'" Cc: "'PHP internals list'" Date: Fri, 17 Feb 2017 13:17:05 +0100 Message-ID: <005501d28917$c472d9c0$4d588d40$@belski.net> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdKINTEDH84fZVe4TjS+sVs5CqHtsA== Content-Language: en-us Subject: Thread safe interned strings From: anatol.php@belski.net ("Anatol Belski") Hi, I was working on a patch to support interned strings in thread safe builds of PHP. ATM there seems to be a good progress on this. The proposed patch unifies TS/NTS builds in regard to the interned strings handling and fixes issues in TS builds. https://github.com/php/php-src/pull/2390 The basic idea is like this - CG(interned_strings) is removed - there are two tables - one table is persistent only, can be filled only at startup and is maintained per process - second table is allocated per thread in TS builds and per process in NTS builds - second table is filled/emptied at every request, separate for every process/thread - in the startup phase, any interned string is saved into the first table - once a snapshot was made, that table stays unchanged until the process exit - after startup, any call to zend_new_interned_string saves into the second table, which is cleaned up per request There was also another idea we discussed with Dmitry, to have only one interned strings table, initially filled with permanent strings, which would be copied into every new thread. In my early implementation I had to abandon it, as a multiple tables idea seems to have more perspective. Especially in TS builds it means the long living strings reside in memory only once. The second table overhead is neglecting, the API can also be extended to override the storages, thus giving the full control over where and how things are stored. Additionally, I've defined three stages, where the interned string tables init/dtor happens. Those are - SAPI - TSRM - Zend The functions for init/dtor will record the stage, and will call the dtor accordingly. For the TS Apache variant it is required, because the main thread holds all the globals and the final interned strings destruction needs to happen after the main thread globals destruction. In general, it also useful in any build kind, as the permanent storage can be then used even before the engine is actually initialized. To do so, the SAPI can define init/dtor calls at the outermost places. The TSRM stage will be ATM missing in NTS builds, and if nothing else is defined, the old way of init/dtor in Zend is taken. The init function can be called at any stage, but the subsequent calls will be ignored and the dtor will be called only when the matching stage argument is passed. From what I can tell so far, at least the bugs here are fixed https://bugs.php.net/bug.php?id=74015 https://bugs.php.net/bug.php?id=72451 These are likely to have been fixed, too, but some harder to test https://bugs.php.net/bug.php?id=71115 https://bugs.php.net/bug.php?id=74011 On my side, I tested quite hard with several small snippets from the bug reports and with WP and Symfony and could observe a huge stability increase. Unfortunately it is not really sensible to tell a performance diff, as same snippets were crashing quite quickly without the patch. The test setup I have is Apache worker/winnt penetrated by 8/16 simultaneous clients on a machine with 4 (Linux) or 8 (Windows) real processor cores. Regarding Opcache, this solution seems to work out of the box. As many the strings are overridden, there seems to be no impact. Though this should be double checked. I have to mention, that there might be also other issues regarding TS, fe with resources. I for now was only concentrated on the interned strings, as this is the most urgent part we still miss. Some general refactoring to the TS layer could be targeted later. Possible improvements to strings could be - intern more strings on startup - the constants, the ini directive names, etc. - introduce static interned strings, those that are already hardcoded into the text/data segment - introduce fixed size interned zend_string for short strings - introduce a special zend_string to hold path, thus saving strlen at many places, maybe also fixed size Thanks Anatol