Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98369 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 74551 invoked from network); 1 Mar 2017 15:42:48 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 Mar 2017 15:42:48 -0000 Authentication-Results: pb1.pair.com header.from=ivan.enderlin@hoa-project.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=ivan.enderlin@hoa-project.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain hoa-project.net from 217.70.183.198 cause and error) X-PHP-List-Original-Sender: ivan.enderlin@hoa-project.net X-Host-Fingerprint: 217.70.183.198 relay6-d.mail.gandi.net Received: from [217.70.183.198] ([217.70.183.198:43401] helo=relay6-d.mail.gandi.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 19/01-65343-4FBE6B85 for ; Wed, 01 Mar 2017 10:42:46 -0500 Received: from mfilter5-d.gandi.net (mfilter5-d.gandi.net [217.70.178.132]) by relay6-d.mail.gandi.net (Postfix) with ESMTP id F3884FB8F6; Wed, 1 Mar 2017 16:42:41 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mfilter5-d.gandi.net Received: from relay6-d.mail.gandi.net ([IPv6:::ffff:217.70.183.198]) by mfilter5-d.gandi.net (mfilter5-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id ISshYT5itBFa; Wed, 1 Mar 2017 16:42:40 +0100 (CET) X-Originating-IP: 178.211.245.124 Received: from [192.168.0.17] (178-211-245-124.dhcp.voenergies.net [178.211.245.124]) (Authenticated sender: ivan.enderlin@hoa-project.net) by relay6-d.mail.gandi.net (Postfix) with ESMTPSA id 4BD21FB8BC; Wed, 1 Mar 2017 16:42:38 +0100 (CET) To: Rasmus Schultz , PHP internals References: Message-ID: Date: Wed, 1 Mar 2017 16:42:38 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] PCRE caching From: ivan.enderlin@hoa-project.net (Ivan Enderlin) Hello, We can also use a LRU caching strategy with a pre-defined (or user-defined) number of expressions to keep in the cache. This would also be a good idea to track number of times a regular expression is used. If this number reaches a certain threshold, then we could automatically re-compile it with the `S` modifier. I agree there is a missed optimisation here. I share your feeling. Regards. On 01.03.17 16:35, Rasmus Schultz wrote: > Hey internals, > > I was wondering whether or how PCRE regular expression get parsed and > cached, and I found this answer on Stack Overflow: > > http://stackoverflow.com/questions/209906/compile-regex-in-php > > Do I understand this correctly, that: > > 1. All regular expressions are hashed and the compiled expression is cached > internally between calls. > > 2. The /S modifier applies more optimizations during compile, but caching > works the same way. > > 3. Compiled expressions are not cached between requests. > > If so, this seems far from optimal. > > Every unique regular expression needs to be compiled during every request, > right? > > So with FPM, or with long-running apps, we're missing an opportunity to > optimize by caching between requests. > > And with long-running apps, we're caching every dynamic regular expression, > which could harm (memory overhead) more than help. > > Ideally, shouldn't we have (like some engines/languages) a switch to enable > caching? > > The run-time can't know if a given regular expression is dynamic or static, > can it? It's just a string either way - so without a switch, you're either > committing compiled dynamic expressions to the cache unnecessarily, and/or > missing an opportunity to cache between requests in long-running apps or > under FPM. > > I think most apps use quite a lot of regular expression for validation etc. > so maybe there's a missed optimization opportunity here? > > Cheers, > Rasmus Schultz >