Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98368 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 72835 invoked from network); 1 Mar 2017 15:35:43 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 Mar 2017 15:35:43 -0000 Authentication-Results: pb1.pair.com header.from=rasmus@mindplay.dk; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=rasmus@mindplay.dk; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain mindplay.dk from 209.85.220.180 cause and error) X-PHP-List-Original-Sender: rasmus@mindplay.dk X-Host-Fingerprint: 209.85.220.180 mail-qk0-f180.google.com Received: from [209.85.220.180] ([209.85.220.180:35815] helo=mail-qk0-f180.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 05/B0-65343-E4AE6B85 for ; Wed, 01 Mar 2017 10:35:42 -0500 Received: by mail-qk0-f180.google.com with SMTP id u188so75886190qkc.2 for ; Wed, 01 Mar 2017 07:35:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mindplay-dk.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=f12/4g5OivJkk1yCH+zVQ3FSmDLMiytZUL6i3TkPaHk=; b=gU24OpIM1/zGoUlEkNJQRNEDbAnOqMf9RFceIgMKJQN3ZJHubN1v2/GuPdzSrRFR6P 0aASSpCFdjRbL3w15mZg46CLmtF9nIl4gF7P2HgtNXq1OMI39HRK29fDvSrKsqsClE5w E9hRq5XMN3fwOuFvbIO7QicojOBh5E7w1Lm7D/pLExVgAQkZMxjUTKpbPmLLXPLXa7EV vXGS0skyLeUoRAtq41/0yCLy1jWPMJmAbPpehVeQCKWNKdwn7mcYgDw0t513sGRAeaow lHMzbzi2izeQTeX+nxsoLTF7+//DxjuI8i8Xg2x72o/XiIlzAB/ylXZVgdNToZtzet27 Io/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=f12/4g5OivJkk1yCH+zVQ3FSmDLMiytZUL6i3TkPaHk=; b=dKEluFXqJoZzF6nAnpfuQ9m0kJI7ft5EKb4ZDHlQxay4IYGP7CT9/zmPigwADUwIS/ gW5EQSk3s2/WcZr/hNlKRtlduMoJplzl+OGwQwwGf2y2HCcEi88To5+yKxpTn6NWBQw4 QG8ah249kd4RQDDEsZcedRaveb93WNM9VgUweo8nwKw5yFE9ciRCHfoofpiBBsKWly3a gQegYtkAx0c2zUZnMLdG8tuO41/LPTCsr1pKJsKsd3tz/DWTnsEO2JTETJYmzzkZ2rqi I4A8lApAYVQHSbQKS+x7W+mbfv7Gqvo2b1fhXP4hBsE4JAsdn2Rk7/kTm0I0HCavHIl6 V9nw== X-Gm-Message-State: AMke39kOxpgnG4Q4xWE9+uhSe47CIwsKIQEoz+kQPYV8GfbfL0bU5FXKRtkNuLzqwkA+zDgj1hNVO5mY+4trUg== X-Received: by 10.55.91.71 with SMTP id p68mr10568736qkb.172.1488382539408; Wed, 01 Mar 2017 07:35:39 -0800 (PST) MIME-Version: 1.0 Received: by 10.237.37.60 with HTTP; Wed, 1 Mar 2017 07:35:38 -0800 (PST) Date: Wed, 1 Mar 2017 16:35:38 +0100 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary=001a114edba28d78240549ad1015 Subject: PCRE caching From: rasmus@mindplay.dk (Rasmus Schultz) --001a114edba28d78240549ad1015 Content-Type: text/plain; charset=UTF-8 Hey internals, I was wondering whether or how PCRE regular expression get parsed and cached, and I found this answer on Stack Overflow: http://stackoverflow.com/questions/209906/compile-regex-in-php Do I understand this correctly, that: 1. All regular expressions are hashed and the compiled expression is cached internally between calls. 2. The /S modifier applies more optimizations during compile, but caching works the same way. 3. Compiled expressions are not cached between requests. If so, this seems far from optimal. Every unique regular expression needs to be compiled during every request, right? So with FPM, or with long-running apps, we're missing an opportunity to optimize by caching between requests. And with long-running apps, we're caching every dynamic regular expression, which could harm (memory overhead) more than help. Ideally, shouldn't we have (like some engines/languages) a switch to enable caching? The run-time can't know if a given regular expression is dynamic or static, can it? It's just a string either way - so without a switch, you're either committing compiled dynamic expressions to the cache unnecessarily, and/or missing an opportunity to cache between requests in long-running apps or under FPM. I think most apps use quite a lot of regular expression for validation etc. so maybe there's a missed optimization opportunity here? Cheers, Rasmus Schultz --001a114edba28d78240549ad1015--