Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98398 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 70151 invoked from network); 3 Mar 2017 21:28:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Mar 2017 21:28:08 -0000 Authentication-Results: pb1.pair.com header.from=nikita.ppv@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=nikita.ppv@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.41 as permitted sender) X-PHP-List-Original-Sender: nikita.ppv@gmail.com X-Host-Fingerprint: 74.125.82.41 mail-wm0-f41.google.com Received: from [74.125.82.41] ([74.125.82.41:34359] helo=mail-wm0-f41.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 47/11-57607-6EFD9B85 for ; Fri, 03 Mar 2017 16:28:07 -0500 Received: by mail-wm0-f41.google.com with SMTP id 196so10316510wmm.1 for ; Fri, 03 Mar 2017 13:28:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=47AtRpmNglwurLOPm3cR4dVEP2pUJs/HWrn9T/77SZ4=; b=RugW3aSCPmhgi6R/Gv4PZbNpnaWlau4Cjh9E3l274kO2VSI3/HCzi8jZesRxmPVFjU cmUnUAQ+N1m7HxT7H8meXRVtasypAD7wt6bCiMxmCpf6A5xDmkaI+1QRme9rasYHacL+ 6cnodLr0sCHcPJSzM8tANr89bVrHFS/iXKU243SG16C2N3bSeXpDzjimt/isXYrQIiES 1OjcdYW7WCw0KMn7nbinzdAKZKQ0WBNSjjfb2NHkybI7vW+MrcIyQhzDS1DEvn9Shqjd misSnSefEH4f7AU2H6Ld0sDULLP/DH75Pkx3C7LShTd629CdU2P9bPPxYD67Du8DxNrx 2ciw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=47AtRpmNglwurLOPm3cR4dVEP2pUJs/HWrn9T/77SZ4=; b=s9pxlQhrdvdAWN0aCDfqXziwdkU1UREyT26zIDdLD5ZtBtarvZAUa4iYwhj63EcPSS OMtDYt2NenTkrfo7T4WyqZ4jOtpICRu8pUI9APr3cXR3FRMbI7jLVrfjkcOwZc1WeYjI 3zK+y2nQi+qrqpRbsgahZJhu+Wv+ONuAL2pCEDs1G+d0lyxCE2nF0V/PZ3yZdmN/WdmB gvc7BiX/zkc4xVr36Jo2j/FYo8BJRxE14LBj+spn4FFlJ2YUebZQCEHo6VkQedihtPla WABFBaHMGTq4V21hgUaypHhDlopoBVUhKr644EHPeqfolXgptsVtZtbCz4MBHauQ9bMd EX5g== X-Gm-Message-State: AMke39ka+hyqfETXV8Ineo6cNFuOGkus7jPETpuCGvchPiT1gr/tjq8Xdguq6RGDrXNc02ODIyC3iXUkoSAO2A== X-Received: by 10.28.23.66 with SMTP id 63mr4441778wmx.46.1488576483898; Fri, 03 Mar 2017 13:28:03 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.170.216 with HTTP; Fri, 3 Mar 2017 13:28:03 -0800 (PST) In-Reply-To: References: Date: Fri, 3 Mar 2017 22:28:03 +0100 Message-ID: To: Rasmus Schultz Cc: PHP internals Content-Type: multipart/alternative; boundary=001a11468aa88b3d7a0549da3802 Subject: Re: [PHP-DEV] PCRE caching From: nikita.ppv@gmail.com (Nikita Popov) --001a11468aa88b3d7a0549da3802 Content-Type: text/plain; charset=UTF-8 On Wed, Mar 1, 2017 at 4:35 PM, Rasmus Schultz wrote: > Hey internals, > > I was wondering whether or how PCRE regular expression get parsed and > cached, and I found this answer on Stack Overflow: > > http://stackoverflow.com/questions/209906/compile-regex-in-php > > Do I understand this correctly, that: > > 1. All regular expressions are hashed and the compiled expression is cached > internally between calls. > Correct. 2. The /S modifier applies more optimizations during compile, but caching > works the same way. > Yes. Additionally, if PCRE JIT is enabled (which it usually is on PHP 7) we always study, independently of whether /S was specified. > 3. Compiled expressions are not cached between requests. > Compiled expressions are cached between requests. However, they are not shared between processes (I'm not even sure if that's possible.) The cache invalidation strategy is FIFO. More specifically, whenever the cache fills up, we discard the first 1/8 cached regular expressions. Nikita > If so, this seems far from optimal. > > Every unique regular expression needs to be compiled during every request, > right? > > So with FPM, or with long-running apps, we're missing an opportunity to > optimize by caching between requests. > > And with long-running apps, we're caching every dynamic regular expression, > which could harm (memory overhead) more than help. > > Ideally, shouldn't we have (like some engines/languages) a switch to enable > caching? > > The run-time can't know if a given regular expression is dynamic or static, > can it? It's just a string either way - so without a switch, you're either > committing compiled dynamic expressions to the cache unnecessarily, and/or > missing an opportunity to cache between requests in long-running apps or > under FPM. > > I think most apps use quite a lot of regular expression for validation etc. > so maybe there's a missed optimization opportunity here? > > Cheers, > Rasmus Schultz > --001a11468aa88b3d7a0549da3802--