Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:69408 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 45979 invoked from network); 28 Sep 2013 19:15:17 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 28 Sep 2013 19:15:17 -0000 Authentication-Results: pb1.pair.com smtp.mail=theanomaly.is@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=theanomaly.is@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.170 as permitted sender) X-PHP-List-Original-Sender: theanomaly.is@gmail.com X-Host-Fingerprint: 74.125.82.170 mail-we0-f170.google.com Received: from [74.125.82.170] ([74.125.82.170:47504] helo=mail-we0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id A3/70-41835-4CA27425 for ; Sat, 28 Sep 2013 15:15:17 -0400 Received: by mail-we0-f170.google.com with SMTP id w62so4038476wes.1 for ; Sat, 28 Sep 2013 12:15:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=NcPDdzXB/vw18+it/+r9W1w7Edsf9o4m5eXi9Hwrcak=; b=GXutz4VwC2DVsf5ndqhPtoxqz/i8ZA3mPtsiBaj0mbZbgOgF3F7PM97iO/hXv4WxsY WXJPPfby3USn94+3e6/qnmYxYhqyZKJSxHLnJHJ5XPqo4CkmlfVWwpeEeorUGBCjbeby DokJUhApCsgpS5T5ttlBwOTO/Nwm58r0TpBx85sKhS1qoWiyBxTiUFs4zOHWmhHWT+tc uyQFUXm7uG4hN4KCQ6ZdbYaXziQYVPB2+gOyuvpublVJzAPIbLPyTz6MsniiXXiHdOMy lE6f9VXNNH/x1/VYbQQtS8x11loNlXLmzhWqq5a/80VFoCGBYNr4Pnf7bN1R3a93rVWU q+ug== MIME-Version: 1.0 X-Received: by 10.180.221.38 with SMTP id qb6mr7373480wic.8.1380395713862; Sat, 28 Sep 2013 12:15:13 -0700 (PDT) Received: by 10.227.134.196 with HTTP; Sat, 28 Sep 2013 12:15:13 -0700 (PDT) Date: Sat, 28 Sep 2013 15:15:13 -0400 Message-ID: To: PHP Internals Content-Type: multipart/alternative; boundary=001a1134d2da2ca22a04e7766c6f Subject: Remove requirement to escape delimiters in regular expression in PCRE functions From: theanomaly.is@gmail.com (Sherif Ramadan) --001a1134d2da2ca22a04e7766c6f Content-Type: text/plain; charset=ISO-8859-1 Hi, Someone pointed out to me recently that since the delimiters are not a requirement of PCRE, and thus should not be considered a part of the regular expression, there really is no need to escape them inside of the regular expression such as that provided to preg_match functions and similar. I propose removing this requirement by seeking out the delimiter from the end of the regex string (ignoring modifiers) and working backwards to find the closing delimiter. Currently the implementation seeks out the closing modifiers by iterating through the entire regex string from start to end (thus requiring us to escape the delimiter inside the expression). The change introduces no BC breakage as far as I can tell and offers a number of advantages. 1) Regular expressions supplied to preg_match functions become portable across PCRE implementations since escaping the delimiter inside of the regular expression really isn't required. (i.e. it may work from PHP to Perl, but may not work from Perl to PHP). This makes things a lot easier to port between PHP and other PCRE implementations. 2) The regular expression is easier to read by a human if we remove the requirement to escape delimiters. 3) It becomes clearer that the delimiters are not a part of the regular expression both in the implementation and from userspace code. It's not a big change so I propose merging into 5.5 or 5.4 up into 5.5. Thoughts, objections, etc? --001a1134d2da2ca22a04e7766c6f--