Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:99443 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 87055 invoked from network); 7 Jun 2017 20:13:24 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 Jun 2017 20:13:24 -0000 Authentication-Results: pb1.pair.com header.from=rowan.collins@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=rowan.collins@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.128.169 as permitted sender) X-PHP-List-Original-Sender: rowan.collins@gmail.com X-Host-Fingerprint: 209.85.128.169 mail-wr0-f169.google.com Received: from [209.85.128.169] ([209.85.128.169:34266] helo=mail-wr0-f169.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id B6/A9-27119-36E58395 for ; Wed, 07 Jun 2017 16:13:24 -0400 Received: by mail-wr0-f169.google.com with SMTP id g76so10515037wrd.1 for ; Wed, 07 Jun 2017 13:13:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=i4rNibJMrNoTJ9Lg+S5dUXt+n34UT3RFg/BUThqKzf8=; b=nAg+UJjisNAbrUzrZe3GmrAtwiKAlLak8AO72LreEJ39/6TMbNF3p3XecJSYKvr3Di fBdyVE8gNXDHjnV+J9E8g2apVO0XGNtA5Upsa8pyUashtOUzMjwZrBUANi274IyfzhZU ruvV0aYFq0+RzhrzrznfTRnW5KSf+gw6Vu8jWOxvm96uw1Mt9S2Ds1keVy6RM2w9badR wy6WdwkGtpwMueBcfpuGHoEod7Uu5E59zkxgUx8qOwU+lycXEmVq1CSH48gD3P/cSP99 /0fxHE586vw2naakjPsQc1wqB2p1XDWQBSQN+gynXtl8dXuHcEkGlcAv0NNvvTEzmpSD AQYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=i4rNibJMrNoTJ9Lg+S5dUXt+n34UT3RFg/BUThqKzf8=; b=ILfxF0WWB3C3KaVSJ5MHTEiPCExdol8s2WkcGFvTFMKFNewHydlerD2dJbpDWTFc1u krhS+7dM8QHqUHlW6rB/oYfGORO7UtF1dwX3VI1MU3OZnFD6z0VcgS26KyOHCwCuogDR cR1gONiL9pu/cGDMhXv95QGYzPLs1LQWcf8yt7o6EBdskIgXWwukMt/HnT4HTqh72Tem 2/gB8yj6K1hpcjtYGvSh7kNO/ut8CSb4FFB31+8rpLwz/4baI2Cx3Q5x0pCJrsiqOjG2 LNTzcYt6ucEnC4dNt5/yukvtWNYn18KTVH47MOZHPrm0GDSeSSASLCRP2iIQhLuIlG4h hFfg== X-Gm-Message-State: AODbwcCWSkMTxOVyyMQ/aXT7ocdQRyNw6AQhQ0dPssc+YGPXfRTenkE2 VXM3iYbFXCdS6zoG X-Received: by 10.223.145.75 with SMTP id j69mr21859990wrj.23.1496866400179; Wed, 07 Jun 2017 13:13:20 -0700 (PDT) Received: from ?IPv6:2a00:23c4:4bd2:6e00:c0a4:f019:3a2a:be5b? ([2a00:23c4:4bd2:6e00:c0a4:f019:3a2a:be5b]) by smtp.googlemail.com with ESMTPSA id d8sm2894145wrc.60.2017.06.07.13.13.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 07 Jun 2017 13:13:19 -0700 (PDT) To: internals@lists.php.net References: Message-ID: Date: Wed, 7 Jun 2017 21:13:17 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] preg_match() option for anchored offset? From: rowan.collins@gmail.com (Rowan Collins) On 07/06/2017 21:03, Rasmus Schultz wrote: > What do you think about adding another option to preg_match() to allow the > $offset parameter to be treated as the start anchor? > > The manual proposes to do this: > > $subject = "abcdef"; > $pattern = '/^def/'; > $offset = 3; > preg_match($pattern, substr($subject, $offset), $matches); > > In other words, use substr() to copy the entire remainder of the string. > > I just wrote a simple SQL parser tonight, and had to use this approach, > which (I imagine) must be pretty inefficient? > > I'd like to be able to do the following: > > $subject = "abcdef"; > $pattern = '/^def/'; > $offset = 3; > preg_match($pattern, $subject, $matches, PREG_ANCHOR_OFFSET, $offset); > > This new option would make the ^ anchor work from the given $offset, which > allows me to parse the entire $subject without copying anything. > > Thoughts? > How would you propose to implement this? Is it something PCRE already supports? Or would you manipulate the subject string pointer at the point it's passed down? The latter approach seems doable, but I'm not sure if there are any subtle gotchas. Regards, -- Rowan Collins [IMSoP]