Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:128476 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id CB1451A00BC for ; Thu, 14 Aug 2025 22:12:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1755209467; bh=mBN8vDh8dwOOkgXdL2PLP3bLjwTxU28qYZUeej21Pr0=; h=Date:From:To:In-Reply-To:References:Subject:From; b=eL2x+DgPdXFMbVX8cIKDHCrk5sv+XMHNts+PQDhKUHqQHaNKsKcRaaPz+hjcqUwrY qDJHW9q7WfMAXLMOhmORJSS6uMnYf0TgErklOX4l6xUQpQxmaTd65bpJMbQGFz1aS7 wgAJ6R1kW5KCHMDevIsF/gBIqbuHNA+gUkm5IHXabaUU/XYFNSbQSTtr6SjxQZuZ73 gYiLgPhIE5sruXY2N9uQmcmUbE8c1tkcoAgC5kL0kN8JU9m2Ux6dVEwteTCiWuPmrk uxYAapsexFS8XTy/4fgjPGHW7/g2fza4PKQOSDCX6ngJej4ogkW7vdrl993UdJbKpj x2tISqw7SGDzw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 15B05180040 for ; Thu, 14 Aug 2025 22:11:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from fout-b5-smtp.messagingengine.com (fout-b5-smtp.messagingengine.com [202.12.124.148]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 14 Aug 2025 22:11:05 +0000 (UTC) Received: from phl-compute-10.internal (phl-compute-10.internal [10.202.2.50]) by mailfout.stl.internal (Postfix) with ESMTP id 01B0F1D000A9 for ; Thu, 14 Aug 2025 18:12:41 -0400 (EDT) Received: from phl-imap-05 ([10.202.2.95]) by phl-compute-10.internal (MEProxy); Thu, 14 Aug 2025 18:12:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1755209561; x=1755295961; bh=QwrAf9eKB0 D3eviNH9grVjXX8BMjsv2putpIZNGuYtM=; b=GGntpmj+WIZL9BORT9cTUC+U7O RawwmLj3IMfL0lWJROnR9WqcffJxDgMMzwav30QT2LQad8dUPcPNOsoTW2w0TZsp l9dI3ICu/KTtnUroCK2LYcSEDmNS27YfSla9TQupijrwgL7/YPDJoE3z2XR1aXu8 Ak6mKWB5T1SiGuTW0Et+mF54/wL0KqFsLa/drMsFf6k+AWksa5+Jt+jHAiq8BpAi MK6PeciwV14qd6ngVuG4z8wb37zs8U3EYXLRTKzL9zcoN2+QkNGastw7dMHqeUwK 1TDOZD2iM1JkiqaWD1RAG0l94pQjwAXY5YMiqzHCf/qL6SUkIDq9mCPrX8Vg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1755209561; x=1755295961; bh=QwrAf9eKB0D3eviNH9grVjXX8BMjsv2putp IZNGuYtM=; b=IOuErHAnWsXue4lCEzN6/KBJ2Hq4MamqWLVCMnAOGd/b+XEY1dI W2rfAghTuVxZb/49OAQ5ITY+RkywcL5PK9TT1sc+ygJ8YEVfOoeWjFxlyvEgkInx 4+Ymg86iIVUMN/tV5nFyAagT4hpJoq4ufhLlqh1PqwBNXh9Op26Enc0rilrlvowo AG0PcxC9ZWWPMXN1Z8CCRPPSRxkhho8WXTGedhoeFZx4Usb3HSBbAYmjuufCELRP KiE3Yu1QciX4gWlYfY+emudS+o0eqNAiUvE8d6fqSqpXHK4uvH3aqcRzDBSOGuwY YUzUSRB9ok0BOUwCWXz0ASNUZLD/wCQxDLA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgddugedvvdehucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpefoggffhffvkfgjfhfutgesrgdtreerre dtjeenucfhrhhomhepfdftohgsucfnrghnuggvrhhsfdcuoehrohgssegsohhtthhlvggu rdgtohguvghsqeenucggtffrrghtthgvrhhnpeehieffkeetleejueefjeffueffheevud fgteevfeeijeevfffhhfegueetteffleenucffohhmrghinhepphhhphdrnhgvthdpghhi thhhuhgsrdgtohhmnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilh hfrhhomheprhhosgessghothhtlhgvugdrtghouggvshdpnhgspghrtghpthhtohepuddp mhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepihhnthgvrhhnrghlsheslhhishhtsh drphhhphdrnhgvth X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 6AD42182007E; Thu, 14 Aug 2025 18:12:41 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface Precedence: list list-help: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 X-ThreadId: A7rib2RgmwtZ Date: Fri, 15 Aug 2025 00:12:21 +0200 To: internals@lists.php.net Message-ID: In-Reply-To: References: Subject: Re: [PHP-DEV] Pipe precedence challenges Content-Type: multipart/alternative; boundary=da9fb7c4ce5a4c549c8aa3a2e81d41ed From: rob@bottled.codes ("Rob Landers") --da9fb7c4ce5a4c549c8aa3a2e81d41ed Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Thu, Aug 14, 2025, at 21:30, Larry Garfield wrote: > Hi folks. We have discovered a subtle issue with the pipes implementa= tion that needs to be addressed, although we're not entirely sure how. = Specifically, Derick found it while trying to ensure Xdebug plays nice w= ith pipes. >=20 > The problem has to do with the operator precedence of short closures v= s pipes. For example: >=20 > $result =3D $arr > |> fn($x) =3D> array_map(strtoupper(...), $x) > |> fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O'); > =20 > It's logical to assume that would be parsed as=20 >=20 > $result =3D $arr > |> (fn($x) =3D> array_map(strtoupper(...), $x)) > |> (fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O')); >=20 > Which then compiles into, essentially: >=20 > $temp =3D (fn($x) =3D> array_map(strtoupper(...), $x))($arr); > $temp =3D (fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O'))($tem= p); > $result =3D $temp; >=20 > Or=20 >=20 > $result =3D (fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O'))((f= n($x) =3D> array_map(strtoupper(...), $x))($arr)); >=20 > (depending on how you want to visualize it.) >=20 > That was the intent of the RFC. >=20 > However, because short closures are "greedy" and assume anything befor= e the next semi-colon is part of the closure body, it's actually getting= parsed like this: >=20 > $result =3D $arr > |> fn($x) =3D> ( > array_map(strtoupper(...), $x) > |> fn($x) =3D> ( > array_filter($x, fn($v) =3D> $v !=3D 'O') > ) > ) > ; >=20 > Which would compile into something like: >=20 > $result =3D (fn($x) =3D> (fn($x) =3D> array_filter($x, fn($v) =3D> $v = !=3D 'O'))(array_map(strtoupper(...), $x)))($arr); >=20 > Which is not the intent. >=20 > Humorously, if all the functions and closures involved are pure, this = parsing difference *usually* doesn't matter. The result is still comput= ed as intended. That's why it wasn't caught during earlier reviews or b= y automated tests. However, there are cases where it matters. For exam= ple: >=20 > 42 > |> fn($x) =3D> $x < 42 > |> fn($x) =3D> var_dump($x); > =20 > One would expect that to evaluate to false in the first segment, then = var_dump() false. But it actually var_dump()s 42, because it gets inter= preted as (42 |> fn($x) =3D> var_dump($x)) first. >=20 > The incorrect wrapping also makes debugging vastly harder, even if the= computed result is the same, as the expression is broken up "wrong" int= o multiple nested closures, stack traces are different than one would ex= pect, etc. >=20 > The challenge is conflicting requirements. Closures have extremely lo= w precedence right now, specifically so that they will grab everything t= hat comes after them as a single expression. However, we also want pipe= s to allow a step to be a closure; that would typically mean making pipe= bind even lower than closures, but that's not viable because it would r= esult in=20 >=20 > $result =3D 'x' |> fn ($x) =3D> strtoupper($x)=20 >=20 > being interpreted as >=20 > ($result =3D 'x') |> (fn ($x) =3D> strtoupper($x)) >=20 > Which would be rather pointless. >=20 > So far, the best suggestion that's been put forward (though we've not = tried implementing it yet) is to disallow a pipe inside a short-closure = body, unless the body is surrounded by (). So this: >=20 > fn($x) =3D> $x=20 > |> fn($x) =3D> array_map(strtoupper(...), $x) > |> fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O'); > =20 > Today, that would run somewhat by accident, as the outer-most closure = would claim everything after it. With the new approach, it would be int= erpreted as passing `fn($x) =3D> $x` as the argument to the first pipe s= egment, which would then be mapped over, which would fail. You'd instea= d need to do this: >=20 > fn($x) =3D> ($x=20 > |> (fn($x) =3D> array_map(strtoupper(...), $x)) > |> (fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O')) > ); >=20 > Which is not wonderful, but it's not too bad, either. That's probably= the only case where pipes inside a short-closure body would be useful a= nyway. And if PFA (https://wiki.php.net/rfc/partial_function_applicatio= n_v2) and similar closure improvements pass, it will greatly reduce the = need for mixing short closures and pipes together in either precedence, = so it won't come up very often. >=20 > There are a few other operators that bind lower than pipe (see https:/= /github.com/php/php-src/blob/fd8dfe1bfda62a3bd9dd1ff7c0577da75db02fcf/Ze= nd/zend_language_parser.y#L56-L73), which would therefore need wrapping = parentheses. For many of them we do want them to be lower than pipe, so= just moving pipe's priority down isn't viable. However, most of those a= re unlikely to be used inside a pipe segment, so are less likely to come= up. The most likely would be a bail-out exception: >=20 > $value =3D null; > $value > |> fn ($x) =3D> $x ?? throw new Exception('Value may not be null') > |> fn ($x) =3D> var_dump($x); >=20 > Which would currently be interpreted something like: >=20 > $c =3D function ($x) { > $c =3D function ($x) { > return var_dump($x); > }; > return $x ?? throw $c(new Exception('Value may not be null')); > }; > $c(null); >=20 > This would not throw the exception as expected, unless parentheses are= added. It would var_dump() an exception and then try to throw the retu= rn of varl_dump(), which would fatal. >=20 > RM approval to address this during the 8.5 beta phase has been given, = but we still want to have some discussion to make sure we have a good so= lution. >=20 > So, the question: >=20 > 1. Does this seem like a good solution, or is there a problem we've no= t spotted yet? > 2. Does anyone have a better solution to suggest? >=20 > Thanks to Derick, Ilija, and Tim for tracking down this annoying edge = case. >=20 > --=20 > Larry Garfield > larry@garfieldtech.com >=20 Hi Larry, What would happen to this: $x =3D fn($y) =3D> $y |> strtoupper(=E2=80=A6) |> var_dump(=E2=80=A6); with this change? I would expect $x to be a chain =E2=80=94 which is wha= t it currently is. But if I understand correctly, it will become: $x =3D (fn($y) =3D> $y) |> strtoupper(...) |> var_dump(...); which would be interesting... but this that you shared as the current "c= orrect" way: > fn($x) =3D> ($x=20 > |> (fn($x) =3D> array_map(strtoupper(...), $x)) > |> (fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O')) > ); This looks correct to me=E2=80=A6 but I=E2=80=99ve been using pipes sinc= e they were merged. So, I might be biased. I would have written the abov= e like this though: fn($x) =3D> $x |> __(array_map(...))(strtoupper(...), __) |> __(array_filter(...))(__, fn($v) =3D> $v !=3D 'O'); or if I was feeling wordy: $array_map =3D __(array_map(...); $array_filter =3D __(array_filter(...); fn($x) =3D> $x =3D> $array_map(strtoupper(...), __) |> $array_filter(__,= fn($v) =3D> $v !=3D 'O'); This suggested change would completely break that if I'm understanding c= orrectly. Then PFA would reallow this syntax? I dunno ... changing it ma= kes it sound inconsistent to me. FWIW, I like it how it is for writing reusable pipes chains. PS. The code above is implemented like so: function __(...$steps): Closure { $first =3D array_shift($steps) ?? throw new LogicException(); =20 return function(...$steps) use ($first) { return function($x) use ($first, $steps) { foreach($steps as &$val) { if ($val =3D=3D=3D __) { $val =3D $x; } } return $first(...$steps); }; }; } define('__', __(...)); ish. =E2=80=94 Rob --da9fb7c4ce5a4c549c8aa3a2e81d41ed Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable


On Thu, Aug 14, 2025, at 21:30, Larry Garfield wrote:<= /div>
Hi folks. = We have discovered a subtle issue with the pipes implementation that ne= eds to be addressed, although we're not entirely sure how.  Specifi= cally, Derick found it while trying to ensure Xdebug plays nice with pip= es.

The problem has to do with the operator pre= cedence of short closures vs pipes.  For example:

$result =3D $arr
  |> fn($x) =3D> array_map= (strtoupper(...), $x)
  |> fn($x) =3D> array_filter= ($x, fn($v) =3D> $v !=3D 'O');
  
It's = logical to assume that would be parsed as 

$result =3D $arr
  |> (fn($x) =3D> array_map(strto= upper(...), $x))
  |> (fn($x) =3D> array_filter($x,= fn($v) =3D> $v !=3D 'O'));

Which then compi= les into, essentially:

$temp =3D (fn($x) =3D>= ; array_map(strtoupper(...), $x))($arr);
$temp =3D (fn($x) =3D= > array_filter($x, fn($v) =3D> $v !=3D 'O'))($temp);
$re= sult =3D $temp;

Or 

$result =3D (fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D '= O'))((fn($x) =3D> array_map(strtoupper(...), $x))($arr));
<= br>
(depending on how you want to visualize it.)
That was the intent of the RFC.

How= ever, because short closures are "greedy" and assume anything before the= next semi-colon is part of the closure body, it's actually getting pars= ed like this:

$result =3D $arr
 =    |> fn($x) =3D> (
    &n= bsp;   array_map(strtoupper(...), $x)
  &n= bsp;         |> fn($x) =3D>= ; (
         &nbs= p;      array_filter($x, fn($v) =3D> $v !=3D= 'O')
         &n= bsp;  )
    )
;

Which would compile into something like:

$result =3D (fn($x) =3D> (fn($x) =3D> array_filter($x, fn($v) =3D= > $v !=3D 'O'))(array_map(strtoupper(...), $x)))($arr);
Which is not the intent.

Humorously= , if all the functions and closures involved are pure, this parsing diff= erence *usually* doesn't matter.  The result is still computed as i= ntended.  That's why it wasn't caught during earlier reviews or by = automated tests.  However, there are cases where it matters.  = For example:

42
    |&= gt; fn($x) =3D> $x < 42
    |> fn($x) = =3D> var_dump($x);
    
One w= ould expect that to evaluate to false in the first segment, then var_dum= p() false.  But it actually var_dump()s 42, because it gets interpr= eted as (42 |> fn($x) =3D> var_dump($x)) first.

The incorrect wrapping also makes debugging vastly harder, even i= f the computed result is the same, as the expression is broken up "wrong= " into multiple nested closures, stack traces are different than one wou= ld expect, etc.

The challenge is conflicting re= quirements.  Closures have extremely low precedence right now, spec= ifically so that they will grab everything that comes after them as a si= ngle expression.  However, we also want pipes to allow a step to be= a closure; that would typically mean making pipe bind even lower than c= losures, but that's not viable because it would result in 

$result =3D 'x' |> fn ($x) =3D> strtoupper($x)&nb= sp;

being interpreted as

($result =3D 'x') |> (fn ($x) =3D> strtoupper($x))
Which would be rather pointless.

S= o far, the best suggestion that's been put forward (though we've not tri= ed implementing it yet) is to disallow a pipe inside a short-closure bod= y, unless the body is surrounded by ().  So this:

fn($x) =3D> $x 
  |> fn($x) =3D> ar= ray_map(strtoupper(...), $x)
  |> fn($x) =3D> array= _filter($x, fn($v) =3D> $v !=3D 'O');
 
Toda= y, that would run somewhat by accident, as the outer-most closure would = claim everything after it.  With the new approach, it would be inte= rpreted as passing `fn($x) =3D> $x` as the argument to the first pipe= segment, which would then be mapped over, which would fail.  You'd= instead need to do this:

fn($x) =3D> ($x&nb= sp;
  |> (fn($x) =3D> array_map(strtoupper(...), $x= ))
  |> (fn($x) =3D> array_filter($x, fn($v) =3D>= ; $v !=3D 'O'))
);

Which is not wonde= rful, but it's not too bad, either.  That's probably the only case = where pipes inside a short-closure body would be useful anyway.  An= d if PFA (https://wiki.php.net/rfc/partial_function_application_v2) and= similar closure improvements pass, it will greatly reduce the need for = mixing short closures and pipes together in either precedence, so it won= 't come up very often.

There are a few other op= erators that bind lower than pipe (see https://github.com/php/php-src/blob/fd8dfe1bfda= 62a3bd9dd1ff7c0577da75db02fcf/Zend/zend_language_parser.y#L56-L73), = which would therefore need wrapping parentheses.  For many of them = we do want them to be lower than pipe, so just moving pipe's priority do= wn isn't viable. However, most of those are unlikely to be used inside a= pipe segment, so are less likely to come up.  The most likely woul= d be a bail-out exception:

$value =3D null;
$value
    |> fn ($x) =3D> $x ?? t= hrow new Exception('Value may not be null')
   = |> fn ($x) =3D> var_dump($x);

Which woul= d currently be interpreted something like:

$c =3D= function ($x) {
  $c =3D function ($x) {
 = ;   return var_dump($x);
  };
  = return $x ?? throw $c(new Exception('Value may not be null'));
};
$c(null);

This would not throw th= e exception as expected, unless parentheses are added.  It would va= r_dump() an exception and then try to throw the return of varl_dump(), w= hich would fatal.

RM approval to address this d= uring the 8.5 beta phase has been given, but we still want to have some = discussion to make sure we have a good solution.

So, the question:

1. Does this seem like a go= od solution, or is there a problem we've not spotted yet?
2. D= oes anyone have a better solution to suggest?

T= hanks to Derick, Ilija, and Tim for tracking down this annoying edge cas= e.

-- 
  Larry Garfield


Hi = Larry,

What would happen to this:
$x =3D fn($y) =3D> $y |> strtoupper(=E2=80=A6= ) |> var_dump(=E2=80=A6);

with this c= hange? I would expect $x to be a chain =E2=80=94 which is what it curren= tly is. But if I understand correctly, it will become:

$x =3D (fn($y) =3D> $y) |> strtoupper(...) |>= ; var_dump(...);

which would be interest= ing... but this that you shared as the current "correct" way:
=
fn($x) =3D> ($x 
<= div>  |> (fn($x) =3D> array_map(strtoupper(...), $x))
  |> (fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D = 'O'))
);

This looks = correct to me=E2=80=A6 but I=E2=80=99ve been using pipes since they were= merged. So, I might be biased. I would have written the above like this= though:

fn($x) =3D> $x
  |> __(array_map(...))(strtoupper(...), = __)
  |> __(array_filter(...))(__, f= n($v) =3D> $v !=3D 'O');

or if I was = feeling wordy:

$array_map =3D __(arr= ay_map(...);
$array_filter =3D __(array_filt= er(...);
fn($x) =3D> $x =3D> $arra= y_map(strtoupper(...), __) |> $array_filter(__, fn($v) =3D> $v !=3D= 'O');

This suggested change would compl= etely break that if I'm understanding correctly. Then PFA would reallow = this syntax? I dunno ... changing it makes it sound inconsistent to me.<= /div>

FWIW, I like it how it is for writing reusable = pipes chains.

PS. The code above is implemented= like so:

function __(...$steps): Cl= osure {
    $first =3D array_= shift($steps) ?? throw new LogicException();
   
    retur= n function(...$steps) use ($first) {
 &= nbsp;      return function($x) use ($first, $st= eps) {
      &= nbsp;     foreach($steps as &$val) {
         &= nbsp;      if ($val =3D=3D=3D __) {
         &= nbsp;          $val =3D $x;=
       &= nbsp;        }
           = }
      &= nbsp;     return $first(...$steps);
        };
    };
}=
define('__', __(...));

=
ish.

=E2=80=94 Rob=
--da9fb7c4ce5a4c549c8aa3a2e81d41ed--