Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126343 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 7186F1A00BC for ; Fri, 7 Feb 2025 22:55:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1738968761; bh=Zf3J0OfJXw/YStoT/NGEoy+huEiGrrdiYrMfKCa+xDg=; h=Date:From:To:In-Reply-To:References:Subject:From; b=DDxx1yxz5GjXsRnqueWOBMK5Mazyp3N6FmmbvB/GD5MQ8DFewFOShC+HAH8ZjF4IC 1+BMlvMmMmyGEi8RjLkviA9QfeOxTSgYVndQ0ITUvQ/g3FgCqlockyW9bodKfLq1TO IPFNjzjAB4JTY0bNM9FACRIZeI8VMEHxRp+CdfVjUXohbYa3ovVHQaYYy2zTVbHzIC 9qM2u1/QxNShWPsYxrZzhGKEllnc19lfFWM9jc9YI7bnaigPS7uWsyOlmERqBFnKgz knuvzfgvFLN7TwKSOqynpagL+vyON+MOzcjd3porf8ngSZykoJVv9JQTGHuQqRbJG0 79TSj1qb/WWjg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id C9472180050 for ; Fri, 7 Feb 2025 22:52:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fhigh-b1-smtp.messagingengine.com (fhigh-b1-smtp.messagingengine.com [202.12.124.152]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 7 Feb 2025 22:52:39 +0000 (UTC) Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfhigh.stl.internal (Postfix) with ESMTP id C59482540185 for ; Fri, 7 Feb 2025 17:55:23 -0500 (EST) Received: from phl-imap-09 ([10.202.2.99]) by phl-compute-01.internal (MEProxy); Fri, 07 Feb 2025 17:55:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1738968923; x=1739055323; bh=nuGePtM4g8 oHWjjWks8WYjZjl7LFPgp0yoy6eaC9Q1s=; b=sUHLg+HmsXONPnAQdGOT2GCC7k lvn2Z6Y1FhnjKfvioWPr3m2fIAk2Ad+54j6rVbn3cglLP/NhqM16HDi1nUIJE0kU oVD6Z+wmo2nG7lW4MhQ90+VpMFVmzeWmc0Vn7BYyg3E5ERvxdRBigmt8lD51Bd9T GV0mcyLsSuTf3MydHpzQMS7PovaTszmKyuKQ/GvDQMfDRZZH4WIp0bNGZ773hgXy 6qmoLhnn2JPIb0d5Qc+O4EOYliBulbsEAFu1yh0cXr6ciOW5h46rQB51Gc4dbe8r z/mA5grqRJRAqsEan9WNd+ff1IMAqAhflFRJ4kIYHFS8hlIMgIEAHlN0BF6A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1738968923; x=1739055323; bh=nuGePtM4g8oHWjjWks8WYjZjl7LFPgp0yoy 6eaC9Q1s=; b=mf5MDzQXg4F6gA8Vg4b8QZpNwfZw/e2Eyjo+aA0WMSHdn8PZ8QJ b0p+Vkg6aSEcs5LZPi4egyRxtB2zeMw3I5hEovG7JbfHUbn8Am9B4OEzkwAlp6To TkP0HyrqDoVfPqr1Km85VhzndhI7ljPDDivbEokYLH4xVaBEA0nT1D7BfggRcjFl Ytth5R+tJLQ9pk1Ovy5wOLcbUe52IvTB1MQeMUmja+UrNMbpKe1AEFqabwshhFpL FqEShyM6fR+qQb1NnEYAD9M6r6oF3LvelRG9rZKouIGwyFtpOr1T6dCxrd1sHJaI ZgeVo9K6VJDtTQgWnwcId17cvD1pX9dE2mA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdeftdehgecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecunecujfgurhepofggff fhvffkjghfufgtsegrtderreertdejnecuhfhrohhmpedftfhosgcunfgrnhguvghrshdf uceorhhosgessghothhtlhgvugdrtghouggvsheqnecuggftrfgrthhtvghrnhepheeiff ekteeljeeufeejffeuffehvedugfetveefieejveffhffhgeeuteetffelnecuffhomhgr ihhnpehphhhprdhnvghtpdhgihhthhhusgdrtghomhenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrohgssegsohhtthhlvggurdgtohguvghs pdhnsggprhgtphhtthhopedupdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehinh htvghrnhgrlhhssehlihhsthhsrdhphhhprdhnvght X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 4D33E780068; Fri, 7 Feb 2025 17:55:23 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Fri, 07 Feb 2025 23:54:47 +0100 To: internals@lists.php.net Message-ID: <040da4e2-2595-42ad-ab94-a0e87aed1a79@app.fastmail.com> In-Reply-To: References: <5a584219f120385e7e30f6d0a46cc108@bastelstu.be> Subject: Re: [PHP-DEV] [RFC] Pipe Operator (again) Content-Type: multipart/alternative; boundary=d236eb7ceedd475db39514d2b95b7db2 From: rob@bottled.codes ("Rob Landers") --d236eb7ceedd475db39514d2b95b7db2 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Fri, Feb 7, 2025, at 22:04, Larry Garfield wrote: > Merging a few replies together here, since they overlap. Also reorder= ing a few of Tim's comments... >=20 > On Fri, Feb 7, 2025, at 7:32 AM, Tim D=C3=BCsterhus wrote: > > Hi > > > > Am 2025-02-07 05:57, schrieb Larry Garfield: > >> It is now back with a better implementation (many thanks to Ilija f= or=20 > >> his help and guidance in that), and it's nowhere close to freeze, s= o=20 > >> here we go again: > >>=20 > >> https://wiki.php.net/rfc/pipe-operator-v3 > > > > There's some editorial issues: > > > > 1. Status: Draft needs to be updated. > > 2. The RFC needs to be added to the overview page. > > 3. List formatting issues in =E2=80=9CFuture Scope=E2=80=9D and =E2=80= =9CPatches and Tests=E2=80=9D. > > > > Would also help having a closed voting widget in the =E2=80=9CPropos= ed Voting=20 > > Choices=E2=80=9D section to be crystal clear on what is being voted = on (see=20 > > below the next quote). >=20 > I split pipes off from the Composition RFC late last night right befor= e posting; I guess I missed a few things while doing so. :-/ Most notab= ly, the Compose section is now removed from pipes, as it is not in scope= for this RFC. (As noted, it's going to be more work so has its own RFC= .) Sorry for the confusion. I think it should all be handled now. >=20 > > 5. The =E2=80=9CReferences=E2=80=9D (as in reference variables) sect= ion would do well=20 > > with an example of what doesn't work. >=20 > Example block added. >=20 > > 9. In the =E2=80=9CWhy in the engine?=E2=80=9D section: The RFC make= s a claim about=20 > > performance. > > > > Do you have any numbers? >=20 > Not currently. The statements here are based on simply counting the n= umber of function calls necessary, and PHP function calls are sadly non-= cheap. In previous benchmarks of my own libraries using my Crell/fp lib= rary, I did find that the number of function calls involved in some tigh= t pipe operations was both a performance and debugging concern, but I do= n't have any hard numbers laying about at present to share. >=20 > If you think that's critical, please advise on how to best get meaning= ful numbers here. >=20 > Regarding the equivalency of pipes: >=20 > Tim D=C3=BCsterhus wrote: > > 4. =E2=80=9CThat is, the following two code fragments are also exact= ly=20 > > equivalent:=E2=80=9D. > > > > I do not believe this is true (specifically referring to the =E2=80=9C= exactly=E2=80=9D=20 > > word in there), since the second code fragment does not have the sho= rt=20 > > closures, which likely results in an observable behavioral differenc= e=20 > > when throwing Exceptions (in the stack trace) and also for debuggers= . Or=20 > > is the implementation able to elide the the extra closure? (Of cours= e=20 > > there's also the difference between the temporary variable existing,=20 > > with would be observable for `get_defined_vars()` and possibly=20 > > destructors / object lifetimes). >=20 > Thomas Hruska wrote: > > The repeated assignment to $temp in your second example is _not_=20 > > actually equal to the earlier example as you claim. The second exam= ple=20 > > with all of the $temp variables should, IMO, just be: > > > > $temp =3D "Hello World"; > > $result =3D array_filter(array_map('strtoupper',=20 > > str_split(htmlentities($temp))), fn($v) { return $v !=3D 'O'; }); >=20 > Juris Evertovskis wrote: > > 3. Does the implementation actually turn `1 |> f(...) |> g(...)` int= o=20 > > `$=CF=80 =3D f(1); g($=CF=80)`? Is `g(f(1))` not performanter? Or is= the engine=20 > > clever enough with the var reuse anyways? >=20 > There's some subtlety here on these points. The v2 RFC used the lexer= to mutate $a |> $b |> $c into the same AST as $c($b($a)), which would t= hen compile as though that had been written in the first place. However= , that made addressing references much harder, and there's an important = caveat around order of operations. (See below.) The v3 RFC instead uses= a compile function to take the AST of $a |> $b |> $c and produce opcode= s that are effectively equivalent to $t =3D $b($a); $t =3D $c($t); I ha= ve not compared to see if they are the precise same opcodes, but they ne= t effect is the same. So "effectively equivalent" may be a more accurat= e statement. >=20 > In particular, Tim is correct that, technically, the short lambdas wou= ld be used as-is, so you'd end up with the equivalent of: >=20 > $temp =3D (fn($x) =3D> array_map(strtoupper(...), $x))($temp); >=20 > I'm not sure if there's a good way to automatically unwrap the closure= there. (If someone knows of one, please share; I'm fine with including= it.) However, the intent is that it would be largely unnecessary in th= e future with a revised PFA implementation, which would obviate the need= for the explicit wrapping closure. You would instead write >=20 > $a |> array_map(strtoupper(...), ?); >=20 > Alternatively, one can use higher order user-space functions already. = In trivial cases: >=20 > function amap(Closure $fn): Closure { > return fn(array $x) =3D> array_map($fn, $x); > } >=20 > $a |> amap(strtoupper(...)); >=20 > Which I am already using in Crell/fp and several libraries that levera= ge it, and it's quite ergonomic. >=20 > There's a whole bunch of such simple higher order functions here: > https://github.com/Crell/fp/blob/master/src/array.php > https://github.com/Crell/fp/blob/master/src/string.php >=20 > Which leads to the subtle difference between that and the v2 implement= ation, and why Thomas' statement is incorrect. If the expression on the= right side that produces a Closure has side effects (output, DB interac= tion, etc.), then the order in which those side effects happen may chang= e with the different restructuring. With all pure functions, that won't= make a practical difference, and normally one should be using pure func= tions, but that's not something PHP can enforce. >=20 > I don't think there would be an appreciable performance difference bet= ween the two compiled versions, either way, but using the temp-var appro= ach makes dealing with references easier, so it's what we're doing. >=20 > Juris Evertovskis wrote: > > 1. Do you think it would be hard to add some shorthand for `|>=20 > > $condition ? $callable : fn($=F0=9F=98=90) =3D> $=F0=9F=98=90`? >=20 > I'm not sure I follow here. Assuming you're talking about "branch in = the next step", the standard way of doing that is with a higher order us= er-space function. Something like: >=20 > function cond(bool $cond, Closure $t, Closure $f): Closure { > return $cond ? $t : $f; > } >=20 > $a |> cond($config > 10, bigval(...), smallval(...)) |> otherstuff(...= ); >=20 > I think it's premature to try and bake that logic into the language, e= specially when I don't know of any other function-composition-having lan= guage that does so at the language level rather than the standard librar= y level. (There are a number of fun operations people build into pipeli= nes, but they are all generally done in user space.) >=20 > --Larry Garfield >=20 Put another way, what is the order of operations for this new operator? For example, what is the output of $x ? $y |> strlen(=E2=80=A6) : $z $x + $y |> sqrt(=E2=80=A6) . EOL Etc. I noticed this seems to be missing from the RFC. As a new operator, I th= ink it should be important to specify that.=20 =E2=80=94 Rob --d236eb7ceedd475db39514d2b95b7db2 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

=
On Fri, Feb 7, 2025, at 22:04, Larry Garfield wrote:
<= /div>
Merging a few r= eplies together here, since they overlap.  Also reordering a few of= Tim's comments...

On Fri, Feb 7, 2025, at = 7:32 AM, Tim D=C3=BCsterhus wrote:
> Hi
&= gt;
> Am 2025-02-07 05:57, schrieb Larry Garfield:
<= /div>
>> It is now back with a better implementation (many tha= nks to Ilija for 
>> his help and guidance in t= hat), and it's nowhere close to freeze, so 
>> = here we go again:
>> 
>
> The= re's some editorial issues:
>
> 1. Sta= tus: Draft needs to be updated.
> 2. The RFC needs to b= e added to the overview page.
> 3. List formatting issu= es in =E2=80=9CFuture Scope=E2=80=9D and =E2=80=9CPatches and Tests=E2=80= =9D.
>
> Would also help having a clos= ed voting widget in the =E2=80=9CProposed Voting 
>= ; Choices=E2=80=9D section to be crystal clear on what is being voted on= (see 
> below the next quote).

=
I split pipes off from the Composition RFC late last night ri= ght before posting; I guess I missed a few things while doing so. :-/&nb= sp; Most notably, the Compose section is now removed from pipes, as it i= s not in scope for this RFC.  (As noted, it's going to be more work= so has its own RFC.)  Sorry for the confusion.  I think it sh= ould all be handled now.

> 5. The =E2=80= =9CReferences=E2=80=9D (as in reference variables) section would do well=  
> with an example of what doesn't work.

Example block added.

&= gt; 9. In the =E2=80=9CWhy in the engine?=E2=80=9D section: The RFC make= s a claim about 
> performance.
>=
> Do you have any numbers?

Not currently.  The statements here are based on simply counting = the number of function calls necessary, and PHP function calls are sadly= non-cheap.  In previous benchmarks of my own libraries using my Cr= ell/fp library, I did find that the number of function calls involved in= some tight pipe operations was both a performance and debugging concern= , but I don't have any hard numbers laying about at present to share.

If you think that's critical, please advise o= n how to best get meaningful numbers here.

= Regarding the equivalency of pipes:

Tim D=C3= =BCsterhus wrote:
> 4. =E2=80=9CThat is, the following = two code fragments are also exactly 
> equivalent:= =E2=80=9D.
>
> I do not believe this i= s true (specifically referring to the =E2=80=9Cexactly=E2=80=9D 
> word in there), since the second code fragment does not= have the short 
> closures, which likely results = in an observable behavioral difference 
> when thr= owing Exceptions (in the stack trace) and also for debuggers. Or 
> is the implementation able to elide the the extra clos= ure? (Of course 
> there's also the difference bet= ween the temporary variable existing, 
> with woul= d be observable for `get_defined_vars()` and possibly 
> destructors / object lifetimes).

Tho= mas Hruska wrote:
> The repeated assignment to $temp in= your second example is _not_ 
> actually equal to= the earlier example as you claim.  The second example 
> with all of the $temp variables should, IMO, just be:
>
> $temp =3D "Hello World";
= > $result =3D array_filter(array_map('strtoupper', 
> str_split(htmlentities($temp))), fn($v) { return $v !=3D 'O'; });=

Juris Evertovskis wrote:
>= ; 3. Does the implementation actually turn `1 |> f(...) |> g(...)`= into 
> `$=CF=80 =3D f(1); g($=CF=80)`? Is `g(f(1= ))` not performanter? Or is the engine 
> clever e= nough with the var reuse anyways?

There's s= ome subtlety here on these points.  The v2 RFC used the lexer to mu= tate $a |> $b |> $c into the same AST as $c($b($a)), which would t= hen compile as though that had been written in the first place.  Ho= wever, that made addressing references much harder, and there's an impor= tant caveat around order of operations. (See below.)  The v3 RFC in= stead uses a compile function to take the AST of $a |> $b |> $c an= d produce opcodes that are effectively equivalent to $t =3D $b($a); $t =3D= $c($t);  I have not compared to see if they are the precise same o= pcodes, but they net effect is the same.  So "effectively equivalen= t" may be a more accurate statement.

In par= ticular, Tim is correct that, technically, the short lambdas would be us= ed as-is, so you'd end up with the equivalent of:

$temp =3D (fn($x) =3D> array_map(strtoupper(...), $x))($temp);=

I'm not sure if there's a good way to auto= matically unwrap the closure there.  (If someone knows of one, plea= se share; I'm fine with including it.)  However, the intent is that= it would be largely unnecessary in the future with a revised PFA implem= entation, which would obviate the need for the explicit wrapping closure= .  You would instead write

$a |> ar= ray_map(strtoupper(...), ?);

Alternatively,= one can use higher order user-space functions already.  In trivial= cases:

function amap(Closure $fn): Closure= {
  return fn(array $x) =3D> array_map($fn, $x);<= br>
}

$a |> amap(strtoupper(..= .));

Which I am already using in Crell/fp a= nd several libraries that leverage it, and it's quite ergonomic.

There's a whole bunch of such simple higher order = functions here:

Which leads to the subtle difference = between that and the v2 implementation, and why Thomas' statement is inc= orrect.  If the expression on the right side that produces a Closur= e has side effects (output, DB interaction, etc.), then the order in whi= ch those side effects happen may change with the different restructuring= .  With all pure functions, that won't make a practical difference,= and normally one should be using pure functions, but that's not somethi= ng PHP can enforce.

I don't think there wou= ld be an appreciable performance difference between the two compiled ver= sions, either way, but using the temp-var approach makes dealing with re= ferences easier, so it's what we're doing.

= Juris Evertovskis wrote:
> 1. Do you think it would be = hard to add some shorthand for `|> 
> $conditio= n ? $callable : fn($=F0=9F=98=90) =3D> $=F0=9F=98=90`?
=
I'm not sure I follow here.  Assuming you're talking= about "branch in the next step", the standard way of doing that is with= a higher order user-space function.  Something like:

function cond(bool $cond, Closure $t, Closure $f): Closu= re {
  return $cond ? $t : $f;
}

$a |> cond($config > 10, bigval(...), small= val(...)) |> otherstuff(...);

I think it= 's premature to try and bake that logic into the language, especially wh= en I don't know of any other function-composition-having language that d= oes so at the language level rather than the standard library level.&nbs= p; (There are a number of fun operations people build into pipelines, bu= t they are all generally done in user space.)

--Larry Garfield


=
Put another way, what is the order of operations for this new opera= tor?

For example, what is the output of
=

$x ? $y |> strlen(=E2=80=A6) : $z
=

$x + $y |> sqrt(=E2=80=A6) . EOL
Etc.

I noticed this seems to be mi= ssing from the RFC. As a new operator, I think it should be important to= specify that. 

=E2= =80=94 Rob
--d236eb7ceedd475db39514d2b95b7db2--