Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:127030 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 0ED581A00BC for ; Thu, 3 Apr 2025 07:23:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1743664866; bh=AiDK1MWyHY9Q6OnJqdk454dJvRlXkj7O8Dmv78WtI2o=; h=Date:From:To:In-Reply-To:References:Subject:From; b=SdPHdwf8CHDyYn8jCamtdx0cAVfa+pNE4F0JVnoBqE//lH68lPxTWquWqutOHag9U NLF7c2NIIsU/y6EMLU0VMa5GZSW0xvn7+bqQwcPR/Gyy75uD0BesWDR7n+mT/UOLGG fJhJFnVoEX+ht1zycrfWwmsFc69QHlfdmzVIkhUBrKBCKDoo5ri8H2X1BpaEYw7J+X DC6nRF7JzhDN7s8RtqKHsX6gR2azSic+swLkINYN4EddqIP4Sl354K/waSKGXwjUA/ bE92gCqdSvs5VhnlXdeP+JSTq7fDh3Rax/7gwzOozPd04Do1OCptthRby9/wk01vxj DdSw0xXZo6TcQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 96F1B18006C for ; Thu, 3 Apr 2025 07:21:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fhigh-a5-smtp.messagingengine.com (fhigh-a5-smtp.messagingengine.com [103.168.172.156]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 3 Apr 2025 07:20:54 +0000 (UTC) Received: from phl-compute-12.internal (phl-compute-12.phl.internal [10.202.2.52]) by mailfhigh.phl.internal (Postfix) with ESMTP id 2A295114017D for ; Thu, 3 Apr 2025 03:23:20 -0400 (EDT) Received: from phl-imap-06 ([10.202.2.83]) by phl-compute-12.internal (MEProxy); Thu, 03 Apr 2025 03:23:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to; s=fm2; t=1743665000; x=1743751400; bh=olSwBXjCC+b6x0uEc5aAF OVcpZ6hEcDCR8cDjQTOsR4=; b=ZVt8hSLW3KHXEHgs1vcDaBdQAR89SQC3IbOf9 WJ+D/9N2jTbNswNafCVVe5ac/ODrgppbDiLc8biNxynt8HUFSoPeS7cVs7hLsbTo EO2bZOnh/12kMmixAbDRoy38t2HiDJDO/q9KXuE+fK4OERD6xybUOPGpu1ExYBbT MxzycRdSG24UaE74gXxwWu0qkqRhwGNvI/sJR6fnurujF9BjdH8aKr0OFzu+2DT9 ft7FiwZiK1fPJ0g2+YpAdTFclOKhiZpHY16bGlCsZ3RpBrNZWhIa4FKD2eW5PNvx d7qgrcDdQC4HrJhOnabd66MsgQejNbqbXjEpDTYuweJU6DUHw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1743665000; x=1743751400; bh=o lSwBXjCC+b6x0uEc5aAFOVcpZ6hEcDCR8cDjQTOsR4=; b=FOc45EDZEDQ+awTJT Lyd2l6zoQHjF0C1C0mNv8l40Rdh+xg/lqxXqPkm5uq/yhRmzoQijUkNIt9zEKb0g i+9UZ1TvjcvzHN9b0aZjKv4iHD6htoil7F1ZiE7JbHWNCDL7YtPz3Y13cQgOEasq U3x35TyIkSBocqoX6pUyKJ/5QbFc7aRO7Q+JvQT5e4Ekn12hJv58j5RimyZGzpLC SKHKRuzA/h80U1OgVlMDcc6A8G8iEHxCDFa/qT/ZzorUOwQcF4WOtiGdQE0Zt0jX djwh4hMedvW3ONQXxTVr+96YrBE0OH6uXZKnVTwk4VjaoACS0Ip7oTrqIzDcXTUy bQkHQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgddukeejleehucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggv pdfurfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpih gvnhhtshculddquddttddmnecujfgurhepofggfffhvffkjghfufgtgfesthhqredtredt jeenucfhrhhomhepfdfnrghrrhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrh hfihgvlhguthgvtghhrdgtohhmqeenucggtffrrghtthgvrhhnpeelfeelteeluedvueeu ieeukeehvdduueefudfgtdehtedtfeekgeduveefledvieenucffohhmrghinhepphhhph drnhgvthdpvgigthgvrhhnrghlshdrihhopdgvlhhigihirhhstghhohholhdrtghomhen ucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehlrghrrh ihsehgrghrfhhivghlughtvggthhdrtghomhdpnhgspghrtghpthhtohepuddpmhhouggv pehsmhhtphhouhhtpdhrtghpthhtohepihhnthgvrhhnrghlsheslhhishhtshdrphhhph drnhgvth X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id CE8CE29C0072; Thu, 3 Apr 2025 03:23:19 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 X-ThreadId: Tb59d627fb5f6e7a2 Date: Thu, 03 Apr 2025 02:22:42 -0500 To: "php internals" Message-ID: <5efa2f02-dd1d-4d59-ae07-c75f193b4096@app.fastmail.com> In-Reply-To: References: Subject: Re: [PHP-DEV] [RFC] Pipe Operator (again) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: larry@garfieldtech.com ("Larry Garfield") On Thu, Mar 27, 2025, at 9:30 AM, Ilija Tovilo wrote: > Hi Larry > > Sorry for the late response. > > On Fri, Feb 7, 2025 at 5:58=E2=80=AFAM Larry Garfield wrote: >> >> https://wiki.php.net/rfc/pipe-operator-v3 > > We have already discussed this topic extensively off-list, so let me > bring the list up-to-date. > > The current pipes proposal is elegantly simple. This has many upsides, > but it comes with an obvious limitation: > It only works well when the called function takes only a single argume= nt. > > $sourceCode |> lexer(...) |> parser(...) |> compiler(...) |> vm(...) > > Such code is nice, but is also quite niche. I have argued off-list > that the predominant use-case for pipes are arrays and iterators > (including strings immediately split into chunks), and it seems most > agree. However, most array/iterator functions (e.g. filter, map, > reduce, first, all, etc.) don't fall into the one-parameter category. > > A slightly simplified example from the RFC: > > $result =3D "Hello World" > |> str_split(...) > |> fn($x) =3D> array_map(strtoupper(...), $x) > |> fn($x) =3D> array_filter($x, fn($v) =3D> $v !=3D 'O'); > > IMO, this is harder to understand than the alternative of using > multiple statements with a temporary variable. > > $tmp =3D "Hello World"; > $tmp =3D str_split($tmp); > $tmp =3D array_map(strtoupper(...), $tmp); > $result =3D array_filter($tmp, fn($v) =3D> $v !=3D 'O'); > > The RFC has a solution for this: Partial function application [1]. > > $result =3D "Hello World" > |> str_split(...) > |> array_map(strtoupper(...), ?) > |> array_filter(?, fn($v) =3D> $v !=3D 'O'); > > This still causes more cognitive overhead than it should, at least to = me. > > * The placement of ? is hard to detect, especially when it's not the > first argument. > * The user now has to think about immediately-invoked closures that > exist solely for argument-reordering. The closure can be elided > through the optimizer, but we cannot elide the additional cognitive > overhead in the user. > * The implementation of ? is significantly more complex than that of > pipes, making the supposed simplicity of pipes somewhat misleading. > > If my assumption is correct that the primary use-case for pipes are > arrays, it might be worth investigating the possibility of introducing > a new iterator API, which has been proposed before [2], optimized for > pipes. Specifically, this API would ensure consistent placement of the > subject, i.e. the iterable in this case, as the first argument. Pipes > would no longer have the form of expr |> expr, where the > right-hand-side is expected to return a callable. Instead, it would > have the form of expr |> function_call, where the left-hand-side is > implicitly inserted as the first parameter of the call. > > namespace Iter { > function map(iterable $iterable, \Closure $callback): \Iterator; > function filter(iterable $iterable, \Closure $callback): \Iterator; > } > > namespace { > use function Iter\{map, filter}; > > $result =3D "Hello World" > |> str_split() > |> map(strtoupper(...)) > |> filter(fn($v) =3D> $v !=3D 'O'); > } > > This is the same approach taken by Elixir [3]. It has a few benefits: > > * We don't need to think about closures that are immediately invoked, > because there are none. The code is exactly the same as if you had > written it through nested function calls. This simplifies things > significantly for both the engine and the user. > * It closely resembles code that would be written in an > object-oriented manner, making it more familiar. > * It is the shortest and most readable of all the proposed options. > > As with everything, there are downsides. > > * It only works well for subject-first APIs. There are not an > insignificant number of existing functions that do not follow this > convention (e.g. explode(), preg_match(), etc.). That said, explode(' > ', $s) |> filter($c1) |> map($c2) still composes well, given explode() > is usually first first in the chain, while preg_match() is rarely > chained at all. > * People have voiced concerns for potential confusion regarding the > right-hand-side. It may not be any arbitrary expression, but is > restricted to a function call. Hence, `$param |> $myClosure` is not > valid code, requiring additional braces: `$param |> $myClosure()`. > This approach resembles the -> operator, where at least conceptually, > the left-hand-side is implicitly passed as a $this parameter. However, > the spaces between |> do not signal this fact as well, making it look > like the right-hand-side is evaluated separately. Potentially, a > different symbol might work better. > > Internal reactions to this idea were mixed, so I'm interested to hear > what the community thinks about it. > > Ilija > > [1] https://wiki.php.net/rfc/partial_function_application > [2] https://externals.io/message/118896 > [3] https://elixirschool.com/en/lessons/basics/pipe_operator To clarify my stance on the above: I am open to this, and I agree with I= lija that in the typical case it would be more convenient. The argument= that it would be confusing to have a "hidden" first param is valid, but= as with any new feature I think it's obvious once you know it, so that'= s a small issue. I didn't propose it originally as I suspected folks wo= uld balk at the added complexity, but I do like the concept. Part of Ilija's proposal does include offering $val |> ($expr) (or simil= ar) to allow arbitrary expressions on the left, which would need to retu= rn a unary function. Basically the () would make it the same as what th= e RFC is doing now. However, it also received significant pushback off-list from folks who f= elt it was too much magic. I don't want to torpedo pipes on over-reachi= ng. But without feedback from other voters, I don't know if this is ove= r-reaching. Is it? Please, someone tell me which approach you'd be mor= e willing to vote for. :-) One concern of this approach is that it gets even closer to "real" exten= sion functions. But real extension functions (which let you write code = that looks like you're adding arbitrary methods to arbitrary objects, ev= en though under the hood it's just a plain function that takes an object= as a parameter) also run into a lot of additional complexity. Chief am= ong them, they don't handle name collisions, so you can have only one "m= ap" function rather than one-per-class. Unless you have an alternate sy= ntax for the extension functions to specify the type they work on (which= is what Kotlin does), but then you run into questions around inheritanc= e and polymorphism that are hard to resolve in a runtime-centric environ= ment. I haven't fully thought through all of these details. It's also been proposed to use +> as an operator for extension functions= and/or first-param pipes like Elixir. I'm not sure how I feel about th= at; my main concern is which one it would apply to, since as noted above= full extension functions introduce a lot of extra considerations. But I really don't want to hold up pipes on speculation on multiple futu= re maybe-features. As the RFC notes, there are a number of follow ups t= hat I want to try and get at least some of into the same release. So, consider this me begging for voters to actually speak up on this iss= ue and give feedback on a way forward, because right now I have no idea = what to do with it. --Larry Garfield