Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:104706 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 80633 invoked from network); 14 Mar 2019 06:32:04 -0000 Received: from unknown (HELO out2-smtp.messagingengine.com) (66.111.4.26) by pb1.pair.com with SMTP; 14 Mar 2019 06:32:04 -0000 Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id C4D2E21908 for ; Wed, 13 Mar 2019 23:22:21 -0400 (EDT) Received: from imap26 ([10.202.2.76]) by compute7.internal (MEProxy); Wed, 13 Mar 2019 23:22:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:in-reply-to :message-id:references:subject:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=dyf8XX9U5XCsEgL/0 AYqSD6uaGdXs/RkJL0RQ09nrIQ=; b=T7n0Rxb6Rj08euVIu7D4zNYBnyNVMVTWI 7FJdAuUjzRP67UHhcpy05vnAyaBilODFkbWdUlXXKMVTcZ/U5KuccWjiWDbQz6vW 5L7Vp0tjuRXWeMLOqcKkLWirSEsKoDy8HGuWT0nkMMFZ1JiB7/Rf1p+Y69OwvPJw 9iK8xhCs0jUBoLRnMNv/QHZXVAN5uIYyAwsmQ2nUcjLKreP787BU5DGHfTDl4FMv nBIvHi0KjXVm09Xdap88L1yhTj0e7pU0j/l2ph853zA8hF3uLWCEQIifUByhQo0M 4SSEujNe0vrvBokY0CGfSbP3QmeBez9zuwOGf5Usc1ncQHWwf5Afg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedutddrhedugdehkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepofgfkfgjfhffhffvufgtsehttdertd erredtnecuhfhrohhmpedfnfgrrhhrhicuifgrrhhfihgvlhgufdcuoehlrghrrhihsehg rghrfhhivghlughtvggthhdrtghomheqnecuffhomhgrihhnpeifihhkihhpvgguihgrrd horhhgnecurfgrrhgrmhepmhgrihhlfhhrohhmpehlrghrrhihsehgrghrfhhivghlught vggthhdrtghomhenucevlhhushhtvghrufhiiigvpedt X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 05D2BB457F; Wed, 13 Mar 2019 23:22:21 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.1.5-925-g644bf8c-fmstable-20190228v5 X-Me-Personality: 10727885 Message-ID: In-Reply-To: References: <6d3be21f-d63a-3fc6-94ee-0bde8e313d66@xs4all.nl> Date: Wed, 13 Mar 2019 23:21:19 -0400 To: internals@lists.php.net Content-Type: text/plain Subject: Re: [PHP-DEV] RFC Draft: Comprehensions From: larry@garfieldtech.com ("Larry Garfield") On Wed, Mar 13, 2019, at 6:30 PM, Rowan Collins wrote: > On 13/03/2019 21:10, Dik Takken wrote: > > So in practice, I expect that > > using comprehensions as proposed in the new RFC will also require doing > > a lot of iterator_to_array(). A dual comprehension syntax could fix that. > > > At risk of complicating things further, might the solution to that be to > have a shorter syntax for iterator_to_array in general? > > It's a shame array-casts are defined for arbitrary objects, else we > could have (array)$iterator - and therefore (array)[foreach ($users as > $user) yield $user->firstName] I am again going to reply to a bunch of people at once here... If I can summarize the responses so far, they seem to fall into one of two categories: 1) Love the idea, but wouldn't short-closures be close enough? 2) Love the idea, but hate the particular syntax proposed. On the plus side, it seems almost everyone is on board in concept, so yay. That of course just leaves the syntax bikeshedding, which is always the fun part. As an aside, someone up-thread said that comprehensions were "an easier way to write foreach loops", which is only true by accident. Comprehensions are more formally a way of defining one set in relation to another set. That is, they are a declarative relationship between one set and another. While in PHP that ends up effectively being a short-hand for foreach loops, that's more an accidental implementation detail. The syntax used by many other languages to achieve the same thing doesn't look at all like loop syntax. To the question of having both a generator and array version, I would have to say no. As noted in the RFC, most cases where you'd want to use a comprehension are not places where you'd be feeding the result into an array function. On the off chance that you are converting the iterable into an array is trivial enough that supporting, documenting, and learning two slightly different syntaxes seems a net negative. To Rowan's point, I would be fully in favor of an easier syntax alternative to iterator_to_array(). I think that's rather similar (although not identical) to the "run out an iterator" add-on mentioned in the RFC. I would support that, but I think it's a bit orthogonal and should not be a blocker for short-closures or for comprehensions. As for the specific syntax, I see a couple of options. 1) Assuming that short-lambdas get adopted and they can transparently support generators, the following syntax becomes automatically possible: $gen = (fn() => foreach($arr as $k => $v) if ($k % 2) yield $v;)(); While that does work, there's an awful lot of symbol salad there: (fn() => and ;)(); are both just gross and hard to type. I would consider that not a full solution for comprehensions because of how clumsy it is. 2) We could include an even-shorter-lambda syntax, potentially, or perhaps a short-lambda-based comprehension syntax. For example (and this may not be parser friendly but it's just to demonstrate the idea): $gen = fn{ foreach($arr as $k => $v) if ($k % 2) yield $v }; That would be a short-hand for a short-closure that has no parameters, and we could detect the yield and self-execute. The language inside the function body would still be a bit verbose, but it would technically be any legal single statement, which would offer some potentially interesting (scary?) options. I would consider this an acceptable solution for comprehensions-ish in PHP. 3) The specific syntax proposed in the RFC is Python-inspired and PHP-ified, but there's no reason we need to stick to that. There are a myriad of other syntaxes for comprehensions in other languages that we could steal if they fit better, some of which wouldn't at all resemble foreach loops and thus avoid the for/foreach confusion. Wikipedia of course has a large index of them we can mine: https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(list_comprehension) It appears that the most common syntax involves [] of some variety, which pose parsing problems for PHP, but a few other options jump out at me as possible syntaxes to pilfer: C# has this SQL-esque syntax (which may involve too many additional language keywords): var ns = from x in Enumerable.Range(0,100) where x*x > 3 select x*2; Elixr, Erlang, and Haskell use the <- symbol, which... I don't think we use anywhere else currently? In Elixir: for x <- 0..100, x * x > 3, do: x * 2 Java 8, Ruby, Rust, and Swift are very very similar, and use a fluent syntax. The Rust example: (0..100).filter(|x| x * x > 3).map(|x| 2 * x).collect(); While that could not be taken as-is, of course, it does propose an interesting alternative approach, if we limit comprehensions to Traversable objects rather than any iterable (that is, exclude arrays): $t->filter(fn($v) => expression)->filter(fn($k, $v) => expression)->map(fn($v) => expression); Which would, in turn, each produce a generator that reduces the set or finally yields. I am not sure I fully like this one, to be honest, as the multiple inline short closures make it rather verbose and harder to follow with the proposed short-closure syntax (and it would involve more function calls internally), but it's an option. (collect() in these languages seems like it's the equivalent of iterator_to_array(); maybe that's another alternative there as well?) Nemerle, which I've never heard of before, has this: $[x*2 | x in [0 .. 100], x*x > 3] Which, while $ is obviously already used, does suggest using one of the other not-yet-used sigils that Nikita identified, which would let us reorder the parameters to put the expression first if we wanted. For example: ^[$x *2 | $k => $v in $arr if $k %2] In general, I see two alternatives: 1) Pass short closures and then include a special case of that special case that effectively gives us comprehensions over foreach, if, and yield, but with fewer seemingly-stray characters. 2) Steal a completely different syntax from some other language that is still terse but less confusing. The main alternatives to "square brackets around a for loop" syntax seem to be: A) Chained filter() and map() methods B) SQL-like keywords C) Use <- somehow. D) Use a different starting character before the [] so that the parser knows some new funky order of stuff is coming. I am open to both options, of course contingent on someone willing and able to code it. --Larry Garfield