Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:105086 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 62483 invoked from network); 5 Apr 2019 04:59:06 -0000 Received: from unknown (HELO out1-smtp.messagingengine.com) (66.111.4.25) by pb1.pair.com with SMTP; 5 Apr 2019 04:59:06 -0000 Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id C967F2434E for ; Thu, 4 Apr 2019 21:54:52 -0400 (EDT) Received: from imap26 ([10.202.2.76]) by compute7.internal (MEProxy); Thu, 04 Apr 2019 21:54:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=yH98Mr irixfZz8AaCpkzFMOcQSfVem+OBLlZa79UGow=; b=xrrZo5CVkEThQ+XoJ8GOMV Js618G0ukiJvkACRG7PsmVoI/+6k8+dX/wwBqlUlSL3HClK+OIcok/kFz9/D84KQ vExrZpMK+kYEOBICjv0artZc+MmNh4ybMsTenNqgl+nj6XhlgLqCewUgyvu+8d+i zPj+UxTLOtBh4f9tJJppup6X3TK9WceZH/+eN4G18TQRYHgIOyzHxmMcQF9GytZV MPDHcER3xrfzGddyV3guYqpzH0OSR8STEMB0+AlMrmXNN3fRjLwWRjxZfubHoHz0 R9mf2xq6UWNQ2hc15kaCv7madHYyNB5lc4/elyBm2rssTNECeKP8Qn95+s5kAMmA == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrtdeigdegleculddtuddrgedutddrtddtmd cutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecunecujfgurhepofgfgg fkjghffffhvffutgesthdtredtreertdenucfhrhhomhepfdfnrghrrhihucfirghrfhhi vghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtohhmqeenucffohhmrg hinhepphhhphdrnhgvthenucfrrghrrghmpehmrghilhhfrhhomheplhgrrhhrhiesghgr rhhfihgvlhguthgvtghhrdgtohhmnecuvehluhhsthgvrhfuihiivgeptd X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 55D0AB4373; Thu, 4 Apr 2019 21:54:52 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.1.6-329-gf4aae99-fmstable-20190329v1 Mime-Version: 1.0 X-Me-Personality: 10727885 Message-ID: <37bb451f-a1b2-4c69-91e2-f2ab6c4798e7@www.fastmail.com> In-Reply-To: References: <6d3be21f-d63a-3fc6-94ee-0bde8e313d66@xs4all.nl> Date: Thu, 04 Apr 2019 21:54:51 -0400 To: internals@lists.php.net Content-Type: text/plain Subject: Re: [PHP-DEV] RFC Draft: Comprehensions From: larry@garfieldtech.com ("Larry Garfield") On Wed, Mar 13, 2019, at 10:22 PM, Larry Garfield wrote: > On Wed, Mar 13, 2019, at 6:30 PM, Rowan Collins wrote: > > On 13/03/2019 21:10, Dik Takken wrote: > If I can summarize the responses so far, they seem to fall into one of > two categories: > > 1) Love the idea, but wouldn't short-closures be close enough? > > 2) Love the idea, but hate the particular syntax proposed. > > On the plus side, it seems almost everyone is on board in concept, so > yay. That of course just leaves the syntax bikeshedding, which is > always the fun part. Bumping this thread again. Thinking on it further, I see two possible syntactic approaches, given that short lambdas as currently written would not give us a viable comprehension syntax. 1) [foreach ($list as $x => $y) if (condition) yield expression] That is, essentially the same syntax as the list would be if wrapped in a function, but with a more compact way of writing it. The above would be effectively identical to: $gen = function () { foreach ($list as $x => $y) if ($condition) yield expression; }(); (But with auto-capture.) I am personally not at all a fan of the extra verbosity (foreach, parens, etc.) but it seems most respondents in the thread want it for familiarity. Advantages: * Very compact. * Works for both arrays and traversables * Would play very nicely with the proposed spread operator for iterables (https://wiki.php.net/rfc/spread_operator_for_array). Disadvantages: * New syntax * If you need to do multiple filter or map operations it gets potentially ugly and unwieldy. * Not super extensible. * Doesn't have a natural way to enforce the types produced. (Although one could add it easily.) This approach has the advantage of being compact and working for both arrays and traversables, but is new syntax. 2) Allow comprehensions to work only on traversable objects, which lets us chain methods. Specifically: $new = $anyTraversable->filter(fn($x) => $x < 0); Would return a new traversable that filters $anyTraversable, using a callable. It would effectively be identical to $new = new CallbackFilterIterator($anyTraversable, fn($x) => $x < 0); Similarly: $new = $anyTraversable->map(fn($x) => $x * 2); Would produce a new traversable that lazily produces a function over the items as they're returned. Equivalent to: $new = function () { foreach ($list as $x) yield expression; }(); And both would also need to support a key/value as well, probably if the callable takes 2 parameters then it's $key, $value, if just one parameter then it's just $value. This approach has a few advantages: * It piggy-backs on existing traversable behavior; essentially, rather than short-syntax for generators it's short syntax for wrapping a bunch of iterator objects around each other. * More elaborate cases (multiple filters, multiple maps) become somewhat nicer; you can easily call filter() or map() multiple times and it's still entirely obvious what's going on. * Has a natural (if verbose) way to enforce types: filter(fn($x) => $x instanceof Foo || throw new \TypeError); * Actually, since short-lambdas already would support return type declaration, there's another alternative: filter(fn($x) : Foo => $x); (Although you'd probably just fit that into a filter function you're using for something else.) * next() is already a useful method that works for the an() case discussed in the RFC, and it flows very naturally. I don't see a nice equivalent of all(), however. But also some disadvantages: * It only works for traversable objects, not arrays. (Workaround: new ArrayObject($arr).) * It is more verbose than the other syntax option. * Adding special-meaning methods to Traversable objects is weird, and I don't think we've done that anywhere before. I have no idea if there are engine implications. * The short lambda RFC becomes effectively a prerequisite, as it's way too verbose to do with an anon function as we have now. * My gut feeling is it would be slower as it would likely mean more function calls internally, but I've zero data to back that up. And before someone else mentions it, it also poses some interesting possible extensions that are not all that relevant to the current target, but would fit naturally: * a ->limit(0, 3) method, that is functionally equivalent to \LimitIterator. * Potentially RegexIterator() could also become a regex() method, that's a special case of filter()? * Languages like Rust have a method to "run out" the comprehension ( .collect() in the case of Rust). We could easily do the same to produce a resultant array, similar to the spread operator. (That said, that should in no way detract from the spread operator proposal, which I also like on its own merits.) * Possibly other stuff that slowly turns iterables into "collection objects" (sort of). Discussion: For me, the inability to work with arrays is the big problem with the second approach. I very very often am type declaring my returns and parameters as `iterable`, which means I may have an array and not know it. Using approach 2 means I suddenly really really need to care which kind of iterable it is, which defeats the purpose of `iterable`. Calling methods on arrays, though, I'm pretty sure is out of scope. Frankly were it not for that limitation I'd say I favor the chained method style, as while it is more verbose it is also more self-documenting. Given that limitation, I'm torn but would probably lean toward option 1. And of course there's the "methods that apply to all traversable objects" thing which is its own can of worms I know nothing about. (If someone has a suggestion for how to resolve that disadvantage, I'd love to hear it.) Those seem like the potential options. Any further thoughts? Or volunteers? :-) --Larry Garfield