Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:105087 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 85571 invoked from network); 5 Apr 2019 06:51:02 -0000 Received: from unknown (HELO mail1.25mail.st) (206.123.115.54) by pb1.pair.com with SMTP; 5 Apr 2019 06:51:02 -0000 Received: from [10.0.1.70] (unknown [49.48.244.61]) by mail1.25mail.st (Postfix) with ESMTPSA id 9EC6860554; Fri, 5 Apr 2019 03:46:44 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) In-Reply-To: <37bb451f-a1b2-4c69-91e2-f2ab6c4798e7@www.fastmail.com> Date: Fri, 5 Apr 2019 10:46:41 +0700 Cc: PHP internals Content-Transfer-Encoding: quoted-printable Message-ID: <76D63EA5-7F6D-43E6-ACC2-E55B5816BEF5@koalephant.com> References: <6d3be21f-d63a-3fc6-94ee-0bde8e313d66@xs4all.nl> <37bb451f-a1b2-4c69-91e2-f2ab6c4798e7@www.fastmail.com> To: Larry Garfield X-Mailer: Apple Mail (2.3445.104.8) Subject: Re: [PHP-DEV] RFC Draft: Comprehensions From: php-lists@koalephant.com (Stephen Reay) > On 5 Apr 2019, at 08:54, Larry Garfield = wrote: >=20 > On Wed, Mar 13, 2019, at 10:22 PM, Larry Garfield wrote: >> On Wed, Mar 13, 2019, at 6:30 PM, Rowan Collins wrote: >>> On 13/03/2019 21:10, Dik Takken wrote: >=20 >> If I can summarize the responses so far, they seem to fall into one = of=20 >> two categories: >>=20 >> 1) Love the idea, but wouldn't short-closures be close enough? >>=20 >> 2) Love the idea, but hate the particular syntax proposed. >>=20 >> On the plus side, it seems almost everyone is on board in concept, so=20= >> yay. That of course just leaves the syntax bikeshedding, which is=20 >> always the fun part. >=20 > Bumping this thread again. >=20 > Thinking on it further, I see two possible syntactic approaches, given = that short lambdas as currently written would not give us a viable = comprehension syntax. >=20 > 1) [foreach ($list as $x =3D> $y) if (condition) yield expression] >=20 > That is, essentially the same syntax as the list would be if wrapped = in a function, but with a more compact way of writing it. The above = would be effectively identical to: >=20 > $gen =3D function () { > foreach ($list as $x =3D> $y) > if ($condition) > yield expression; > }(); >=20 >=20 > (But with auto-capture.) I am personally not at all a fan of the = extra verbosity (foreach, parens, etc.) but it seems most respondents in = the thread want it for familiarity. >=20 > Advantages: >=20 > * Very compact. > * Works for both arrays and traversables > * Would play very nicely with the proposed spread operator for = iterables (https://wiki.php.net/rfc/spread_operator_for_array). >=20 > Disadvantages: >=20 > * New syntax > * If you need to do multiple filter or map operations it gets = potentially ugly and unwieldy. > * Not super extensible. > * Doesn't have a natural way to enforce the types produced. (Although = one could add it easily.) >=20 > This approach has the advantage of being compact and working for both = arrays and traversables, but is new syntax. >=20 > 2) Allow comprehensions to work only on traversable objects, which = lets us chain methods. Specifically: >=20 > $new =3D $anyTraversable->filter(fn($x) =3D> $x < 0); >=20 > Would return a new traversable that filters $anyTraversable, using a = callable. It would effectively be identical to=20 >=20 > $new =3D new CallbackFilterIterator($anyTraversable, fn($x) =3D> $x < = 0); >=20 > Similarly: >=20 > $new =3D $anyTraversable->map(fn($x) =3D> $x * 2); >=20 > Would produce a new traversable that lazily produces a function over = the items as they're returned. Equivalent to: >=20 > $new =3D function () { > foreach ($list as $x) > yield expression; > }(); >=20 > And both would also need to support a key/value as well, probably if = the callable takes 2 parameters then it's $key, $value, if just one = parameter then it's just $value. >=20 > This approach has a few advantages: >=20 > * It piggy-backs on existing traversable behavior; essentially, rather = than short-syntax for generators it's short syntax for wrapping a bunch = of iterator objects around each other. > * More elaborate cases (multiple filters, multiple maps) become = somewhat nicer; you can easily call filter() or map() multiple times and = it's still entirely obvious what's going on. > * Has a natural (if verbose) way to enforce types: filter(fn($x) =3D> = $x instanceof Foo || throw new \TypeError); > * Actually, since short-lambdas already would support return type = declaration, there's another alternative: filter(fn($x) : Foo =3D> $x); = (Although you'd probably just fit that into a filter function you're = using for something else.) > * next() is already a useful method that works for the an() case = discussed in the RFC, and it flows very naturally. I don't see a nice = equivalent of all(), however. >=20 > But also some disadvantages: >=20 > * It only works for traversable objects, not arrays. (Workaround: new = ArrayObject($arr).) > * It is more verbose than the other syntax option. > * Adding special-meaning methods to Traversable objects is weird, and = I don't think we've done that anywhere before. I have no idea if there = are engine implications. > * The short lambda RFC becomes effectively a prerequisite, as it's way = too verbose to do with an anon function as we have now. > * My gut feeling is it would be slower as it would likely mean more = function calls internally, but I've zero data to back that up. >=20 > And before someone else mentions it, it also poses some interesting = possible extensions that are not all that relevant to the current = target, but would fit naturally: >=20 > * a ->limit(0, 3) method, that is functionally equivalent to = \LimitIterator. > * Potentially RegexIterator() could also become a regex() method, = that's a special case of filter()? > * Languages like Rust have a method to "run out" the comprehension ( = .collect() in the case of Rust). We could easily do the same to produce = a resultant array, similar to the spread operator. (That said, that = should in no way detract from the spread operator proposal, which I also = like on its own merits.) > * Possibly other stuff that slowly turns iterables into "collection = objects" (sort of). >=20 >=20 > Discussion: >=20 > For me, the inability to work with arrays is the big problem with the = second approach. I very very often am type declaring my returns and = parameters as `iterable`, which means I may have an array and not know = it. Using approach 2 means I suddenly really really need to care which = kind of iterable it is, which defeats the purpose of `iterable`. = Calling methods on arrays, though, I'm pretty sure is out of scope. >=20 > Frankly were it not for that limitation I'd say I favor the chained = method style, as while it is more verbose it is also more = self-documenting. Given that limitation, I'm torn but would probably = lean toward option 1. And of course there's the "methods that apply to = all traversable objects" thing which is its own can of worms I know = nothing about. >=20 > (If someone has a suggestion for how to resolve that disadvantage, I'd = love to hear it.) >=20 > Those seem like the potential options. Any further thoughts? Or = volunteers? :-) >=20 > --Larry Garfield >=20 > --=20 > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php >=20 (Sorry, sent from wrong address, sending again!) Hi Larry, I=E2=80=99ve mostly ignored this thread until now - I find a lot of the = =E2=80=9Cshorter syntax=E2=80=9D (i.e. the short closures RFC) to sound = a lot like the arguments =E2=80=9CI don=E2=80=99t like semicolons/it has = to be =E2=80=98pretty'=E2=80=9D that happen in other language = communities. But the first example you give here, I can see the logical approach - as = you say, it=E2=80=99s a currently-valid foreach statement, wrapped in = square brackets. Would it have to be a single line to parse, or could it = be wrapped when the condition gets longer (yes I know it could just = become a regular generator then, I=E2=80=99m just wondering about what = happens when someone adds a new line in there (in a language that = historically doesn=E2=80=99t care about newlines) I like the second concept a lot too, but how would this cope with for = example: a userland class implements iterator but *also* defines a = `filter(callback $fn): self` method for the exact same purposes were = discussing. How is that handled? Cheers Stephen