Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114797 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 63365 invoked from network); 9 Jun 2021 13:57:21 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 9 Jun 2021 13:57:21 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 07F94180531 for ; Wed, 9 Jun 2021 07:12:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam08olkn2022.outbound.protection.outlook.com [40.92.47.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 9 Jun 2021 07:12:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=UieJMwg2HGyCfJeDw0t0kCfJQB4ln2YF5NJU7OphggC4woDHmf35Gw7dXMTEOGbfbj5YDXpwOy0cd+Qa2TvTn4a4tNArGIwSD4GI9IZ78e0ensvfkWtvR80Hb2EvNUDqpodrccSThmQqKik9WdMJZK4wfaw+lcEpcJWxscG3Z3dsb9cMV1RkGPNX0fntb3WOsqjd5cxl6tpO8Ry8ZieNAhM28vbjSAWYEEHqJhefIGuw/vC6686/icPiintyQk0J/qZHzSulFtpH5/Kujg1pTWw6wnOTBUOz2+sEAi+Hgn+FvizBouOaOiNWzdna/hIgf9beuPS7nbx/TW5wQo1obw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SmjBIe5W9Tc2VtE7LCYjZFQvwxiveUrGks38gMLyxZ0=; b=XXwuYVhqV/UOi2Ux+bzE20rFokQ/ZXwEJatAmHFScvlaenM2PpT4NLnqqgg9D9wvD4MipmIRSH/Z9zAVzlUb0RZwLQzUbiDyqNO10homEqjkvuejP8l0a+tQ9rQlgLnixb8dPGkydhkf+bDFuIzIVaDsevvchEXkdwjbBNGKDuZuro3jbuZ5jLeeHPiaZ/VMgZXCyGzG2Hbeuuu1oTlbnlCwqqLwj9dNZg622egDHFgwN3C+Um9GAN45oknYaQzcxfiDNrdFRaNSJcxKDlEPnojwx3zUnbfpNfA5P6TdSdF8JeZMyVRu4oxCd/4YneJMJUwxigyVWq//U5MuAsqxnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SmjBIe5W9Tc2VtE7LCYjZFQvwxiveUrGks38gMLyxZ0=; b=iPC6LtOXOeQ8KwOq/XkUCZEA2Uw9WXs9EguvcXq0kdMF4CjZkSVUlUPiRkM3Yhg6/EXU9rPLqKmXMN19Ij0CzBMcPtlLaLdpa9UhctdZCXUeVyCwaQISGja25I9uZuQyKvxwtbd/nx1e8uaMsFOXkhJS/zD3+U06sBGTOhJlZve8C1Xr49ecIS6H3tKElBhtcfXSui1XSS6zROPKHxT4VRoIwe6ACiKD2/fMQIzF0ntACo4pT1yd5Px2Ic86eqTTG/rKnaPko9Igv8VgTrbMUe/9rTVLuKIxARgvD8T5fe7GJV0Rh4Q7fB9rjFEjFMcxc17vYMtt7RBzOxsq3kYQig== Received: from DM6NAM04FT014.eop-NAM04.prod.protection.outlook.com (2a01:111:e400:7ea3::4f) by DM6NAM04HT224.eop-NAM04.prod.protection.outlook.com (2a01:111:e400:7ea3::323) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4195.18; Wed, 9 Jun 2021 14:12:21 +0000 Received: from DM6PR07MB6618.namprd07.prod.outlook.com (2a01:111:e400:7ea3::42) by DM6NAM04FT014.mail.protection.outlook.com (2a01:111:e400:7ea3::288) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4195.18 via Frontend Transport; Wed, 9 Jun 2021 14:12:21 +0000 Received: from DM6PR07MB6618.namprd07.prod.outlook.com ([fe80::20ed:6cd1:4fe2:eea7]) by DM6PR07MB6618.namprd07.prod.outlook.com ([fe80::20ed:6cd1:4fe2:eea7%3]) with mapi id 15.20.4195.030; Wed, 9 Jun 2021 14:12:21 +0000 To: "internals@lists.php.net" Thread-Topic: [PHP-DEV] Re: RFC: CachedIterable (rewindable, allows any key&repeating keys) Thread-Index: AQHXACLkMdQqZoaVtUy2vSZWXRDmOasLeFTVgABcgwCAAAB+gIAAmIrp Date: Wed, 9 Jun 2021 14:12:21 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-CA, en-US Content-Language: en-CA X-MS-Has-Attach: X-MS-TNEF-Correlator: x-incomingtopheadermarker: OriginalChecksum:B6AEDF70A5A7C97D63FCCAEBA3AE944B0A153D9542F6E2696DD8EC798BF87626;UpperCasedChecksum:89CEFAC8507604FCD0A9AF61CFE37409F931F950D24D36BD4CE04CBEBBFB840D;SizeAsReceived:7350;Count:44 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [gsBoANLKyiFLwlRw9gb3EMFaDeVCX4fFZIfvooy4lo8E9cttaqlNDOoX+l67SlSg] x-ms-publictraffictype: Email x-incomingheadercount: 44 x-eopattributedmessage: 0 x-ms-office365-filtering-correlation-id: a0bd5041-b38b-4609-3e5f-08d92b5098e3 x-ms-traffictypediagnostic: DM6NAM04HT224: x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: SrN8ie26Rk2QhHiDemATqhUpwgY1yGrOVhJEZyGpb+i0Ouchqt0zOpLMEnuwnPFJYZGmwmL+XH3QW3OP199QUK2gGMlg3jmQu8A2UdYhfN5NwDHMuaLiI08VPwlu90ePzW0fq0q+y3rj1pOiVvIFanJ4CnawwzPHH9kenuMU6ytaxByOASs44nxjgwezU5UUf1K3z8ZrgkEsdamla1P9uARkb1fn+9yEGW54x8nw1K4VipPtfDAFBlXH9Xhzz3XI48itlUt4bjH43l3/QxOszXJZGK1mbHhNJLj51Cwn18QVbV05Ux9o4QYnxrxWLTHN5GJL+tRbW2hT9LPe/016lxgoxzG5F2I5qI7lV/qoL42b/REP6RsJYh/tvO3IhW7LMgP8+6lfhK6rJ9DD3gMHEYUsh9IavyZTGHSqf0OWy8mWTJUBBoEU0l0JZEMqqpOzgBRMRVDc8JkY9Emr+ff/dQ== x-ms-exchange-antispam-messagedata: SAEnWoMw9rgvYF6diC6WM9lgfzCxTkZP6CBtnLHyKqQW0ISY+y1EapLrz7Mh8jYxQ812khO5r4w43qE8DBermP8q7ktkdWzQcd5AgdW5omCU50E0v/CGmwZJ62rZKVFgPsT483CFnyrATXS8K534ocppGTJ1t78aFLgUTZ0f0edMBOxK9yClDxKhcTSxLVNlYptroCpusSxy8DQ1JfyMgQ== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-AuthSource: DM6NAM04FT014.eop-NAM04.prod.protection.outlook.com X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-CrossTenant-Network-Message-Id: a0bd5041-b38b-4609-3e5f-08d92b5098e3 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Jun 2021 14:12:21.0993 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6NAM04HT224 Subject: Re: [PHP-DEV] Re: RFC: CachedIterable (rewindable, allows any key&repeating keys) From: tysonandre775@hotmail.com (tyson andre) Hi Levi Morrison,=0A= =0A= > > > Hi internals,=0A= > > >=0A= > > > > I've created a new RFC https://wiki.php.net/rfc/cachediterable addi= ng CachedIterable,=0A= > > > > which eagerly evaluates any iterable and contains an immutable copy= of the keys and values of the iterable it was constructed from=0A= > > > >=0A= > > > > This has the proposed signature:=0A= > > > >=0A= > > > > ```=0A= > > > > final class CachedIterable implements IteratorAggregate, Countable,= JsonSerializable=0A= > > > > {=0A= > > > >=A0=A0=A0=A0 public function __construct(iterable $iterator) {}=0A= > > > >=A0=A0=A0=A0 public function getIterator(): InternalIterator {}=0A= > > > >=A0=A0=A0=A0 public function count(): int {}=0A= > > > >=A0=A0=A0=A0 // [[$key1, $value1], [$key2, $value2]]=0A= > > > >=A0=A0=A0=A0 public static function fromPairs(array $pairs): CachedI= terable {}=0A= > > > >=A0=A0=A0=A0 // [[$key1, $value1], [$key2, $value2]]=0A= > > > >=A0=A0=A0=A0 public function toPairs(): array{}=0A= > > > >=A0=A0=A0=A0 public function __serialize(): array {}=A0 // [$k1, $v1= , $k2, $v2,...]=0A= > > > >=A0=A0=A0=A0 public function __unserialize(array $data): void {}=0A= > > > >=0A= > > > >=A0=A0=A0=A0 // useful for converting iterables back to arrays for f= urther processing=0A= > > > >=A0=A0=A0=A0 public function keys(): array {}=A0 // [$k1, $k2, ...]= =0A= > > > >=A0=A0=A0=A0 public function values(): array {}=A0 // [$v1, $v2, ...= ]=0A= > > > >=A0=A0=A0=A0 // useful to efficiently get offsets at the middle/end = of a long iterable=0A= > > > >=A0=A0=A0=A0 public function keyAt(int $offset): mixed {}=0A= > > > >=A0=A0=A0=A0 public function valueAt(int $offset): mixed {}=0A= > > > >=0A= > > > >=A0=A0=A0=A0 // '[["key1","value1"],["key2","value2"]]' instead of '= {...}'=0A= > > > >=A0=A0=A0=A0 public function jsonSerialize(): array {}=0A= > > > >=A0=A0=A0=A0 // dynamic properties are forbidden=0A= > > > > }=0A= > > > > ```=0A= > > > >=0A= > > > > Currently, PHP does not provide a built-in way to store the state o= f an arbitrary iterable for reuse later=0A= > > > > (when the iterable has arbitrary keys, or when keys might be repeat= ed). It would be useful to do so for many use cases, such as:=0A= > > > >=0A= > > > > 1. Creating a rewindable copy of a non-rewindable Traversable=0A= > > > > 2. Generating an IteratorAggregate from a class still implementing = Iterator=0A= > > > > 3. In the future, providing internal or userland helpers such as it= erable_flip(iterable $input), iterable_take(iterable $input, int $limit),= =0A= > > > >=A0=A0=A0=A0 iterable_chunk(iterable $input, int $chunk_size), itera= ble_reverse(), etc (these are not part of the RFC)=0A= > > > > 4. Providing memory-efficient random access to both keys and values= of arbitrary key-value sequences=0A= > > > >=0A= > > > > Having this implemented as an internal class would also allow it to= be much more efficient than a userland solution=0A= > > > > (in terms of time to create, time to iterate over the result, and t= otal memory usage). See https://wiki.php.net/rfc/cachediterable#benchmarks= =0A= > > > >=0A= > > > > After some consideration, this is being created as a standalone RFC= , and going in the global namespace:=0A= > > > >=0A= > > > > - Based on early feedback on https://wiki.php.net/rfc/any_all_on_it= erable#straw_poll (on the namespace preferred in previous polls)=0A= > > > >=A0=A0 It seems like it's way too early for me to be proposing names= paces in any RFCs for PHP adding to modules that already exist, when there = is no consensus.=0A= > > > >=0A= > > > >=A0=A0 An earlier attempt by others on creating a policy for namespa= ces in general(https://wiki.php.net/rfc/php_namespace_policy#vote) also did= not pass.=0A= > > > >=0A= > > > >=A0=A0 Having even 40% of voters opposed to introducing a given name= space (in pre-existing modules)=0A= > > > >=A0=A0 makes it an impractical choice when RFCs require a 2/3 majori= ty to pass.=0A= > > > > - While some may argue that a different namespace might pass,=0A= > > > >=A0=A0 https://wiki.php.net/rfc/any_all_on_iterable_straw_poll_names= pace#vote had a sharp dropoff in feedback after the 3rd form.=0A= > > > >=A0=A0 I don't know how to interpret that - e.g. are unranked namesp= aces preferred even less than the options that were ranked or just not seen= as affecting the final result.=0A= > > >=0A= > > > A heads up - I will probably start voting on https://wiki.php.net/rfc= /cachediterable this weekend after https://wiki.php.net/rfc/cachediterable_= straw_poll is finished.=0A= > > >=0A= > > > Any other feedback on CachedIterable?=0A= > > >=0A= > > > Thanks,=0A= > > > Tyson=0A= > > >=0A= > > > --=0A= > > > PHP Internals - PHP Runtime Development Mailing List=0A= > > > To unsubscribe, visit: https://www.php.net/unsub.php=0A= > > >=0A= > >=0A= > > Based on a recent comment you made on GitHub, it seems like=0A= > > `CachedIterable` eagerly creates the datastore instead of doing so=0A= > > on-demand. Is this correct?=0A= > =0A= > Sorry, yes, that's correct and pointed out in the RFC.=0A= > =0A= > I think that's a significant implementation flaw. I don't see why we'd=0A= > balloon memory usage unnecessarily by being eager -- if an operation=0A= > needs to fetch more data then it can go ahead and do so.=0A= =0A= First, PHP's standard library accommodates a wide variety of use cases, of = which I believe eager evaluation is the most common.=0A= There is no reason that an eagerly evaluated CachedIterable and lazily eval= uated LazyCachedIterable couldn't be both added at some point=0A= if both had passing RFCs.=0A= =0A= (This is referring to https://en.wikipedia.org/wiki/Lazy_evaluation and htt= ps://en.wikipedia.org/wiki/Eager_evaluation)=0A= =0A= As was stated in that GitHub Discussion,=0A= =0A= 1) If a CachedIterable were to be used in the standard library or a user-de= fined library,=0A= many end users would want the standard library to return something that = could be iterated over multiple times.=0A= The limit of a single iteration was a source of bugs in SPL classes =0A= such as https://www.php.net/arrayobject prior to them being switched to = IteratorAggregate.=0A= =0A= (This is concerning whether functions such as `*filter` and `*map` shoul= d evaluate the result eagerly or lazily if they do get added.=0A= It is possible for a LazyCachedIterable to be implemented that computes = values on demand, but see below points.)=0A= =0A= ```=0A= $foo =3D map(...);=0A= foreach ($foo as $i =3D> $v1) {=0A= foreach ($foo as $i =3D> $v2) {=0A= if (some_pair_predicate($v1, $v2)) {=0A= // do something=0A= }=0A= }=0A= }=0A= ```=0A= =0A= 2) Userland library/application authors that are interested in lazy generat= ors could use or implement something =0A= such as https://github.com/nikic/iter instead. My opinion is that the st= andard library should provide =0A= something that is easy to understand, debug, serialize or represent, etc= .=0A= I expect the inner iterable may be hidden entirely in a LazyCachedIterab= le from var_dump as an implementation detail.=0A= =0A= 3) It would be harder to understand why SomeFrameworkException is thrown in= code unrelated to that framework =0A= when a lazy (instead of eager) iterable is passed to some function that = accepts a generic iterable,=0A= and harder to write correct exception handling for it if done in a lazy = generation style.=0A= =0A= Many RFCs have been rejected due to being perceived as being likely to b= e misused in userland or =0A= to make code harder to understand.=0A= =0A= 4) It is possible to implement a lazy alternative to CachedIterable that on= ly loads values as needed.=0A= However, I hadn't proposed it due to doubts that 2/3 of voters would con= sider it widely useful =0A= enough to be included in php rather than as a userland or PECL library.= =0A= =0A= Additionally,=0A= =0A= CachedIterables are much more memory efficient than existing options such a= s arrays=0A= https://wiki.php.net/rfc/cachediterable#cachediterables_are_memory-efficien= t=0A= (The only thing more efficient in PHP's core modules is SplFixedArray,=0A= and that only allows keys `0..n-1`)=0A= =0A= Regards,=0A= Tyson=0A=