Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116966 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 70496 invoked from network); 1 Feb 2022 14:52:55 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 1 Feb 2022 14:52:55 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id D09CA180544 for ; Tue, 1 Feb 2022 08:07:09 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS36024 206.123.114.0/23 X-Spam-Virus: No X-Envelope-From: Received: from mail1.25mail.st (mail1.25mail.st [206.123.115.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 1 Feb 2022 08:07:09 -0800 (PST) Received: from smtpclient.apple (unknown [49.48.219.182]) by mail1.25mail.st (Postfix) with ESMTPSA id D46F060411; Tue, 1 Feb 2022 16:07:02 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) In-Reply-To: Date: Tue, 1 Feb 2022 23:06:59 +0700 Cc: "internals@lists.php.net" Content-Transfer-Encoding: quoted-printable Message-ID: References: To: tyson andre X-Mailer: Apple Mail (2.3654.120.0.1.13) Subject: Re: [PHP-DEV] Adding `final class Deque` to PHP From: php-lists@koalephant.com (Stephen Reay) > On 1 Feb 2022, at 21:46, tyson andre = wrote: >=20 > Hi internals, >=20 >> I've created a new RFC https://wiki.php.net/rfc/deque to add a `final = class Deque` >>=20 >> This is based on the `Teds\Deque` implementation I've worked on >> for the https://github.com/TysonAndre/pecl-teds PECL. >>=20 >> While `SplDoublyLinkedList` and its subclass `SplQueue`/`SplStack` = exist in the SPL, they have several drawbacks >> that are addressed by this RFC to add a `Deque` class (to use instead = of those): >>=20 >> 1. `SplDoublyLinkedList` is internally represented by a doubly linked = list, >> making it use roughly twice as much memory as the proposed `Deque` >> 2. `push`/`pop`/`unshift`/`shift` from `SplDoublyLinkedList` are = slower due to >> needing to allocate or free the linked list nodes. >> 3. Reading values in the middle of the `SplDoublyLinkedList` is = proportional to the length of the list, >> due to needing to traverse the linked list nodes. >> 4. `foreach` Iteration behavior cannot be understood without knowing = what constructed the >> `SplDoublyLinkedList` instance or set the flags. >>=20 >> It would be useful to have an efficient `Deque` container in the = standard library >> to provide an alternative without those drawbacks, >> as well as for the following reasons: >>=20 >> 1. To save memory in applications or libraries that may need to store = many lists of values or run for long periods of time. >> Notably, PHP's `array` type will never release allocated capacity. >> See = https://www.npopov.com/2014/12/22/PHPs-new-hashtable-implementation.html >> 2. To provide a better alternative to `SplDoublyLinkedList`, = `SplStack`, and `SplQueue` >> for use cases that require stacks or queues. >> 3. As a more efficient option than `array` and `SplDoublyLinkedList` >> as a queue or `Deque`, especially for `unshift`. >>=20 >> A `Deque` is more efficient than an `array` when used as a queue, = more readable, and easier to use correctly. >> While it is possible to efficiently remove elements from the start of = an `array` (in terms of insertion order) (though this makes = reset()/array_key_first() inefficient), >> it is very inefficient to prepend elements to the start of a large = `array` due to needing to either copy the array >> or move all elements in the internal array representation, >> and an `array` would use much more memory than a `Deque` when used = that way (and be slower). >>=20 >> There are also several pitfalls to using an array as a queue for = larger queue sizes, >> some of which are not obvious and discovered while writing the = benchmarks. >> (Having a better (double-ended) queue datastructure (`Deque`) than = the `SplDoublyLinkedList` >> would save users from needing to write code with these pitfalls): >>=20 >> 1. `array_key_first()` and reset()`takes time proportional to the = number of elements `unset` from the start of an array, >> causing it to unexpectedly be extremely slow (quadratic time) = after unsetting many elements at the start of the queue. >> (when the array infrequently runs out of capacity, buckets are = moved to the front) >> 2. `reset()` or `end()` will convert a variable to a reference, >> and php is less efficient at reading or writing to reference. >> Opcache is also less efficient at optimizing uses of variables = using references. >> 3. More obviously, `array_unshift` and `array_shift` will take time = proportional to the number of elements in the array >> (to reindex and move existing/remaining elements). >=20 > I plan to start voting on https://wiki.php.net/rfc/deque on Friday, = February 4th. >=20 > Several changes have been made to = https://wiki.php.net/rfc/deque#changelog > after the feedback in https://externals.io/message/116100 >=20 > - The class is now named `Collections\Deque` > - The api documentation in https://wiki.php.net/rfc/deque#proposal was = expanded for methods. > - Benchmarks were updated. > - Like other standard datastructures, iteration over the deque is now = over the original object (instead of creating a copy),=20 > and mutating the deque will be reflected in `$iterator->current()` = (and moving the end with push()/pop() will affect where iteration ends). > - Iteration will account for calls to shift/unshift moving the start = of the deque. > the offsets will be corrected and values won't be skipped or iterated = over multiple times. > (no matter how many iterators were created by `Deque->getIterator()`) > See https://wiki.php.net/rfc/deque#iteration_behavior > - The get()/set() methods were removed, after feedback in = https://externals.io/message/116100#116214 >=20 > A WebAssembly demo is available at = https://tysonandre.github.io/php-rfc-demo/deque/ >=20 > Thanks, > Tyson >=20 > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php >=20 Hi Tyson, As a userland dev & library author it=E2=80=99s nice to see some = progression on basic data structures, so thank you for your efforts on = this! Two little things in the RFC: The proposed API switches between terms `front`, `back`, `start` and = `end` in comments - is there meant to be a conceptual difference between = front/start and end/back ? In the "Why use this instead of array?=E2=80=9D Section, the 3rd point = seems cut off: > Note that starting in php 8.2, array Cheers Stephen=20