Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116223 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 78477 invoked from network); 5 Oct 2021 07:14:43 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 5 Oct 2021 07:14:43 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8AAE51804C6 for ; Tue, 5 Oct 2021 00:59:07 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 5 Oct 2021 00:59:07 -0700 (PDT) Received: by mail-ed1-f49.google.com with SMTP id g8so75170483edt.7 for ; Tue, 05 Oct 2021 00:59:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=glbaPSv2KE6Cqj0Iv+DmzytQCxpbq6X7W8hvCofxumY=; b=RxfB3FhXe8/PcCpU+atcPruwdOgMiZ9lqWaipKo6ipEbC1bOvXqPTuRDsMnMz8BNoz R0P3E1SxtLGckawcyOn/9aDFfbM7t+m0e6iH82s0bc0C6KnueaO4e27THMQNaEMcXve5 74MAR9uuRTt/VZSCnNyLT+5Bfy/F+kFW2ktVYY9ZNcgKXuU05VoQxLm+fTYW+AftV/lf pL/8R5ixrZrZrrqABPYsqkYP36MEtZNS/1lZsWur+0PDgYQlPUGYFrM2lWNjtFrM24EV Igp7b11Om9Q6lgp4DW9RyL0A2rPSlLSWJMiU+fNxmwxPsy3G6kcucV4vABtPqxTWWKQy UZkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=glbaPSv2KE6Cqj0Iv+DmzytQCxpbq6X7W8hvCofxumY=; b=TSkL4mcHJf4J8raEd9GCTlVdow0j4DM6vqNR9IVbK99tAGJvF22t4YBNKqYdkWwIrP u4/C2r7miw/TcMIO83UxArbJqKqONLwANyUI2+gTIgAS+MBxgoRtQYU2ff8DbZKqoO/t coypEHkuOUUdD1atA/U5StkJ4ektVFGboODE7ewcfhmW2VZFmBUWkmtLJJfrXaEW2XR5 xCSPa0/DmzvPdrVkFEqSCqLO5qKWwHWx/JDLRi1bKO4lwwIkxLOKy9gO6bffhBbNqRgI sFEpyEAHtfIj0DFN/+Z91jWBlEtddiKlA+yPbV9GZ+KBLqqoAev6t2Ln+iVKLzoBP66K lTYA== X-Gm-Message-State: AOAM532azyYtBWl4u/D7wMp75+e03ctpj/tErO9R0eUvUt9enxKrgHxp FXXIpdKegb8/+eKWm+lem7f6Gb8fgvRNY7V50ovSTVFw X-Google-Smtp-Source: ABdhPJyni8v5eLenJn4n0N0OYeFlqMu33xm8dLApzzkvCNp8YEDLgHkAaVwFynvctoFmBoPp1yRqZ22K5bv59adN9lA= X-Received: by 2002:aa7:ce1a:: with SMTP id d26mr11352697edv.362.1633420742869; Tue, 05 Oct 2021 00:59:02 -0700 (PDT) MIME-Version: 1.0 References: <67BFE5C6-3C3F-4309-97A2-14CA0D8C3B3B@newclarity.net> <7BEE0406-3CAE-4E83-A196-F9F608BE8D17@newclarity.net> In-Reply-To: Date: Tue, 5 Oct 2021 09:58:46 +0200 Message-ID: To: tyson andre Cc: "internals@lists.php.net" Content-Type: multipart/alternative; boundary="00000000000025c6aa05cd966764" Subject: Re: [PHP-DEV] Adding `final class Deque` to PHP From: nikita.ppv@gmail.com (Nikita Popov) --00000000000025c6aa05cd966764 Content-Type: text/plain; charset="UTF-8" On Tue, Oct 5, 2021 at 12:47 AM tyson andre wrote: > Hi Nikita Popov, > > > 1. There would be the possibility of having an interface Deque that is > > backed by a VecDeque/ArrayDeque implementation. I'm not convinced this > > would actually make sense, but I wanted to throw it out there, given that > > the class is final (which is without any doubt the correct decision). > > Do you mean ArrayDeque for a hardcoded max capacity? > I'm also not convinced there's a use case. > > > 2. How do set() / offsetSet() work with an out of range index? Is it > > possible to push an element with $deque[count($deque)] = x? I would > assume > > that throws, but want to make sure. On a related note, why do we need > both > > get()/set() and offsetGet()/offsetSet()? These APIs seem redundant. > > It isn't possible to set an out of range index - both throws > `\OutOfBoundsException` > > In order to support the ArrayAccess `$deque[$offset] = $value;` or > `$deque[$offset]` shorthand, > the offsetGet/offsetSet needed to be implemented to follow conventions. > (because of ArrayAccess, offsetGet must accept mixed offsets) > > Those aren't type safe, though, so get()/set() are provided for a type > safe alternative > that will throw a TypeError for use cases that would benefit from runtime > type safety. > offsetGet() and offsetSet() can (and should) still throw if the offset is not an integer. We just can't actually declare that type. This is a weakness of the PHP type system, so nominally "violating" LSP is fine here. > > 3. I believe it's pretty standard for Deque implementations to provide > > insert() / remove() methods to insert at any position (these would be > O(n)). > > https://www.php.net/manual/en/class.splqueue.php didn't offer that > functionality. > > https://www.php.net/manual/en/class.ds-deque.php did, apparently. > > > 4. The design of getIterator() here is rather unusual, and deserves some > > additional justification. Normally, we let getIterator() see concurrent > > modifications to the structure, though the precise behavior is > unspecified. > > I would also like to know how this will look like on a technical level > (it > > doesn't seem to be implemented yet?) This seems like something that will > > require a check on every write operation, to potentially separate the > > structure in some form. > > The original plan was to copy the entire array on iterator creation, > to imitate the immediate copy nature of foreach on arrays. > There is no immediate copy with arrays though. The copy is just a refcount increment and no actual copy will happen unless the array is modified during iteration, which is a rare case. I'd expect the Deque implementation (with those semantics) to do the same -- but we don't have a convenient mechanism for it. Adding an indirection and a refcount to the underlying storage would be possible, but also feels like overkill. The other approach I see would be to track all the registered iterators, so we can explicitly separate them on write. > This was assuming that `foreach` over a Deque without removing elements > would be a rare use case. > > That may have been a mistake since `foreach (clone $deque as $key => > $value)` can be done to explicitly do that. > There're around 4 approaches I could take with different tradeoffs > > 1. Iterate over $offset = 0 and increment offset in calls to > InternalIterator->next() until exceeding the size of the deque, not copying > the deque. > > That's the **actual** current implementation, but would be unintuitive > with shift()/unshift() > > This would repeat elements on unshift(), or skip over elements when > shift() is called. > > The current implementation of `Ds\Deque` seems to do the same thing, > but there's a similar discussion in its issue tracker in > https://github.com/php-ds/ext-ds/issues/17 > > > 2. Similar iteration behavior, but also have it relative to a uint64 > indicating the number of times elements were added to the front of the > deque minus the number of elements were removed from the back of the deque. > > (e.g. if the current Deque internalOffset is 123, the InternalIterator > returned by getOffset would start with an offset of 123 and increase it > when advancing. > Calls to shift would raise the Deque internalOffset, calls to unshift > would decrease it (by the number of elements). > This would be independent of the circular buffer offset. > And the InternalIterator would increase the internalOffset to catch up > if the signed offset difference became negative, e.g. by calling shift() > twice in a foreach block) > Something to keep in mind here is that there might be multiple iterators at the same time, so I don't think the suggestion from the last sentence would work. Though I think the general idea does work, as long as we're working with a delta between the offset in the iterator and the deque. > 3. Behave as if the entire Deque was copied when iteration starts (either > by actually copying everything or by copy on write). > > That was the **planned** implementation documented in the RFC, but > would be inefficient for iterations that end early and use a lot of memory > for large Deques. > > 4. Throw if attempting to change the size of the `Deque` while a foreach > is in progress. > > The semantics of that would be annoying, e.g. handling `clear()` or > `shift()`, and the exception getting thrown would be unexpected. > > > 5. The shift/unshift terminology is already used by our array API, but > I'm > > not sure it's the most intuitive choice in the context of a deque. > SplQueue > > has enqueue() and dequeue(). Another popular option from other languages > > (which I would personally favor) is pushFront() and popFront(). > > My original name idea was pushBack/popBack/pushFront/popFront but I had > decided against proposing brand new terms > for functionality that already had terms in php-src. > It would also be inconsistent with proposed names for `Vector` if that was > also added. > > https://www.php.net/manual/en/class.ds-deque.php and SplDoublyLinkedList > had done the same thing > > enqueue and dequeue don't have a good term for enqueueing on the opposite > side. > > - It would make it easier to learn the language to have fewer terms and > migrate code using SplDoublyLinkedList or arrays > - push()/shift() are also shorter names than pushBack/popFront() > > Thanks, > Tyson > > > From: Nikita Popov > Sent: October 4, 2021 8:04 AM > To: Levi Morrison > Cc: PHP internals > Subject: Re: [PHP-DEV] Adding `final class Deque` to PHP > > On Tue, Sep 21, 2021 at 11:05 PM Levi Morrison via internals < > internals@lists.php.net> wrote: > > > The name "deque" is used in the standard library of these languages: > > > > - C++: std::deque > > - Java: java.util.Deque (interface) > > - Python: collections.deque > > - Swift: Collections.Deque (not standard lib, apparently, but Apple > > official? Don't know Swift) > > - Rust: std::collections::VecDeque > > > > And these don't have it in the standard library: > > - Go > > - C# > > - C > > - JavaScript > > > > Anyway, it's pretty decent evidence that: > > 1. This functionality is pretty widely used across languages. > > 2. This functionality should have "deque" be in the name, or be the > > complete name. > > > > Discussion naming for "vector" I can understand, as it's less widely > > used or sometimes means something specific to numbers, but there's > > little point in discussing the name "deque." It's well established in > > programming, whether PHP programmers are aware of it or not. > > > > As I see it, the discussion should be centered around: > > 1. The API Deque provides. > > 2. The package that provides it. > > 3. Whether Deque's API is consistent with other structures in the same > > package. > > 4. Whether this should be included in php-src or left to extensions. > > > > I suggest that we try to make PHP as homogenous in each bullet as we > > can with other languages: > > 1. Name it "Deque." > > 2. Put it in the namespace "Collections" (and if included in core, > > then "ext/collections"). > > 3. Support common operations on Deque (pushing and popping items to > > both front and back, subscript operator works, iteration, size, etc) > > and add little else (don't be novel here). To me, the biggest thing > > left to discuss is a notion of a maximum size, which in my own > > experience is common for circular buffers (the implementation chosen > > for Deque) but not all languages have this. > > 4. This is less clear, but I'm in favor as long as we can provide a > > few other data structures at the same time. Obviously, this a voter > > judgement call on the yes/no. > > > > Given that I've suggested the most common options for these across > > many languages, it shouldn't be very controversial. The worst bit > > seems like picking the namespace "Collections" as this will break at > > least one package: https://github.com/danielgsims/php-collections. We > > should suggest that they vendor it anyway, as "collections" is common > > e.g. "Ramsey\Collections", "Doctrine\Common\Collections". I don't see > > a good alternative here to "Collections", as we haven't agreed on very > > much on past namespace proposals. > > > > Anyway, that's what I suggest. > > > > I generally agree with everything Levi has said here. I think that adding a > deque structure generally makes sense, and that putting it into a new > ext/collections extension under the corresponding namespace would be > appropriate. I wouldn't insist on introducing it together with other > structures (I'm a lot more sceptical about your Vector proposal), but do > think there should be at least some overarching plan here, even if we only > realize a small part of it in this version. > > A few questions for the particular API proposed here: > > 1. There would be the possibility of having an interface Deque that is > backed by a VecDeque/ArrayDeque implementation. I'm not convinced this > would actually make sense, but I wanted to throw it out there, given that > the class is final (which is without any doubt the correct decision). > > 2. How do set() / offsetSet() work with an out of range index? Is it > possible to push an element with $deque[count($deque)] = x? I would assume > that throws, but want to make sure. On a related note, why do we need both > get()/set() and offsetGet()/offsetSet()? These APIs seem redundant. > > 3. I believe it's pretty standard for Deque implementations to provide > insert() / remove() methods to insert at any position (these would be > O(n)). > > 4. The design of getIterator() here is rather unusual, and deserves some > additional justification. Normally, we let getIterator() see concurrent > modifications to the structure, though the precise behavior is unspecified. > I would also like to know how this will look like on a technical level (it > doesn't seem to be implemented yet?) This seems like something that will > require a check on every write operation, to potentially separate the > structure in some form. > > 5. The shift/unshift terminology is already used by our array API, but I'm > not sure it's the most intuitive choice in the context of a deque. SplQueue > has enqueue() and dequeue(). Another popular option from other languages > (which I would personally favor) is pushFront() and popFront(). > > Regards, > Nikita > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > > --00000000000025c6aa05cd966764--