Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122617 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 6A7AC1AD8F6 for ; Tue, 12 Mar 2024 01:36:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1710207414; bh=QItw3v2pEJyTa1oupO1a9VHlSPL6MhNhTt4G8x4CLPg=; h=Date:To:From:Cc:Subject:In-Reply-To:References:From; b=JsyTB+ROhPMboexGxk+8R3mFvUj2LIXpgYrWlFFPCHdrRxXwgt5AZsLyFMev1RFie WRKQfyMLpgqfG5BiHFXWQL/9qyrY1kK+cG2Xa3jiB6BEvyLxSY6tx3zODpO/ZeVE0v 47aoMMGGpL2rLgLYVEuDn3x2RvJUXdVllb6GrhxZkh++/nYKDrtgVdN0dC9MMSI8Y2 dwKC3c28Inq5CQuT76+f2QSjCWd/trhPkslwQmVMi8ibuQ1PESzeNfMetuYRd9P38e ME8d8VZxoWR5Izscsr8P7SrkXi8v5mRZDbE+SO3xajcMN35IhKg7izL8OlPTIzFqve TfGa2gYPwwv/Q== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 01BD818003E for ; Tue, 12 Mar 2024 01:36:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_20,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-40136.proton.ch (mail-40136.proton.ch [185.70.40.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 12 Mar 2024 01:36:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gpb.moe; s=protonmail; t=1710207393; x=1710466593; bh=QItw3v2pEJyTa1oupO1a9VHlSPL6MhNhTt4G8x4CLPg=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=fxG4wAlfdw8Paf6ZzSxME+YkGG+IzUlso/uDC1eolrFRb1tbPYOZIwlR1psymYsXf j8mRvyWNQV+5eb+QSEUg3vgb8QJCafwxc13XbKATHZ4f3YH7M4sk5TWAZdsDkw8b1N gxNvZw7+RP40HEytmTy+YfXvgMtbn4FULUWMDyywJabZGSL7M//VyJhnzk7PqU4g3e IatOjLXjEi1bxT/L3axSjv7ld3YqmWxEJpRCuRL5l29H1CZ4pGYvPIQOYca4NbWx6v p7P/sSw8PPVGZpWCUJZriD8cJ6QEN0PcvgAW9TmCTVtWc/kF8eADRqmd9gJcOBlYXb bedKd0e04iZCA== Date: Tue, 12 Mar 2024 01:36:10 +0000 To: Larry Garfield Cc: php internals Subject: Re: [PHP-DEV] [Pre-RFC] Improve language coherence for the behaviour of offsets and containers Message-ID: In-Reply-To: References: <1MzJ8G_MuG-LJsMSoGDIRBmSODjO4hxPACqMoQYfx83MS-du6sskpNLJ45HnzPYMoUNkXyGiZEUAtFk-uyeGDv_Kwg6Qgtod05pXCQH-8M8=@gpb.moe> Feedback-ID: 96993444:user:proton Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: internals@gpb.moe ("Gina P. Banyard") On Monday, 11 March 2024 at 16:11, Larry Garfield = wrote: >=20 > Woof. That's my kind of RFC. :-) The extensive background helps a lot, th= ank you. >=20 > I am generally in favor of this, and have wanted more fine-grained ArrayA= ccess interfaces for a long time. >=20 > Thoughts in no particular order: >=20 > * "Dimension" is clearly based on the existing engine hooks, but as a use= r-space dev, it's a very confusing term. Possibly documentable if there's a= n obvious logic for it we could present, but something more self-evident is= probably better. I am open to naming changes, but "dimension" is very much the standard and = ubiquitous term for this, see multidimensional arrays, n-dimensional array,= etc. >=20 > * I am not sure combining get and exists into a single interface is right= . I'm not certain it's wrong, either, though. What is the argument for comb= ining them? >=20 > * Do we have some guidance for what offsetGet() should do when offsetExis= ts() is false? I know there isn't really any now, but it seems like the sor= t of thing to think about here. I will answer both points together. The reason that checking an offset exists is combined with being able to re= ad it is that it just makes sense to be together. If not and only the "read" operation is supported, how do you know that an = offset actually exists before accessing it? You just access it and get an E= xception that you need to catch? It is also required for the null coalesce operator to function properly. What offsetGet() does if the offset doesn't exist is up to the implementati= on, you could return null as a default value or throw an exception. The only expectation is that *if* the offset exists, then reading the value= should be possible. If you can only write to a container, then being able to check something ex= ists is somewhat meaningless. >=20 > * You sort of skirt around but don't directly address the impact of this = change on the long-standing desire to make array functions accept arrayifie= d objects. What if any impact would this have there? Eg, could some array f= unctions be made to accept array|DimensionRead without breaking the world? >=20 > * How, if at all, does this relate to iterable? I think it has no impact,= but since it's in the same problem space it's worth confirming. The former is actually also more related to iterable, as most array functio= ns need to be able to traverse the container to be able to do anything mean= ingful with it. Or there would need to be a way to provide a "manifest" of what offsets are= set for things like array_search or array_flip. >=20 > * You mention at one point applying shim code ArrayAccess to make it work= like the new interfaces. Please expand on that. Do you mean the engine wou= ld automatically insert some shims, like how `__toString()` now magically i= mplies `implements Stringable`? Or some trait that objects could use (eithe= r a user-space trait or an engine trait) that would provide such shims in a= n overridable fashion? I don't fully follow here. >=20 > * If I read correctly, an internal object that implements one of the dime= nsion handlers will automagically appear to user-land as having the corresp= onding interface, much like `__toString()`/`Stringable`. Is that correct? I= t seemed implied but not fully stated. If so, a brief code example would he= lp to make it clear in such a long RFC. Move around a later point to respond to them together. Internal objects don't actually magically implement Stringable, this is som= ething internal objects must do themselves. Moreover, internals objects can support string casts without implementing _= _toString(), see the GMP object which is the only example of this in php-sr= c (and I should fix this or if someone else wants an easy PR to do feel fre= e). To understand how the shim works, I first need to explain how interfaces be= ing implemented in a class work. Internal interfaces can have a special handler called interface_gets_implem= ented which gets called during compilation of classes. This is the mechanism used by the Throwable and the DateTimeInterface inter= faces to prevent userland classes from implementing them. The ArrayAccess interface has (and all the other Dimension ones actually ha= ve) such a handler, and the "shim" is to set the append, fetch, and fetch-a= ppend dimension handlers on the CE to magically support those operations on= the class for the time being. No methods are created on the class, the new interfaces are not implemented= . To override the behaviour of append/fetch/fetch-append the relevant new int= erface should be implemented. This is conceived as a temporary measure to ease migration for userland cla= sses that support those operations already. >=20 > * Big +1 to removing the magic semi-silent casting when using weird key t= ypes. >=20 > * I feel like some of the sections could benefit from more short code exa= mples. Eg, What the heck does fetch-append on a null even look like? That w= ould help illustrate why the current behavior is weird, or why some things = need to stay non-obvious because they're used in odd cases. (Like how $a[1]= [2] is a by-ref fetch internally, something most people don't think about.) >=20 I find having too many examples makes RFCs difficult to read and parse, and= I prefer to use them sparingly for when they are really needed. Just for clarity but $a[1][2] is only a by-ref fetch for write operations, = if it is a read operation those are performed in sequence. Fetch-append on null is not really weird, it just appends the given value (= /null if no assignments happens) and provides a reference to it. See https://3v4l.org/UPg3P > * What would the changes to ArrayObject mean for a backing object that us= es hooks on some properties? It says __get/__set would be bypassed. Would h= ooks also be bypassed? (I have no clue what the preferred logic here is, ho= nestly. Just thinking aloud and hoping hooks pass so that we have to think = about this. :-) ) The current behaviour by-passes __set() and __get() because it works with t= he property HashTable directly. My current PR to fix all those weird behaviours is to use the write_propert= y object handler with ZEND_GUARD_PROPERTY_SET guard set. I don't know how hooks are implemented so Ilija should be able to determine= how this works exactly, but my guess is that it should "just work" with se= t hooks being called. >=20 > Again, thank you for your work on this, and I hope it passes easily. >=20 > --Larry Garfield Hope I've answered the questions you had. Best regards, Gina P. Banyard