Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:121049 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 1964 invoked from network); 12 Sep 2023 19:11:10 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 12 Sep 2023 19:11:10 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 9E5E71804D0 for ; Tue, 12 Sep 2023 12:11:07 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS19151 66.111.4.0/24 X-Spam-Virus: No X-Envelope-From: Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 12 Sep 2023 12:11:07 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 962C65C00F0 for ; Tue, 12 Sep 2023 15:11:06 -0400 (EDT) Received: from imap50 ([10.202.2.100]) by compute4.internal (MEProxy); Tue, 12 Sep 2023 15:11:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to; s=fm1; t=1694545866; x=1694632266; bh=XD7B6HFOyt QC0ZJq+xMQrEA3ehPNnNFWJLb6conNIU8=; b=mjVLvAhcARCdbjCqzWyP3Nx7JT MqRX46x7/mTWgqmvvJTUvK2/xCaVXULqxGVhc08Jf9Gd1f8C23L3bzwGifI0UGAz s97qaENAZBI9ubtAJTecweXZ7+oRZVSyLXFDFl+rxI4gX0TkibhGkqiqcQ5yhVNa zi2SrP1hUOKmDKjR+0tZ64BC72Pxv1TLaTDKixOw77AD7mCA2Futfu5d8WtxM103 FG+kI8Zf9d8O1xCcwYSkBA5WOzIgFIRHe9Umedo8EAjfnWCvvsRw+Q6iOJQCmMXy oonaOUN1wYFSnv7SN1iT1J8hLGbQwNJN0WqW731SsqF/OCamaC3eJ0sbqhVA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1694545866; x= 1694632266; bh=XD7B6HFOytQC0ZJq+xMQrEA3ehPNnNFWJLb6conNIU8=; b=g /S4rU9n4/a1QoNFq1uuyOA9MMBBlSfLJ4xree4vg9/qAD0LcpOJ/endvhp9bJ6bK 2V3lKGnhRez2v2X+1lhHCUK9WH52+znHA1kPOunsDwziyFj/6WNQu7sO6IAdg1o9 8oTr+33/qLzReRdv9U95u0C54r8jjCEOG08KNwgQKgc+31Uzi4s3TiiD7yaqfnr9 HFVK7sUr+AcN6d0y1lYkC6ZW2rzacd022uoJbIlaT3Tgle5NWR+bxgR0QBUilNOy XLpxxjPM6pAQ/tJPgd3Nji1GE37aMwiVcTss4XTFxopKILxmDQe2ybwbNRjAIABh 352RSlWJsRIK8h9dzKjzQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrudeiiedgudefhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreerjeenucfhrhhomhepfdfn rghrrhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrd gtohhmqeenucggtffrrghtthgvrhhnpeffffffjeffudfggeevvdeitdetvdfgjefffeff jeelfeejteevheeghffhvdfgleenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh epmhgrihhlfhhrohhmpehlrghrrhihsehgrghrfhhivghlughtvggthhdrtghomh X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 2A8391700096; Tue, 12 Sep 2023 15:11:06 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.9.0-alpha0-745-g95dd7bea33-fm-20230905.001-g95dd7bea Mime-Version: 1.0 Message-ID: <094b11b6-bbc3-43de-9c81-485a6cb50c3b@app.fastmail.com> In-Reply-To: References: Date: Tue, 12 Sep 2023 19:10:45 +0000 To: "php internals" Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] RFC Proposal: Readonly Structs in PHP From: larry@garfieldtech.com ("Larry Garfield") On Fri, Sep 8, 2023, at 1:12 PM, Lanre Waju wrote: > Dear PHP Internals, > > I am writing to propose a new feature for PHP that introduces the=20 > concept of structs. This feature aims to provide a more concise and=20 > expressive way to define and work with immutable data structures. Belo= w=20 > is a detailed description of the proposed syntax, usage, and behavior. > > Syntax > > struct Data > { > =C2=A0=C2=A0=C2=A0 string $title; > =C2=A0=C2=A0=C2=A0 Status $status; > =C2=A0=C2=A0=C2=A0 ?DateTimeImmutable $publishedAt =3D null; > } > The Data struct is essentially represented as a readonly class with a=20 > constructor as follows: > > > readonly class Data > { > =C2=A0=C2=A0=C2=A0 public function __construct( > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public string $title, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public Status $status, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public ?DateTimeImmutable = $publishedAt =3D null, > =C2=A0=C2=A0=C2=A0 ) {} > } > Assertions > The Data struct will always be readonly. > It has no methods besides the constructor. > Constructors > The Data struct can be constructed in three different ways, each of=20 > which allows for named or positional arguments, which can be mixed: > > 1.1 Class like > $data =3D new Data('title', Status::PUBLISHED, new DateTimeImmutable()= ); > > 1.2 Class like (Named Syntax) > $data =3D new Data(title: 'title', status: Status::PUBLISHED, publishe= dAt:=20 > new DateTimeImmutable()); > > 2.1 Proposed struct initialization syntax (Positional Arguments) > $data =3D Data{'title', Status::PUBLISHED, new DateTimeImmutable()}; > > 2.2 Proposed struct initialization syntax (Named Syntax) > $data =3D Data{title: 'title', status: Status::PUBLISHED, publishedAt:= new=20 > DateTimeImmutable()}; > > 3.1 Anonymous Struct (Named Arguments) > > $data =3D struct { > =C2=A0=C2=A0=C2=A0 string $title; > =C2=A0=C2=A0=C2=A0 Status $status; > =C2=A0=C2=A0=C2=A0 ?DateTimeImmutable $publishedAt =3D null; > }('title', Status::PUBLISHED, new DateTimeImmutable()); > 3.2 Anonymous Struct (Named Arguments - Named Syntax) > > $data =3D struct { > =C2=A0=C2=A0=C2=A0 string $title; > =C2=A0=C2=A0=C2=A0 Status $status; > =C2=A0=C2=A0=C2=A0 ?DateTimeImmutable $publishedAt =3D null; > }(title: 'title', status: Status::PUBLISHED, publishedAt: new=20 > DateTimeImmutable()); > Nesting > The proposed feature also supports nesting of structs. For example: > > > final class HasNestedStruct > { > =C2=A0=C2=A0=C2=A0 NestedStruct { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 string $title; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Status $status; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ?DateTimeImmutable $publis= hedAt =3D null; > =C2=A0=C2=A0=C2=A0 }; > > =C2=A0=C2=A0=C2=A0 public function __construct( > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public string $string, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public Data $normalStruct, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public NestedStruct $neste= dStruct =3D NestedStruct{'title',=20 > Status::PUBLISHED, new DateTimeImmutable()}, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public struct InlineNamed = { int $x} $inlineNamed =3D {x: 1}, > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public { int $x, int $y} $= inlineAnonymous =3D {x: 1, y: 2}, > =C2=A0=C2=A0=C2=A0 ) {} > } > This proposal aims to enhance the readability and maintainability of=20 > code by providing a more concise and expressive way to work with=20 > immutable data structures in PHP. > I believe this feature will be a valuable addition to the language as = it=20 > not only opens the door for future enhancements (eg. typed json=20 > deserialization, etc.), but should also help reduce reliance on arrays=20 > by providing a more expressive alternative. > > Your feedback and suggestions are highly appreciated, and we look=20 > forward to discussing this proposal further within the PHP internals=20 > community. > > Sincerely > Lanre As I have stated in the past, I am firmly opposed to anemic structs. Th= ey offer no benefit, much confusion, and more work for future RFCs. The core concept -- that service objects and data objects are separate c= reatures that should not be comingled -- I fully agree with and advocate= for. If I were writing PHP from scratch today, I would probably design= it with separate constructs, or take a cue from Go/Rust and eliminate c= lasses all together per se, as they just confuse matters. However, we a= re dealing with PHP as it exists today, and an entirely separate limited= construct just doesn't make sense. I also want to make clear that I am 1000% in favor of structured, typed = data. The use of associative arrays as a pseudo data structure is the w= eakest part of PHP, and the more we can move people away from that towar= ds more formally typed data, the better. For that reason, making it tri= vial to cast between an associative array and a structured object (as a = few others in the thread have suggested) is a *bad* feature, because it = further reinforces the idea that an associative array is "just as good" = as making a defined type. This is simply flat out false, and we should = avoid language features that pretend that it is true. That said, as of PHP 8, we already have perfectly good struct-ish data s= tructure: Classes with promoted properties and named arguments. As of P= HP 8.2, the entire class can be declared readonly with a single keyword.= For 95% of use cases, this is completely adequate as a struct-like str= ucture: readonly class Person { =C2=A0=C2=A0=C2=A0 public function __construct( =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public string $first, =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public Status $last, =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 public ?DateTimeImmutable $b= irthday =3D null, =C2=A0=C2=A0=C2=A0 ) {} } $p =3D new Person(first: 'Larry', last: 'Garfield'); So where does it fall short? 1. The proposal above suggests it's that it allows methods. Why is that= an issue? Why are methods a problem on a data-centric object? This is= never explained, and I don't believe it to be true. While methods that= call out to other service objects would definitely be bad juju, a fulln= ame() method on the above class poses no problems whatsoever. There is = no theoretical purity being protected by disallowing methods. By the sa= me logic, would structs also forbid property hooks, assuming those pass?= I would hope not, as data objects are where those are most useful. In= fact, I'd go a step further and note that a readonly struct that disall= ows methods *precludes* the "with-er" style of evolving an object, so if= you want "the same thing but with this one change", you have to complet= ely recreate a new struct from scratch. This is a worse experience in e= very way. 2. The proposal above suggests that it's because the `new` keyword is ne= eded, and proposes both positional and named function-esque syntax, maki= ng it look more like Kotlin or Rust where there is no `new` keyword and = the class name is itself the constructor. I will agree that `new` is cl= umsy in many cases, particularly in compound expressions, but that's not= an issue unique to data-centric objects. If we were to come up with so= me alternate constructor invocation to make it easier for data-centric o= bjects, it would be equally useful on non-data objects as well. There's= no reason to make it specific to just data structs. (I am also not cer= tain if the parser could even handle that, since functions and classes a= re in a different keyspace currently so if both a class foo and function= foo are defined, `foo()` is ambiguous.) 3. The proposal suggests nested the ability to have nested struct defini= tions. I can see where this is useful, certainly. However... I can als= o see where it's useful on service objects, too. Many languages have su= ch a feature, often called "inner classes," and it works just fine on se= rvice objects as well as data objects. Inner classes would be an intere= sting feature in itself that would be worth its own RFC (I won't guarant= ee that I'd support it, depending on the details, but I am quite open to= considering it), but there's no good reason to limit that functionality= to just data classes. 4. The proposal implies that structs should be always readonly. As note= d above, a readonly class is trivial to define now. Moreover, while I a= m an outspoken proponent of immutable data structures they are not appro= priate in all cases. PSR-14 events, for example, are deliberately mutab= le because, given PHP's design, making them immutable would have require= d a lot of extra work from anyone writing a listener for very little ben= efit. Entities are another example of a data-object that logically need= s to be mutable. Mutable data-centric product types have their place, a= nd this approach would preclude that. 5. MWOP suggested in a reply that allowing a struct to conform to a stru= ct definition structurally by the properties it has would be useful. Po= tentially yes. However, interface properties, part of the hooks RFC, wo= uld get us to almost the same place without the weird world-splitting be= tween structs and classes. 6. When dealing with a mutable data value, objects pass by handle (feels= like reference even if it's not), but data feels like it should pass by= value. Valid! This is a long-standing gripe, and the growth of with-e= r style value objects (PSR-7 et al) is in a large part to avoid that ris= k of "spooky action at a distance." However... the above proposal does = not address this at all! I would argue that point 6 is the only valid argument for a separate con= struct from classes as they already exist. But there is no need to crea= te a whole other construct (a very significant implementation lift) to a= chieve that. All that would need is a flag/marker on the class to indic= ate that it should use data-like passing semantics. Kotlin has a good e= xample here, where you can declare a class a "data" class by just adding= the `data` keyword. That has a number of implications in Kotlin (many = of which are not relevant for us), but in our case it would mean either = to pass the object by value, or to automatically clone it every time it = is passed. (The two would be almost the same to the end user, but likel= y have different implementation challenges. I cannot speak to what thos= e would be personally, but it's an implementation detail not relevant fo= r now.) That very small change, when combined with all of the other improvements= to the language in recent years, gives us all the benefits of data-cent= ric structures without any of the downsides of a completely new construc= t. It would also allow the developer to opt-in to the class being reado= nly or not, as the situation requires. The downsides of a new construct= include: 1. It would either have to be a new zval type, which is a ton of work, o= r built on classes, in which case you're fighting against all of the stu= ff classes do. 2. Which stuff that classes do should be supported by the very-similar s= yntax? Methods? Attributes? Can you clone it? Do you get a __clone()= override if you do? Are traits supported? How does equality work? I = can see an argument for where product types (which is what we're talking= about) would benefit from all of the above. So we either cripple struc= ts without useful features, or it becomes a lot of work to end up with "= objects that pass funny." We can get "objects that pass funny" with a l= ot less effort with just a `data` keyword flag. 3. If structs are entirely separate from objects, then any time we add a= new feature to objects we'll have to debate, again, if that feature sho= uld be added to structs as well. And if so, we're looking at more work = for the RFC implementer for very little gain. Or, we'll add a feature t= o structs (like inner classes) and then ask for it on classes, too, and = again have double the work and double the debate. 4. The Reflection API is complicated enough as is, without having to dea= l with a whole other type of type. As someone who maintains a serialize= r, that would be a lot of work for me to support, with no actual benefit. In short, there's only two versions of structs that could realistically = end up existing: Crippled in some way, and "objects that pass funny." So if what we really want are objects that pass by value, let's just imp= lement by-value opt-in objects and be done with it. It's much less work= , much more powerful, and avoids many more debates in the future. --Larry Garfield