Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126040 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id D404B1A00BD for ; Sat, 23 Nov 2024 20:36:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1732394331; bh=W2dhIoNUMwSKclYNOac0iW7+cz+YNQGVZjKfzRnkvCo=; h=Date:From:To:In-Reply-To:References:Subject:From; b=KPbN+Hj/aSXhrhkgw1oPUGcTmEK87lNlF0g0T5+KfRtWjMjmxXMosMB1VlTue/iun 7uzXGoxiN1kHOaHOe4bQyY5JFBzpsZj0WGkwexiS7BGdy2AR2OACFb7tw/6Ze6it2k 3P5pwZSmTytMrC84RdZ2q1hH5UPfd9KECtesHOKkCziYmyfbUrs5tKnF8W9pPv0giM k56S/YBlruMEWXTs0fH5Ya0pSs7C4etqDejj3q8l7h6pVdljZHF/iWFm6GeIXUUAx1 C8P7Ya4NywUuMmW+7xOnRRzLR855zO5JHPWtJQ1xjeToK1kqOCyx3iGsdFIhtXEC5o PSRHtRRBOtssA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 38D7718002F for ; Sat, 23 Nov 2024 20:38:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,URIBL_SBL_A autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fout-b4-smtp.messagingengine.com (fout-b4-smtp.messagingengine.com [202.12.124.147]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 23 Nov 2024 20:38:48 +0000 (UTC) Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfout.stl.internal (Postfix) with ESMTP id A160D1140141 for ; Sat, 23 Nov 2024 15:36:06 -0500 (EST) Received: from phl-imap-06 ([10.202.2.83]) by phl-compute-01.internal (MEProxy); Sat, 23 Nov 2024 15:36:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to; s=fm1; t=1732394166; x=1732480566; bh=NS6GYI0wr9GaKr8D10BZ/ 7zR/9CAlNp4us75c1zXJ6k=; b=J8+HU7UEO631R9mzLip8h7QHmzQ6i5OXLBU2x VAELNKZRqzBAkj8nEvu4HhFdBxSf5CKzmXS/w+psMEyts0/2T4Re93p+iILPX4u1 3kShpIMsMMW7nv/ooBgYEzq61jtXl6MSRC9/cl5wSqFH26zhvOuXhnSpJOJIEzXB 702UIU86jHMozSf/qcF2lHQzX+TfZT7wxFMqColCsLKXrf/Tkb4xXBe8o4OnYhmR FaxZPmsbQ04l9A7jEWZT9UCV39mNmTIZ1XP7BezgyoShr10RtVR7PJPIk/mQ71CX TY4pgqH94KAJPpsj++8f6VqIAQHtjTswN7UBTNLV4JL40geHQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1732394166; x=1732480566; bh=N S6GYI0wr9GaKr8D10BZ/7zR/9CAlNp4us75c1zXJ6k=; b=HkEDEKa66lXyPQNkp E8SilHQ9WBrfrzx3zUBg47hvECqJrW/NbPXQl9YKAfcVJqsNrOQQHlhirmiTj2ys 36jVh5qRfQvN3XOjcVxP14WsLe0VeOWfHXl7tVSQ6hZNnIpyhlpvfkcfeemzPZI3 Vgl7KJPuZa9AsE7/7jxItKDDUPiV38tK+Cz+ZO7t62hKUNjIYfeSl1qJo+wZ37Fy p61ebkuyNMa+U5MYjj2u9W/tHbGeBjm6OF+uaCPO+GhxgubBukPXeEKgyr6YtUMb uxY0Sm6C6yLyD6V3qAybDv+sI/IEHCihlTJIaNnJWr+bzQgie7YBjYnny1MbgczC TfUpQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddrgedugddufeekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnegoufhushhpvggtthffohhmrghinhculdegledmnecujfgurhep ofggfffhvffkjghfufgtgfesthhqredtredtjeenucfhrhhomhepfdfnrghrrhihucfirg hrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtohhmqeenucgg tffrrghtthgvrhhnpeekhffhhefhffevgedvheevueelueffhedujeevhedvleehkeegve ehvedujeehgeenucffohhmrghinhepphhhphdrnhgvthdpghhithhhuhgsrdgtohhmpdef vheglhdrohhrghdpthhhvghphhhprdhfohhunhgurghtihhonhenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehlrghrrhihsehgrghrfhhivghl ughtvggthhdrtghomhdpnhgspghrtghpthhtohepuddpmhhouggvpehsmhhtphhouhhtpd hrtghpthhtohepihhnthgvrhhnrghlsheslhhishhtshdrphhhphdrnhgvth X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 3BB4C29C0072; Sat, 23 Nov 2024 15:36:06 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Sat, 23 Nov 2024 14:35:45 -0600 To: "php internals" Message-ID: <66b25296-ca81-4029-b6f6-064cfd810635@app.fastmail.com> In-Reply-To: <18b85ba5-5f1c-489c-9096-3ae203977fbe@app.fastmail.com> References: <18b85ba5-5f1c-489c-9096-3ae203977fbe@app.fastmail.com> Subject: Re: [PHP-DEV] [RFC] Data Classes Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: larry@garfieldtech.com ("Larry Garfield") On Sat, Nov 23, 2024, at 7:11 AM, Rob Landers wrote: > Hello internals, > > Born from the Records RFC (https://wiki.php.net/rfc/records)=20 > discussion, I would like to introduce to you a competing RFC: Data=20 > Classes (https://wiki.php.net/rfc/dataclass).=20 > > This adds a new class modifier: data. This modifier drastically change= s=20 > how classes work, making them comparable by value instead of reference= ,=20 > and any mutations behave more like arrays than objects (by vale). If=20 > desired, it can be combined with other modifiers, such as readonly, to=20 > enforce immutability. > > I've been playing with this feature for a few days now, and it is=20 > surprisingly intuitive to use. There is a (mostly) working=20 > implementation available on GitHub=20 > (https://github.com/php/php-src/pull/16904) if you want to have a go a= t=20 > it. > > Example: > > data class UserId { public function __construct(public int $id) {} } > > $user =3D new UserId(12); > // later > $admin =3D new UserId(12); > if ($admin =3D=3D=3D $user) { // do something } // true > > Data classes are true value objects, with full copy-on-write optimizat= ions: > > data class Point { > public function __construct(public int $x, public int $y) {} > public function add(Point $other): Point { > // illustrating value semantics, no copy yet > $previous =3D $this; > // a copy happens on the next line > $this->x =3D $this->x + $other->x; > $this->y =3D $this->y + $other->y; > assert($this !=3D=3D $previous); // passes > return $this; > } > } > > I think this would be an amazing addition to PHP.=20 > > Sincerely, > > =E2=80=94 Rob Oh boy. Again, I think there's too much going on here, but I think that= 's because different people are operating under a different definition o= f what "value semantics" means. Let me try to break down what I think a= re the constituent parts. 1. Pass-by-value. This is what arrays, ints, strings, etc. do. When yo= u pass a value to a function, what you get is logically a new value. It= may be equal to the old one, it may be the same memory location as the = old one, but that's hidden from you. Logically, it's a new value. (And= if there's a shared memory location, CoW hides that from you, too.) Th= e intent here is to avoid "spooky action at a distance" (SAAAD) (that is= , changing a value inside a function is guaranteed to not have any effec= t on the function that called it). 2. Logical equality. This only applies to compound values (arrays and o= bjects), but would imply checking equality by recursively checking equal= ity on sub-elements. (Properties in the case of objects, keys in the ca= se of arrays.) 3. Physical equality. This is what =3D=3D=3D does, and checks that two = variables refer to the same memory location. Physical equality implies = logical equality, but not vice versa. 4. Immutability. A given variable's value cannot change. 5. Product types. A type that is based on two or more other types. (Eg= , Point is a product of int and int.) These are all circling around the same problem space, but are all differ= ent things. For instance, rigidly immutable values make pass-by-value i= rrelevant, while pass-by-value avoids SAAD without needing immutability. I think that's the key place where Rob's approach and Ilija's approach d= iffer. Rob's approach (records and dataclass) are trying to solve SAAAD= through immutability, one way or another. Ilija's approach is trying t= o solve SAAAD through pass-by-value semantics. By-value semantics would be really easy to implement by just auto-clonin= g an object at a function boundary. However, that's also very wasteful,= as the object probably won't be modified, making the clone just a memor= y hog. The issue is that detecting a modification on nested objects is = not particularly easy, which is how Ilija ended up with an explicit synt= ax to mark such modification. (I personally dislike it, from a DX persp= ective, but I don't have any suggestions on how to avoid it. If someone= else does, please speak up.) Immutability semantics, as we've seen, seem easy but are actually quite = logically complex once you get past the bare minimum. (The bare minimum= is already provided by readonly classes. Problem solved.) So I'm not sure we're all talking about solving the same problem, or sol= ving it in the same way. Moreover, I don't think we all agree on the use cases we're solving. Le= t me offer a few examples. 1. Fancy typed values readonly class UserID { public function __construct(public int $id) {} } This is already mostly supported, as above, just a bit verbose. In this= case, it makes sense that two equivalent objects are =3D=3D, and if we = can make them =3D=3D=3D then that's a nice memory optimization, but not = a requirement. In this case, we're really just providing additional typ= ing, and the immutability is trivial (and already supported). 2. Product types (part 1) class Point { public function __construct(public int $x, public int $y) {} } Now here's the interesting part. Should Point be immutable? Should mod= ifications to Point inside a function affect values outside the function= ? MAYBE! It depends on the context. In most cases, probably not. How= ever, consider a "registration/collection" use case of an event dispatch= er: class RegisterPluginsEvent { public function __construct(public array $pluginsToRegister) {} } This is a "data" class in that it is carrying data, and is not a service= . However, we very clearly DO want SAAAD in this case. That's the whol= e reason it exists. Currently this case is solved by conventional class= es, so I don't think there's anything to do here. 3. Product types (part 2) Where it gets interesting is when you do need to modify an object, and p= ropagate those changes, but NOT propagate the ability to change it. Con= sider: class Circle { public function __construct(Point $center, int $radius) {} } $c =3D new Circle(new Point(1, 2), 5); if ($some_user_data) { $c->center->x =3D 10; } draw($c); Here, *we do want the ability to modify $c after construction*. However= , we do NOT want to allow draw() to modify our $c. This case is current= ly unsolved in PHP. As above, there's two approaches to solving it: Making $c immutable gene= rally, or making a copy (immediately or delayed) when passing to draw().= Making $c immutable generally would, in this case, be bad, because we = do want the ability to modify $c before passing it. It's just much more= convenient than needing to compute everything ahead of time and pass it= to the constructor like it's just a function. 4. Aggregate types One of the main places that Ilija and I have discussed his structs propo= sal is collections[1]. In many languages, collections have both an in-p= lace modifier and a clone-along-the-way modifier. For instance, sort() = and sorted(), reverse() and reversed(), etc. (Details vary a little by = language.) Some languages also have both mutable and immutable versions= of each collection type (Seq, Set, Map), with the in-place methods only= available on the mutable variant. There's also then methods to convert= a mutable collection into an immutable one and vice versa, which (I bel= ieve) implies making a copy. Kotlin does both of the above, and is the = model that I have been planning to pursue in PHP, eventually. Ilija has argued that if we can flag collection classes as pass-by-value= , then we don't need the immutable versions at all. The only reason for= the immutable versions to exist is to prevent SAAAD. If that's already= prevented by the passing semantics, then we don't need an explicitly im= mutable collection. So that would mean: $c =3D new List(); $c->add(1); // in place mutation. $c->add(3); // in place mutation. $c->add(2); // in place mutation. function doStuff(List $l) { $l->sort(); // in-place mutation of a value-passed value. // do stuff with l. } doStuff($c); var_dump($c); // Still ordered 1, 3, 2 So a sorted() method or an ImmutableList class wouldn't be necessary. (= I can see a use for sorted() anyway, to make it chainable, just like ano= ther recent RFC proposed for the existing sort() function. That's relat= ed but a separate question.) This approach would not be possible if data/record/struct/whatever class= es have *any* built-in immutability to them. They just become super cum= bersome to work with. One way or another, you end up back at the withX(= ) methods that we already have and use. $c =3D new List(); $c =3D $c->add(1); $c =3D $c->add(3); $c =3D $c->add(2); // ... Eew. I can do that already today, and I don't want to. Here's the important observation: Speaking as the leading functional pro= gramming in PHP fanboy, I don't really see much value at all to intra-fu= nction immutability. It's just... not useful in PHP. Immutability at f= unction boundaries, that's super useful. But solving the problem at the= object-immutability level is the wrong place in PHP. (It is arguably t= he right place in Haskell or ML, but PHP is not Haskell or ML.) So IMO, the focus should be on just the function boundary semantics. Th= e main issue is how to make that work without wonky new syntax. Again, = I don't have a good answer, but would kindly request one. :-) Finally, there's the question of equality. Be aware, PHP *already does = value equality for objects*: https://3v4l.org/67ho1 The issue isn't that it's not there, it's that it cannot be controlled. = I am not convinced that overriding =3D=3D=3D to mean logical equality r= ather than physical equality, but only for data objects, is wise. And w= e already have =3D=3D handled. (I use that fact in my PHPUnit tests all= the time.) What is missing is the ability to control how that =3D=3D c= omparison is made. class Rect { private int $area; public function __construct(public readonly int $h, public readonly in= t $w) {} public function area(): int { $this->area ??=3D $this->h * $this->w; } } $r1 =3D new Rect(4, 5); $r2 =3D new Rect(4, 5); print $r1->area; var_dump($r1 =3D=3D $r2); // What happens here? Presumably, we'd want those to be equal without having to compute $area = on $r2. Right now, that's impossible, and those objects would not be eq= ual. Fixing that has... nothing to do with value semantics at all. It = has to do with operator overloading, and I'm already on record that I am= very in favor of addressing that. I hope that gives a better lay of the land for everyone in this thread. --Larry Garfield [1] https://thephp.foundation/blog/2024/08/19/state-of-generics-and-coll= ections/#collections