Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122879 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 4B4401A009F for ; Tue, 2 Apr 2024 16:02:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1712073794; bh=IEE8/K9rj9Inq7JEvat7Oi3in3EMKUVLKKYfCSrnbhs=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=KVFJJtO7s2nZlpXNWAwLTVIB/VppVNpxGy2p2CR9GbJV39jMUTtGcYtYde23yYcEm 0JSnhwNP7n1CXtsRVsqFFPMreQturS4P60FraIKskyjBDG3cCeQp5Eq09bhk5FQJKF +dhQLeC8dc91Rm6q1u9BdX1vJuuIgch1OvYUNSG1OBZjs/NakQLbbxjDF/zUp424oL Oc+1R/U+aXMBO5m9X7BKZGyFy94tmX6rbteBQOKH9T+DwLBKEPxXsYRBtK2jh88r1O RcHsMWHC9JQ8LiYhoQky/cFdhcovmhscL+zn5TdtVGMicePyT3Y2plkyaBA5Q5e/Pt fIn8laYMwwk0w== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 415FC180C59 for ; Tue, 2 Apr 2024 16:03:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-ot1-f51.google.com (mail-ot1-f51.google.com [209.85.210.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 2 Apr 2024 16:03:08 +0000 (UTC) Received: by mail-ot1-f51.google.com with SMTP id 46e09a7af769-6e68d358974so2578752a34.3 for ; Tue, 02 Apr 2024 09:02:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712073760; x=1712678560; darn=lists.php.net; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fJWmqnIPSpN22mEn5GIHoHogY5/ejbrYPx63Jmt2aPw=; b=gpJICwbHpafOzywRMOXSVXOS4TuX0mJHlM0io+EQEozQybllbGrwRH3qE36FL1TI3N Zs+QsM5BvBpFuDz5XhSW7Gm6LHXX9EPIunzdVihX1uxAKcq6vgccr/EQksha81rg4B6z t8V3+t84BdlAnHlYVcZcxoeH4DFJvlnGXlQYA3gGSoDkxKpAMcX5LZ/E3Les6CuV80YK uQb9PR+Ty6JvkY6XNlTzq/JBeurhk0TWC9ucRhG0tYUP7U3A6r0no7H6rgYHlIuwCvWo PJ3TFF32OjU2zvqSz/v75HJCG44fzIBt/p6Wiy2A+dOpAw4wo40JBBPOvJmGh1jgCEvJ dDVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712073760; x=1712678560; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fJWmqnIPSpN22mEn5GIHoHogY5/ejbrYPx63Jmt2aPw=; b=vRxSseMShm3SkWTd5Pn+gTxKzMRFB1nnLYJfKZC7AclfI/BjZnKR+jOEeOj+RTwqWa cDQG3zLblhuBM6w+fBA8TnX7WYG6DqVJWIrliZ1ZD2+4nDopR9yNyd+DE/WINH+Jywhm AFbRuCkZ/T1py/6rShVf0GU0roudERh5cZ9hFZVqdFEzvJPbELMxQq3QQiwb3aQn+m8b h6mOxMKQGX6m3ABt1dJ9R5scgHIsbKv7nTpQ/HCCVj3+UPpXnRATsyBpSTvQzsoL7l35 3lPsX/OPl+QcbxHuOUEzFgaDdeHxDfcY+YJdxfzAEMuIFINfJVTz5d5To7QgT8yrIxpo AGwQ== X-Gm-Message-State: AOJu0YwSiDm/BTWdGMSmEI/fTG8S+Ly9YCgyoxw/24hh6DCav361Mt5j djiYYuqMnC8Jn9OPKq+wGDxI/c16KWTstSWwHUmLEq4kzqb2pm4uC8smsln4k7gEo8v/hE93n/m hr3EVd6PRwIG238Xv8plGgAmIvNv4Gxr2R1+byg== X-Google-Smtp-Source: AGHT+IHPWKtO+ZxYmLAaPsxKyzUogU4R16mJRcYftz+VWLvyITyLnTQAiINWOEGL5WwtBLvUv34+Y97LeCKNtHvIOlo= X-Received: by 2002:a9d:6ad2:0:b0:6e8:900b:199c with SMTP id m18-20020a9d6ad2000000b006e8900b199cmr72985otq.30.1712073759933; Tue, 02 Apr 2024 09:02:39 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 2 Apr 2024 18:02:28 +0200 Message-ID: Subject: Re: [PHP-DEV] [RFC][Concept] Data classes (a.k.a. structs) To: Ilija Tovilo Cc: PHP internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: landers.robert@gmail.com (Robert Landers) On Tue, Apr 2, 2024 at 2:20=E2=80=AFAM Ilija Tovilo wrote: > > Hi everyone! > > I'd like to introduce an idea I've played around with for a couple of > weeks: Data classes, sometimes called structs in other languages (e.g. > Swift and C#). > > In a nutshell, data classes are classes with value semantics. > Instances of data classes are implicitly copied when assigned to a > variable, or when passed to a function. When the new instance is > modified, the original instance remains untouched. This might sound > familiar: It's exactly how arrays work in PHP. > > ```php > $a =3D [1, 2, 3]; > $b =3D $a; > $b[] =3D 4; > var_dump($a); // [1, 2, 3] > var_dump($b); // [1, 2, 3, 4] > ``` > > You may think that copying the array on each assignment is expensive, > and you would be right. PHP uses a trick called copy-on-write, or CoW > for short. `$a` and `$b` actually share the same array until `$b[] =3D > 4;` modifies it. It's only at this point that the array is copied and > replaced in `$b`, so that the modification doesn't affect `$a`. As > long as a variable is the sole owner of a value, or none of the > variables modify the value, no copy is needed. Data classes use the > same mechanism. > > But why value semantics in the first place? There are two major flaws > with by-reference semantics for data structures: > > 1. It's very easy to forget cloning data that is referenced somewhere > else before modifying it. This will lead to "spooky actions at a > distance". Having recently used JavaScript (where all data structures > have by-reference semantics) for an educational IR optimizer, > accidental mutations of shared arrays/maps/sets were my primary source > of bugs. > 2. Defensive cloning (to avoid issue 1) will lead to useless work when > the value is not referenced anywhere else. > > PHP offers readonly properties and classes to address issue 1. > However, they further promote issue 2 by making it impossible to > modify values without cloning them first, even if we know they are not > referenced anywhere else. Some APIs further exacerbate the issue by > requiring multiple copies for multiple modifications (e.g. > `$response->withStatus(200)->withHeader('X-foo', 'foo');`). > > As you may have noticed, arrays already solve both of these issues > through CoW. Data classes allow implementing arbitrary data structures > with the same value semantics in core, extensions or userland. For > example, a `Vector` data class may look something like the following: > > ```php > data class Vector { > private $values; > > public function __construct(...$values) { > $this->values =3D $values; > } > > public mutating function append($value) { > $this->values[] =3D $value; > } > } > > $a =3D new Vector(1, 2, 3); > $b =3D $a; > $b->append!(4); > var_dump($a); // Vector(1, 2, 3) > var_dump($b); // Vector(1, 2, 3, 4) > ``` > > An internal Vector implementation might offer a faster and stricter > alternative to arrays (e.g. Vector from php-ds). > > Some other things to note about data classes: > > * Data classes are ordinary classes, and as such may implement > interfaces, methods and more. I have not decided whether they should > support inheritance. > * Mutating method calls on data classes use a slightly different > syntax: `$vector->append!(42)`. All methods mutating `$this` must be > marked as `mutating`. The reason for this is twofold: 1. It signals to > the caller that the value is modified. 2. It allows `$vector` to be > cloned before knowing whether the method `append` is modifying, which > hugely reduces implementation complexity in the engine. > * Data classes customize identity (`=3D=3D=3D`) comparison, in the same w= ay > arrays do. Two data objects are identical if all their properties are > identical (including order for dynamic properties). > * Sharing data classes by-reference is possible using references, as > you would for arrays. > * We may decide to auto-implement `__toString` for data classes, > amongst other things. I am still undecided whether this is useful for > PHP. > * Data classes protect from interior mutability. More concretely, > mutating nested data objects stored in a `readonly` property is not > legal, whereas it would be if they were ordinary objects. > * In the future, it should be possible to allow using data classes in > `SplObjectStorage`. However, because hashing is complex, this will be > postponed to a separate RFC. > > One known gotcha is that we cannot trivially enforce placement of > `modfying` on methods without a performance hit. It is the > responsibility of the user to correctly mark such methods. > > Here's a fully functional PoC, excluding JIT: > https://github.com/php/php-src/pull/13800 > > Let me know what you think. I will start working on an RFC draft once > work on property hooks concludes. > > Ilija Neat! I've been playing around with "value-like" objects for awhile now: https://github.com/withinboredom/time Having inheritance supported would be useful, for example, consider an ID t= ype: data class Id { public function __construct(public string $id) {} } Maybe you want to extend it to a UserId: data class UserId extends Id {} Now you can't accidentally pass a VideoId as a UserId, but underlying ORMs can still use both as an Id. Robert Landers Software Engineer Utrecht NL