There's been a number of discussions of late around property visibility and how to make objects more immutable. Since it seems to have been well-received in the past, I decided to do a complete analysis and context of the various things that have been floated about recently.
The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
--
Larry Garfield
larry@garfieldtech.com
There's been a number of discussions of late around property visibility and how to make objects more immutable. Since it seems to have been well-received in the past, I decided to do a complete analysis and context of the various things that have been floated about recently.
The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
Thanks for the nice write up Larry!
Is there a reason you didn't mention the proposal for immutable classes?
(probably because it never went into a final RFC)
2020-12-29 8:26 GMT, Marc marc@mabe.berlin:
There's been a number of discussions of late around property visibility
and how to make objects more immutable. Since it seems to have been
well-received in the past, I decided to do a complete analysis and context
of the various things that have been floated about recently.The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
Thanks for the nice write up Larry!
Is there a reason you didn't mention the proposal for immutable classes?
(probably because it never went into a final RFC)https://externals.io/message/94913#94913
https://externals.io/message/79180#79180
--
To unsubscribe, visit: https://www.php.net/unsub.php
I just want to mention that immutability might be applied too
liberally in the current discourse, and in some cases, what you really
want is non-aliasing, that is, uniqueness, to solve problems related
to immutability. I think methods like withX
is an anti-pattern, in
fact, and a symptom that you do not really want immutability, but
rather uniqueness, at least in some cases.
Olle
I just want to mention that immutability might be applied too
liberally in the current discourse, and in some cases, what you really
want isnon-aliasing, that is, uniqueness, to solve problems related
to immutability. I think methods likewithX
is an anti-pattern, in
fact, and a symptom that you do notreally want immutability, but
rather uniqueness, at least in some cases.
Hi Olle,
I'm afraid I don't follow what you mean by "non-aliasing" and
"uniqueness" here. Could you clarify, perhaps with some examples?
Cheers,
--
Rowan Tommins
[IMSoP]
2020-12-29 21:36 GMT, Rowan Tommins rowan.collins@gmail.com:
I just want to mention that immutability might be applied too
liberally in the current discourse, and in some cases, what you really
want isnon-aliasing, that is, uniqueness, to solve problems related
to immutability. I think methods likewithX
is an anti-pattern, in
fact, and a symptom that you do notreally want immutability, but
rather uniqueness, at least in some cases.Hi Olle,
I'm afraid I don't follow what you mean by "non-aliasing" and
"uniqueness" here. Could you clarify, perhaps with some examples?Cheers,
--
Rowan Tommins
[IMSoP]
Wikipedia has an article about aliasing:
https://en.wikipedia.org/wiki/Aliasing_(computing)
Also relevant: https://en.wikipedia.org/wiki/Uniqueness_type
Uniqueness is when you only allow one reference to an object (or
bucket of memory).
$a = new A();
$b = $a; // Both $a and $b point to the same place in memory, so you
have an alias
Uniqueness and immutability solves similar problems (at least in a GC
language like PHP): Spooky action at a distance, fragile composition,
rep exposure.
The are more advanced systems of ownership than just uniqueness, e.g.
Universe Types, but let's ignore that for now.
https://www.researchgate.net/publication/221321963_Ownership_transfer_in_Universe_Types
Uniqueness has the benefit of being more performant than immutability,
since it leads to less memory copy (but of course it's not certain
this performance gain matters in PHP programs).
You can compare a builder pattern with immutability vs non-aliasing
(uniqueness):
// Immutable
$b = new Builder();
$b = $b->withFoo()->withBar()->withBaz();
myfun($b); // $b is immutable, so $b cannot be modified by myfun()
return $b;
// Uniqueness
$b = new Builder(); // class Builder is annotated as non-aliasing/unique
$b->addFoo();
$b->addBar();
$b->addBaz();
myfun(clone $b); // HAVE TO CLONE TO NOT THROW EXCEPTION.
return $b;
The guarantee in both above snippets is that myfun() DOES NOT modify
$b before returning it. BUT with immutability, you have to copy $b
three times, with uniqueness only one. That's why I think ownership
system merits some attention, not ONLY immutability. :) Unfortunately,
it can take a looong time to force new concepts like these into common
discourse. Rust helps, obviously. The language Clean has opt-in
uniqueness (I need to read up more on it, tho). Linear types in e.g.
Haskell is also related (but is both more complex and more powerful).
I would like to add an annotation to Psalm, like @psalm-no-alias,
because this is also needed to make type-state sane. But, didn't do
much yet, only some brain farting. :)
Olle
Uniqueness is when you only allow one reference to an object (or
bucket of memory).
[...]You can compare a builder pattern with immutability vs non-aliasing
(uniqueness):// Immutable $b = new Builder(); $b = $b->withFoo()->withBar()->withBaz(); myfun($b); // $b is immutable, so $b cannot be modified by myfun() return $b;
// Uniqueness $b = new Builder(); // class Builder is annotated as non-aliasing/unique $b->addFoo(); $b->addBar(); $b->addBaz(); myfun(clone $b); // HAVE TO CLONE TO NOT THROW EXCEPTION. return $b;
Thanks, I can see how that solves a lot of the same problems, in a very
robustly analysable way.
However, from a high-level user-friendliness point of view, I think
"withX" methods are actually more natural than explicitly cloning
mutable objects.
Consider the case of defining a range: firstly, with plain integers and
familiar operators:
$start = 1;
$end = $start + 5;
This models integers as immutable values, and + as an operator which
returns a new instance. If integers were mutable but not aliasable, we
would instead write something like this:
$start = 1;
$end = clone $start;
$end += 5; // where += would be an in-place modification, not a
short-hand for assignment
I think the first more naturally expresses the desired algorithm. It's
therefore natural to want the same for a range of dates:
$start = MyDate::today();
$end = $start->withAddedDays(5);
vs
$start = MyDate::today();
$end = clone $start;
$end->addDays(5);
To put it a different way, value types naturally form expressions,
which mutable objects model clumsily. It would be very tedious if we had
to avoid accidentally mutating the speed of light:
$e = (clone $m) * ((clone $c) ** 2);
The guarantee in both above snippets is that myfun() DOES NOT modify
$b before returning it. BUT with immutability, you have to copy $b
three times, with uniqueness only one.
I wonder if that difference can be optimised out by the
compiler/OpCache: detect clones that immediately replace their original,
and optimise it to an in-place modification. In other words, compile
$foo = clone $foo with { x: 42 } to $foo->x = 42, even if the clone is
actually in a "withX" method.
Regards,
--
Rowan Tommins
[IMSoP]
Uniqueness is when you only allow one reference to an object (or
bucket of memory).
[...]You can compare a builder pattern with immutability vs non-aliasing
(uniqueness):// Immutable $b = new Builder(); $b = $b->withFoo()->withBar()->withBaz(); myfun($b); // $b is immutable, so $b cannot be modified by myfun() return $b;
// Uniqueness $b = new Builder(); // class Builder is annotated as non-aliasing/unique $b->addFoo(); $b->addBar(); $b->addBaz(); myfun(clone $b); // HAVE TO CLONE TO NOT THROW EXCEPTION. return $b;
Thanks, I can see how that solves a lot of the same problems, in a very
robustly analysable way.However, from a high-level user-friendliness point of view, I think
"withX" methods are actually more natural than explicitly cloning
mutable objects.Consider the case of defining a range: firstly, with plain integers and
familiar operators:$start = 1;
$end = $start + 5;This models integers as immutable values, and + as an operator which
returns a new instance. If integers were mutable but not aliasable, we
would instead write something like this:$start = 1;
$end = clone $start;
$end += 5; // where += would be an in-place modification, not a
short-hand for assignmentI think the first more naturally expresses the desired algorithm. It's
therefore natural to want the same for a range of dates:$start = MyDate::today();
$end = $start->withAddedDays(5);vs
$start = MyDate::today();
$end = clone $start;
$end->addDays(5);To put it a different way, value types naturally form expressions,
which mutable objects model clumsily. It would be very tedious if we had
to avoid accidentally mutating the speed of light:$e = (clone $m) * ((clone $c) ** 2);
The guarantee in both above snippets is that myfun() DOES NOT modify
$b before returning it. BUT with immutability, you have to copy $b
three times, with uniqueness only one.
That's a good summary of why immutability and with-er methods (or some equivalent) are more ergonomic.
Another point to remember: Because of PHP's copy-on-write behavior, full on immutability doesn't actually waste that much memory. It does use up some, but far less than you think. (Again, based on the tests MWOP ran for PSR-7 a ways back.)
I wonder if that difference can be optimised out by the
compiler/OpCache: detect clones that immediately replace their original,
and optimise it to an in-place modification. In other words, compile
$foo = clone $foo with { x: 42 } to $foo->x = 42, even if the clone is
actually in a "withX" method.
In concept, maybe? That's well above my pay grade. :-)
--Larry Garfield
2020-12-30 18:31 GMT, Larry Garfield larry@garfieldtech.com:
Uniqueness is when you only allow one reference to an object (or
bucket of memory).
[...]You can compare a builder pattern with immutability vs non-aliasing
(uniqueness):// Immutable $b = new Builder(); $b = $b->withFoo()->withBar()->withBaz(); myfun($b); // $b is immutable, so $b cannot be modified by myfun() return $b;
// Uniqueness $b = new Builder(); // class Builder is annotated as non-aliasing/unique $b->addFoo(); $b->addBar(); $b->addBaz(); myfun(clone $b); // HAVE TO CLONE TO NOT THROW EXCEPTION. return $b;
Thanks, I can see how that solves a lot of the same problems, in a very
robustly analysable way.However, from a high-level user-friendliness point of view, I think
"withX" methods are actually more natural than explicitly cloning
mutable objects.Consider the case of defining a range: firstly, with plain integers and
familiar operators:$start = 1;
$end = $start + 5;This models integers as immutable values, and + as an operator which
returns a new instance. If integers were mutable but not aliasable, we
would instead write something like this:$start = 1;
$end = clone $start;
$end += 5; // where += would be an in-place modification, not a
short-hand for assignmentI think the first more naturally expresses the desired algorithm. It's
therefore natural to want the same for a range of dates:$start = MyDate::today();
$end = $start->withAddedDays(5);vs
$start = MyDate::today();
$end = clone $start;
$end->addDays(5);To put it a different way, value types naturally form expressions,
which mutable objects model clumsily. It would be very tedious if we had
to avoid accidentally mutating the speed of light:$e = (clone $m) * ((clone $c) ** 2);
The guarantee in both above snippets is that myfun() DOES NOT modify
$b before returning it. BUT with immutability, you have to copy $b
three times, with uniqueness only one.That's a good summary of why immutability and with-er methods (or some
equivalent) are more ergonomic.Another point to remember: Because of PHP's copy-on-write behavior, full on
immutability doesn't actually waste that much memory. It does use up some,
but far less than you think. (Again, based on the tests MWOP ran for PSR-7
a ways back.)
I thought copy-on-write was only for arrays, not objects?
Olle
That's a good summary of why immutability and with-er methods (or some
equivalent) are more ergonomic.Another point to remember: Because of PHP's copy-on-write behavior, full on
immutability doesn't actually waste that much memory. It does use up some,
but far less than you think. (Again, based on the tests MWOP ran for PSR-7
a ways back.)I thought copy-on-write was only for arrays, not objects?
Olle
Copy on write applies to all values; the caveat is that with objects, the value being copied is the handle that points to an object in memory, rather than the object itself. That means passing an object by reference can do some seriously unexpected things, which is why you basically never do so.
The point here is that if you have an object with 15 internal properties, it's memory usage is 15 zvals plus one zval for the object, plus one zval for the variable that points to it. (I'm over-simplifying here. A lot.) If you pass it to a function, only the one zval for the handle is duplicated, which is the same as for an integer.
If you clone the object, you don't duplicate 15+1 zvals. You duplicate just the one zval for the object itself, which reuses the existing 15 internal property entries. If in the new object you then update just the third one, PHP then duplicates just that one internal zval and modifies the new one. So you still are using only 18 zvals, not 36 zvals. (Engine people: Yes, I am very over-simplifying. I know.)
Basically, what in most languages would require manually implementing "immutable data structures" we get for free in PHP, which is seriously sweet.
The net result is that a with-er chain like this:
$foo2 = $foo->withBar('x')->withBaz('y')->withBeep('z');
is way, way less expensive than it looks, both on memory and CPU. It is more expensive than setters, but not by much.
That's why I don't think the distinction between unique and immutable mentioned up-thread is that big of a deal in PHP, specifically. Yes, they're different things, but the cost of them is not all that different because of CoW, so considering them separately is not as important as it would be in a language that doesn't automatically do CoW in the background for us.
(Whoever in the 90s decided to bake CoW into the engine, thank you. It's an incredibly nice foundational feature.)
--Larry Garfield
If you clone the object, you don't duplicate 15+1 zvals. You duplicate just the one zval for the object itself, which reuses the existing 15 internal property entries. If in the new object you then update just the third one, PHP then duplicates just that one internal zval and modifies the new one. So you still are using only 18 zvals, not 36 zvals. (Engine people: Yes, I am very over-simplifying. I know.)
I've pondered hacking in something like perl's bless() to turn arrays
into value objects, but according to this it looks like an object with
clone-on-write behavior would be better, as I'm assuming arrays do a
full shallow copy: given an array of 15 entries, pass it to a
function, change one member, now you're using 15 more zvals, as
opposed to just one with an object.
Am I reading that right?
--c
That's a good summary of why immutability and with-er methods (or some
equivalent) are more ergonomic.Another point to remember: Because of PHP's copy-on-write behavior,
full on
immutability doesn't actually waste that much memory. It does use up
some,
but far less than you think. (Again, based on the tests MWOP ran for
PSR-7
a ways back.)I thought copy-on-write was only for arrays, not objects?
Olle
Copy on write applies to all values; the caveat is that with objects, the
value being copied is the handle that points to an object in memory, rather
than the object itself. That means passing an object by reference can do
some seriously unexpected things, which is why you basically never do so.The point here is that if you have an object with 15 internal properties,
it's memory usage is 15 zvals plus one zval for the object, plus one zval
for the variable that points to it. (I'm over-simplifying here. A lot.)
If you pass it to a function, only the one zval for the handle is
duplicated, which is the same as for an integer.If you clone the object, you don't duplicate 15+1 zvals. You duplicate
just the one zval for the object itself, which reuses the existing 15
internal property entries. If in the new object you then update just the
third one, PHP then duplicates just that one internal zval and modifies the
new one. So you still are using only 18 zvals, not 36 zvals. (Engine
people: Yes, I am very over-simplifying. I know.)Basically, what in most languages would require manually implementing
"immutable data structures" we get for free in PHP, which is seriously
sweet.The net result is that a with-er chain like this:
$foo2 = $foo->withBar('x')->withBaz('y')->withBeep('z');
is way, way less expensive than it looks, both on memory and CPU. It is
more expensive than setters, but not by much.
Ok. You have a benchmark for this? I can make one otherwise, for the query
example.
It worries me a little that immutablility is pushed into the ecosystem as a
silver bullet. Main reason functional languages are using it is because
ownership is a newer concept, so it hasn't been adapted as much.
2020-12-30 19:50 GMT, Olle Härstedt olleharstedt@gmail.com:
That's a good summary of why immutability and with-er methods (or
some
equivalent) are more ergonomic.Another point to remember: Because of PHP's copy-on-write behavior,
full on
immutability doesn't actually waste that much memory. It does use up
some,
but far less than you think. (Again, based on the tests MWOP ran for
PSR-7
a ways back.)I thought copy-on-write was only for arrays, not objects?
Olle
Copy on write applies to all values; the caveat is that with objects, the
value being copied is the handle that points to an object in memory,
rather
than the object itself. That means passing an object by reference can do
some seriously unexpected things, which is why you basically never do so.The point here is that if you have an object with 15 internal properties,
it's memory usage is 15 zvals plus one zval for the object, plus one zval
for the variable that points to it. (I'm over-simplifying here. A lot.)
If you pass it to a function, only the one zval for the handle is
duplicated, which is the same as for an integer.If you clone the object, you don't duplicate 15+1 zvals. You duplicate
just the one zval for the object itself, which reuses the existing 15
internal property entries. If in the new object you then update just the
third one, PHP then duplicates just that one internal zval and modifies
the
new one. So you still are using only 18 zvals, not 36 zvals. (Engine
people: Yes, I am very over-simplifying. I know.)Basically, what in most languages would require manually implementing
"immutable data structures" we get for free in PHP, which is seriously
sweet.The net result is that a with-er chain like this:
$foo2 = $foo->withBar('x')->withBaz('y')->withBeep('z');
is way, way less expensive than it looks, both on memory and CPU. It is
more expensive than setters, but not by much.Ok. You have a benchmark for this? I can make one otherwise, for the query
example.It worries me a little that immutablility is pushed into the ecosystem as a
silver bullet. Main reason functional languages are using it is because
ownership is a newer concept, so it hasn't been adapted as much.
Tiny benchmark here:
https://gist.github.com/olleharstedt/eaaf1dd40be541f84aa0f3954a0ea54a
Running this on my ARM machine with PHP 7.2 gives ~1.2s for the
immutable loop, ~0.35s for the mutable one, meaning immutability is
~3x as expensive performance wise. And this is for a SMALL object - I
suspect the performance hit will grow the bigger the class gets (more
properties to clone). Correct me if I'm wrong. :/
Olle
Ok. You have a benchmark for this? I can make one otherwise, for the query
example.It worries me a little that immutablility is pushed into the ecosystem as a
silver bullet. Main reason functional languages are using it is because
ownership is a newer concept, so it hasn't been adapted as much.Tiny benchmark here:
https://gist.github.com/olleharstedt/eaaf1dd40be541f84aa0f3954a0ea54aRunning this on my ARM machine with PHP 7.2 gives ~1.2s for the
immutable loop, ~0.35s for the mutable one, meaning immutability is
~3x as expensive performance wise. And this is for a SMALL object - I
suspect the performance hit will grow the bigger the class gets (more
properties to clone). Correct me if I'm wrong. :/
That's not a valid benchmark; it's comparing cloning and method invocation against just property sets. The method calls are chewing up most of the time there.
Here's a more fair comparison on my laptop: https://gist.github.com/Crell/848568124e25c8c83fc4da5455063bab
Which is only ~20% slower. And that's when dealing with very small numbers, so in most cases you're unlikely to notice a difference unless you really are iterating over something a million times.
I also tossed some big string properties into the class, and while the total time went up a bit the ratio between the two stayed about the same.
--Larry Garfield
2020-12-30 20:37 GMT, Larry Garfield larry@garfieldtech.com:
Ok. You have a benchmark for this? I can make one otherwise, for the
query
example.It worries me a little that immutability is pushed into the ecosystem
as a
silver bullet. Main reason functional languages are using it is because
ownership is a newer concept, so it hasn't been adapted as much.Tiny benchmark here:
https://gist.github.com/olleharstedt/eaaf1dd40be541f84aa0f3954a0ea54aRunning this on my ARM machine with PHP 7.2 gives ~1.2s for the
immutable loop, ~0.35s for the mutable one, meaning immutability is
~3x as expensive performance wise. And this is for a SMALL object - I
suspect the performance hit will grow the bigger the class gets (more
properties to clone). Correct me if I'm wrong. :/That's not a valid benchmark; it's comparing cloning and method invocation
against just property sets. The method calls are chewing up most of the
time there.Here's a more fair comparison on my laptop:
https://gist.github.com/Crell/848568124e25c8c83fc4da5455063babWhich is only ~20% slower. And that's when dealing with very small numbers,
so in most cases you're unlikely to notice a difference unless you really
are iterating over something a million times.I also tossed some big string properties into the class, and while the total
time went up a bit the ratio between the two stayed about the same.--Larry Garfield
Oh yeah. Huh. Didn't realize method calling was expensive. :) Another
good reason to not use setters, lol. On my machine it's ~70% slower
with "with" now (running multiple runs, using bash "time").
The performance might matter when frameworks start to adapt patterns
like this. It's already in PSR, with the HTTP message interface etc.
Now I'm curious how OCaml and JS performs with a similar benchmark.
Since they both have tracing GC, and PHP uses ref counting...
2020-12-30 21:27 GMT, Olle Härstedt olleharstedt@gmail.com:
2020-12-30 20:37 GMT, Larry Garfield larry@garfieldtech.com:
Ok. You have a benchmark for this? I can make one otherwise, for the
query
example.It worries me a little that immutability is pushed into the ecosystem
as a
silver bullet. Main reason functional languages are using it is
because
ownership is a newer concept, so it hasn't been adapted as much.Tiny benchmark here:
https://gist.github.com/olleharstedt/eaaf1dd40be541f84aa0f3954a0ea54aRunning this on my ARM machine with PHP 7.2 gives ~1.2s for the
immutable loop, ~0.35s for the mutable one, meaning immutability is
~3x as expensive performance wise. And this is for a SMALL object - I
suspect the performance hit will grow the bigger the class gets (more
properties to clone). Correct me if I'm wrong. :/That's not a valid benchmark; it's comparing cloning and method
invocation
against just property sets. The method calls are chewing up most of the
time there.Here's a more fair comparison on my laptop:
https://gist.github.com/Crell/848568124e25c8c83fc4da5455063babWhich is only ~20% slower. And that's when dealing with very small
numbers,
so in most cases you're unlikely to notice a difference unless you really
are iterating over something a million times.I also tossed some big string properties into the class, and while the
total
time went up a bit the ratio between the two stayed about the same.--Larry Garfield
Oh yeah. Huh. Didn't realize method calling was expensive. :) Another
good reason to not use setters, lol. On my machine it's ~70% slower
with "with" now (running multiple runs, using bash "time").The performance might matter when frameworks start to adapt patterns
like this. It's already in PSR, with the HTTP message interface etc.
Now I'm curious how OCaml and JS performs with a similar benchmark.
Since they both have tracing GC, and PHP uses ref counting...
Benchmarks for JS and OCaml:
https://gist.github.com/olleharstedt/eaaf1dd40be541f84aa0f3954a0ea54a
In JS, it's 5x more expensive to use spread operator vs mutate the
object fields directly (or maybe node is just not optimized well on
the ARM CPU?). In OCaml, it's a ~10% difference between destructive
update (mutable fields) and cloning. No idea what to take from that.
:)
2020-12-30 18:15 GMT, Rowan Tommins rowan.collins@gmail.com:
Uniqueness is when you only allow one reference to an object (or
bucket of memory).
[...]You can compare a builder pattern with immutability vs non-aliasing
(uniqueness):// Immutable $b = new Builder(); $b = $b->withFoo()->withBar()->withBaz(); myfun($b); // $b is immutable, so $b cannot be modified by myfun() return $b;
// Uniqueness $b = new Builder(); // class Builder is annotated as non-aliasing/unique $b->addFoo(); $b->addBar(); $b->addBaz(); myfun(clone $b); // HAVE TO CLONE TO NOT THROW EXCEPTION. return $b;
Thanks, I can see how that solves a lot of the same problems, in a very
robustly analysable way.However, from a high-level user-friendliness point of view, I think
"withX" methods are actually more natural than explicitly cloning
mutable objects.Consider the case of defining a range: firstly, with plain integers and
familiar operators:$start = 1;
$end = $start + 5;This models integers as immutable values, and + as an operator which
returns a new instance. If integers were mutable but not aliasable, we
would instead write something like this:$start = 1;
$end = clone $start;
$end += 5; // where += would be an in-place modification, not a
short-hand for assignmentI think the first more naturally expresses the desired algorithm. It's
therefore natural to want the same for a range of dates:$start = MyDate::today();
$end = $start->withAddedDays(5);vs
$start = MyDate::today();
$end = clone $start;
$end->addDays(5);
Sure, this is a good use-case for immutability. :)
To put it a different way, value types naturally form expressions,
which mutable objects model clumsily. It would be very tedious if we had
to avoid accidentally mutating the speed of light:$e = (clone $m) * ((clone $c) ** 2);
Using a variable on right-hand side does not automatically create an
alias, so in the above case you don't have to use clone.
A more motivating example for uniqueness is perhaps a query builder.
$query = (new Query())
->select(1)
->from('foo')
->where(...)
->orderBy(..)
->limit();
doSomething($query);
doSomethingElse($query);
In the above snippet, we don't know if doSomething() will change
$query and cause a bug. The issue can be solved with an immutable
builder, using withSelect(), withWhere(), etc, OR it's solved with
uniqueness, forcing a clone to avoid creating a new alias (passing
$query to a function creates an alias inside that function). The
optimisation is the same as in my previous example, avoiding copying
$query multiple times during build-up.
The guarantee in both above snippets is that myfun() DOES NOT modify
$b before returning it. BUT with immutability, you have to copy $b
three times, with uniqueness only one.I wonder if that difference can be optimised out by the
compiler/OpCache: detect clones that immediately replace their original,
and optimise it to an in-place modification. In other words, compile
$foo = clone $foo with { x: 42 } to $foo->x = 42, even if the clone is
actually in a "withX" method.
I guess OCaml/Haskell does stuff like this, since everything is
immutable by default there. Let's ask them? Unless someone here
already knows? :)
Olle
A more motivating example for uniqueness is perhaps a query builder.
$query = (new Query()) ->select(1) ->from('foo') ->where(...) ->orderBy(..) ->limit(); doSomething($query); doSomethingElse($query);
In the above snippet, we don't know if doSomething() will change
$query and cause a bug. The issue can be solved with an immutable
builder, using withSelect(), withWhere(), etc, OR it's solved with
uniqueness, forcing a clone to avoid creating a new alias (passing
$query to a function creates an alias inside that function). The
optimisation is the same as in my previous example, avoiding copying
$query multiple times during build-up.
For a query builder, I probably wouldn't make it immutable anyway, myself. If you really want to force that doSomething() cannot modify the object that is otherwise mutable, calling doSomething(clone $query) already works today and gets that net effect, provided that Query is safe to clone. (Vis, has no service dependencies, and if it has any dependent value objects then it has a __clone() method that deep clones.)
The guarantee in both above snippets is that myfun() DOES NOT modify
$b before returning it. BUT with immutability, you have to copy $b
three times, with uniqueness only one.
Yes, but with CoW those 3 copies are not that expensive, so we can most of the time ignore them except as a very micro-optimization. (See previous email.)
--Larry Garfield
Uniqueness is when you only allow one reference to an object (or
bucket of memory).
[...]You can compare a builder pattern with immutability vs non-aliasing
(uniqueness):// Immutable $b = new Builder(); $b = $b->withFoo()->withBar()->withBaz(); myfun($b); // $b is immutable, so $b cannot be modified by myfun() return $b;
// Uniqueness $b = new Builder(); // class Builder is annotated as non-aliasing/unique $b->addFoo(); $b->addBar(); $b->addBaz(); myfun(clone $b); // HAVE TO CLONE TO NOT THROW EXCEPTION. return $b;
Thanks, I can see how that solves a lot of the same problems, in a very robustly analysable way.
However, from a high-level user-friendliness point of view, I think "withX" methods are actually more natural than explicitly cloning mutable objects.
"User-friendliness" of this nature is in the eye of the beholder.
A different perspective is that "withX" methods require a mental translation where "addX" methods do not, much like how a person whose native language is English will find it a challenge to (or cannot) "think" in French.
Consider the case of defining a range: firstly, with plain integers and familiar operators:
$start = 1;
$end = $start + 5;This models integers as immutable values, and + as an operator which returns a new instance. If integers were mutable but not aliasable, we would instead write something like this:
$start = 1;
$end = clone $start;
$end += 5; // where += would be an in-place modification, not a short-hand for assignmentI think the first more naturally expresses the desired algorithm. It's therefore natural to want the same for a range of dates:
$start = MyDate::today();
$end = $start->withAddedDays(5);vs
$start = MyDate::today();
$end = clone $start;
$end->addDays(5);
Ignoring that you are comparing apples and oranges (scalars to objects,) the latter is easier to reason about IMO.
To put it a different way, value types naturally form expressions, which mutable objects model clumsily. It would be very tedious if we had to avoid accidentally mutating the speed of light:
$e = (clone $m) * ((clone $c) ** 2);
The guarantee in both above snippets is that myfun() DOES NOT modify
$b before returning it. BUT with immutability, you have to copy $b
three times, with uniqueness only one.I wonder if that difference can be optimised out by the compiler/OpCache: detect clones that immediately replace their original, and optimise it to an in-place modification. In other words, compile $foo = clone $foo with { x: 42 } to $foo->x = 42, even if the clone is actually in a "withX" method.
-Mike
Hi Mike and Olle,
A different perspective is that "withX" methods require a mental translation where "addX" methods do not, much like how a person whose native language is English will find it a challenge to (or cannot) "think" in French.
I wonder if that's just about the choice of names, rather than the
mutability/immutability itself?
$start = MyDate::today();
$end = $start->withAddedDays(5);vs
$start = MyDate::today();
$end = clone $start;
$end->addDays(5);
Ignoring that you are comparing apples and oranges (scalars to objects,) the latter is easier to reason about IMO.
Ignoring the distinction between "scalar" and "object" was kind of the
point: they are both "values", and are more naturally treated the same
as differently.
To take a different example, consider writing a new number type (for
arbitrary precision, or complex numbers, or whatever), with an "add" method.
The mutable version looks something like this:
public function add($other) {
$this->value = $this->value + $other;
}
and has to be used like this:
$start = new MyNumber(1);
$end = clone $start;
$end->add(5);
The immutable version might look more like this:
public function add($other) {
return clone $this with { value: $this->value + $other };
}
and is used like this:
$start = new MyNumber(1);
$end = $start->add(5);
That's much closer to the "$end = $start + 5;" we're used to.
To put it a different way, value types naturally formexpressions,
which mutable objects model clumsily. It would be very tedious if we had
to avoid accidentally mutating the speed of light:$e = (clone $m) * ((clone $c) ** 2);
Using a variable on right-hand side does not automatically create an
alias, so in the above case you don't have to use clone.
Whether or not the type system forced you to, you'd have to use clone if
the values were implemented as mutable. Switching to methods again may
make that clearer:
$c = new MyNumber(299_792_458);
$m = new MyNumber(10);
$e = $m->multiply( $c->square() );
If multiply() and square() are mutating state, rather than returning new
instances, $c is now 89875517873681764, which is going to totally mess
up the universe...
Regards,
--
Rowan Tommins
[IMSoP]
2020-12-31 12:37 GMT, Rowan Tommins rowan.collins@gmail.com:
Hi Mike and Olle,
A different perspective is that "withX" methods require a mental
translation where "addX" methods do not, much like how a person whose
native language is English will find it a challenge to (or cannot) "think"
in French.I wonder if that's just about the choice of names, rather than the
mutability/immutability itself?$start = MyDate::today();
$end = $start->withAddedDays(5);vs
$start = MyDate::today();
$end = clone $start;
$end->addDays(5);
Ignoring that you are comparing apples and oranges (scalars to objects,)
the latter is easier to reason about IMO.Ignoring the distinction between "scalar" and "object" was kind of the
point: they are both "values", and are more naturally treated the same
as differently.To take a different example, consider writing a new number type (for
arbitrary precision, or complex numbers, or whatever), with an "add"
method.The mutable version looks something like this:
public function add($other) {
$this->value = $this->value + $other;
}and has to be used like this:
$start = new MyNumber(1);
$end = clone $start;
$end->add(5);The immutable version might look more like this:
public function add($other) {
return clone $this with { value: $this->value + $other };
}and is used like this:
$start = new MyNumber(1);
$end = $start->add(5);That's much closer to the "$end = $start + 5;" we're used to.
To put it a different way, value types naturally formexpressions,
which mutable objects model clumsily. It would be very tedious if we had
to avoid accidentally mutating the speed of light:$e = (clone $m) * ((clone $c) ** 2);
Using a variable on right-hand side does not automatically create an
alias, so in the above case you don't have to use clone.Whether or not the type system forced you to, you'd have to use clone if
the values were implemented as mutable. Switching to methods again may
make that clearer:$c = new MyNumber(299_792_458);
$m = new MyNumber(10);
$e = $m->multiply( $c->square() );If multiply() and square() are mutating state, rather than returning new
instances, $c is now 89875517873681764, which is going to totally mess
up the universe...Regards,
--
Rowan Tommins
[IMSoP]
Yes, of course you can find use-cases where immutability is a better
choice, just like I can find use-cases where (constrained) mutability
is better. The point is not to replace one tool with another, but
rather adding another tool to the toolbox. The web dev discourse is
one-sided with regard to immutability, I think. Wish I had time to
implement a PR to Psalm to show something more concrete... Again, if
you only have a hammer, everything looks like a nail. :)
Olle
2020-12-31 12:37 GMT, Rowan Tommins rowan.collins@gmail.com:
To put it a different way, value types naturally formexpressions,
which mutable objects model clumsily. It would be very tedious if we had
to avoid accidentally mutating the speed of light:$e = (clone $m) * ((clone $c) ** 2);
Using a variable on right-hand side does not automatically create an
alias, so in the above case you don't have to use clone.Whether or not the type system forced you to, you'd have to use clone if
the values were implemented as mutable. Switching to methods again may
make that clearer:$c = new MyNumber(299_792_458);
$m = new MyNumber(10);
$e = $m->multiply( $c->square() );If multiply() and square() are mutating state, rather than returning new
instances, $c is now 89875517873681764, which is going to totally mess
up the universe...Regards,
--
Rowan Tommins
[IMSoP]Yes, of course you can find use-cases where immutability is a better
choice, just like I can find use-cases where (constrained) mutability
is better. The point is not to replace one tool with another, but
rather adding another tool to the toolbox. The web dev discourse is
one-sided with regard to immutability, I think. Wish I had time to
implement a PR to Psalm to show something more concrete... Again, if
you only have a hammer, everything looks like a nail. :)Olle
The web dev discourse is
one-sided with regard to immutability,
Yes, if you've heard any of the regular whining about PSR-7 being an immutable object you'd think it's one-sided in favor of mutability. ;-)
As you say, the point here is to add tools. Right now, doing immutability in PHP in syntactically clumsy and ugly. We want to fix that, and that has to include some means of "give me a new value based on this existing value but with some difference." (aka, exactly what with-er methods do, although I agree entirely that if you have the option of less generic names, use them).
So, can we get back to the original post, which is proposing specifics of the tools to make that happen? :-) (Asymmetric visibility and clone-with, specifically.)
--Larry Garfield
The web dev discourse is one-sided with regard to immutability,
Yes, if you've heard any of the regular whining about PSR-7 being an immutable object you'd think it's one-sided in favor of mutability.
To characterize it as "whining" is pretentious. The complaints against the incomplete and inconsistent immutability of PSR-7 have merit.
As one of the sponsors of PSR-7, I have come to regard its quasi-immutability as one of its main weaknesses, one that we of FIG should have (but failed) to predict would be more trouble than it was worth. The fewer people who use it, the better.
--
Paul M. Jones
pmjones@pmjones.io
http://paul-m-jones.com
Modernizing Legacy Applications in PHP
https://leanpub.com/mlaphp
Solving the N+1 Problem in PHP
https://leanpub.com/sn1php
The complaints against the incomplete and inconsistent immutability of PSR-7 have merit.
The big mistake of PSR-7, in my view, is mixing immutable objects with
streams, which are inherently mutable/stateful. I wonder if there are
any lessons to be learned there in terms of what kinds of immutability
the language should encourage / make easy.
For instance, is it better to constrain entire objects to be immutable
rather than individual properties? And should there be restrictions on
what you can declare as immutable, since "immutable resource" is largely
nonsensical?
Or is it rather a reflection that building purely immutable
implementations is hard, and the language needs other facilities
(monads? ownership?) to mix in those parts that don't fit?
Regards,
--
Rowan Tommins
[IMSoP]
The complaints against the incomplete and inconsistent immutability of PSR-7 have merit.
The big mistake of PSR-7, in my view, is mixing immutable objects with
streams, which are inherently mutable/stateful. I wonder if there are
any lessons to be learned there in terms of what kinds of immutability
the language should encourage / make easy.For instance, is it better to constrain entire objects to be immutable
rather than individual properties? And should there be restrictions on
what you can declare as immutable, since "immutable resource" is largely
nonsensical?Or is it rather a reflection that building purely immutable
implementations is hard, and the language needs other facilities
(monads? ownership?) to mix in those parts that don't fit?Regards,
I rarely hear that called out as a PSR-7 complaint specifically, in practice, but moving on...
IMO, it's better to put the focus on immutable properties. There are use cases where you want only some properties to be immutable, but not the whole class. If you do want the whole class, then marking all the properties immutable is effectively the same thing.
Though, again, in practice, at least in PHP, I don't think immutable properties should be the goal. Asymmetric visibility lets us built safely immutable-from-the-outside objects that are "up to you" on the inside. I think that gives us a better end result given the nature of PHP. In other languages that wouldn't make as much sense, but PHP is what it is.
Copy on write makes "immutable data structures" not something we need to build explicitly; we get "close enough" for free. If you really wanted to micro-optimize the memory and CPU usage further than that... go build it in Rust instead.
Wrapping immutable behavior around IO is... ugly, gross, and disgusting. :-) Even supposedly Twue Pure languages like Haskell don't actually do that; it just hides the IO mutability in the engine and whistles innocently while muttering "monad" under its breath. :-)
At least in the abstract, immutability and IO would require IO primitives that returned both a read value and a new stream pointer, possibly consuming and destroying the old one. If a stream is seekable, you could conceptually do something like:
$fp = file_open('foo.txt');
[$line, $fp2] = read_line($fp);
In which $fp2 and $fp refer to the same file stream on disk, but their stream pointers are different. In practice you'd likely use $fp as the second parameter and lose the old reference, which is good enough. That's a bit clumsy, though, and where one might use something monad-based to make the code a bit simpler. I don't know off hand what that would look like, though.
[$line1, $fp2] = read_line($fp);
[$line2, $fp2] = read_line($fp);
[$line3, $fp2] = read_line($fp);
[$line4, $fp2] = read_line($fp);
In the above example, since $fp is never overwritten, all 4 $line variables are the same thing.
If a stream is not seekable, then it would have to consume and destroy $fp in the process (unset it). So:
[$line1, $fp2] = read_line($fp);
[$line2, $fp2] = read_line($fp);
The second line would throw an error that $fp "has been consumed" or something like that. But even that still creates potential for spooky-action-at-a-distance if $fp was passed into a function, gets read in that function, and then the parent call scope has a broken $fp lying around.
IO is inherently impure and mutable, and always will be. I don't think it's realistic for us to fix that, certainly not in PHP. Instead we should be making it easy to encapsulate the IO into safe corners where you can guard it carefully and keep everything else as pure as reasonable.
All of which is why the scope I'm looking at is not how to make PHP Haskell-esque pure. It's how do we make it more ergonomic for developers to build stateless code in those places where it's reasonable to do so. It's a much less ambitious, but therefore more achievable, scope.
--Larry Garfield
2021-01-02 21:25 GMT, Larry Garfield larry@garfieldtech.com:
The complaints against the incomplete and inconsistent immutability of
PSR-7 have merit.The big mistake of PSR-7, in my view, is mixing immutable objects with
streams, which are inherently mutable/stateful. I wonder if there are
any lessons to be learned there in terms of what kinds of immutability
the language should encourage / make easy.For instance, is it better to constrain entire objects to be immutable
rather than individual properties? And should there be restrictions on
what you can declare as immutable, since "immutable resource" is largely
nonsensical?Or is it rather a reflection that building purely immutable
implementations is hard, and the language needs other facilities
(monads? ownership?) to mix in those parts that don't fit?Regards,
I rarely hear that called out as a PSR-7 complaint specifically, in
practice, but moving on...IMO, it's better to put the focus on immutable properties. There are use
cases where you want only some properties to be immutable, but not the whole
class. If you do want the whole class, then marking all the properties
immutable is effectively the same thing.Though, again, in practice, at least in PHP, I don't think immutable
properties should be the goal. Asymmetric visibility lets us built safely
immutable-from-the-outside objects that are "up to you" on the inside. I
think that gives us a better end result given the nature of PHP. In other
languages that wouldn't make as much sense, but PHP is what it is.Copy on write makes "immutable data structures" not something we need to
build explicitly; we get "close enough" for free. If you really wanted to
micro-optimize the memory and CPU usage further than that... go build it in
Rust instead.
Correct me if I'm wrong, but copy-on-write is only beneficial with
values, not references to values (objects)? When you clone with a
with
method, you always write, so you always have to copy. And
objects are already free to pass around without any copying happening
automatically (as is the case with arrays, which have value semantics,
which is why copy-on-write was implemented to not copy needlessly when
an array is only read from).
Olle
If a stream is not seekable, then it would have to consume and destroy
$fp in the process (unset it). So:[$line1, $fp2] = read_line($fp);
[$line2, $fp2] = read_line($fp);The second line would throw an error that $fp "has been consumed" or
something like that. But even that still creates potential for
spooky-action-at-a-distance if $fp was passed into a function, gets
read in that function, and then the parent call scope has a broken $fp
lying around.
Yes, that is where "uniqueness attributes" come in: in Clean, that's basically how I/O looks, but either of those scenarios would produce an error at compile time. The type system includes the constraint that the file handle must not be reachable from anywhere else when passed to the read_line function, whether that's use of the same variable after the call, assignment to an extra variable, capture by some other function, or storage in an array or record.
The same constraint can be added to custom functions, allowing the compiler to reuse the memory for, say, a large array that you're adding an item to. So you still write the code as though it was immutable, and can reason about it that way, but can also prove that it's safe to actually mutate it in place.
Similar things can be done, in a slightly different way, with Rust's ownership/lifetime system: the "borrow checker" proves that the manipulations you're doing are free of "action at a distance" by prohibiting anything that would create ambiguous "ownership".
Regards,
--
Rowan Tommins
[IMSoP]
Hi Rowan and all,
I apologize in advance for the wall-of-text; the short questions lead to long answers.
The complaints against the incomplete and inconsistent immutability of PSR-7 have merit.
The big mistake of PSR-7, in my view, is mixing immutable objects with streams, which are inherently mutable/stateful.
(/me nods along)
Streams were noted as the one explicit exception to immutability in PSR-7. Looking back on it, that should have been a sign to us that immutability should not have been a goal.
(Re: PSR-7 but not re: immutability, there is at least one more mistake born of the original intent, that seemed reasonable at the time but in hindsight was another poor decision: it combines the concerns of the HTTP message with the concerns of middleware communication. This observation is attributable to [deleted] at Reddit ...
https://www.reddit.com/r/PHP/comments/5ojqr0/q_how_many_psr7_implementations_exist_a_zero/dcjxtxl/
... stating "[T]he true intent of PSR-7 [is] not to be an HTTP message standard, but to be middleware IO standard, which happens to be mostly (but not only) an HTTP message." It's an accurate assessment.)
I wonder if there are any lessons to be learned there in terms of what kinds of immutability the language should encourage / make easy.
I think there are; I wrote about at least some of them here ...
https://paul-m-jones.com/post/2016/09/06/avoiding-quasi-immutable-objects-in-php/
... in which I conclude that, if you want to build a truly immutable object in PHP, it appears the best approach is the following:
-
Default to using only scalars and nulls as properties.
-
Avoid streams as properties; if a property must be a stream, make sure that it is read-only, and that its state is restored each time it is used.
-
Avoid objects as properties; if a property must be an object, make sure that object is itself immutable.
-
Especially avoid arrays as properties; use only with extreme caution and care.
-
Implement __set() to disallow setting of undefined properties.
-
Possibly implement __unset() to warn that the object is immutable.
I put those conclusions (and other resulting from that article) into an implementation described here:
http://paul-m-jones.com/post/2019/02/04/immutability-package-for-php/
is it better to constrain entire objects to be immutable rather than individual properties?
I would say that the entire object ought to be immutable, for reasons that are difficult for me to articulate.
And should there be restrictions on what you can declare as immutable, since "immutable resource" is largely nonsensical?
I think so, and I note some of those restrictions above. In particular, the immutability package ValueObject further inspects arrays to restrict their values to immutables as well.
--
Paul M. Jones
pmjones@pmjones.io
http://paul-m-jones.com
Modernizing Legacy Applications in PHP
https://leanpub.com/mlaphp
Solving the N+1 Problem in PHP
https://leanpub.com/sn1php
2021-01-01 19:14 GMT, Larry Garfield larry@garfieldtech.com:
2020-12-31 12:37 GMT, Rowan Tommins rowan.collins@gmail.com:
To put it a different way, value types naturally formexpressions,
which mutable objects model clumsily. It would be very tedious if we
had
to avoid accidentally mutating the speed of light:$e = (clone $m) * ((clone $c) ** 2);
Using a variable on right-hand side does not automatically create an
alias, so in the above case you don't have to use clone.Whether or not the type system forced you to, you'd have to use clone
if
the values were implemented as mutable. Switching to methods again may
make that clearer:$c = new MyNumber(299_792_458);
$m = new MyNumber(10);
$e = $m->multiply( $c->square() );If multiply() and square() are mutating state, rather than returning
new
instances, $c is now 89875517873681764, which is going to totally mess
up the universe...Regards,
--
Rowan Tommins
[IMSoP]Yes, of course you can find use-cases where immutability is a better
choice, just like I can find use-cases where (constrained) mutability
is better. The point is not to replace one tool with another, but
rather adding another tool to the toolbox. The web dev discourse is
one-sided with regard to immutability, I think. Wish I had time to
implement a PR to Psalm to show something more concrete... Again, if
you only have a hammer, everything looks like a nail. :)Olle
The web dev discourse is
one-sided with regard to immutability,Yes, if you've heard any of the regular whining about PSR-7 being an
immutable object you'd think it's one-sided in favor of mutability. ;-)As you say, the point here is to add tools. Right now, doing immutability
in PHP in syntactically clumsy and ugly. We want to fix that, and that has
to include some means of "give me a new value based on this existing value
but with some difference." (aka, exactly what with-er methods do, although
I agree entirely that if you have the option of less generic names, use
them).So, can we get back to the original post, which is proposing specifics of
the tools to make that happen? :-) (Asymmetric visibility and clone-with,
specifically.)
OK!
I like that you connect higher level design patterns with language
design. This is the way to go, IMO. Personally, I'd prefer support for
the Psalm notation @psalm-readonly
, which is the same as your
initonly. Clone-with makes sense too, as this construct is already
supported in multiple languages. The exact notation doesn't matter
that much - my personal choice is OCaml {record with x = 10} over JS
spread operator, but OCaml is pretty "wordy" in notation in contrast
to the C tradition that PHP is part of.
Reintroducing "objects that pass by value" is a hard pass from me. The
way forward is immutability and constrained mutability (ownership,
escape analysis, etc). Psalm also supports array shapes - maybe this
can be investigated as an alternative? Since PHP has no tuples.
I'm not convinced the added complexity of asymmetric visibility is
powerful enough to motivate its existence. Feel free to prove me
wrong. :) My choice here would be namespace "internal" (also supported
by Psalm already), but this requires implementation of namespace
visibility, a PR that was abandoned.
And also, happy new year!
Olle
The web dev discourse is
one-sided with regard to immutability,Yes, if you've heard any of the regular whining about PSR-7 being an
immutable object you'd think it's one-sided in favor of mutability. ;-)As you say, the point here is to add tools. Right now, doing immutability
in PHP in syntactically clumsy and ugly. We want to fix that, and that has
to include some means of "give me a new value based on this existing value
but with some difference." (aka, exactly what with-er methods do, although
I agree entirely that if you have the option of less generic names, use
them).So, can we get back to the original post, which is proposing specifics of
the tools to make that happen? :-) (Asymmetric visibility and clone-with,
specifically.)OK!
I like that you connect higher level design patterns with language
design. This is the way to go, IMO. Personally, I'd prefer support for
the Psalm notation@psalm-readonly
, which is the same as your
initonly. Clone-with makes sense too, as this construct is already
supported in multiple languages. The exact notation doesn't matter
that much - my personal choice is OCaml {record with x = 10} over JS
spread operator, but OCaml is pretty "wordy" in notation in contrast
to the C tradition that PHP is part of.Reintroducing "objects that pass by value" is a hard pass from me. The
way forward is immutability and constrained mutability (ownership,
escape analysis, etc). Psalm also supports array shapes - maybe this
can be investigated as an alternative? Since PHP has no tuples.I'm not convinced the added complexity of asymmetric visibility is
powerful enough to motivate its existence. Feel free to prove me
wrong. :) My choice here would be namespace "internal" (also supported
by Psalm already), but this requires implementation of namespace
visibility, a PR that was abandoned.And also, happy new year!
Happy New Year!
I agree that "objects, but passing by value" would not be the right solution. I used to think that would be a good part of the solution, but eventually concluded that it would introduce more complexity, not less. Eventually, everything people wanted to do with objects they'd want to do with "Records" (for lack of a better term), and if they pass by value but are still mutable then you have a weird situation where sometimes changes propagate and some don't (depending on if you have a record or object). Making it easier to use objects in a value-esque way will get us closer to the desired end state.
I think the tldr of my post is this: A single "immutable" flag (whatever it's called) on a class or property would require having lots of holes poked in it in order to make it useful in practice (mostly what "initonly" would do), but those holes would introduce other holes we don't want (cloning an object from the outside when you shouldn't).
Asymmetric visibility, however, doesn't give us true immutability but allows a class author to more easily emulate it "close enough", while also supporting various other use cases. Its gap is the class author, not the entire rest of the programming industry. That makes it much safer, and gets us to objects that are effectively immutable from the outside, which is what's important. (If they're mutable from the inside, either there are use cases for that or rely on the good behavior of just the class author, who is in the best position to know if the object should be internally immutable or not.)
Pairing that with an easy clone-with-change syntax would allow class authors to easily construct something that is immutable-in-practice fairly easily, even if it's not Twue Immutability(tm).
--Larry Garfield
2021-01-02 16:06 GMT, Larry Garfield larry@garfieldtech.com:
The web dev discourse is
one-sided with regard to immutability,Yes, if you've heard any of the regular whining about PSR-7 being an
immutable object you'd think it's one-sided in favor of mutability. ;-)As you say, the point here is to add tools. Right now, doing
immutability
in PHP in syntactically clumsy and ugly. We want to fix that, and that
has
to include some means of "give me a new value based on this existing
value
but with some difference." (aka, exactly what with-er methods do,
although
I agree entirely that if you have the option of less generic names, use
them).So, can we get back to the original post, which is proposing specifics
of
the tools to make that happen? :-) (Asymmetric visibility and
clone-with,
specifically.)OK!
I like that you connect higher level design patterns with language
design. This is the way to go, IMO. Personally, I'd prefer support for
the Psalm notation@psalm-readonly
, which is the same as your
initonly. Clone-with makes sense too, as this construct is already
supported in multiple languages. The exact notation doesn't matter
that much - my personal choice is OCaml {record with x = 10} over JS
spread operator, but OCaml is pretty "wordy" in notation in contrast
to the C tradition that PHP is part of.Reintroducing "objects that pass by value" is a hard pass from me. The
way forward is immutability and constrained mutability (ownership,
escape analysis, etc). Psalm also supports array shapes - maybe this
can be investigated as an alternative? Since PHP has no tuples.I'm not convinced the added complexity of asymmetric visibility is
powerful enough to motivate its existence. Feel free to prove me
wrong. :) My choice here would be namespace "internal" (also supported
by Psalm already), but this requires implementation of namespace
visibility, a PR that was abandoned.And also, happy new year!
Happy New Year!
I agree that "objects, but passing by value" would not be the right
solution. I used to think that would be a good part of the solution, but
eventually concluded that it would introduce more complexity, not less.
Eventually, everything people wanted to do with objects they'd want to do
with "Records" (for lack of a better term), and if they pass by value but
are still mutable then you have a weird situation where sometimes changes
propagate and some don't (depending on if you have a record or object).
Making it easier to use objects in a value-esque way will get us closer to
the desired end state.I think the tldr of my post is this: A single "immutable" flag (whatever
it's called) on a class or property would require having lots of holes poked
in it in order to make it useful in practice (mostly what "initonly" would
do), but those holes would introduce other holes we don't want (cloning an
object from the outside when you shouldn't).
I new language feature needs to be both simple and powerful - it's not
enough to be only powerful. A second problem I see is how asymmetric
visibility would affect the readability of a class, putting extra
strain in understanding it. Thirdly, how does PHP differ from FP
languages like OCaml and Haskell in this regard, neither who uses
visibility in this way? What's acceptable in those languages that
would be unacceptable in PHP?
Olle
I like that you connect higher level design patterns with language
design. This is the way to go, IMO. Personally, I'd prefer support for
the Psalm notation@psalm-readonly
, which is the same as your
initonly. Clone-with makes sense too, as this construct is already
supported in multiple languages. The exact notation doesn't matter
that much - my personal choice is OCaml {record with x = 10} over JS
spread operator, but OCaml is pretty "wordy" in notation in contrast
to the C tradition that PHP is part of.Reintroducing "objects that pass by value" is a hard pass from me. The
way forward is immutability and constrained mutability (ownership,
escape analysis, etc). Psalm also supports array shapes - maybe this
can be investigated as an alternative? Since PHP has no tuples.I'm not convinced the added complexity of asymmetric visibility is
powerful enough to motivate its existence. Feel free to prove me
wrong. :) My choice here would be namespace "internal" (also supported
by Psalm already), but this requires implementation of namespace
visibility, a PR that was abandoned.And also, happy new year!
Happy New Year!
I agree that "objects, but passing by value" would not be the right
solution. I used to think that would be a good part of the solution, but
eventually concluded that it would introduce more complexity, not less.
Eventually, everything people wanted to do with objects they'd want to do
with "Records" (for lack of a better term), and if they pass by value but
are still mutable then you have a weird situation where sometimes changes
propagate and some don't (depending on if you have a record or object).
Making it easier to use objects in a value-esque way will get us closer to
the desired end state.I think the tldr of my post is this: A single "immutable" flag (whatever
it's called) on a class or property would require having lots of holes poked
in it in order to make it useful in practice (mostly what "initonly" would
do), but those holes would introduce other holes we don't want (cloning an
object from the outside when you shouldn't).I new language feature needs to be both simple and powerful - it's not
enough to be only powerful. A second problem I see is how asymmetric
visibility would affect the readability of a class, putting extra
strain in understanding it. Thirdly, how does PHP differ from FP
languages like OCaml and Haskell in this regard, neither who uses
visibility in this way? What's acceptable in those languages that
would be unacceptable in PHP?Olle
I'll disagree slightly. A language feature should introduce more power than it does complexity. Not everything can be made absolutely simple, but the power it offers is worth it. I'd say it should minimize introduced complexity, relative to the power offered. Complexity ideally is super low, but it's never zero simply by virtue of being "one more thing" that developers need to know how to read.
So in this case, we need to compare the power/complexity of asymmetric visibility vs the power/complexity of "immutable... except in these situations." I would argue that asymmetric visibility is more self-documenting, because it states explicitly what those situations are.
The other point is that, as noted, "initonly" creates a gap if you have properties that are inter-dependent. Those then cannot be made public-read, because that would also mean public-clone-with, and thus allow callers to violate property relationships. Asymmetric visibility does not have that problem.
As far as other language comparisons, I've never written in OCaml and can only barely read Haskell. :-) However, the relevant points as I understand them are:
-
In strictly functional languages (Haskell, etc.), immutability is assumed by default. So the rest of the syntax, runtime behavior, and community standards are built on that assumption. That's not true in PHP.
-
Haskell at least (and I presume other strictly functional languages, although I've not dug into them in any detail at all) know you're going to be calling a bazillion functions, often recursively, and so the engine can reorder things, execute lazily, skip having a stack entirely, or do other things to make a deeply recursive function design highly performant. That's not the case in PHP, so usually an iterative algorithm is going to be more performant but requires mutating variables. So the engine is optimized for that by default.
Compare the idealized functional/immutable fibbonaci with its mutable-iterative version:
function fp_fib(int $n) {
return match($n) {
0, 1 => 1,
default => fp_fib(n-1) - fp_fib(n-2),
};
}
function fibonacci_iterative(int $n)
{
$previous = 1;
$current = 1;
$next = 1;
for ($i = 3; $i <= $n; ++i) {
$next = $current + $previous;
$previous = $current;
$current = $next;
}
return $next;
}
That means you'll often have optimizations where you want to write mutable code in the small in order to create effectively-immutable at a higher level. We do that now for with-er methods, which mutate the $new object by necessity before returning it, resulting in an object that seems immutable from the outside.
Rust is able to do some of the same kind of thing with zero cost abstraction because its variables are mostly-immutable.
- I've read very good arguments that class-level visibility control is a mistake and always was. Visibility should be at the package level, not the object level, which is what Go, Rust, and many other newer languages do. PHP doesn't have packages and it does have class-based visibility, for better or worse, and neither of those are changing any time soon. Package-level visibility provides a different level at which to have the "immutable on the outside, but internally we can optimize things" barrier that is, arguably, better. In PHP, we're stuck with class-level visibility so that's what we've got to work with.
That's all background on why I think, in PHP specifically, letting developers emulate immutability at the level they need rather than forcing it tightly at the language level is going to be a better strategy.
(But, of course, I could be convinced otherwise with sufficient demonstrated use cases, but they'd have to address the drawbacks of "immutable except for" I noted previously.)
It sounds like no one is against clone-with, though? :-) Anyone want to argue for clone-arguments?
--Larry Garfield
2021-01-03 16:55 GMT, Larry Garfield larry@garfieldtech.com:
I like that you connect higher level design patterns with language
design. This is the way to go, IMO. Personally, I'd prefer support for
the Psalm notation@psalm-readonly
, which is the same as your
initonly. Clone-with makes sense too, as this construct is already
supported in multiple languages. The exact notation doesn't matter
that much - my personal choice is OCaml {record with x = 10} over JS
spread operator, but OCaml is pretty "wordy" in notation in contrast
to the C tradition that PHP is part of.Reintroducing "objects that pass by value" is a hard pass from me. The
way forward is immutability and constrained mutability (ownership,
escape analysis, etc). Psalm also supports array shapes - maybe this
can be investigated as an alternative? Since PHP has no tuples.I'm not convinced the added complexity of asymmetric visibility is
powerful enough to motivate its existence. Feel free to prove me
wrong. :) My choice here would be namespace "internal" (also supported
by Psalm already), but this requires implementation of namespace
visibility, a PR that was abandoned.And also, happy new year!
Happy New Year!
I agree that "objects, but passing by value" would not be the right
solution. I used to think that would be a good part of the solution,
but
eventually concluded that it would introduce more complexity, not less.
Eventually, everything people wanted to do with objects they'd want to
do
with "Records" (for lack of a better term), and if they pass by value
but
are still mutable then you have a weird situation where sometimes
changes
propagate and some don't (depending on if you have a record or object).
Making it easier to use objects in a value-esque way will get us closer
to
the desired end state.I think the tldr of my post is this: A single "immutable" flag
(whatever
it's called) on a class or property would require having lots of holes
poked
in it in order to make it useful in practice (mostly what "initonly"
would
do), but those holes would introduce other holes we don't want (cloning
an
object from the outside when you shouldn't).I new language feature needs to be both simple and powerful - it's not
enough to be only powerful. A second problem I see is how asymmetric
visibility would affect the readability of a class, putting extra
strain in understanding it. Thirdly, how does PHP differ from FP
languages like OCaml and Haskell in this regard, neither who uses
visibility in this way? What's acceptable in those languages that
would be unacceptable in PHP?Olle
I'll disagree slightly. A language feature should introduce more power than
it does complexity. Not everything can be made absolutely simple, but the
power it offers is worth it. I'd say it should minimize introduced
complexity, relative to the power offered. Complexity ideally is super low,
but it's never zero simply by virtue of being "one more thing" that
developers need to know how to read.So in this case, we need to compare the power/complexity of asymmetric
visibility vs the power/complexity of "immutable... except in these
situations." I would argue that asymmetric visibility is more
self-documenting, because it states explicitly what those situations are.The other point is that, as noted, "initonly" creates a gap if you have
properties that are inter-dependent. Those then cannot be made public-read,
because that would also mean public-clone-with, and thus allow callers to
violate property relationships. Asymmetric visibility does not have that
problem.
Can you perhaps be a bit more clear on why initonly/readonly would be
a deal breaker? Seems to me like readonly would cover 80% of
use-cases? Which is to make data-value objects humane (and fast, since
you don't need getters anymore) to work with. Seems like you're
focusing too much on an edge case here. Maybe we should list the
possibly use-cases? Or at least the main target use-case.
If an object has invariants that need to hold, just throw an exception
in __clone to force use with withX() instead? Or, as you suggested,
improve __clone by giving it arguments?
Olle
I'll disagree slightly. A language feature should introduce more power than
it does complexity. Not everything can be made absolutely simple, but the
power it offers is worth it. I'd say it should minimize introduced
complexity, relative to the power offered. Complexity ideally is super low,
but it's never zero simply by virtue of being "one more thing" that
developers need to know how to read.So in this case, we need to compare the power/complexity of asymmetric
visibility vs the power/complexity of "immutable... except in these
situations." I would argue that asymmetric visibility is more
self-documenting, because it states explicitly what those situations are.The other point is that, as noted, "initonly" creates a gap if you have
properties that are inter-dependent. Those then cannot be made public-read,
because that would also mean public-clone-with, and thus allow callers to
violate property relationships. Asymmetric visibility does not have that
problem.Can you perhaps be a bit more clear on why initonly/readonly would be
a deal breaker? Seems to me like readonly would cover 80% of
use-cases? Which is to make data-value objects humane (and fast, since
you don't need getters anymore) to work with. Seems like you're
focusing too much on an edge case here. Maybe we should list the
possibly use-cases? Or at least the main target use-case.If an object has invariants that need to hold, just throw an exception
in __clone to force use with withX() instead? Or, as you suggested,
improve __clone by giving it arguments?Olle
It took a few days, but I am back with some more concrete examples. I decided to try and convert PSR-7 to the various options considered in my previous post. Here are the results:
https://peakd.com/hive-168588/@crell/object-properties-part-2-examples
Along with an analysis of the pros/cons of each. As shown there, initonly
creates backdoors that make any but the most basic cases untennable.
--Larry Garfield
I'll disagree slightly. A language feature should introduce more
power than
it does complexity. Not everything can be made absolutely simple,
but the
power it offers is worth it. I'd say it should minimize introduced
complexity, relative to the power offered. Complexity ideally is
super low,
but it's never zero simply by virtue of being "one more thing" that
developers need to know how to read.So in this case, we need to compare the power/complexity of asymmetric
visibility vs the power/complexity of "immutable... except in these
situations." I would argue that asymmetric visibility is more
self-documenting, because it states explicitly what those situations
are.The other point is that, as noted, "initonly" creates a gap if you have
properties that are inter-dependent. Those then cannot be made
public-read,
because that would also mean public-clone-with, and thus allow callers
to
violate property relationships. Asymmetric visibility does not have
that
problem.Can you perhaps be a bit more clear on why initonly/readonly would be
a deal breaker? Seems to me like readonly would cover 80% of
use-cases? Which is to make data-value objects humane (and fast, since
you don't need getters anymore) to work with. Seems like you're
focusing too much on an edge case here. Maybe we should list the
possibly use-cases? Or at least the main target use-case.If an object has invariants that need to hold, just throw an exception
in __clone to force use with withX() instead? Or, as you suggested,
improve __clone by giving it arguments?Olle
It took a few days, but I am back with some more concrete examples. I
decided to try and convert PSR-7 to the various options considered in my
previous post. Here are the results:https://peakd.com/hive-168588/@crell/object-properties-part-2-examples
Along with an analysis of the pros/cons of each. As shown there,
initonly
creates backdoors that make any but the most basic cases
untennable.--Larry Garfield
Thanks for dwelling into this.
However, one can already have asymmetric visibility in PHP, just declare a
__get() handler.
Sure it is slow due to the VM -> User code -> VM jumps but it is possible.
Moreover, asymmetric visibility does not prevent mutating an object by
calling the constructor once again as follows:
$obj->__construct(...$args);
This is IMHO the main reason why we want immutability/init only, not to
reduce getter methods or wither methods, even if this makes some of them
redundant.
Also clone-with {} and clone "arguments" could very well be combined by
having the props list being passed to the clone-with instruction as a
$cloneContext array only available in __clone(), similar to how
$http_response_header is populated. [1]
The advantages I see in such a construct is that clone-with can handle any
type concerns (single/union, enums, literals, typed arrays, generics
if/when we get them) for the properties before passing them even to
__clone().
If no __clone() handler is defined then it can just assign them but if
there needs to be one to handle extra validation, such as the type not
being sufficient or a property being dependent on another you are already
guaranteed that the property only needs minimal extra validation.
As such I still believe immutability and asymmetric visibility are
orthogonal features which might be related but fundamentally solve
different problems.
One is about data integrity, the other is about removing getters/setters.
Best regards,
George P. Banyard
[1] https://www.php.net/manual/en/reserved.variables.httpresponseheader.php
Moreover, asymmetric visibility does not prevent mutating an object by
calling the constructor once again as follows:
$obj->__construct(...$args);
That's pretty trivial to work around: mark the constructor private and provide one or more public static methods that call it. That's actually a pretty common and useful design in its own right.
It's also something that could probably be banned at the language level. I seem to remember it being discussed before, but the details aren't quite trivial because you need to allow parent::__construct etc. I'd rather spend the time to work out those details than design the rest of the language around it being possible, if it's really that much of an issue.
Regards,
--
Rowan Tommins
[IMSoP]
It took a few days, but I am back with some more concrete examples. I
decided to try and convert PSR-7 to the various options considered in my
previous post. Here are the results:https://peakd.com/hive-168588/@crell/object-properties-part-2-examples
Along with an analysis of the pros/cons of each. As shown there,
initonly
creates backdoors that make any but the most basic cases
untennable.--Larry Garfield
Thanks for dwelling into this.
However, one can already have asymmetric visibility in PHP, just declare a
__get() handler.
Sure it is slow due to the VM -> User code -> VM jumps but it is possible.
Many things are technically possible, but only in lame ways. The __get() callback is one such lame way of doing many things. The existence of a lame workaround for something hasn't stopped us from improving the developer experience of the language before nor should it now.
Moreover, asymmetric visibility does not prevent mutating an object by
calling the constructor once again as follows:
$obj->__construct(...$args);
I agree with Rowan's point here. This is a bug in the language. I've never actually seen that bug exploited in the wild, but the answer here is to fix that bug, not to use it as justification to not improve the language.
This is IMHO the main reason why we want immutability/init only, not to
reduce getter methods or wither methods, even if this makes some of them
redundant.
The "main reason" for immutability depends on who you ask. :-) My original post laid out some of the main arguments I've seen. Which one is the "main reason" is subjective and I don't think there's any clear consensus on it. Fortunately, if we do it right we can all get what we want out of it and it doesn't matter which benefit was more important in hindsight.
Also clone-with {} and clone "arguments" could very well be combined by
having the props list being passed to the clone-with instruction as a
$cloneContext array only available in __clone(), similar to how
$http_response_header is populated. [1]
The advantages I see in such a construct is that clone-with can handle any
type concerns (single/union, enums, literals, typed arrays, generics
if/when we get them) for the properties before passing them even to
__clone().
If no __clone() handler is defined then it can just assign them but if
there needs to be one to handle extra validation, such as the type not
being sufficient or a property being dependent on another you are already
guaranteed that the property only needs minimal extra validation.
That would entail assigning the properties first, then allowing __clone() to override if desired, if I understand you correctly. That means the object is in an invalid state at least for a time. I'm not wild about that. It would also change the logic of when __clone() happens, which right now is immediately after the object is duplicated. What you're suggesting is changing it to:
- Duplicate object
- Assign with'ed properties
- Call __clone(), which could throw
To be fair, I didn't consider where the with'ed properties would be assigned relative to __clone() in my writeup. (I should perhaps have done so.)
But that still doesn't resolve the issue of all the validation being shoved into one big method. It only removes the "just assign it blindly" default case. All of the other validation is still needed, and still just as fugly.
As such I still believe immutability and asymmetric visibility are
orthogonal features which might be related but fundamentally solve
different problems.
One is about data integrity, the other is about removing getters/setters.
Disagree. Asymmetric visibility achieves nearly all the same end results as initonly, but without introducing data integrity problems or fugly hacks (__clone()) to resolve them. The main takeaway from my experimentation, as I see it, is that initonly offers very little in practice in the way of data integrity guarantees beyond what asymmetric visibility does. Only in the trivial case where a property is fully validated by the type system automatically and has no inter-dependencies does it have any benefit. And the benefit is, actually, only moving the clone with
statement from inside a single-expression method (which I'm hoping to make simpler, as noted) to the calling code. I'm not sure that's always a net win.
--Larry Garfield
Yes, of course you can find use-cases where immutability is a better
choice, just like I can find use-cases where (constrained) mutability
is better. The point is not to replace one tool with another, but
rather adding another tool to the toolbox. The web dev discourse is
one-sided with regard to immutability, I think. Wish I had time to
implement a PR to Psalm to show something more concrete... Again, if
you only have a hammer, everything looks like a nail. :)
Certainly, I didn't mean to say that immutability was always the perfect
choice. I think it's popular because it's an easy hammer to borrow from
the fashionable Functional Programming toolbox - you can get a lot of
its advantages without much support from the language, and it genuinely
fits a lot of use cases encountered in high-level programming.
Where ownership concepts seem to shine is where immutability is either
impossible (e.g. consuming from a network stream or an event queue) or
otherwise undesirable (e.g. working with large amounts of data, or
tightly optimised code).
I read a bit about Uniqueness Attributes in Clean [1] and it seems they
are implemented there so that the user can treat everything as
immutable, but the compiler can safely mutate underlying structures.
So in that implementation at least, a "mutable record" would in fact be
implemented with the equivalent of "clone ... with", so that it appeared
from the outside to return a new instance each time.
It's certainly an interesting concept, particularly for the I/O case
(where immutability is genuinely not an option) but how easy it would be
to retro-fit to a dynamic language like PHP I'm not sure.
[1] https://cloogle.org/doc/#_9
Regards,
--
Rowan Tommins
[IMSoP]
There's been a number of discussions of late around property visibility and how to make objects more immutable. Since it seems to have been well-received in the past, I decided to do a complete analysis and context of the various things that have been floated about recently.
The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
Thanks for the nice write up Larry!
Is there a reason you didn't mention the proposal for immutable classes?
(probably because it never went into a final RFC)
Two main reasons:
-
It's not been discussed recently (see how old the dates are on those messages), so I wasn't thinking about it.
-
An immutable class would in all practicality be the same as a class where all the properties are initonly (or writeonce, but that was already rejected). So any arguments for/against initonly apply in aggregate to an immutable class.
--Larry Garfield
2020-12-29 15:38 GMT, Larry Garfield larry@garfieldtech.com:
There's been a number of discussions of late around property visibility
and how to make objects more immutable. Since it seems to have been
well-received in the past, I decided to do a complete analysis and
context of the various things that have been floated about recently.The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
Thanks for the nice write up Larry!
Is there a reason you didn't mention the proposal for immutable classes?
(probably because it never went into a final RFC)Two main reasons:
It's not been discussed recently (see how old the dates are on those
messages), so I wasn't thinking about it.An immutable class would in all practicality be the same as a class where
all the properties are initonly (or writeonce, but that was already
rejected). So any arguments for/against initonly apply in aggregate to an
immutable class.--Larry Garfield
--
To unsubscribe, visit: https://www.php.net/unsub.php
Instead of shoe-horning everything into the PHP object system, did
anyone consider adding support for records instead, which would always
be immutable, and could support the spread operator for cloning-with
similar as in JavaScript or OCaml? They could be based on PHP arrays
and thus be passed by value.
Olle
Instead of shoe-horning everything into the PHP object system, did
anyone consider adding support for records instead, which would always
be immutable, and could support the spread operator for cloning-with
similar as in JavaScript or OCaml? They could be based on PHP arrays
and thus be passed by value.
While we could create a brand new "record" or "struct" type, I think
there are a few reasons to think it would end up looking more like
objects than arrays:
- we have an established syntax for declaring types of object (class Foo
{...}), and none for declaring types of array - the 'bar' in $foo['bar'] is an expression, implying dynamic options;
the bar in $foo->bar is a bare identifier, implying statically defined
options - similarly, we have a syntax for creating object instances, with
statically analysable members: new Foo(bar: 42)
The spread operator could be made to work with either style, if we
preferred it to using "clone ... with ...":
- ['bar'=>69, ...$existingFoo]
- new Foo(bar: 69, ...$existingFoo)
However, arrays arguably already have a clone-with syntax, more normally
thought of as "copy-on-write". Rather than "mutable with special logic
to pass and assign by value", I think you can model their behaviour as
"immutable with special logic to clone with modifications":
$foo = ['bar'=>42, 'baz'=>101];
$newFoo = $foo; // lazy assignment by value is indistinguishable from
assignment by pointer
$newFoo['bar'] = 69; // $newFoo is a modified clone of $foo
$newFoo['bar'] = 72; // mutating $newFoo in place is indistinguishable
from creating and assigning another modified clone
In theory, "records" could have this ability with object-like syntax:
$foo = new Foo(bar: 42, baz: 101);
$newFoo = $foo;
$newFoo->bar = 69; // $newFoo is a modified clone
$newFoo->bar = 72; // can be optimised as in-place modification, but
conceptually cloning again
In the simple case, that's equivalent to a clone-with:
$foo = new Foo(bar: 42, baz: 101);
$newFoo = clone $foo with { bar: 69 };
$newFoo = clone $newFoo with { bar: 72 }; // can probably be optimised
the same way as the above examples
It would allow more complex modifications, though, such as deep
modification:
$foo = new Foo(bar: new Bar(name: 'Bob'));
$newFoo = $foo;
$newFoo->bar->name = 'Robert';
That last line would do the same as this:
$newFoo = clone $newFoo with { bar: clone $newFoo->bar with { name:
'Robert' }};
How desirable that is, and how it fits with the use cases in Larry's
post, I'm not sure.
Regards,
--
Rowan Tommins
[IMSoP]
2020-12-29 22:43 GMT, Rowan Tommins rowan.collins@gmail.com:
Instead of shoe-horning everything into the PHP object system, did
anyone consider adding support for records instead, which would always
be immutable, and could support the spread operator for cloning-with
similar as in JavaScript or OCaml? They could be based on PHP arrays
and thus be passed by value.While we could create a brand new "record" or "struct" type, I think
there are a few reasons to think it would end up looking more like
objects than arrays:
- we have an established syntax for declaring types of object (class Foo
{...}), and none for declaring types of array- the 'bar' in $foo['bar'] is an expression, implying dynamic options;
the bar in $foo->bar is a bare identifier, implying statically defined
options- similarly, we have a syntax for creating object instances, with
statically analysable members: new Foo(bar: 42)The spread operator could be made to work with either style, if we
preferred it to using "clone ... with ...":
- ['bar'=>69, ...$existingFoo]
- new Foo(bar: 69, ...$existingFoo)
However, arrays arguably already have a clone-with syntax, more normally
thought of as "copy-on-write". Rather than "mutable with special logic
to pass and assign by value", I think you can model their behaviour as
"immutable with special logic to clone with modifications":$foo = ['bar'=>42, 'baz'=>101];
$newFoo = $foo; // lazy assignment by value is indistinguishable from
assignment by pointer
$newFoo['bar'] = 69; // $newFoo is a modified clone of $foo
$newFoo['bar'] = 72; // mutating $newFoo in place is indistinguishable
from creating and assigning another modified cloneIn theory, "records" could have this ability with object-like syntax:
$foo = new Foo(bar: 42, baz: 101);
$newFoo = $foo;
$newFoo->bar = 69; // $newFoo is a modified clone
$newFoo->bar = 72; // can be optimised as in-place modification, but
conceptually cloning againIn the simple case, that's equivalent to a clone-with:
$foo = new Foo(bar: 42, baz: 101);
$newFoo = clone $foo with { bar: 69 };
$newFoo = clone $newFoo with { bar: 72 }; // can probably be optimised
the same way as the above examplesIt would allow more complex modifications, though, such as deep
modification:$foo = new Foo(bar: new Bar(name: 'Bob'));
$newFoo = $foo;
$newFoo->bar->name = 'Robert';That last line would do the same as this:
$newFoo = clone $newFoo with { bar: clone $newFoo->bar with { name:
'Robert' }};How desirable that is, and how it fits with the use cases in Larry's
post, I'm not sure.Regards,
--
Rowan Tommins
[IMSoP]--
To unsubscribe, visit: https://www.php.net/unsub.php
Good breakdown. One benefit of records is that they can be
structurally typed (instead of nominally, as classes are), but that's
probably never going to happen in PHP. :) Perhaps a readonly
attribute is best for now? Compare with the annotation supported by
Psalm: https://psalm.dev/docs/annotating_code/supported_annotations/#psalm-readonly-and-readonly
Olle
Am 28.12.20 um 21:23 schrieb Larry Garfield:
There's been a number of discussions of late around property visibility and how to make objects more immutable. Since it seems to have been well-received in the past, I decided to do a complete analysis and context of the various things that have been floated about recently.
The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
A really nice writeup and interesting to read.
But I have a question:
We then end up with the following combinations:
- public read, private write
- public read, private read, init write
- public none, private write
- public none, private read
- public none, private read, init write
What is the difference between
(a) "public none, private read" and
(b) "public none, private read, init"
write"? When will (a) be initialized?
And if there is really a useful case for (a) why is there no "public
read, private read"?
Regards
Thomas
On Mon, Dec 28, 2020 at 9:24 PM Larry Garfield larry@garfieldtech.com
wrote:
There's been a number of discussions of late around property visibility
and how to make objects more immutable. Since it seems to have been
well-received in the past, I decided to do a complete analysis and context
of the various things that have been floated about recently.The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
Thanks for the analysis Larry! I want to add a couple of thoughts from my
side.
First of all, I think it's pretty clear that "asymmetric visibility" is the
approach that gives us most of what we want for the least amount of effort.
Asymmetric visibility has clear semantics, is (presumably) trivial to
implement, and gives immutability guarantees that are "good enough" for
most practical purposes. It's the pragmatic choice, and PHP is all about
pragmatism...
That said, I don't think that asymmetric visibility is the correct solution
to this problem space -- I don't think asymmetric visibility is ever (or
only very rarely) what we actually want, it's just a good enough
approximation. Unfortunately, the alternatives are more complex, and we
have a limited budget on complexity.
Here are the pieces that I think would make up a proper solution to this
space:
- initonly properties. This is in the sense of the previous "write once
properties" proposal, though initonly is certainly the better name for the
concept. Initonly properties represent complete immutability both inside
and outside the class, and I do believe that this is the most common form
of immutability needed (if it is needed at all).
Of course, as you correctly point out, initonly properties are incompatible
with wither patterns that rely on clone-then-modify implementations. I
think that ultimately, the "wither pattern" is an artifact of the fact that
PHP only supports objects with by-handle semantics. The "wither pattern"
emulates objects with by-value semantics, in a way that is verbose and
inefficient.
I do want to point out that your presentation of copy-on-write when it
comes to withers is not entirely correct: When you clone an object, this
will always result in a full copy of the object, including all its
properties. If you call a sequence of 5 wither methods, then this will
create five objects and perform a copy of all properties every time. There
is really no copy-on-write involved here, apart from the fact that property
values (though not the property storage) can still be shared.
- This brings us to: Objects with by-value semantics. This was discussed
in the thread, but I felt like it was dismissed a bit prematurely.
Ultimately, by-value semantics for objects is what withers are emulating.
PSR-7 isn't "immutable", it's "mutable by-value". "Immutable + withers" is
just a clumsy way to emulate that. If by-value objects were supported, then
there would be no need for wither methods, and the "clone-then-modify"
incompatibility of initonce properties would not be a problem in practice.
You just write $request->method = 'POST' and this will either efficiently
modify the request in-place (if you own it) or clone it and then modify it
(if it is shared).
Another area where by-value objects are useful are data structures. PHP's
by-value array type is probably one of those few instances where PHP got
something right in a major way, that many other languages got wrong. But
arrays have their own issues, in particular in how they try to service both
lists and dictionaries at the same time, and fail where those intersect
(dictionaries with integer keys or numeric string keys). People regularly
suggest that we should be adding dedicated vector and dictionary objects,
and one of the issues with that is that the resulting objects would follow
the usual by-handle semantics, and would not serve as a mostly drop-in
replacement for arrays. It is notable that while HHVM/Hack initially had
vec and dict object types, they later created dedicated by-value types for
these instead.
-
Property accessors, or specifically for your PSR-7 examples, guards. The
__clone related issues you're mostly dealing with in your examples are
there because you need to replicate the validation logic in multiple
places. If instead you could write something likepublic string $method {
guard($version) {
if (!in_array($version, ['1.1', '1.0', '2.0'])) throw new
InvalidArgumentException;
}
}
then this would ensure consistent enforcement of the property invariants
regardless of how it is set.
Circling back, while I think that a combination of these features would be
the "proper" solution to the problem, they also add quite a bit of
complexity. Despite what I say above, I'm very much not convinced that
adding support for by-value objects is a good idea, due to the confusion
that two different object semantics could cause, especially if writing
operations on them are not syntactically distinct.
I've written up an initial draft for property accessors at
https://wiki.php.net/rfc/property_accessors, but once again I get the
distinct impression that this is adding a lot of language complexity, that
is possibly not justified (and it will be more complex once inheritance is
fully considered).
Overall, I'm still completely unsure what we should be doing :)
Regards,
Nikita
Le 03/02/2021 à 15:14, Nikita Popov a écrit :
I've written up an initial draft for property accessors at
https://wiki.php.net/rfc/property_accessors, but once again I get the
distinct impression that this is adding a lot of language complexity, that
is possibly not justified (and it will be more complex once inheritance is
fully considered).Overall, I'm still completely unsure what we should be doing :)
Regards,
Nikita
Hello,
I love pretty much everything of this draft, it will allow to write
value types in a very concise manner.
Various notes thought:
- Visibility modifier (public, protected, private) is useless and
could be dropped entirely (I don't like var, but if that's necessary
to keep it OK) for properties with asymmetric visibility directives,
I don't know if the current parser will let you do that easily, but
that would be a huge win for developers (even more concise code). - I love the fact that it can be combined with constructor promotion.
- I love the guard and lazy features as proposed.
Regarding inheritance, obviously the most important point is that
interface or class contracts should not be changed, so you may open for
reading a closed property, but you may not close a readable property for
example. This is true for writing as well of course. You're saying
basically that a get'ed property would be passed by-value and thus it
would forbid indirect access such as adding values to an array ? But
what if the compiler could detect that get; is just get and not a
function behind and compile opcodes as if it was a normal property (I
don't know Zend internals at all, just guessing here) and considers that
any other more complex getter to just be incompatible ? I guess that in
languages such as C# that implement such asymmetric visibility
mechanism, they always return object references, so this kind of problem
just doesn't exist.
Thank you so much for this draft, I love the path it follows.
Regards,
--
Pierre
On Mon, Dec 28, 2020 at 9:24 PM Larry Garfield larry@garfieldtech.com
wrote:There's been a number of discussions of late around property visibility
and how to make objects more immutable. Since it seems to have been
well-received in the past, I decided to do a complete analysis and context
of the various things that have been floated about recently.The full writeup is here:
https://peakd.com/hive-168588/@crell/object-properties-and-immutability
I hope it proves stimulating, at least of discussion and not naps.
Thanks for the analysis Larry! I want to add a couple of thoughts from my
side.First of all, I think it's pretty clear that "asymmetric visibility" is the
approach that gives us most of what we want for the least amount of effort.
Asymmetric visibility has clear semantics, is (presumably) trivial to
implement, and gives immutability guarantees that are "good enough" for
most practical purposes. It's the pragmatic choice, and PHP is all about
pragmatism...That said, I don't think that asymmetric visibility is the correct solution
to this problem space -- I don't think asymmetric visibility is ever (or
only very rarely) what we actually want, it's just a good enough
approximation. Unfortunately, the alternatives are more complex, and we
have a limited budget on complexity.Here are the pieces that I think would make up a proper solution to this
space:
- initonly properties. This is in the sense of the previous "write once
properties" proposal, though initonly is certainly the better name for the
concept. Initonly properties represent complete immutability both inside
and outside the class, and I do believe that this is the most common form
of immutability needed (if it is needed at all).Of course, as you correctly point out, initonly properties are incompatible
with wither patterns that rely on clone-then-modify implementations. I
think that ultimately, the "wither pattern" is an artifact of the fact that
PHP only supports objects with by-handle semantics. The "wither pattern"
emulates objects with by-value semantics, in a way that is verbose and
inefficient.I do want to point out that your presentation of copy-on-write when it
comes to withers is not entirely correct: When you clone an object, this
will always result in a full copy of the object, including all its
properties. If you call a sequence of 5 wither methods, then this will
create five objects and perform a copy of all properties every time. There
is really no copy-on-write involved here, apart from the fact that property
values (though not the property storage) can still be shared.
- This brings us to: Objects with by-value semantics. This was discussed
in the thread, but I felt like it was dismissed a bit prematurely.Ultimately, by-value semantics for objects is what withers are emulating.
PSR-7 isn't "immutable", it's "mutable by-value". "Immutable + withers" is
just a clumsy way to emulate that. If by-value objects were supported, then
there would be no need for wither methods, and the "clone-then-modify"
incompatibility of initonce properties would not be a problem in practice.
You just write $request->method = 'POST' and this will either efficiently
modify the request in-place (if you own it) or clone it and then modify it
(if it is shared).Another area where by-value objects are useful are data structures. PHP's
by-value array type is probably one of those few instances where PHP got
something right in a major way, that many other languages got wrong. But
arrays have their own issues, in particular in how they try to service both
lists and dictionaries at the same time, and fail where those intersect
(dictionaries with integer keys or numeric string keys). People regularly
suggest that we should be adding dedicated vector and dictionary objects,
and one of the issues with that is that the resulting objects would follow
the usual by-handle semantics, and would not serve as a mostly drop-in
replacement for arrays. It is notable that while HHVM/Hack initially had
vec and dict object types, they later created dedicated by-value types for
these instead.
Property accessors, or specifically for your PSR-7 examples, guards. The
__clone related issues you're mostly dealing with in your examples are
there because you need to replicate the validation logic in multiple
places. If instead you could write something likepublic string $method {
guard($version) {
if (!in_array($version, ['1.1', '1.0', '2.0'])) throw new
InvalidArgumentException;
}
}then this would ensure consistent enforcement of the property invariants
regardless of how it is set.Circling back, while I think that a combination of these features would be
the "proper" solution to the problem, they also add quite a bit of
complexity. Despite what I say above, I'm very much not convinced that
adding support for by-value objects is a good idea, due to the confusion
that two different object semantics could cause, especially if writing
operations on them are not syntactically distinct.I've written up an initial draft for property accessors at
https://wiki.php.net/rfc/property_accessors, but once again I get the
distinct impression that this is adding a lot of language complexity, that
is possibly not justified (and it will be more complex once inheritance is
fully considered).Overall, I'm still completely unsure what we should be doing :)
Regards,
Nikita
Thanks for the feedback, Nikita. And yes, on the larger scale I'm not sure what the perfect solution is either. :-)
Regarding your comments first:
I've thought about "record" types in the past (by-value formal structures), which would go back to by-value semantics. However, every time I think about what features we'd want them to have, I always end up back at "every possible feature of classes someone will want on records," at which point we're just double-implementing classes on a new zval type. That seems ungood. (Imagine figuring out how to do generics, and then needing to do them twice.) The alternative would be some kind of "by-value-passing" flag on class definitions, something like "byval class Foo { ... }", but I have absolutely no idea if that's even possible (at the engine level) much less desireable (at the API predictability level). To some extent you want to be able to predict in advance whether a variable will pass by value or by reference or by handle so you know what it's safe to do to it.
It's also trivial to bypass by-value by passing a value by reference, thus losing all the safety that would give you. See also: Any Drupal version in the last 15 years, that loves passing around enormous arrays by reference so they can be modified.
The question, though, is if we want immutable values or passing-safe values, which are not quite the same thing. You assert above that what we really want are passing-safe values. I'm... not actually sure myself which one is the true desire since they've been coupled for so long, other than modify-in-place structures don't always have good ergonomics. (I personally prefer chaining set or with methods over repeating an object name over and over again to set a value. I'm sure others will disagree.)
I will note that even if we were to have a record type of some kind, initonly values still pose a challenge if they're derived from some other value that may change, or in cases where an object still can and should be cloned for reasons other than emulating immutability. Also, there are other reasons to implement vec and dict in the engine beyond just enforcing immutability, although I would want to do that even if they were done with a record type.
Regarding your property accessor proposal:
I've always said that initonly, asymmetric visibility, etc. are all stepping stones toward full property accessors. My understanding was that they failed before for performance reasons. If you believe those are solvable in a way that would let us skip the intermediary steps and go straight toward the full package (which would effectively let us emulate all of these other features we've been discussing), I am so totally here for it.
I adore the idea of a guard method on properties. That would be useful in a huge number of places, even if nothing else makes it in. I have only 2 concerns about them:
- I can see them being used a ton on promoted properties, so guard clauses being incompatible with promoted properties would be extremely sad. We should spend some time exploring ways to make them play nice together.
- There are likely a huge number of cases that can be reduced to a declarative syntax, which could then be parsed, extracted, and used for creating tests, creating JS equivalents for automated form validation, and so on. A method wouldn't support that, but offers more flexibility.
Which... Those two together just gave me an idea. Make it an attribute.
class Foo {
public function __construct(
#[GuardMethod('startsWithNumber')]
#[GuardRegex('[0-9]')]
public string $bar
) { }
public function startsWithNumber($val): bool { ... }
}
That would allow some validation to be baked in, in a declarative form, support arbitrary method guards, and move the code away so that it's compatible with constructor promotion. It could potentially be implemented in a way that is user-space extensible, too. I think this is worth investigating further, even independently of everything else we're discussing.
I also love lazy/init/whatever properties, as that gives us the self-memoization that no other option discussed has managed. As noted above, my only concern is what happens to it if some other value it is computed off of changes, or the object is cloned, etc. One viable answer is "if it's not safe to memoize then just don't do that, dummy," which may be the answer, but as we all know PHP developers in the wild do not always think such things through. (And it's such a tempting feature that it may get over used, and get people into trouble.) Again, possibly not something we could realistically resolve but worth calling out.
The descriptions around the backing property are a bit clunky. I think I follow/agree with what you're describing, but the way it's described with the underscore property is a bit misleading.
I also think that making $value a magic name is not a good approach. That's not at all self-evident from context; you just have to know that is a magic value now. I would alternatively propose using the property name itself. So:
class Test {
public string $prop {
get { return $this->prop; }
set { $this->prop = $prop; }
}
}
That way the name is predictable and logical.
I don't have any good ideas on the inheritance or references front at the moment.
--Larry Garfield