Hello, internals!
Today I was looking at PSR-7 and discovered this part of code:
$body = new StringStream(json_encode(['tasks' => [
'Code',
'Coffee',
]]));;
$request = $baseRequest
->withUri($uri->withPath('/tasks/user/' . $userId))
->withMethod('POST')
->withHeader('Content-Type' => 'application/json')
->withBody($body);
$response = $client->send($request);
What is wrong here? Emulated immutability. All methods will create a
separate instance of request, so
$baseRequest->withUri()->withMethod()->withHeader()->withBody() will create
total 5 object instances. It's a memory overhead and time consumption for
each method call.
What I want to discuss is true immutability flag for variables and
parameters. There are a lot of languages that use "final" or "const"
keywords to prevent modification of variables. We can use this approach by
extending language syntax as following:
const $text = "Some message"; // defines an immutable variable that can not
be modified.
$text .= "foo"; // Exception: can not change the immutable variable
class Test {
public final $value; // final keyword defines immutable property, but
it isn't initialized yet so allows only one assignment, then will be
immutable
public function __construct(const $value) // Argument is declared as
immutable with "const" keyword, can not be changed in the function body
{
$this->value = $value;
// $value = 456; // will throw an exception
}
}
$obj = new Test(42);
echo $obj->value; // Direct access to the property, result is 42
$obj->value = 100; // Exception: can not change the immutable (final)
property
Immutable variable can not be made mutable, but it can be assigned to the
separate mutable variable (copy on write). Example:
const $text = "Some message";
$anotherText = $text;
$anotherText .= "modified"
On the engine level this immutable variables and parameters can be
effectively optimized by using IS_TYPE_IMMUTABLE for zval. This can result
in achieving better performance and various JIT-optimizations.
Thoughts?
Hi Alexander,
Hello, internals!
Today I was looking at PSR-7 and discovered this part of code:
$body = new StringStream(json_encode(['tasks' => [
'Code',
'Coffee',
]]));;
$request = $baseRequest
->withUri($uri->withPath('/tasks/user/' . $userId))
->withMethod('POST')
->withHeader('Content-Type' => 'application/json')
->withBody($body);
$response = $client->send($request);What is wrong here? Emulated immutability. All methods will create a
separate instance of request, so
$baseRequest->withUri()->withMethod()->withHeader()->withBody() will create
total 5 object instances. It's a memory overhead and time consumption for
each method call.
Yes, I also think this is unfortunate.
What I want to discuss is true immutability flag for variables and
parameters. There are a lot of languages that use "final" or "const"
keywords to prevent modification of variables. We can use this approach by
extending language syntax as following:
This approach wouldn’t solve the problem you’re describing. You still need to produce a new request object, because the request object is immutable. The mutability of its properties isn’t the issue.
If you want to avoid creating five different objects, you’d need to implement value-type objects that are passed by value and use copy-on-write. Basically, you’d need to re-add PHP 4 style classes.
Thanks.
Andrea Faulds
http://ajf.me/
2015-01-30 18:07 GMT+03:00 Andrea Faulds ajf@ajf.me:
This approach wouldn’t solve the problem you’re describing. You still
need to produce a new request object, because the request object is
immutable. The mutability of its properties isn’t the issue.
Thank you for this answer. However, pay an attention to the last part of my
suggestion. To create a mutable copy of immutable object it will be
possible just to make an assignment (this will create a copy/clone) to a
local mutable variable and adjust it for your needs.
example:
public function handleRequest(const Request $request) { // Mark $request as
immutable for body
// we can not change $request variable itself
// but if need, we can only initialize a copy of it
// $request->url = 'https://evil.com'; // Exception will be there
$mutableSubRequest = $request; // copy of immutable object or maybe a
direct Copy-on-Write?
$mutableSubRequest->uri = 'https://example.com'; // It's ok to modify
our local copy of an object
$this->kernel->handleRequest($mutableSubRequest);
}
2015-01-30 18:07 GMT+03:00 Andrea Faulds ajf@ajf.me:
Hi Alexander,
On 30 Jan 2015, at 13:07, Alexander Lisachenko lisachenko.it@gmail.com
wrote:Hello, internals!
Today I was looking at PSR-7 and discovered this part of code:
$body = new StringStream(json_encode(['tasks' => [
'Code',
'Coffee',
]]));;
$request = $baseRequest
->withUri($uri->withPath('/tasks/user/' . $userId))
->withMethod('POST')
->withHeader('Content-Type' => 'application/json')
->withBody($body);
$response = $client->send($request);What is wrong here? Emulated immutability. All methods will create a
separate instance of request, so
$baseRequest->withUri()->withMethod()->withHeader()->withBody() will
create
total 5 object instances. It's a memory overhead and time consumption for
each method call.Yes, I also think this is unfortunate.
What I want to discuss is true immutability flag for variables and
parameters. There are a lot of languages that use "final" or "const"
keywords to prevent modification of variables. We can use this approach
by
extending language syntax as following:This approach wouldn’t solve the problem you’re describing. You still
need to produce a new request object, because the request object is
immutable. The mutability of its properties isn’t the issue.If you want to avoid creating five different objects, you’d need to
implement value-type objects that are passed by value and use
copy-on-write. Basically, you’d need to re-add PHP 4 style classes.Thanks.
Andrea Faulds
http://ajf.me/
What is wrong here? Emulated immutability. All methods will create a
separate instance of request, so
$baseRequest->withUri()->withMethod()->withHeader()->withBody() will create
total 5 object instances. It's a memory overhead and time consumption for
each method call.
Like Andrea, I don't see how immutable variables/objects solve this
problem. The overhead is not from emulating the immutability, it is a
consequence of the design pattern choosing immutability. In fact, from
the code shown, there is no way to know that immutability is in any way
relevant, since the definition of withUri() could be a mutator which
ends "return $this", a pattern I've frequently used to create such
"fluent" interfaces. Copy-on-write doesn't help, either, since all 5
calls are writes, so will still make 5 copies.
What I want to discuss is true immutability flag for variables and
parameters. There are a lot of languages that use "final" or "const"
keywords to prevent modification of variables.
On the concrete suggestion, though, I do think there are possible
advantages to this. I've long felt the confusion people have over
pass-by-value for primitives, pass-by-value-which-is-actually-a-pointer
for objects, and pass-by-reference for a variable which might or might
not be an object pointer, is a failure not of the programmer but of the
programming language. In a blog post about it a few years ago 1, I
suggested that deep immutability (along with deep cloning) could provide
a better framework than by-value vs by-reference in modern OO languages.
This is rather different from defining a type that is immutable, since
it implies temporary immutability of a particular instance; but it seems
to be what at least some of your examples are hinting at.
The problem is that deep cloning and deep immutability are non-trivial;
PHP notably doesn't support deep cloning of objects, requiring each
class to define what to do with any non-primitive members, since some
may represent resources which can't be meaningfully cloned just by
copying data in memory.
In the same way, in order to make an instance deeply immutable, you need
to nail down the details of which actions are actually allowed while
it's in that state, and that gets complicated:
- Directly assigning to a public property is clearly invalid, as is
calling a method on that property which mutates it. - If the property is itself an object, assigning it to a temporary
variable must retain the immutability - that is, we don't want
"$foo->bar->setValue(42);" to behave differently from "$x = $foo->bar;
$x->setValue(42);" - Calling a method which internally performs such an action is also
invalid (like the setValue() in the example above). This could be
achieved by executing the function but marking $this as immutable, but
that means that the method may have other effects up to the point where
the violation occurs, so it would be preferable to somehow invalidate
the method call. - For objects which represent proxies for external resources, there may
be methods which mutate that external state, and properties which exist
only as virtual getters. So calling $db->insertRow(...) should probably
be invalid, since it changes the value of $db->lastInsertedId.
I think the only way to do it would be for every method to be invalid
unless it is explicitly marked as "immutable-safe", e.g.
class Foo {
private $delegatedObject;
public function __construct() { $this->delegatedObject = new
SomethingElse; }
immutable public function getValue() { return
$this->delegatedObject->getX(); }
public function setValue($newValue) { echo "Setting!";
$this->delegatedObject->setX($newValue); }
}
const $foo = new Foo;
echo $foo->getValue();
$foo->setValue(42); // Instantly fails without echoing "Setting!", but
method is illegal in immutable context
Copy-on-write, at the user object level, requires both this and deep
cloning:
$bar = clone-on-write $foo;
$bar->setValue(42); // Needs to implicitly clone $foo before calling
setValue(), but must also clone $delegatedObject for that to be meaningful
So class Foo needs an implementation of __clone() as well before any of
this can be used, and if it is missing, the resulting behaviour may be
rather non-obvious.
As ever, the devil's in the detail. The only language I know of that's
meaningfully tackled this is Rust, with its concepts of "boxes" and
"lending", although I'm hazy on the details.
Regards,
--
Rowan Collins
[IMSoP]
What is wrong here? Emulated immutability. All methods will create a
separate instance of request, so
$baseRequest->withUri()->withMethod()->withHeader()->withBody() will
create
total 5 object instances. It's a memory overhead and time consumption
for
each method call.Like Andrea, I don't see how immutable variables/objects solve this
problem. The overhead is not from emulating the immutability, it is a
consequence of the design pattern choosing immutability. In fact, from
the code shown, there is no way to know that immutability is in any
way relevant, since the definition of withUri() could be a mutator
which ends "return $this", a pattern I've frequently used to create
such "fluent" interfaces. Copy-on-write doesn't help, either, since
all 5 calls are writes, so will still make 5 copies.
The with*() methods in PSR-7 are documented to return a new instance,
not modify the existing instance. Yes, there's no way in PHP itself to
force that syntactically, which is why documentation exists. :-)
Also, in the benchmarks we've run the performance cost of all those new
objects is measured in nanoseconds, ie, small enough that we're not
worried about it. (Hats off to the PHP Internals folks for making that
fast!)
What I want to discuss is true immutability flag for variables and
parameters. There are a lot of languages that use "final" or "const"
keywords to prevent modification of variables.On the concrete suggestion, though, I do think there are possible
advantages to this. I've long felt the confusion people have over
pass-by-value for primitives,
pass-by-value-which-is-actually-a-pointer for objects, and
pass-by-reference for a variable which might or might not be an object
pointer, is a failure not of the programmer but of the programming
language. In a blog post about it a few years ago [1], I suggested
that deep immutability (along with deep cloning) could provide a
better framework than by-value vs by-reference in modern OO languages.This is rather different from defining a type that is immutable,
since it implies temporary immutability of a particular instance; but
it seems to be what at least some of your examples are hinting at.The problem is that deep cloning and deep immutability are
non-trivial; PHP notably doesn't support deep cloning of objects,
requiring each class to define what to do with any non-primitive
members, since some may represent resources which can't be
meaningfully cloned just by copying data in memory.
snip
Immutability, generally, offers two advantages:
-
It makes it easier for humans to reason about code.
-
It makes it easier for compilers/runtimes to reason about code.
For the former, good programming practices/standards can often suffice
as there are cases where immutability makes code uglier, not better.
For the latter, it allows the compiler/runtime to do two things: Catch
code errors early and optimize based on assumptions.
In practice, I don't see much value in a language allowing a variable to
be flagged as mutable or immutable unless the default is immutable, as
in F# or Rust. In those cases it encourages the developer to take more
functional, immutable approaches most of the time, which (it is argued)
lead to better code, and more optimizable code (because the compiler can
make more assumptions). Switching PHP to default-immutable variables is
clearly off the table, so allowing individual variables to be explicitly
marked as immutable, particularly scalars as Stanislav points out,
doesn't sound like it would offer much.
What could be valuable, however, is flagging parameters. Ie:
function foo(const MyClass $c, const $s) {
$s = 'abc'; // Compiler error
$c = new MyClass(); // Compiler error.
$c->foo = 'abc'; // Some kind of error?
}
With PHP's existing copy-on-write there's not much memory-savings to be
had with "const reference" parameters, as in C. What the above would
allow is for the compiler to catch certain errors, especially if the
parameters are on a method in an interface. Ie, the following:
interface Foo {
public function setFromThing(const Thing $t);
}
Makes it explicitly clear that an implementer is not allowed to modify
$t in their setFromThing() implementation. Of course, as Rowan notes
object encapsulation makes enforcing that quite difficult, which is
arguably by design.
So I suppose the more general question is, are there code annotations
(immutable values, marking a function as pure, explicit scalar types,
etc.) that developers could insert that would:
- Provide guidance for developers reasoning about the code.
- Allow the compiler to catch more bugs.
- Allow the compiler to better optimize the code by safely making more
assumptions. - Some combination of the above.
Eg, if the compiler knows a given function is pure (explicit input and
output, no side-effects, immutable of the parameters) then it could
auto-memoize it, or inline safely (eliminating the function call
entirely), or other such things. Is there a way that a developer could
syntactically tell the compiler "you can safely make these assumptions
and optimize/error-check accordingly"? And could Zend Engine support
such optimizations in a meaningful way? (I have no idea on the latter.)
--Larry Garfield
Hi!
The with*() methods in PSR-7 are documented to return a new instance,
not modify the existing instance. Yes, there's no way in PHP itself to
force that syntactically, which is why documentation exists. :-)Also, in the benchmarks we've run the performance cost of all those new
objects is measured in nanoseconds, ie, small enough that we're not
worried about it. (Hats off to the PHP Internals folks for making that
fast!)
It is great that this is fast, but I wonder (maybe off-topic?) why do
it? I.e. it is clear that in something like:
$a = new Request->withHeaders(...)->withBody(...)
->withEncoding(...)->withETag(...)
the intermediate objects are useless and nobody needs 5 new objects when
you do it. Am I missing something here?
Stas Malyshev
smalyshev@gmail.com
Hi Stas,
The with*() methods in PSR-7 are documented to return a new instance,
not modify the existing instance. Yes, there's no way in PHP itself to
force that syntactically, which is why documentation exists. :-)Also, in the benchmarks we've run the performance cost of all those new
objects is measured in nanoseconds, ie, small enough that we're not
worried about it. (Hats off to the PHP Internals folks for making that
fast!)It is great that this is fast, but I wonder (maybe off-topic?) why do
it? I.e. it is clear that in something like:$a = new Request->withHeaders(...)->withBody(...)
->withEncoding(...)->withETag(...)the intermediate objects are useless and nobody needs 5 new objects when
you do it. Am I missing something here?
I assume the reason for doing this is so you can’t ever modify the object from a distance, you must always create a new one to avoid messing up anything with an existing handle on it.
As you mention, though, this means that you get useless intermediate objects. This use case could be solved much better if we had copy-on-write/value-type classes like PHP 4 had. If Request was a value-type class, then you could do this:
$a = new Request();
$a->addHeaders(…);
$a->setBody(…);
$a->setEncoding(…);
$a->setETag(…);
Here, there’s no redundant objects made, but if you pass $a on, it’d be automatically copied by PHP, so you don’t need to worry about it being modified.
Would that make sense? It’s no different than how our existing value types like scalars and arrays work.
Andrea Faulds
http://ajf.me/
Hi!
Here, there’s no redundant objects made, but if you pass $a on, it’d
be automatically copied by PHP, so you don’t need to worry about it
being modified.
I don't think it's a particularly good solution in this case, as in many
cases (especially DI setups, many design patterns, etc.) the whole point
of creating the object is to pass it around. Just pointlessly copying it
out of fear somebody somewhere could modify it doesn't sound the best
way. I'd rather just have a clear separation between mutating and
non-mutating APIs, and instruct people to use the right ones in right
situation - i.e. if you created the object or own it, use mutating ones,
if you got object from outside and do not have full ownership of it, use
non-mutating ones.
Would that make sense? It’s no different than how our existing value
types like scalars and arrays work.
Scalars don't have this problem as, except for string offsets (IMHO not
the best idea to have mutable strings too) scalars can not really be
changed, just replaced with other scalars. But implementing value
objects in PHP is not hard right now - if you don't provide any methods
that allow changing state, you've got an immutable object. It's just not
always what people using it would want, especially with something as
complex as HTTP message.
Stas Malyshev
smalyshev@gmail.com
Hi,
Here, there’s no redundant objects made, but if you pass $a on, it’d
be automatically copied by PHP, so you don’t need to worry about it
being modified.I don't think it's a particularly good solution in this case, as in many
cases (especially DI setups, many design patterns, etc.) the whole point
of creating the object is to pass it around. Just pointlessly copying it
out of fear somebody somewhere could modify it doesn't sound the best
way.
Well, that’s the advantage of copy-on-write: you can avoid needless manual copies, and instead have automatic copying done for you by the language, where needed.
Although implementing copy-on-write for object methods might be a challenge, give PHP doesn’t track which functions have side effects.
I'd rather just have a clear separation between mutating and
non-mutating APIs, and instruct people to use the right ones in right
situation - i.e. if you created the object or own it, use mutating ones,
if you got object from outside and do not have full ownership of it, use
non-mutating ones.
This isn’t very nice in practice, though. Mutating APIs are easy to use and performant, non-mutating APIs are neither of these things. What you want is the benefits of the first without the disadvantages of the second.
Would that make sense? It’s no different than how our existing value
types like scalars and arrays work.Scalars don't have this problem as, except for string offsets (IMHO not
the best idea to have mutable strings too) scalars can not really be
changed, just replaced with other scalars. But implementing value
objects in PHP is not hard right now - if you don't provide any methods
that allow changing state, you've got an immutable object. It's just not
always what people using it would want, especially with something as
complex as HTTP message.
You’re ignoring arrays, which also have our copy-on-write behaviour. Also, the bigint RFC (if passed) would be another kind of mutable scalar. It’s not mutable in very many cases, but it is in some places as a performance optimisation.
We have value objects, sure, but they’re not efficient. Every mutation requires the creation of a new object, because you can’t do copy-on-write. Compare that to arrays: mutations there only create new arrays if the refcount is > 1.
The following code using immutable value objects requires the creation of five new objects:
$a = $somefoo
->withBar(…)
->withBaz(…)
->withQux(…)
->withoutFooBar();
Yet the following code using arrays, which are passed by value in PHP, requires the creation of only one new array, and modifies in-place:
$a = $somefoo;
$a[‘bar’] = …;
$a[‘baz’] = …;
$a[‘qux’] = …;
unset($a[‘foobar’]);
Is that not superior?
--
Andrea Faulds
http://ajf.me/
It is great that this is fast, but I wonder (maybe off-topic?) why do
it? I.e. it is clear that in something like:$a = new Request->withHeaders(...)->withBody(...)
->withEncoding(...)->withETag(...)the intermediate objects are useless and nobody needs 5 new objects when
you do it. Am I missing something here?
Primarily for the reasons Andrea listed: Avoid
spooky-action-at-a-distance. In the particular case of a request (or
ServerRequest), it's very tempting to use as an over-engineered global
with all of the problems that causes. When doing sub requests, or
recursive requests, any code that relies on a global request object is
now invalid and introduces all sorts of weirdness. By not allowing that
object to change without the system knowing about it explicitly you
eliminate a lot of sources of weird bugs.
(There were several very long threads on the FIG list about mutability
and I'm greatly over-simplifying them here. There are also still
dissenters who would favor a mutable object, still, although right now
the group is leaning immutable.)
Hi,
I'd rather just have a clear separation between mutating and
non-mutating APIs, and instruct people to use the right ones in right
situation - i.e. if you created the object or own it, use mutating ones,
if you got object from outside and do not have full ownership of it, use
non-mutating ones.This isn’t very nice in practice, though. Mutating APIs are easy to use and performant, non-mutating APIs are neither of these things. What you want is the benefits of the first without the disadvantages of the second.
I disagree here. In the micro-sense, mutating APIs may be easier and
more performant at first. At scale, though, immutable APIs tend to be
much more predictable and less prone to mysterious behavior.
Drupal 7's render API is an excellent demonstration of how mutable data
structures can result in completely incomprehensible, almost
non-deterministic code. :-)
Would that make sense? It’s no different than how our existing value
types like scalars and arrays work.
Scalars don't have this problem as, except for string offsets (IMHO not
the best idea to have mutable strings too) scalars can not really be
changed, just replaced with other scalars. But implementing value
objects in PHP is not hard right now - if you don't provide any methods
that allow changing state, you've got an immutable object. It's just not
always what people using it would want, especially with something as
complex as HTTP message.
You’re ignoring arrays, which also have our copy-on-write behaviour. Also, the bigint RFC (if passed) would be another kind of mutable scalar. It’s not mutable in very many cases, but it is in some places as a performance optimisation.We have value objects, sure, but they’re not efficient. Every mutation requires the creation of a new object, because you can’t do copy-on-write. Compare that to arrays: mutations there only create new arrays if the refcount is > 1.
The following code using immutable value objects requires the creation of five new objects:
$a = $somefoo
->withBar(…)
->withBaz(…)
->withQux(…)
->withoutFooBar();Yet the following code using arrays, which are passed by value in PHP, requires the creation of only one new array, and modifies in-place:
$a = $somefoo;
$a[‘bar’] = …;
$a[‘baz’] = …;
$a[‘qux’] = …;
unset($a[‘foobar’]);Is that not superior?
Depends what it is you're prioritizing. Arrays-as-junior-structs only
scale so far, and in my experience they break down surprisingly fast.
(See above regarding Drupal's Array-Oriented APIs.) They also don't
have any natural mechanism to do anything but CRUD, so methods that
derive information from provided data, or provide defaults for various
values, or do anything to provide a nice developer experience (DX) are
impossible. They're also completely non-self-documenting, whereas a
class with defined properties and methods, mutable or not, is much
easier to learn how to use.
Having some objects that are pass-by-value and others that are
pass-by-handle/reference... honestly scares me. The potential for
confusion there is huge. Last I recall there was discussion of trying
to revamp arrays to be more like objects to finally resolve the
"array_*() functions and iterators are incompatible and the world sucks"
problem, so that would seem to go the other direction. (Did anyone end
up working on that for PHP 7? Please?)
If you really wanted a compound variable (like objects and arrays) that
was just for data and passed by value but had a nicer DX than
undocumentable array keys... now you're talking about actual structs,
and letting them have actor functions a la Go. But I should probably
stop talking now before someone shoots me. :-)
--Larry Garfield
the intermediate objects are useless and nobody needs 5 new objects when
you do it. Am I missing something here?
I assume the reason for doing this is so you can’t ever modify the object from a distance, you must always create a new one to avoid messing up anything with an existing handle on it.
An alternative, which I'm guessing FIG considered and rejected, is
having a mutable "builder" object, and require the user to specifically
request an immutable object when they're ready to use it:
$builder = new RequestBulider;
$a = $builder->withHeaders(...)->withBody(...)->getRequest();
$a_prime = $builder->withEncoding(...)->withETag(...)->getRequest();
--
Rowan Collins
[IMSoP]
The with*() methods in PSR-7 are documented to return a new instance,
not modify the existing instance. Yes, there's no way in PHP itself to
force that syntactically, which is why documentation exists. :-)Also, in the benchmarks we've run the performance cost of all those new
objects is measured in nanoseconds, ie, small enough that we're not
worried about it. (Hats off to the PHP Internals folks for making that
fast!)
It is great that this is fast, but I wonder (maybe off-topic?) why do
it? I.e. it is clear that in something like:$a = new Request->withHeaders(...)->withBody(...)
->withEncoding(...)->withETag(...)the intermediate objects are useless and nobody needs 5 new objects when
you do it. Am I missing something here?
Hi,
my assumptions after some testing:
- This is only true in case of reassigning ($a = clone $a) the old
variable, as refcount for underlying values remains 1 for all values in
clones, so cow don't need to copy anything. - If the old object is not thrown away, then memory consumption is
doubled and the "fast" argument is wrong.
(Performance, of cloning an object without copying values and of some
method calls, is negligible.)
Omit mutator methods (with*/set*) is a simple and logical way to achieve
immutability in PHP. No need for const/final/...
The with*() methods in PSR-7 are documented to return a new instance,
not modify the existing instance. Yes, there's no way in PHP itself to
force that syntactically, which is why documentation exists. :-)Also, in the benchmarks we've run the performance cost of all those new
objects is measured in nanoseconds, ie, small enough that we're not
worried about it. (Hats off to the PHP Internals folks for making that
fast!)
It is great that this is fast, but I wonder (maybe off-topic?) why do
it? I.e. it is clear that in something like:$a = new Request->withHeaders(...)->withBody(...)
->withEncoding(...)->withETag(...)the intermediate objects are useless and nobody needs 5 new objects when
you do it. Am I missing something here?Hi,
my assumptions after some testing:
- This is only true in case of reassigning ($a = clone $a) the old
variable, as refcount for underlying values remains 1 for all values
in clones, so cow don't need to copy anything.- If the old object is not thrown away, then memory consumption is
doubled and the "fast" argument is wrong.
(Performance, of cloning an object without copying values and of some
method calls, is negligible.)
I'm not sure that anyone has benchmarked the memory impact of the
immutable approach yet. I will pass that along and see what data comes out.
Omit mutator methods (with*/set*) is a simple and logical way to
achieve immutability in PHP. No need for const/final/...
(Someone feel free to declare this thread off topic now, as we're
basically just rehashing discussions had weeks ago on the FIG list.)
The original calls for immutable HTTP message objects had no with*()
methods. They objects were set from their constructor and that was the
end of it. However, many people (myself included) pointed out that such
objects, in actual practice, would be completely unusable. Middlewares
need to be able to pass information from one layer to another; the most
straightforward way to do that is (as Symfony does and PSR-7 recommends)
an attributes collection on the request. But if you can't set that
without going through a new constructor manually it becomes horribly
difficult to do. Similarly, middlewares that want to set cache
information for a whole application, encrypt/decrypt headers, or do
anything else actually useful have to manually replicate the object
themselves.
An object with a complex constructor with lots of parameters offers,
really, no value over a big-ass array. It's almost worse in my
experience. (And yes, I've used such APIs and yes, I detest them.)
Too, the constructor is obviously not part of the interface, which means
every implementation would have its own constructor, thus eliminating
the benefit of having a standard at all.
The with*() methods solved that problem entirely, and enabled a nice
fluent style for those who are into such things.
--Larry Garfield
Hi Larry,
Il 01/02/15 10:38, Larry Garfield ha scritto:
- If the old object is not thrown away, then memory consumption is
doubled and the "fast" argument is wrong.
(Performance, of cloning an object without copying values and of some
method calls, is negligible.)I'm not sure that anyone has benchmarked the memory impact of the
immutable approach yet. I will pass that along and see what data comes
out.
I think I did while checking its speed. Maybe I didn't publish the
result, but going from memory there was no significant footprint
increase. Just the expected speed decrease.
Cheers
Matteo
(Someone feel free to declare this thread off topic now, as we're
basically just rehashing discussions had weeks ago on the FIG list.)
Just as HHVM is not PHP neither is FIG ... so any discussion on ether
list is not relevant here since many people will not have seen them.
I have no intention adopting FIG simply because it fails to follow the
coding guide lines used within PHP itself so any argument as to how PHP7
should evolve has to take place in this discussion space.
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Hey Larry,
Immutability, generally, offers two advantages:
It makes it easier for humans to reason about code.
It makes it easier for compilers/runtimes to reason about code.
For the former, good programming practices/standards can often suffice as there are cases where immutability makes code uglier, not better.
For the latter, it allows the compiler/runtime to do two things: Catch code errors early and optimize based on assumptions.
PHP doesn’t have immutable classes, though, so the PHP runtime can’t reason about code using value objects and such, and they’re less performant than mutable objects with manual copying.
I think having some means to create value type classes (i.e. PHP 4-style classes) would be beneficial. These classes would have the same always-copy or copy-on-write behaviour that PHP’s scalar types and arrays have. They’d be performant compared to immutable classes like PSR-7’s, because operations mutate the value in-place if possible, rather than creating a new class. They’d also be nicer to use, because you can change the value imperatively rather than having to chain together return values. But you keep the main advantages of immutable types: no spooky action at a distance, no explicit copying needed.
What could be valuable, however, is flagging parameters. Ie:
function foo(const MyClass $c, const $s) {
$s = 'abc'; // Compiler error
$c = new MyClass(); // Compiler error.
$c->foo = 'abc'; // Some kind of error?
}
const parameters are a cool feature in C. However, the main use for them in C doesn’t really exist in PHP. In C, most of the time you need to pass a pointer to a value to a function, rather than a value directly. Having a pointer to a value means you can modify that value. So, there’s a need for a “const” modifier to mark parameters as being for values that will be taken as input and not modified, rather than as values that will be used as output and modified.
PHP, on the other hand, doesn’t have pointers. Parameters are usually by-value (objects are by-reference, but still). If you want to mutate some value, you need to explicitly mark it as such… most parameters are already “constant”.
Though, I suppose there’s some usefulness in that, since all objects are by-reference, you might want to say you won’t touch that object. Hmm. Given that PHP is dynamic, not compiled, and function calls can have side effects, though, this would be difficult to enforce. You’d need to check that all calls made within the function do not have any side effects on that value…
I’m not sure how workable this is.
Thanks.
Andrea Faulds
http://ajf.me/
Given that PHP is dynamic, not compiled, and function calls can have side effects, though, this would be difficult to enforce. You’d need to check that all calls made within the function do not have any side effects on that value…
I’m not sure how workable this is.
If you can implement copy-on-write, you can implement immutability, and
vice versa. However, any context where its possible to mix mutable and
immutable (or mutable-reference and copy-on-write) would get confusing
very quickly, I think.
Consider some possible behaviours of this set of objects:
$a->b = $b;
$a2 = $a;
$a2->b->value = 42;
If every object is immutable or copy-on-write, this is easy to reason
about: $b and $a->b remain the same, while $a2->b is created as a new
clone; in doing so, $a is separated from $a2 to reference this new copy.
But what if $a/$a2 are immutable, or copy-on-write, but $b is not - do
$b, $a->b and $a2->b all reflect the change, with no error? That would
certainly make no sense for const parameters, and would be a massive
gotcha for "value objects" defined as such at the class (rather than
instance) level.
Regards,
--
Rowan Collins
[IMSoP]
Hello, internals!
2015-02-01 4:01 GMT+03:00 Andrea Faulds ajf@ajf.me:
I think having some means to create value type classes (i.e. PHP 4-style
classes) would be beneficial. These classes would have the same always-copy
or copy-on-write behaviour that PHP’s scalar types and arrays have. They’d
be performant compared to immutable classes like PSR-7’s, because
operations mutate the value in-place if possible, rather than creating a
new class. They’d also be nicer to use, because you can change the value
imperatively rather than having to chain together return values. But you
keep the main advantages of immutable types: no spooky action at a
distance, no explicit copying needed.
Agree in that point with you, this can be a good instrument if implemented
in PHP. I see a good point in it by using this with interfaces and "const"
modifier for parameters:
class Foo {
public function bar(const Baz $object); // Require an instance to be
passed as copy
}
This should be used as weakly immutability for places where changes of
object is not expected. However, we can call methods on $objet, e.g.
setters.
The same will be useful for simple object variables:
const $object = new stdClass;
$object = 123; // Exception, can not change const variable
$object->field = 456; // Exception, can not change public properties for
const objects
$anotherObject = $object; // Copy on write
$anotherObject->field = 123; // ok to change local copy
However, we should not worry about setters (if present):
class MutableObject {
public $field = 42;
public function setField(const integer $newValue) {
$this->field = $newValue;
}
}
// let's create immutable object instance for that
const $instance = new MutableObject();
$instance->field = 56; // Exception, can not change public properties for
const objects
// BUT, no checks for setters
$instance->setField(56);
echo $instance->field; // outputs 56
I think temporary immutability for objects can solve a lot of extra checks.
For example, my previous code allows to drop extra getters
($instance->getField()) for simple classes, when we just want to work on
our local copy of object in read-only mode. With this feature
DateTimeImmutable functionality can be implemented like that:
const $now = new DateTime();
$future = $now; // Copy on write
$future->modify('+1 day'); // only $future variable is modified
echo $now->format(DATE_RFC822); // unchanged
For all this, I'd like to ask - why? Immutable object are very useful in
concurrency applications, since they allow to avoid expensive
synchronization costs and make it much easier to reason about system's
state (which is arguably impossible to predict with mutable objects in a
parallel system). But PHP has no parallelism. So which benefit would
that provide? If you want an immutable object, just make all properties
protected and provide getters but not setters. This is easily doable
without any keywords.
Yes, parallelism is not present in PHP (except pthreads extension), so this
requirement for temporary immutability of variables is less than in
multi-threaded applications with IPC. But I think that main advantage of
this modifier (immutable) is to add extra protection and assumption for
developers to bypass extra getters (sorry, Doctrine) and to allow more
safer code.
Passing a copy of object (should this be deeper clone or not?) as an
argument to the function is nice feature too, IMO. We explicitly mark an
argument as immutable for method body, so no direct changes will be
performed on this instance and no overhead for calling getters.
Should I try to write an RFC for that?
Agree in that point with you, this can be a good instrument if implemented
in PHP. I see a good point in it by using this with interfaces and "const"
modifier for parameters:class Foo {
public function bar(const Baz $object); // Require an instance to be
passed as copy
}
OK as usual I am missing something again ...
This is probably because I still don't understand objects, as I still
just consider them as arrays with a few more complex elements. I STILL
work on the basis that is I pass by reference, I can make modifications
to the data while if I don't I get a copy and have to pass the copy back
if I need the original changed. This used to be all very simple, so when
did it stop working? This may explain why I get into trouble with stuff
that has been working for years but after 'modernising' starts throwing
problems :(
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Lester Caine wrote on 02/02/2015 09:19:
This is probably because I still don't understand objects, as I still
just consider them as arrays with a few more complex elements. I STILL
work on the basis that is I pass by reference, I can make modifications
to the data while if I don't I get a copy and have to pass the copy back
if I need the original changed. This used to be all very simple, so when
did it stop working? This may explain why I get into trouble with stuff
that has been working for years but after 'modernising' starts throwing
problems :(
Since PHP 5, and in most other languages, objects are passed with an
extra level of indirection, almost but not quite the same as
pass-by-reference. So there are actually three ways a variable can be
passed (or assigned):
- scalar by value - an assignment to the variable changes only that variable
- scalar by reference, using & - both variable names point to the same
variable, and assigning a value to one assigns it to both - object by value - an assignment to the variable still only changes
that variable, BUT two variables can point at the same object can both
modify its internal state. So if you have $current_user =
getCurrentUserObject(); then saying $current_user = null; is changing
its value, so would only change that variable, but
$current_user->setPassword('1234'); is not, so the change would be
visible from all variables pointing at that object.
Immutable parameters for scalars and arrays are simple: currently you
can use a by-value parameter as a local variable in your function, but
you might want to avoid the confusion of this: function foo( const $bar
) { $bar = 42; // ERROR } or foo( const array $bar ) { $bar[] = 42; //
ERROR }
Immutable parameters for objects are complicated, because foo( const
$bar ) { $bar->value = 42; } is not technically changing the value of
that object, but it would be useful to have an enforcement that it
wouldn't happen.
Hope that clarifies things a bit.
Regards,
Rowan Collins
[IMSoP]
||
I want to add a link here to the Java article about Value Types, this
information is quite interesting:
http://cr.openjdk.java.net/~jrose/values/values-0.html
Probably, immutable value objects will be soon in Java world according to
this information:
Conclusion
Along with the questions, we believe we have enough answers and insights
to begin prototyping value types, and verifying our thesis. That is, we
think it quite likely that a restricted class-like type can be made to look
enough like a primitive to be worth adding to the language and VM.
2015-02-02 13:01 GMT+03:00 Rowan Collins rowan.collins@gmail.com:
Lester Caine wrote on 02/02/2015 09:19:
This is probably because I still don't understand objects, as I still
just consider them as arrays with a few more complex elements. I STILL
work on the basis that is I pass by reference, I can make modifications
to the data while if I don't I get a copy and have to pass the copy back
if I need the original changed. This used to be all very simple, so when
did it stop working? This may explain why I get into trouble with stuff
that has been working for years but after 'modernising' starts throwing
problems :(Since PHP 5, and in most other languages, objects are passed with an extra
level of indirection, almost but not quite the same as pass-by-reference.
So there are actually three ways a variable can be passed (or assigned):
- scalar by value - an assignment to the variable changes only that
variable- scalar by reference, using & - both variable names point to the same
variable, and assigning a value to one assigns it to both- object by value - an assignment to the variable still only changes that
variable, BUT two variables can point at the same object can both modify
its internal state. So if you have $current_user = getCurrentUserObject();
then saying $current_user = null; is changing its value, so would only
change that variable, but $current_user->setPassword('1234'); is not, so
the change would be visible from all variables pointing at that object.Immutable parameters for scalars and arrays are simple: currently you can
use a by-value parameter as a local variable in your function, but you
might want to avoid the confusion of this: function foo( const $bar ) {
$bar = 42; // ERROR } or foo( const array $bar ) { $bar[] = 42; // ERROR }Immutable parameters for objects are complicated, because foo( const $bar
) { $bar->value = 42; } is not technically changing the value of that
object, but it would be useful to have an enforcement that it wouldn't
happen.Hope that clarifies things a bit.
Regards,
Rowan Collins
[IMSoP]
||
I want to add a link here to the Java article about Value Types, this
information is quite interesting:
http://cr.openjdk.java.net/~jrose/values/values-0.htmlProbably, immutable value objects will be soon in Java world according
to
this information:Conclusion
Along with the questions, we believe we have enough answers and
insights
to begin prototyping value types, and verifying our thesis. That is,
we
think it quite likely that a restricted class-like type can be made
to look
enough like a primitive to be worth adding to the language and VM.
Ooh, that does look like an interesting read. I'm particularly intrigued by their notion of "custom basic types". I'm beginning to dream about such types being a way of introducing various features:
- immutability and/or COW; if they can't have normal objects as members, then the deep cloning/immutability problem basically goes away
- operator overloading (without allowing it on normal objects); perhaps further limited to cases where both operands are of the same type (e.g. money + money) to avoid complicated dispatch algorithms
- range and domain types as sub-classes of scalars, with some kind of cast declaration; dare I say it, this could be more useful than scalar type hinting, since you're able to check arbitrary limits
- "scalar methods" / autoboxing; standard PHP could include basic methods (e.g. those array_* and str_* functions which have a natural "subject"), and you could add more by casting to a subtype, with much less overhead than an object wrapper
Maybe if we keep experimenting, we can come up with something really awesome for PHP 8...
Rowan Collins
[IMSoP]
Hi!
What I want to discuss is true immutability flag for variables and
parameters. There are a lot of languages that use "final" or "const"
keywords to prevent modification of variables. We can use this approach by
extending language syntax as following:
Most of the languages that use "final" and "const" do much more than
"prevent modification of variables" - they make these "variables"
constants, which behave differently. Also, many of them fail to achieve
the actual immutability - "const" declaration just means you can't
assign/mutate this name, but says nothing about the object the name
points to - it can very well still be mutated by other means. E.g. if
you declare "const $foo = new MyObject();" you have no control over what
any method called on $foo does. You'd need to introduce const methods
and have checks to ensure they are actually not changing the object
(which may have serious performance impacts).
For all this, I'd like to ask - why? Immutable object are very useful in
concurrency applications, since they allow to avoid expensive
synchronization costs and make it much easier to reason about system's
state (which is arguably impossible to predict with mutable objects in a
parallel system). But PHP has no parallelism. So which benefit would
that provide? If you want an immutable object, just make all properties
protected and provide getters but not setters. This is easily doable
without any keywords. Having immutable scalar variable seems a bit
useless since the scope of the scalar variable is very small and if you
can't track what happens inside a single function you probably should
refactor this function, not the language.
There might be optimizations available for immutable scalars, but we
already have constants for that.
const $text = "Some message"; // defines an immutable variable that can not
be modified.
$text .= "foo"; // Exception: can not change the immutable variableclass Test {
public final $value; // final keyword defines immutable property, but
it isn't initialized yet so allows only one assignment, then will be
immutable
So each property would have a counter which tracks if it's assigned or
not? That would be rather weird - what if the ctor doesn't initialize
it? Then random method that tries to assign it would work once but not
twice, which would be very weird.
On the engine level this immutable variables and parameters can be
effectively optimized by using IS_TYPE_IMMUTABLE for zval. This can result
in achieving better performance and various JIT-optimizations.
I don't see how they can be optimized unless they are actually constants
(yours are not, since they can be assigned with arbitrary value in
runtime, albeit just once) - and we already have constants.
--
Stas Malyshev
smalyshev@gmail.com