Hello internals and a happy new year.
I've been meaning to ask this for some time now: why aren't closures
serializable? The only on-list discussion about this that i've found is:
http://marc.info/?l=php-internals&m=119837318407795&w=2
Personally i'd find serialization of closures very useful (think
serializing of callback arrays, serialization of objects that acquired
closure somewhere during execution, etc)
Also i find that being unable to serialize closure introduces a slight
inconsistency. It was possible to serialize "is callable" type (plain
array/string) whereas now it's not always the case[1] so programmers
have to be aware of it when using some kind of "register callback" behavior.
So to my question: why aren't closures serializable? Is it a design
choice or implementation issue?
I've looked at both zend_closure.c (for closure implementation) and
var.c (for serialization) but didn't see anything that could suggest
either way.
Now for some wild guessing: In the internals post mentioned above it's
stated that annonymous function name is along the lines of
hashfunction(FILE) and per_file_counter; i understand that it's
deterministic, and so if i were to serialize just this id (as sting) it
should be possible to track back to the actual code after
deserialization? i imagine that it would require (in simple terms)
getting the original file and "executing/assigning" n-th per_file
closure. It should be even simpler when done in the same request and/or
with opcode caches where just the id should be sufficient to get the
original opcodes and execute.
I'll look some more into it myself but i'm not that fluent in C to
provide quality patch (not to mention zend api). I'd like to know your
opinion whether this could be implemented though.
TIA
Marcin Kurzyna
[1] While it was possible to serialize an array of callbacks before it's
impossible with closures assigned, see example below:
http://aquarion.hq.crystalpoint.pl/public/php-internals/closure-serialization.php
Hello Marcin,
Friday, January 2, 2009, 8:56:41 PM, you wrote:
Hello internals and a happy new year.
I've been meaning to ask this for some time now: why aren't closures
serializable? The only on-list discussion about this that i've found is:
Personally i'd find serialization of closures very useful (think
serializing of callback arrays, serialization of objects that acquired
closure somewhere during execution, etc)
Also i find that being unable to serialize closure introduces a slight
inconsistency. It was possible to serialize "is callable" type (plain
array/string) whereas now it's not always the case[1] so programmers
have to be aware of it when using some kind of "register callback" behavior.
So to my question: why aren't closures serializable? Is it a design
choice or implementation issue?
I've looked at both zend_closure.c (for closure implementation) and
var.c (for serialization) but didn't see anything that could suggest
either way.
Now for some wild guessing: In the internals post mentioned above it's
stated that annonymous function name is along the lines of
hashfunction(FILE) and per_file_counter; i understand that it's
deterministic, and so if i were to serialize just this id (as sting) it
should be possible to track back to the actual code after
deserialization? i imagine that it would require (in simple terms)
getting the original file and "executing/assigning" n-th per_file
closure. It should be even simpler when done in the same request and/or
with opcode caches where just the id should be sufficient to get the
original opcodes and execute.
I'll look some more into it myself but i'm not that fluent in C to
provide quality patch (not to mention zend api). I'd like to know your
opinion whether this could be implemented though.
You would need to provide a c level serialization that stores the $this
pointer (the easy part) and the zend_function member (not easy). The
second part requires storing of all static variables, which again is
pretty easy but it also requires to store the function itself, which in
case of a user function probably only means the string but then at the
time of serialization you only have the compiled opcodes. So you would
need APC to serialize the opcodes, which is not yet implemented.
Alternatively you could save the implementation string somewhere. And
serialize that and compiling it upon unserialization. If you're more
interested, c level serialization with multiple serialization inputs is
used for: SplObjectStorage::serialize() / unserialize()
.
marcus
TIA
Marcin Kurzyna
[1] While it was possible to serialize an array of callbacks before it's
impossible with closures assigned, see example below:
http://aquarion.hq.crystalpoint.pl/public/php-internals/closure-serialization.php
Best regards,
Marcus
Hi Marcus,
You would need to provide a c level serialization that stores the $this
pointer (the easy part) and the zend_function member (not easy). The
second part requires storing of all static variables, which again is
pretty easy but it also requires to store the function itself, which in
case of a user function probably only means the string but then at the
time of serialization you only have the compiled opcodes. So you would
need APC to serialize the opcodes, which is not yet implemented.
Alternatively you could save the implementation string somewhere. And
serialize that and compiling it upon unserialization. If you're more
interested, c level serialization with multiple serialization inputs is
used for: SplObjectStorage::serialize() /unserialize()
.
thnx for your insight. i've looked at SplObjectStorage::serialize and it
seems pretty straight forward - this shouldn't be a problem.
as you said it's saving the zend_function member that is tricky. i
thought i'd go with saving function string and recompiling it as you
suggested (i don't want to mess with APC although i find the idea of
opcode serialization much more appealing). anyway it's proving itself,
well... tricky ;-)
i'm thinking about hooking into
zend_do_beging_lambda_function_declaration and saving all until
zend_do_end_function_declaration, but i haven't tried that in practice
yet (not even sure how to do this at the moment - i'll get to it tonight).
thnx
marcin
as you said it's saving the zend_function member that is tricky. i
thought i'd go with saving function string and recompiling it as you
suggested (i don't want to mess with APC although i find the idea of
opcode serialization much more appealing). anyway it's proving itself,
well... tricky ;-)
Two thoughts:
-
It can be quite tricky to keep the binding of "use"d variables correct
-
Serialized data soften used as data exchange format (for instance
http://developer.yahoo.com/common/phpserial.html) the default
unserialize shouldn't introduce executable code even though it's not
executed by default we should be careful there. And yes, I can
understand the need to serialize object structures, including closures,
to sessions...
johannes
Hi Marcus,
You would need to provide a c level serialization that stores the $this
pointer (the easy part) and the zend_function member (not easy). The
second part requires storing of all static variables, which again is
pretty easy but it also requires to store the function itself, which in
case of a user function probably only means the string but then at the
time of serialization you only have the compiled opcodes. So you would
need APC to serialize the opcodes, which is not yet implemented.
Alternatively you could save the implementation string somewhere. And
serialize that and compiling it upon unserialization. If you're more
interested, c level serialization with multiple serialization inputs is
used for: SplObjectStorage::serialize() /unserialize()
.
thnx for your insight. i've looked at SplObjectStorage::serialize and it
seems pretty straight forward - this shouldn't be a problem.
as you said it's saving the zend_function member that is tricky. i
thought i'd go with saving function string and recompiling it as you
suggested (i don't want to mess with APC although i find the idea of
opcode serialization much more appealing). anyway it's proving itself,
well... tricky ;-)
i'm thinking about hooking into
zend_do_beging_lambda_function_declaration and saving all until
zend_do_end_function_declaration, but i haven't tried that in practice
yet (not even sure how to do this at the moment - i'll get to it
tonight).
thnx
marcin