Hello to the php developer community - :)
This is my first post so I hope I'm not too out of line coming here to ask a
question.
My question is - With PHP could a C extension be written that takes the
binary contents of a variable in memory and writes it to a file or a stream?
Like:
struct abc{ int a;}X;
FILE*fp = fopen("abc.dat","wb");
fwrite(&X,sizeof(X),1,fp);
Basically I'm interested in using DBUS with PHP - Someone is already working
on an implementation ...
My question is could you take any arbitrary variable and send it over DBUS
in its C binary form.
That way 'the other side' of the DBUS connection could implement a similar
construct and interpret the binary data to its language-specific format.
This could go a long way to provide various forms of
inter-process-communication in a binary fashion to other Languages (such as
Python).
To start out with I would like to experiment with serialization of a
variable to a binary file - and see if I can get that working.
I realize there are some limitations here - recursive structures,
references, object references, pointer pointers - etc, I just don't know how
PHP implements it's data structures well enough to know how hard retrieving
simple arrays, integers, and strings from the applications memory would be.
Binary serialization is ideal because it eliminates any problems associated
with multi-byte strings, parsing overhead, etc.
It would be great if a PHP arrays structure could be taken from memory,
written to a file or stream, and opened in Python as a Dictionary without
any database or serialization inbetween.
If this overlaps with any current development please let me know.
Thanks,
Ben DeMott
I realize there are some limitations here - recursive structures,
references, object references, pointer pointers - etc, I just don't
know how PHP implements it's data structures well enough to know how
hard retrieving simple arrays, integers, and strings from the
applications memory would be.
Well, you'll have to serialize the contents of the variables anyway as
they're not one coherent piece of memory. All of the serializers do
this. You just need a compatible deserializer on the other side.
It would be great if a PHP arrays structure could be taken from
memory, written to a file or stream, and opened in Python as a
Dictionary without any database or serialization inbetween.
The DBus extension in PECL (http://pecl.php.net/package/DBus) implements
a serializer and deserializer for most data types as supported by DBUS,
and there is also a python module available to do the same.
with kind regards,
Derick
http://derickrethans.nl | http://xdebug.org
Like Xdebug? Consider a donation: http://xdebug.org/donate.php
twitter: @derickr and @xdebug
Thanks for the response Derick,
So is the data structure thats being serialized remaining in some form
thats reflective of its binary encoding on the system.
Or is there some byte-encoding that goes on to generalize the binary structure?
I guess ... Does it find the values in memory of the distinct values,
and then create a serialized set of indexes in a c-compatible
data-type?
or... is this just another form of string serialization, where
variables are represented by strings encoded in ASCII?
If this is a binary serialization two additional questions:
1.) Does this mean that you can to some accuracy determine the size of
an in-memory variable by counting the sizeof()
on the components that
make up the data-structure in C?
2.) Can the PHP sessions serialization be made to use this type of
serialization - so the serialized values could be shared between
multiple applications.. (let me give you an example)
Would I be better off just re-implementing the PHP C code for its
session serializer/deserializer in Python than attempting to keep it
binary?
For PHP Sessions, lets say you use memcached to store a serialized
representation of the session.
Lets also say that you would like to read that session data
occasionally by python to offload work that your webserver would
normally have to perform.
If you were to write a binary serialized form to memory - the C
implementation of the serialization could by very easily ported to
Python as an extension and both could be maintained with ease
(as both PHP and Python are written in C, a Java variant wouldn't be
too hard either using JNI)
I guess this brings up the question of - How hard would it be to
expose the binary object that dbus sends over-the-wire so it could be
written to file instead?
Overall it would be nice if PHP had a functionality similar to Pythons
Pickle -> to expose to the user so they could store binary
representations of objects as they saw fit.
It seems to me this would be vastly more efficient than an
ascii-encoding and serialization as well.
If my logic is poorly formed because of a lack of understanding please
feel free to educate me!
Thanks Again Derick - Cheers!
So is the data structure thats being serialized remaining in some form
thats reflective of its binary encoding on the system. Or is there
some byte-encoding that goes on to generalize the binary structure?
PHP only knows about binary text, and then numbers and floats. Our
current serialize format is just text based.
I guess ... Does it find the values in memory of the distinct values,
and then create a serialized set of indexes in a c-compatible
data-type?
or... is this just another form of string serialization, where
variables are represented by strings encoded in ASCII?
The current serializer does do the latter. pecl/dbus does... DBUS
serialization of course.
If this is a binary serialization two additional questions:
1.) Does this mean that you can to some accuracy determine the size of
an in-memory variable by counting thesizeof()
on the components that
make up the data-structure in C?
nope, that won't work as there are strings that are at least two blocks,
and arrays and objects are tons.
2.) Can the PHP sessions serialization be made to use this type of
serialization - so the serialized values could be shared between
multiple applications.. (let me give you an example)
Thereis an igbinary extension/serializer.
Would I be better off just re-implementing the PHP C code for its
session serializer/deserializer in Python than attempting to keep it
binary?
What you want... both work.
For PHP Sessions, lets say you use memcached to store a serialized
representation of the session.
Lets also say that you would like to read that session data
occasionally by python to offload work that your webserver would
normally have to perform.If you were to write a binary serialized form to memory - the C
implementation of the serialization could by very easily ported to
Python as an extension and both could be maintained with ease
(as both PHP and Python are written in C, a Java variant wouldn't be
too hard either using JNI)
The problem is that PHP and Python have different dataytpes, and dbus
has yet again others. That is always going to be a problem.
I guess this brings up the question of - How hard would it be to
expose the binary object that dbus sends over-the-wire so it could be
written to file instead?
You need a tool such as dbus-monitor, but you need to write the
deserialization from DBUS into your "fileformat" yet again.
Overall it would be nice if PHP had a functionality similar to Pythons
Pickle -> to expose to the user so they could store binary
representations of objects as they saw fit.It seems to me this would be vastly more efficient than an
ascii-encoding and serialization as well.
That's no different from what serialization does really. There is
http://opensource.dynamoid.com/ for binary serializations.
If my logic is poorly formed because of a lack of understanding please
feel free to educate me!
You write too much text :P
regards,
Derick
--
http://derickrethans.nl | http://xdebug.org
Like Xdebug? Consider a donation: http://xdebug.org/donate.php
twitter: @derickr and @xdebug
I'll keep it short and sweet Derick ! =]
http://opensource.dynamoid.com/ -> Is exactly what I was thinking of,
glad I asked so I am not re-inventing the wheel :)
I have a feeling I can get educated really fast on php data-type
internals by taking a look at the source.
Thanks again!
So is the data structure thats being serialized remaining in some form
thats reflective of its binary encoding on the system. Or is there
some byte-encoding that goes on to generalize the binary structure?PHP only knows about binary text, and then numbers and floats. Our
current serialize format is just text based.I guess ... Does it find the values in memory of the distinct values,
and then create a serialized set of indexes in a c-compatible
data-type?
or... is this just another form of string serialization, where
variables are represented by strings encoded in ASCII?The current serializer does do the latter. pecl/dbus does... DBUS
serialization of course.If this is a binary serialization two additional questions:
1.) Does this mean that you can to some accuracy determine the size of
an in-memory variable by counting thesizeof()
on the components that
make up the data-structure in C?nope, that won't work as there are strings that are at least two blocks,
and arrays and objects are tons.2.) Can the PHP sessions serialization be made to use this type of
serialization - so the serialized values could be shared between
multiple applications.. (let me give you an example)Thereis an igbinary extension/serializer.
Would I be better off just re-implementing the PHP C code for its
session serializer/deserializer in Python than attempting to keep it
binary?What you want... both work.
For PHP Sessions, lets say you use memcached to store a serialized
representation of the session.
Lets also say that you would like to read that session data
occasionally by python to offload work that your webserver would
normally have to perform.If you were to write a binary serialized form to memory - the C
implementation of the serialization could by very easily ported to
Python as an extension and both could be maintained with ease
(as both PHP and Python are written in C, a Java variant wouldn't be
too hard either using JNI)The problem is that PHP and Python have different dataytpes, and dbus
has yet again others. That is always going to be a problem.I guess this brings up the question of - How hard would it be to
expose the binary object that dbus sends over-the-wire so it could be
written to file instead?You need a tool such as dbus-monitor, but you need to write the
deserialization from DBUS into your "fileformat" yet again.Overall it would be nice if PHP had a functionality similar to Pythons
Pickle -> to expose to the user so they could store binary
representations of objects as they saw fit.It seems to me this would be vastly more efficient than an
ascii-encoding and serialization as well.That's no different from what serialization does really. There is
http://opensource.dynamoid.com/ for binary serializations.If my logic is poorly formed because of a lack of understanding please
feel free to educate me!You write too much text :P
regards,
Derick--
http://derickrethans.nl | http://xdebug.org
Like Xdebug? Consider a donation: http://xdebug.org/donate.php
twitter: @derickr and @xdebug