Hallo,
I use pear cache (light) to save calculated values. The package uses
serialize to transform the content of a variable into a writable form.
This transformed value will be saved to disk.
This could be improved! Would it be possible to rewrite the serialize
function to make it possible to write directly to disk? If you have a
value, which needs 10MB you need also ca. 10MB to serialize and than you
can write this data to disk. So, you needs 20MB. If serialize (and of
course unserialize) would be able to write directly to disk (or read
directly from disk), you only needs 10MB.
I think, that could perform much better for every file cache than
current implementation does. I have such a scenario, in which a variable
needs 100MB. Because the calculation of this value needs much time, I
wanted to save this value to cache (with the help of (un)serialize). But
than, I need 200MB and that doesn't look nice, because it is not really
necessary.
As I understand the serialize structure, it would be no problem to
extend it to directly save to disk. But my c knowledge is not as good as
it should be. I would be very happy, if this idea could be realized.
Mathias
can write this data to disk. So, you needs 20MB. If serialize (and of
course unserialize) would be able to write directly to disk (or read
directly from disk), you only needs 10MB.
Actually having serialize/unserialize be able to write directly to a
stream and read directly from a stream might be interesting, would
probably improve working with things like large sessions or caching
large data substantially.
--
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/
Stanislav Malyshev wrote:
can write this data to disk. So, you needs 20MB. If serialize (and of
course unserialize) would be able to write directly to disk (or read
directly from disk), you only needs 10MB.Actually having serialize/unserialize be able to write directly to a
stream and read directly from a stream might be interesting, would
probably improve working with things like large sessions or caching
large data substantially.
Indeed, especially since this is the most common use case. Maybe it
should optionally also return an md5 of the written data.
regards,
Lukas
Stanislav Malyshev wrote:
can write this data to disk. So, you needs 20MB. If serialize (and of
course unserialize) would be able to write directly to disk (or read
directly from disk), you only needs 10MB.Actually having serialize/unserialize be able to write directly to a stream
and read directly from a stream might be interesting, would probably improve
working with things like large sessions or caching large data substantially.Indeed, especially since this is the most common use case. Maybe it should
optionally also return an md5 of the written data.
If we're to add this, make sure writes to the files are atomic.
regards,
Derick
--
Derick Rethans
http://derickrethans.nl | http://ez.no | http://xdebug.org
If we're to add this, make sure writes to the files are atomic.
Does PHP now ensure fwrite is atomic? If it doesn't than writing on
serialize doesn't change a thing.
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/
If we're to add this, make sure writes to the files are atomic.
Does PHP now ensure fwrite is atomic? If it doesn't than writing on serialize
doesn't change a thing.
Only "a" mode is atomic - per write call; normal fwrites are not.
However, you'd need to write the whole file to disk atomically and not
on every fwrite. And you can not first cache it in memory as you then
lose the whole advantage of this idea.
regards,
Derick
on every fwrite. And you can not first cache it in memory as you then
lose the whole advantage of this idea.
IIRC sessions are locked by php anyway, and for other uses if locking is
important it is already implemented anyway, so we shouldn't really try
to solve all the world's problems with this one.
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/
on every fwrite. And you can not first cache it in memory as you then lose
the whole advantage of this idea.IIRC sessions are locked by php anyway, and for other uses if locking is
important it is already implemented anyway, so we shouldn't really try to
solve all the world's problems with this one.
? Nobody is talking about sessions here, just about the serialize()
function that is also used for a myriad of other things...
regards,
Derick
? Nobody is talking about sessions here, just about the
serialize()
You mean you are not talking about sessions. I, however, do. Sessions
are one of the obvious examples where such functionality could improve
performance.
function that is also used for a myriad of other things...
Oh really? I guess that's why we talk about adding stuff to it in
order to make it more efficient in certain scenarios and not replacing
it with the new one. Precisely the scenarios where external locking
would be happening anyway - when either using sessions or caching.
--
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/
Stanislav Malyshev wrote:
? Nobody is talking about sessions here, just about the
serialize()
You mean you are not talking about sessions. I, however, do. Sessions
are one of the obvious examples where such functionality could improve
performance.
Well as the topic implies I am quite sure that the user request was
about caching into a custom file and not inside the session. Both are
frequent use cases.
regards,
Lukas
Well as the topic implies I am quite sure that the user request was
about caching into a custom file and not inside the session. Both are
frequent use cases.
If his cache had no locking before, what changed?
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/
If his cache had no locking before, what changed?
Well, I have been using several cache classes. A good cache class is the
pear cache light.
This cache is serializing your data and write this data to a file - of
course with file locking.
I could imagine, that a improved serialize-function could work like this:
$fp = @fopen($filename, "wb");
if ($fp) {
@flock($fp, LOCK_EX);
serialize($data,$fp);
@flock($fp, LOCK_UN);
@fclose($fp);
}
This method makes it possible to decide, if you want to use a locking
function or not.
The same technique could be applied to imagepng (and others):
instead of writing
imagepng($img,"/path/to/file");
it would be better to write
$fp = @fopen($filename, "wb");
if ($fp) {
@flock($fp, LOCK_EX);
imagepng($img,$fp);
@flock($fp, LOCK_UN);
@fclose($fp);
}
This way, you can decide, if you want to use locking functions or not. I
think, this could be a big improvement (until now you have to use
imagepng in combination with ob_start / ob_get_contents / ob_end_clean
to cache images in a secure way).
Mathias
Stanislav Malyshev wrote:
can write this data to disk. So, you needs 20MB. If serialize
(and of
course unserialize) would be able to write directly to disk (or
read
directly from disk), you only needs 10MB.Actually having serialize/unserialize be able to write directly to
a stream
and read directly from a stream might be interesting, would
probably improve
working with things like large sessions or caching large data
substantially.Indeed, especially since this is the most common use case. Maybe it
should
optionally also return an md5 of the written data.If we're to add this, make sure writes to the files are atomic.
Is this suggesting that the entire 80M upload has to be done in a
single operation?...
Or is the md5/sha1 computed chunk by chunk, in parallel, with writing
buffered data to the disk?
Cuz if it's the former, I don't see that working out too well for
ginormous uploaded files...
Which people probably shouldn't be doing over HTTP anyway, but they
do, and that's the reality one has to deal with...
Apologies if I'm being alarmist and totally mis-reading this through
my ignorance.
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
Stanislav Malyshev wrote:
can write this data to disk. So, you needs 20MB. If serialize
(and of course unserialize) would be able to write directly to
disk (or read directly from disk), you only needs 10MB.Actually having serialize/unserialize be able to write directly
to a stream and read directly from a stream might be interesting,
would probably improve working with things like large sessions or
caching large data substantially.Indeed, especially since this is the most common use case. Maybe it
should optionally also return an md5 of the written data.If we're to add this, make sure writes to the files are atomic.
s this suggesting that the entire 80M upload has to be done in a
single operation?...
Wrong thread ;-) This is on serialize, not on hashes.
regards,
Derick
--
Derick Rethans
http://derickrethans.nl | http://ez.no | http://xdebug.org