Hi,
Before I say anything, I have just concluded 20 hours of travel and feel
kinda funny, so please take that into consideration :).
Regarding API changes, let's first be clear:
- Any changes to file creation need to be understood in terms of what
they means. For instance, let's look at:
$phar['full/path/to/file'] = 'hi';
This creates 1 entry "full/path/to/file"
$phar['full']['path']['to']['file'] = 'hi';
This would create 4 entries
full
full/path
full/path/to
full/path/to/file
In addition, this would mean that offsetGet() would have to create a
directory. Why?
$phar['full']['path']['to']['file'] = 'hi' calls offsetGet() with
'full', 'path', and 'to', and offsetSet() only with 'file'.
In other words, the proposed change is much more complex, introduces
magic and counterintuitive behavior inside the extension. For instance,
let's say we have this code:
<?php
$a = $phar['dir'];
$b = $phar['dri']; // typo
?>
if phar.readonly=1, the second line would throw an exception for
attempting to create the directory "dri"
What about this?
<?php
unset($phar['dri']); // dri doesn't exist
?>
The above code would first create the dri directory, and then erase it.
In other words, the proposed syntax is not a good idea.
- Names must be very carefully considered in relation to existing APIs
and the pre-existing methods.
Phar::create(), for instance, should be named buildFromDirectory() to
match buildFromIterator().
what benefit is there in adding complexity of remembering the Phar::GZ
constant over setCompressedGZ()? We have to think hard about changes
like this.
- I would like to suggest these changes, and these alone:
- simplify all compression methods, with more consistency of naming
that clearly differentiates between compressing files within an archive,
and compressing the entire archive. This abstract idea is more
important to me than the specific naming, but it should be very easy to do. - add a zip-style API for adding files/directories, addFile,
addEmptyDir, addFileFromString with the exact method names - add buildFromDirectory()
- any further suggestions have off-list vetting with the maintainers
of phar for technical issues prior to publicly posting them, so we can
truly work together and present a more polished API suggestion to those
interested in phar on the first try without confusing them. I am NOT
proposing we take discussion to private, only that technical issues be
worked out offlist prior to presenting an idea.
Thank you for your patience, I know I am a bit of a hard-ass at times,
but I am genuinely excited about the discussion and creativity that is
going into this.
Greg
Hi Greg,
what benefit is there in adding complexity of remembering the Phar::GZ
constant over setCompressedGZ()? We have to think hard about changes like
this.
We did think hard about them. In fact I thought you'd implemented
compress(Phar::GZ) to replace the long-named one now :) Need to go play I
guess.
- I would like to suggest these changes, and these alone:
- simplify all compression methods, with more consistency of naming that
clearly differentiates between compressing files within an archive, and
compressing the entire archive. This abstract idea is more important to
me than the specific naming, but it should be very easy to do.
Yep.
- add a zip-style API for adding files/directories, addFile, addEmptyDir,
addFileFromString with the exact method names
OK...
- add buildFromDirectory()
Cool...
- any further suggestions have off-list vetting with the maintainers of
phar for technical issues prior to publicly posting them, so we can truly
work together and present a more polished API suggestion to those
interested in phar on the first try without confusing them. I am NOT
proposing we take discussion to private, only that technical issues be
worked out offlist prior to presenting an idea.Thank you for your patience, I know I am a bit of a hard-ass at times, but
I am genuinely excited about the discussion and creativity that is going
into this.Greg
Hi Greg,
Am Freitag, den 28.03.2008, 17:13 -0500 schrieb Greg Beaver:
[...]
In addition, this would mean that offsetGet() would have to create a
directory. Why?$phar['full']['path']['to']['file'] = 'hi' calls offsetGet() with
'full', 'path', and 'to', and offsetSet() only with 'file'.
I know that and I discussed that with Marcus beforehand. I don't think
that implicitly creating the directory is much an issue. If somebody
does not want that, he can just use isset() before.
In other words, the proposed change is much more complex, introduces
magic and counterintuitive behavior inside the extension. For instance,
let's say we have this code:<?php
$a = $phar['dir'];
$b = $phar['dri']; // typo
?>if phar.readonly=1, the second line would throw an exception for
attempting to create the directory "dri"
Wouldn't the second line throw an exeception anyway because the index is
not defined? And: can't offsetGet() check weither Phar::canWrite() is
true?
What about this?
<?php
unset($phar['dri']); // dri doesn't exist
?>The above code would first create the dri directory, and then erase it.
No, it would not. Please take a look at
http://lars.schokokeks.org/php/phar.phps and
http://lars.schokokeks.org/php/phar.php
In other words, the proposed syntax is not a good idea.
Two of three assumptions were not correct or at least arguable, so I
can't agree that easily.
- Names must be very carefully considered in relation to existing APIs
and the pre-existing methods.Phar::create(), for instance, should be named buildFromDirectory() to
match buildFromIterator().
I agree, that this name is much better.
what benefit is there in adding complexity of remembering the Phar::GZ
constant over setCompressedGZ()? We have to think hard about changes
like this.
setCompressedGZ() is just not the way object oriented APIs should
behave, as it taints the API with concrete implementation details
("there can be gzip compression and every compression algorithm is
exposed" opposed to "there can be compression, one variant is gzip").
Phar::compress() indicates that "the archive can be compressed" with the
concrete algorithm as a parameter. What about in 5 years when a new,
free compression algorithm rises, which fixes the problems of bzip2.
Would you like to add another setCompressedFooBar() or just add the
algorithm and the class constant and be fine?
If you insist on setCompressedFoo(), I would at least propose to not
implement that as a setter but to describe the attached behaviour, which
would be Phar::compressWithGzip() or something like that. Nevertheless,
please seriously consider to do it with a single compress()-method.
- I would like to suggest these changes, and these alone:
- simplify all compression methods, with more consistency of naming
that clearly differentiates between compressing files within an archive,
and compressing the entire archive. This abstract idea is more
important to me than the specific naming, but it should be very easy to do.
What would be your concrete proposal regarding that?
[...]
cu, Lars
Lars Strojny wrote:
Hi Greg,
Am Freitag, den 28.03.2008, 17:13 -0500 schrieb Greg Beaver:
[...]In addition, this would mean that offsetGet() would have to create a
directory. Why?$phar['full']['path']['to']['file'] = 'hi' calls offsetGet() with
'full', 'path', and 'to', and offsetSet() only with 'file'.I know that and I discussed that with Marcus beforehand. I don't think
that implicitly creating the directory is much an issue. If somebody
does not want that, he can just use isset() before.
Hi Lars,
If one uses file_put_contents('/path/to/this/file', 'hi') and
'/path/to/this' does not exist, there is an error. The same is true of
fopen, regardless of mode. mkdir()
even fails unless the recursive
parameter is explicitly specified. I do not think that this behavior
should be different in ext/phar, especially because all that is needed
is to create 1 directory (full/path/to/) and add the file.
Incidentally, you cut out the portion of my objections that related to
the creation of many unnecessary entries in the phar archive. It is
important to understand that every unnecessary entry in a phar archive
causes phar to do extra processing on loadup, which ultimately increases
the latency (this will be true until we implement caching of phar
manifests, which will be relatively easy to do, I think). In addition,
the performance of:
$phar['long/path/to/subdir/file'] = 'hi';
compared to:
$phar['long']['path']['to']['subdir']['file'] = 'hi';
is significantly slower. Each [] results in a call to offsetGet and
instantiation of a new PharFileInfo object plus the final offsetSet call.
In other words, the proposed change is much more complex, introduces
magic and counterintuitive behavior inside the extension. For instance,
let's say we have this code:<?php
$a = $phar['dir'];
$b = $phar['dri']; // typo
?>if phar.readonly=1, the second line would throw an exception for
attempting to create the directory "dri"Wouldn't the second line throw an exeception anyway because the index is
not defined? And: can't offsetGet() check weither Phar::canWrite() is
true?
Yes, but if you're attempting to access the "dir" directory, an
intuitive exception would be "dri not found" instead of "write access is
disabled, cannot create dri directory." If I got the write access
error, I would report a bug, since I was not trying to write anything.
What about this?
<?php
unset($phar['dri']); // dri doesn't exist
?>The above code would first create the dri directory, and then erase it.
No, it would not. Please take a look at
http://lars.schokokeks.org/php/phar.phps and
http://lars.schokokeks.org/php/phar.php
You're right, chock this up to jet lag on my part. :)
In other words, the proposed syntax is not a good idea.
Two of three assumptions were not correct or at least arguable, so I
can't agree that easily.
Only 1 was incorrect, but my objection still holds on a fundamental
level: offsetGet() should not attempt to write anything.
However, let's examine the feature from another perspective - I think it
could be useful as a read access method, for several reasons:
- foreach($phar as $name => $file)
This construct does not iterate over every file in the phar archive, but
instead iterates over only those files in the top-most directory. If a
directory returned implemented ArrayAccess and Iterator, one could treat
it like a phar object. Currently this isn't possible. This means
ultimately one should be able to do:
foreach($phar['subdir'] as $name => $file)
or
foreach($phar['subdir/anothersubdir'] as $name => $file)
or
foreach($phar['subdir']['anothersubdir'] as $name => $file)
- Some might find it easier to do something like:
$phar['subdir/with/really/long/path/file1.txt'] = 'a';
$subdir = $phar['subdir/with/really/long/path'];
$subdir['file2.txt'] = 'a';
$subdir['file3.txt'] = 'a';
Incidentally, if you want to know a real hack in the implementation of
ArrayAccess, you can make a directory now with this code:
$phar['subdir/with/really/long/path/'] = 0;
Needless to say, that probably should be removed as an option :).
Could you live with $phar['this']['thing'] as a read access API
(offsetGet only)?
- Names must be very carefully considered in relation to existing APIs
and the pre-existing methods.Phar::create(), for instance, should be named buildFromDirectory() to
match buildFromIterator().I agree, that this name is much better.
what benefit is there in adding complexity of remembering the Phar::GZ
constant over setCompressedGZ()? We have to think hard about changes
like this.setCompressedGZ() is just not the way object oriented APIs should
behave, as it taints the API with concrete implementation details
("there can be gzip compression and every compression algorithm is
exposed" opposed to "there can be compression, one variant is gzip").
Phar::compress() indicates that "the archive can be compressed" with the
concrete algorithm as a parameter. What about in 5 years when a new,
free compression algorithm rises, which fixes the problems of bzip2.
Would you like to add another setCompressedFooBar() or just add the
algorithm and the class constant and be fine?
If you insist on setCompressedFoo(), I would at least propose to not
implement that as a setter but to describe the attached behaviour, which
would be Phar::compressWithGzip() or something like that. Nevertheless,
please seriously consider to do it with a single compress()-method.
As I was trying to say in the portion you quoted below, Phar::compress()
is already used to compress the entire archive, and should not be
overloaded. I would be happy with these method names:
Phar::compressArchive() /* Steph hated this when I originally proposed
it, perhaps she's changed her mind? */
Phar::compressAllFiles()
Phar::uncompressArchive()
Phar::uncompressAllFiles()
If I need to explain what the two methods do, then we need to choose new
names, could you react to these names and let me know what you think
they would do?
- I would like to suggest these changes, and these alone:
- simplify all compression methods, with more consistency of naming
that clearly differentiates between compressing files within an archive,
and compressing the entire archive. This abstract idea is more
important to me than the specific naming, but it should be very easy to do.What would be your concrete proposal regarding that?
[...]
See above.
Lars, I also think it might help to understand my reaction better that
we've been publicly discussing the API of compression and other
phar-related issues on pecl-dev for a while, with no feedback. To
suddenly see an RFC with major changes without having even been notified
was annoying, I hope that in the future you will at least contact the
leads of the project you are proposing changes for so that minor
disagreements can be worked out in private. I always relish
improvements, Steph can attest to both my flaws and my talents in this
regard (I believe you witnessed a rough patch on IRC one day...), and
also the satisfaction in the ultimate result, so I hope we will find our
right working interaction from here forward. :)
Do you want to take a crack at a patch for any of these API features
once they are worked out (buildFromDirectory should be an easy one, if
you would like to start there)?
Thanks,
Greg
Hi Greg,
Am Samstag, den 29.03.2008, 17:58 -0500 schrieb Greg Beaver:
[...]
If one uses file_put_contents('/path/to/this/file', 'hi') and
'/path/to/this' does not exist, there is an error. The same is true of
fopen, regardless of mode.mkdir()
even fails unless the recursive
parameter is explicitly specified. I do not think that this behavior
should be different in ext/phar, especially because all that is needed
is to create 1 directory (full/path/to/) and add the file.
I mostly agree. Except that the whole argument chain applies to the
current API. You mentioned mkdir("foo/bar/baz") triggering an error
without the recursive argument, but Phar['foo/bar/baz'] does not.
[... Performance concerns ...]
$phar['long/path/to/subdir/file'] = 'hi';
compared to:
$phar['long']['path']['to']['subdir']['file'] = 'hi';
is significantly slower. Each [] results in a call to offsetGet and
instantiation of a new PharFileInfo object plus the final offsetSet call.
Could you estimate how hard that performance hit would be?
[....]
Yes, but if you're attempting to access the "dir" directory, an
intuitive exception would be "dri not found" instead of "write access is
disabled, cannot create dri directory." If I got the write access
error, I would report a bug, since I was not trying to write anything.
The exception message would be far away from being optimal, yes.
[...]
Only 1 was incorrect, but my objection still holds on a fundamental
level: offsetGet() should not attempt to write anything.However, let's examine the feature from another perspective - I think it
could be useful as a read access method, for several reasons:
- foreach($phar as $name => $file)
This construct does not iterate over every file in the phar archive, but
instead iterates over only those files in the top-most directory. If a
directory returned implemented ArrayAccess and Iterator, one could treat
it like a phar object. Currently this isn't possible. This means
ultimately one should be able to do:
[... Code snippets ... ]
I would like that, yes.
- Some might find it easier to do something like:
$phar['subdir/with/really/long/path/file1.txt'] = 'a';
$subdir = $phar['subdir/with/really/long/path'];
$subdir['file2.txt'] = 'a';
$subdir['file3.txt'] = 'a';Incidentally, if you want to know a real hack in the implementation of
ArrayAccess, you can make a directory now with this code:$phar['subdir/with/really/long/path/'] = 0;
Needless to say, that probably should be removed as an option :).
Yes, this shouldn't be allowed while
$phar['foo/bar/baz'] = new DirectoryIterator(); could be allowed.
Could you live with $phar['this']['thing'] as a read access API
(offsetGet only)?
I'm not sure if I would vote for a partial solution. I think - also I
really do think the current ArrayAccess API is cluttered - it is as
least consistent, which is a plus.
[...]
As I was trying to say in the portion you quoted below, Phar::compress()
is already used to compress the entire archive, and should not be
overloaded. I would be happy with these method names:Phar::compressArchive() /* Steph hated this when I originally proposed
it, perhaps she's changed her mind? */
I like it. And much better, it does not contain the compression
format :-)
Phar::compressAllFiles()
Wouldn't compressFiles() be fine?
Phar::uncompressArchive()
Phar::uncompressAllFiles()
Same here, maybe uncompressFiles()
If I need to explain what the two methods do, then we need to choose new
names, could you react to these names and let me know what you think
they would do?
compressAllFiles()/uncompressAllFiles(): compress all the files
contained in the archive
compressArchive()/uncompressArchive(): Compress the archive
[...]
Lars, I also think it might help to understand my reaction better that
we've been publicly discussing the API of compression and other
phar-related issues on pecl-dev for a while, with no feedback.
I would not care so much for phar, if it's inclusion in core has not
been targeted.
To suddenly see an RFC with major changes without having even been notified
was annoying, I hope that in the future you will at least contact the
leads of the project you are proposing changes for so that minor
disagreements can be worked out in private.
On http://pecl.php.net/package/phar Marcus is listed as one of the two
project leads. I've talked to him before writing the RFC on friday. So
this accusation fizzles out. I sort of understand your irritation, but
from my point of view improvement takes place constantly. In the time
Phar was discussed on pecl-dev I was not subscribed to it, now I am. So
the next time I will come up earlier most presumably.
[...]
Do you want to take a crack at a patch for any of these API features
once they are worked out (buildFromDirectory should be an easy one, if
you would like to start there)?
Yes, I'm willing to help with it to a degree, as I said in a previous
mail.
cu, Lars
Lars Strojny wrote:
Hi Greg,
Am Samstag, den 29.03.2008, 17:58 -0500 schrieb Greg Beaver:
[...]If one uses file_put_contents('/path/to/this/file', 'hi') and
'/path/to/this' does not exist, there is an error. The same is true of
fopen, regardless of mode.mkdir()
even fails unless the recursive
parameter is explicitly specified. I do not think that this behavior
should be different in ext/phar, especially because all that is needed
is to create 1 directory (full/path/to/) and add the file.I mostly agree. Except that the whole argument chain applies to the
current API. You mentioned mkdir("foo/bar/baz") triggering an error
without the recursive argument, but Phar['foo/bar/baz'] does not.[... Performance concerns ...]
$phar['long/path/to/subdir/file'] = 'hi';
compared to:
$phar['long']['path']['to']['subdir']['file'] = 'hi';
is significantly slower. Each [] results in a call to offsetGet and
instantiation of a new PharFileInfo object plus the final offsetSet call.Could you estimate how hard that performance hit would be?
We're talking about a similar difference between iterating over an array
and using an ArrayIterator, which is to say about 20 times slower. This
is due to the many extra opcodes for each offset retrieval, which then
calls an expensive method call to offsetGet(), creates the directory
inside the phar archive, writes to disk, and continues. Because we are
writing to disk, without a not-at-all-obvious startBuffering() call,
with large archives, the performance difference could be 100 times
slower or more.
In addition, although Phar does support the idea of thinking of archives
as multi-dimensional arrays, this is an abstraction not supported by the
actual archive file formats. For other examples, take a look at at
ext/zip. There is no support for opendir()
in the stream wrapper of
ext/zip because it requires the kind of path grepping that pecl/phar
does. Internally, no archive format stores paths the way directories
are stored in a typical file system. In other words, typically a
directory is a file that contains a list of other files or directories.
This allows very quick navigation on a random access disk, but causes
extreme bloat inside an archive. Files are stored as full paths in what
is the equivalent of a single-dimensional array. In other words, the
truth of the matter is that phar's ArrayAccess API is a 100% accurate
reflection of the true structure of the actual archives we are
accessing, and the iteration supported is a convenience for simulating
regular file systems. The API you describe actually layers an
artificial complexity over this design, and as a result requires much
more code complexity.
Let's also remember we're talking about phar creation, which is going to
be far less common than phar usage. To use a phar, you don't even need
to know what phar is let alone what the API is. One just includes the
phar archive, or adds it to include_path with a phar:// stream wrapper path.
I've spent a significant amount of time actually implementing the new
read-only implementation of offsetGet() in the past day, and although I
think this might be possible I am running into a weird segfault in SPL
on shutdown caused by a double destruction of the object. In addition,
the patch is not even close to complete, as it does not support the
addition of ArrayAccess to PharFileInfo, which would be required in
order to implement what your RFC describes. The complexity of the patch
is reaching the hundreds of lines and will be in the thousands of lines,
all to add syntactic sugar that is far more inefficient than the current
syntax - is this really a necessary feature?
As Elizabeth points out, most people are going to build their iterator
by creating a directory structure and then add the entire directory all
at once. Use of ArrayAccess to write to the archive would be done for
one or two files at most.
When reading from a phar, there is no real incentive to use relative
paths outside of scanning a directory for files (to locate drivers or
plugins, for instance), and iterating over the entire archive is usually
reserved for searching for a particular file. The ability to glob
through a phar archive would of course be the most ideal solution for
that approach (perhaps globiterator already works? I haven't tried it).
What I would like to do, however, is to rethink offsetSet(). I think
that we should introduce a property of PharFileInfo called "content"
that is read/write, and can be used to perform the equivalent of
file_get_contents()
/file_put_contents(). This will allow a consistency
that is much more intuitive. offsetSet() could then simply be removed,
or perhaps be refactored as a way to pass in a PharFileInfo and clone
the information.
Yes, this shouldn't be allowed while
$phar['foo/bar/baz'] = new DirectoryIterator(); could be allowed.
I will simply add makeEmptyDir() a la zip, although it is too bad
ext/zip never thought to go with mkdir()
.
Phar::compressArchive() /* Steph hated this when I originally proposed
it, perhaps she's changed her mind? */I like it. And much better, it does not contain the compression
format :-)
One problem with this suggestion is that phar renames the archive by
default when compressing. I would rather we combine compression and
format conversion into a single convert() method that accepts constants.
$phar = new Phar('blah.phar');
$phar = $phar->convert(Phar::GZ); // compress with gzip
$phar = $phar->convert(Phar::TAR | PHAR::GZ); // convert to .phar.tar.gz
$phar = $phar->convert(Phar::NONE | PHAR::DATA); // convert to
non-executable .tar, returns PharData object
$phar = $phar->convert(Phar::EXE); // convert back to .phar.tar
This would also solve a subtle but important problem with the current
conversion API where intermediary files are lying around.
Phar::compressAllFiles()
Wouldn't compressFiles() be fine?
Probably, especially if we remove the compress() method.
Lars, I also think it might help to understand my reaction better that
we've been publicly discussing the API of compression and other
phar-related issues on pecl-dev for a while, with no feedback.I would not care so much for phar, if it's inclusion in core has not
been targeted.To suddenly see an RFC with major changes without having even been notified
was annoying, I hope that in the future you will at least contact the
leads of the project you are proposing changes for so that minor
disagreements can be worked out in private.On http://pecl.php.net/package/phar Marcus is listed as one of the two
project leads. I've talked to him before writing the RFC on friday. So
this accusation fizzles out. I sort of understand your irritation, but
from my point of view improvement takes place constantly. In the time
Phar was discussed on pecl-dev I was not subscribed to it, now I am. So
the next time I will come up earlier most presumably.
My point is you should contact the leads (plural) of a project. I've
been breaking my back implementing changes and features requested as far
back as last May, and finally reached a fully functional solution a few
weeks ago. Your RFC would completely redo a large portion of the API.
I think it is a common courtesy to be consulted prior to publishing the
RFC for the simple reason that I am the one writing the majority of the
code and implementing these changes.
Do you want to take a crack at a patch for any of these API features
once they are worked out (buildFromDirectory should be an easy one, if
you would like to start there)?Yes, I'm willing to help with it to a degree, as I said in a previous
mail.
How soon can we expect a patch for this? Steph also expressed interest,
and we are on a rather tight time schedule to get this to beta, which
will mean a feature freeze (no new crap, just bug fixes).
Let me also be clear: the largest priority right now is on making
pecl/phar work with xdebug, APC and on increasing stability, test
coverage, and ensuring we work out all OS X issues since none of the
devs have an OS X platform to test phar on.
Greg
Hi,
For other examples, take a look at at
ext/zip. There is no support foropendir()
in the stream wrapper of
ext/zip because it requires the kind of path grepping that pecl/phar
does.
That's easy to add and will be available at some point, that was
simply not needed and nobody ever requested this feature. Extract or
add using a pattern is however much more useful. It is already
partially supported.
Yes, this shouldn't be allowed while
$phar['foo/bar/baz'] = new DirectoryIterator(); could be allowed.I will simply add makeEmptyDir() a la zip, although it is too bad
ext/zip never thought to go withmkdir()
.
I thought about mkdir but it is a bad name as it implies a shell
command. makeEmptyDir while being longer is self explaining. I also
asked before choosing this name, as far as I remember nobody proposed
mkdir either.
Cheers,
Pierre Joye wrote:
Hi,
For other examples, take a look at at
ext/zip. There is no support foropendir()
in the stream wrapper of
ext/zip because it requires the kind of path grepping that pecl/phar
does.That's easy to add and will be available at some point, that was
simply not needed and nobody ever requested this feature. Extract or
add using a pattern is however much more useful. It is already
partially supported.
Yes, I agree.Yes, this shouldn't be allowed while
$phar['foo/bar/baz'] = new DirectoryIterator(); could be allowed.I will simply add makeEmptyDir() a la zip, although it is too bad
ext/zip never thought to go withmkdir()
.I thought about mkdir but it is a bad name as it implies a shell
command. makeEmptyDir while being longer is self explaining. I also
asked before choosing this name, as far as I remember nobody proposed
mkdir either.
I'm fine with makeEmptyDir(), I was just thinking out loud :)
This wasn't intended to be a referendum on ext/zip, just to be clear.
Greg
Hi Greg,
What I would like to do, however, is to rethink offsetSet(). I think
that we should introduce a property of PharFileInfo called "content"
that is read/write, and can be used to perform the equivalent of
file_get_contents()
/file_put_contents(). This will allow a consistency
that is much more intuitive. offsetSet() could then simply be removed,
or perhaps be refactored as a way to pass in a PharFileInfo and clone
the information.
Could you offer up some pseudo-code showing how this will be used? I can't
visualize it at all from that description.
One problem with this suggestion is that phar renames the archive by
default when compressing. I would rather we combine compression and
format conversion into a single convert() method that accepts constants.$phar = new Phar('blah.phar');
$phar = $phar->convert(Phar::GZ); // compress with gzip
$phar = $phar->convert(Phar::TAR | PHAR::GZ); // convert to .phar.tar.gz
$phar = $phar->convert(Phar::NONE | PHAR::DATA); // convert to
non-executable .tar, returns PharData object
$phar = $phar->convert(Phar::EXE); // convert back to .phar.tar
Didn't we just go full circle here? There was a proposed convert() method
that did absolutely everything (on e-paper) and it was far less intuitive
than when conversion was separated from compression. That's why you ended up
implementing them separately, as I recall.
This would also solve a subtle but important problem with the current
conversion API where intermediary files are lying around.
Can't we simply delete those intermediary files on success? That would be
the normal procedure when renaming, no?
Do you want to take a crack at a patch for any of these API features
once they are worked out (buildFromDirectory should be an easy one, if
you would like to start there)?Yes, I'm willing to help with it to a degree, as I said in a previous
mail.How soon can we expect a patch for this? Steph also expressed interest,
and we are on a rather tight time schedule to get this to beta, which
will mean a feature freeze (no new crap, just bug fixes).
I'll implement buildFromDirectory() this weekend. It would've been last
weekend, but I got caught up in trying to bring the last few non-core PECL
modules into line over versioning structure, which hasn't been entirely
plain sailing. (Still 5 + IBM to go.)
- Steph
Steph Fox wrote:
Hi Greg,
What I would like to do, however, is to rethink offsetSet(). I think
that we should introduce a property of PharFileInfo called "content"
that is read/write, and can be used to perform the equivalent of
file_get_contents()
/file_put_contents(). This will allow a consistency
that is much more intuitive. offsetSet() could then simply be removed,
or perhaps be refactored as a way to pass in a PharFileInfo and clone
the information.Could you offer up some pseudo-code showing how this will be used? I
can't visualize it at all from that description.
I should have done this in the original post, because it turns out that
it would require offsetGet() to implicitly create the file, which brings
me back to my original objection to these changes. I'm adding in
getContents() right now to retrieve the file contents into a string, so
one can do:
<?php
$phar = new Phar('blah.phar');
echo $phar['path/to/file.txt']->getContents();
?>
The truth is, I think offsetSet() could accept either a string/fp (file
contents) or a PharFileInfo() that can be used to set a slew of things:
- metadata
- per-file compression
- file contents
This would also be relatively trivial to implement.
One problem with this suggestion is that phar renames the archive by
default when compressing. I would rather we combine compression and
format conversion into a single convert() method that accepts constants.$phar = new Phar('blah.phar');
$phar = $phar->convert(Phar::GZ); // compress with gzip
$phar = $phar->convert(Phar::TAR | PHAR::GZ); // convert to .phar.tar.gz
$phar = $phar->convert(Phar::NONE | PHAR::DATA); // convert to
non-executable .tar, returns PharData object
$phar = $phar->convert(Phar::EXE); // convert back to .phar.tarDidn't we just go full circle here? There was a proposed convert()
method that did absolutely everything (on e-paper) and it was far less
intuitive than when conversion was separated from compression. That's
why you ended up implementing them separately, as I recall.
It does look similar, but the key difference here is that convert() was
a method that changed the underlying object. If we return a new object,
this is a completely different ballpark, the current implementation
actually can remain the same and remove the intermediary files.
This would also solve a subtle but important problem with the current
conversion API where intermediary files are lying around.Can't we simply delete those intermediary files on success? That would
be the normal procedure when renaming, no?
Actually, no. The current implementation has no modification to the
original file on disk, as there is no way to know in advance if the user
actually intended to erase it, and it is actually impossible to do so if
there are any open references to the archive (i.e. Phar objects or
stream fp's lying around).
Do you want to take a crack at a patch for any of these API features
once they are worked out (buildFromDirectory should be an easy one, if
you would like to start there)?Yes, I'm willing to help with it to a degree, as I said in a previous
mail.How soon can we expect a patch for this? Steph also expressed interest,
and we are on a rather tight time schedule to get this to beta, which
will mean a feature freeze (no new crap, just bug fixes).I'll implement buildFromDirectory() this weekend. It would've been last
weekend, but I got caught up in trying to bring the last few non-core
PECL modules into line over versioning structure, which hasn't been
entirely plain sailing. (Still 5 + IBM to go.)
OK.
Greg
Hi Greg,
Phar::compressArchive() /* Steph hated this when I originally proposed it,
perhaps she's changed her mind? */
Nope :)
I still think compress(Phar::BZ2|GZ), uncompress(), compressAllFiles(),
uncompressAllFiles(). And I also think compressAllFiles() only means
something to a handful of people on the planet and needs good clear
documentation.
Do you want to take a crack at a patch for any of these API features once
they are worked out (buildFromDirectory should be an easy one, if you
would like to start there)?
Aw, I had my eye on that :)
- Steph
Thanks,
Greg