Hi all,
I have been working hard on pecl/phar to address several issues raised
last May when it was first mentioned on the list, and would like to
summarize where phar stands today with regards to those criticisms:
Criticisms:
- non-standard file format
- limited introspection
- no support for web-based applications
- by default, phar archives require the phar extension to run
- massive modification of php applications required to run them as a
phar archive - no caching of phar files in opcode caches
- has write support in the extension
Current status of phar addresses most of these criticisms:
- full read/write support for tar and zip file formats plus original
phar file format - introspection of phar-based archives is available via the "phar.phar"
command-line tool, and all standard tar/zip tools can introspect tar and
zip files - web-based phar archives has been completed, 1 line of code is needed
to enable the web front controller. Concept proved by running
phpMyAdmin from its original tarball without code changes - default stub for phar-based archives allows standard PHP applications
to run from a phar archive without the phar extension being present. - interception of read-based file functions and include now allows most
php applications to run from a phar archive without any modification to
the original code - Gopal committed code to APC that allows caching of files from stream
wrappers - write support is disabled by default, and can only be enabled on the
system level, this has not changed.
phar is also the first PHP extension to provide full read/write support
of the tar file format on windows (libarchive supports this on unix)
phar implements zip support with native PHP code, enabling some features
not present in ext/zip such as opendir()
stream support, bzip2
compression, file permissions stored in the zip archive, and greatly
improved efficiency on accessing just a few files within a large zip
archive.
phar also supports creating and running gzipped/bzipped tar or phar
archives without requiring decompression (this is done on the fly) at
the expense of the expected performance hit.
Phar has no required dependencies, and optional dependencies on spl,
zlib and bz2 (zlib+bz2 are obviously required for
compressing/decompressing with those formats, spl is only required for
fancy-pants stuff, the stream wrapper, web front controller, and other
major features do not require spl)
Development is still actively occurring on phar, to fix a few known
issues and increase code coverage in the unit tests. The enhancements
above are in CVS in the soon-to-be-released phar 2.0.0.
If phar were in core, it would allow people distributing applications or
libraries to bundle unpacking code or installation code inside the
archive. Applications could also be designed to run right out of the
phar archive for users to try them out or even for final installation
using the standard tar/zip file formats. The phar file format has the
advantage of not requiring the phar extension in order to run. The
tar/zip file formats have the advantage that if the phar extension is
not present a simple "unzip" command (or the equivalent for tar or for
windows) allows easy installation.
As such, I would like to ask for a second consideration of bundling phar
in core, as it has a huge potential for enhancing the distribution of
PHP applications.
Thanks,
Greg
Hello Gregory,
My point of view is this action list:
- include phar in core
2.a. add ext/zip compatible functions and replace ext/zip
2.b. change ext/zip to use zip lib of pahr and add stream support - drop ability to disable spl
I have no preference between 2a or 2b. Though technically I guess that 2a
is probably much faster to achive.
best regards
marcus
p.s.: I haven't done much in regards to Phar lately - and can only say that
you did an amazing not expect job. You turned Phar into something way
better than we ever thought of.
Monday, January 28, 2008, 6:30:58 PM, you wrote:
Hi all,
I have been working hard on pecl/phar to address several issues raised
last May when it was first mentioned on the list, and would like to
summarize where phar stands today with regards to those criticisms:
Criticisms:
- non-standard file format
- limited introspection
- no support for web-based applications
- by default, phar archives require the phar extension to run
- massive modification of php applications required to run them as a
phar archive- no caching of phar files in opcode caches
- has write support in the extension
Current status of phar addresses most of these criticisms:
- full read/write support for tar and zip file formats plus original
phar file format- introspection of phar-based archives is available via the "phar.phar"
command-line tool, and all standard tar/zip tools can introspect tar and
zip files- web-based phar archives has been completed, 1 line of code is needed
to enable the web front controller. Concept proved by running
phpMyAdmin from its original tarball without code changes- default stub for phar-based archives allows standard PHP applications
to run from a phar archive without the phar extension being present.- interception of read-based file functions and include now allows most
php applications to run from a phar archive without any modification to
the original code- Gopal committed code to APC that allows caching of files from stream
wrappers- write support is disabled by default, and can only be enabled on the
system level, this has not changed.
phar is also the first PHP extension to provide full read/write support
of the tar file format on windows (libarchive supports this on unix)
phar implements zip support with native PHP code, enabling some features
not present in ext/zip such asopendir()
stream support, bzip2
compression, file permissions stored in the zip archive, and greatly
improved efficiency on accessing just a few files within a large zip
archive.
phar also supports creating and running gzipped/bzipped tar or phar
archives without requiring decompression (this is done on the fly) at
the expense of the expected performance hit.
Phar has no required dependencies, and optional dependencies on spl,
zlib and bz2 (zlib+bz2 are obviously required for
compressing/decompressing with those formats, spl is only required for
fancy-pants stuff, the stream wrapper, web front controller, and other
major features do not require spl)
Development is still actively occurring on phar, to fix a few known
issues and increase code coverage in the unit tests. The enhancements
above are in CVS in the soon-to-be-released phar 2.0.0.
If phar were in core, it would allow people distributing applications or
libraries to bundle unpacking code or installation code inside the
archive. Applications could also be designed to run right out of the
phar archive for users to try them out or even for final installation
using the standard tar/zip file formats. The phar file format has the
advantage of not requiring the phar extension in order to run. The
tar/zip file formats have the advantage that if the phar extension is
not present a simple "unzip" command (or the equivalent for tar or for
windows) allows easy installation.
As such, I would like to ask for a second consideration of bundling phar
in core, as it has a huge potential for enhancing the distribution of
PHP applications.
Thanks,
Greg
Best regards,
Marcus
Hi Marcus,
- include phar in core
2.a. add ext/zip compatible functions and replace ext/zip
2.b. change ext/zip to use zip lib of pahr and add stream support- drop ability to disable spl
I have no preference between 2a or 2b. Though technically I guess that 2a
is probably much faster to achive.
You're behind the times. Check it out; 18 of the 20 broken tests are down to
the fact that we can no longer 'knit' a pure zip archive because Greg
dropped the ext/zip dependency.
p.s.: I haven't done much in regards to Phar lately - and can only say
that
you did an amazing not expect job. You turned Phar into something way
better than we ever thought of.
Hear, hear!
- Steph
Hello Steph,
so you mean we do not have to confuse our uses by solution 2a becasue we
only have the minimum subset of zip in phar that ohar actually needs?
marcus
Monday, January 28, 2008, 7:01:42 PM, you wrote:
Hi Marcus,
- include phar in core
2.a. add ext/zip compatible functions and replace ext/zip
2.b. change ext/zip to use zip lib of pahr and add stream support- drop ability to disable spl
I have no preference between 2a or 2b. Though technically I guess that 2a
is probably much faster to achive.
You're behind the times. Check it out; 18 of the 20 broken tests are down to
the fact that we can no longer 'knit' a pure zip archive because Greg
dropped the ext/zip dependency.
p.s.: I haven't done much in regards to Phar lately - and can only say
that
you did an amazing not expect job. You turned Phar into something way
better than we ever thought of.
Hear, hear!
- Steph
Best regards,
Marcus
Hi Marcus,
so you mean we do not have to confuse our uses by solution 2a becasue we
only have the minimum subset of zip in phar that ohar actually needs?
Yep. But Greg can explain better.
- Steph
Marcus Boerger wrote:
Hello Steph,
so you mean we do not have to confuse our uses by solution 2a becasue we
only have the minimum subset of zip in phar that ohar actually needs?
Although phar can be used to create and read any zip archive, it will
automatically add the file ".phar/stub.php" when creating or modifying a
zip file, as this file is needed in order to directly run a zip-based
phar archive. Although this could be disabled to make phar ext/zip, I
don't think phar is the right API for simple data-based zips. For one
thing, write access is disabled by default, meaning on-the-fly creation
of zip/tar in web applications is disabled by default. Phar is designed
with the idea that one creates the phar archive on a development machine
and users just use it read-only.
This is most important for phar-based phar archives, as these can run
without ext/phar being present. One possibility that just occurred to
me would be to allow creation of zip/tar files that do not contain
".phar/stub.php" if phar.readonly=1. These archives would not be
executable because of this omission, and it would allow creation of
tar/zip via phar in any application. However, this may be too confusing
with the current API. One possibility is simply to provide
PharZip/PharTar objects that can only work with tar/zip-based archives
and allow creation of non-executable tar/zip archives. This would
require minimal code changes, probably just a line or two for each
object as we could use the existing Phar object as its base and a simple
flag.
In terms of minimum subset, phar supports nearly the full range of
possibility with zip archives currently (encryption is the only major
feature missing), and I'm not sure how we could successfully subset that
without obfuscating the code.
Steph's reference to test failures is simply that some of the existing
tests use ext/zip to create zip files on the fly, and then pass those to
phar, in order to test phar's ability to take a fully created zip file
as opposed to one it made itself, and is not a user issue, just a
structural issue in the tests themselves that has nothing to do with
phar's zip support. The tests simply need to be modified to create the
zips using pecl/phar and copy the filename to one phar doesn't already
know about, and the failures will go away.
Greg
phar's zip support. The tests simply need to be modified to create the
zips using pecl/phar and copy the filename to one phar doesn't already
know about, and the failures will go away.
I thought you wanted 'pure' zips for the tests - that told me!
So how do I create a zip with ext/phar then, other than by using
convertToZip()?
- Steph
Greg
phar's zip support. The tests simply need to be modified to create the
zips using pecl/phar and copy the filename to one phar doesn't already
know about, and the failures will go away.I thought you wanted 'pure' zips for the tests - that told me!
So how do I create a zip with ext/phar then, other than by using
convertToZip()?
- Steph
Greg
--
--
Just as a side note. I have been following Phar development closely
since basically the beginning of the project and I must admit that
despite being an advocate and user of this piece of code, over the
past few months, Greg has put up a LOT of effort and wonderful
commits. I personally have been impressed by his C skills that I was
not aware of before.
This said, I believe that phar can be used widely by many many
projects and that anyone using java and jar files have the right to be
awful jealous about the phar in php.
Anyways, thanks greg you have made my life much easier with the strike
of commits and optimizations + features you have added to phar. I want
it in core, want it now.
Thanks Greg and Marcus a bunch (once again)
--
David Coallier,
Founder & Software Architect,
Agora Production (http://agoraproduction.com)
51.42.06.70.18
Current status of phar addresses most of these criticisms:
Looks impressive, great work!
phar implements zip support with native PHP code, enabling some features
I am a bit confused about native PHP code here - we are talking baout an
extension, right? So what exactly is meant here?
Also, as I understand, there are features that ext/zip missed and phar
extension needs. Any reason not to add them to ext/zip?
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
Stanislav Malyshev wrote:
Current status of phar addresses most of these criticisms:
Looks impressive, great work!
phar implements zip support with native PHP code, enabling some features
I am a bit confused about native PHP code here - we are talking baout
an extension, right? So what exactly is meant here?
Also, as I understand, there are features that ext/zip missed and phar
extension needs. Any reason not to add them to ext/zip?
Hi Stas,
Sorry, just a confusion of language on my part. ext/zip relies on
libzip, an external lib that uses malloc/open and zlib directly.
pecl/phar has built-in zip support based on php internals, it uses
emalloc/streams which is what allows zlib/bz2 to be optional deps.
Pierre and I are talking about merging efforts, which would benefit
everyone. "libifying" the zip support in pecl/phar would be a
relatively simple task, although some of the features are phar-specific,
such as opendir()
support. ext/zip would need to copy the generic
opendir()
stuff we use to get that (it scans all files to figure out
which ones are in a directory, which is O(n) but works). zip support is
only a small part of phar's features, merging all of phar's features
into ext/zip would not make any sense and would confuse folks
tremendously, such as being able to create tar archives with ext/zip :).
Hopefully, the question of how the zip support will eventually end up
looking (inside phar, phar/zip merge, new php zip lib) is kind of a
non-issue. I'm very confident Pierre, Marcus, Steph and I will be able
to work out the best solution for both ext/zip and pecl/phar, the main
question at this point is bundling phar.
Greg
Hi Greg,
Current status of phar addresses most of these criticisms:
Looks impressive, great work!
phar implements zip support with native PHP code, enabling some features
I am a bit confused about native PHP code here - we are talking baout an
extension, right? So what exactly is meant here?
Also, as I understand, there are features that ext/zip missed and phar
extension needs. Any reason not to add them to ext/zip?
Exactly and I'm rather surprised to see this post given the recent
efforts to export the Zip symbols to allow any extension to share the
zip features. Most of the discussions have been public on pecl-dev.
There is some private discussions about our respective plans, even
today. I can say that the goals and APIs are different, there is a
need for both extensions (zip will never provide what phar does for
the application archive and pahr will never go as far as zip for the
zip format support), that's my understanding of the current situation.
For the record on this list, we also have a lot of work in progress
with the libzip developers and myself to add many new features. The
short list is:
-
custom stream support (like in libgd or libxml), allowing native (as
in operating system native supports) IO functions, bringing the
maximum portability and integration (we can use php stream as well as
soon as php supports large files > 2Go, I did not follow this topic,
maybe it is already in place). The stream works both in read in write
mode. -
crypt support
-
Zipstream (not like the previous point), inline zipped data stream.
Like what you can see in many java-based web services. -
Drop of the open files system limit for the amount of entries used
at the same time. This limitation was rather annoying but necessary to
insure the consistency and safety of zip creation
However, my point remains intact, I'm not in favour of having phar
included. Unless there is an improved cooperation with the community
(in large) to create this application-archive format. It would really
rock to have a standard format designed, approved and adopted by all
PHP developers and projects. At this point we can bundle it or it may
be a chicken-egg problem :)
Cheers,
Hi Pierre,
Exactly and I'm rather surprised to see this post given the recent
efforts to export the Zip symbols to allow any extension to share the
zip features.
I think until the zip features were shared the library's limitations hadn't
been too obvious.
Most of the discussions have been public on pecl-dev.
There is some private discussions about our respective plans, even
today.
Not that I'm aware of...
I can say that the goals and APIs are different, there is a
need for both extensions (zip will never provide what phar does for
the application archive and pahr will never go as far as zip for the
zip format support), that's my understanding of the current situation.
That's my understanding too.
However, my point remains intact, I'm not in favour of having phar
included. Unless there is an improved cooperation with the community
(in large) to create this application-archive format.
What exactly would we need to do to improve cooperation?
It would really
rock to have a standard format designed, approved and adopted by all
PHP developers and projects. At this point we can bundle it or it may
be a chicken-egg problem :)
Well, that's why the aim is to get it bundled now. Because using the Phar
extension you can now optionally write phars that can be opened by: tar (+
bzip2/zlib), zip (including Windows Explorer), or plain PHP, either CLI or
through a browser, using the default stub; it's about as flexible as anyone
could want it to be. There's still also the option to write phars that
require ext/phar enabled before they can be read, which personally I think
is a bad idea but Marcus wanted it that way.
- Steph
Cheers,
Hi Steph,
Exactly and I'm rather surprised to see this post given the recent
efforts to export the Zip symbols to allow any extension to share the
zip features.I think until the zip features were shared the library's limitations hadn't
been too obvious.
There is no limitation per se. There is room for improvements like the
ones I listed. There is a huge amount of users and a very small amount
of bug reports or support requests, it simply works for almost
everyone :).
Most of the discussions have been public on pecl-dev.
There is some private discussions about our respective plans, even
today.Not that I'm aware of...
I'm not sure to understand what you mean here. There was discussion on
pecl-dev and there is a current private discussions about phar and zip
:)
However, my point remains intact, I'm not in favour of having phar
included. Unless there is an improved cooperation with the community
(in large) to create this application-archive format.What exactly would we need to do to improve cooperation?
What we did not do for PEAR and why it took too long for pear to be
widely adopted (and there is still work to be done in this area). The
problem is that you can't force a given format. I have no magic
solution but I think we should find something to get some of the major
projets involved to get what they actually like to see and/or if what
is done fits their needs.
It would really
rock to have a standard format designed, approved and adopted by all
PHP developers and projects. At this point we can bundle it or it may
be a chicken-egg problem :)Well, that's why the aim is to get it bundled now. Because using the Phar
extension you can now optionally write phars that can be opened by: tar (+
bzip2/zlib), zip (including Windows Explorer), or plain PHP, either CLI or
through a browser, using the default stub; it's about as flexible as anyone
could want it to be. There's still also the option to write phars that
require ext/phar enabled before they can be read, which personally I think
is a bad idea but Marcus wanted it that way.
I think it is a good thing to require ext/phar even for the read
operations. It certainly allows a shit load of optimization and tricks
that will never be possible otherwise. But Greg or Marcus will give us
a better answer :)
Cheers,
I think it is a good thing to require ext/phar even for the read
operations. It certainly allows a shit load of optimization and tricks
that will never be possible otherwise. But Greg or Marcus will give us
a better answer :)
? It's a < 7kb add-in stub to make it open-access.
- Steph
I think it is a good thing to require ext/phar even for the read
operations. It certainly allows a shit load of optimization and tricks
that will never be possible otherwise. But Greg or Marcus will give us
a better answer :)? It's a < 7kb add-in stub to make it open-access.
What does that mean? It needs 7KB more? irrelevant imho :)
--
Pierre
http://blog.thepimp.net | http://www.libgd.org
Pierre Joye wrote:
Hi Greg,
Current status of phar addresses most of these criticisms:
Looks impressive, great work!
phar implements zip support with native PHP code, enabling some features
I am a bit confused about native PHP code here - we are talking baout an
extension, right? So what exactly is meant here?
Also, as I understand, there are features that ext/zip missed and phar
extension needs. Any reason not to add them to ext/zip?Exactly and I'm rather surprised to see this post given the recent
efforts to export the Zip symbols to allow any extension to share the
zip features. Most of the discussions have been public on pecl-dev.
I expended considerable energy in this direction, but shortly after
succeeding in getting the symbols exported, I began to run into
limitation after limitation, and these were all intrinsic design
problems in libzip, this was why I implemented the new zip stuff. I'm
sorry if my communication was less-than-ideal, I have no intention of
operating in dark alleys or behind closed doors, and have been emailing
you offlist every step of the way as well as onlist for all changes that
involved ext/zip directly.
There is some private discussions about our respective plans, even
today. I can say that the goals and APIs are different, there is a
need for both extensions (zip will never provide what phar does for
the application archive and pahr will never go as far as zip for the
zip format support), that's my understanding of the current situation.
I agree, both extensions are needed for the API differences. ext/zip is
focussed on data users and is very good for that purpose, whereas phar
is designed for php applications. However, it is inaccurate to say
phar's zip format support is inferior to ext/zip. Phar's support of the
zip file format is actually more extensive than libzip's in that it
supports extra fields for files, bzip2 compression per-file, and is more
easily extended. For instance, adding encryption support to phar's zip
implementation is trivial compared to adding it to libzip. Neither
extension supports split files, which is something that might make sense
for ext/zip, or zip64, but zip64 would also be trivial to add to phar,
as the header formats are already in pharzip.h waiting to be
implemented, should the need arise, for instance, to support InfoZIP
zips with UTF-8 file comments/filenames.
For the record on this list, we also have a lot of work in progress
with the libzip developers and myself to add many new features. The
short list is:
- custom stream support (like in libgd or libxml), allowing native (as
in operating system native supports) IO functions, bringing the
maximum portability and integration (we can use php stream as well as
soon as php supports large files > 2Go, I did not follow this topic,
maybe it is already in place). The stream works both in read in write
mode.
phar uses PHP streams, so if there is a 2 GB limit, this will affect
phar until it is fixed.
- crypt support
this, interestingly, was raised in a private conversation with a
developer over the weekend as a major must-have for closed-source apps,
as they could distro as an encrypted zip, so I am fully in favor of
implementing this in phar with the help and support of those who would
use it. The stuff is all there in pharzip.h, all we need is the crypto
and implementation details (for instance where does the key go? ini
setting?).
- Zipstream (not like the previous point), inline zipped data stream.
Like what you can see in many java-based web services.
This is a limitation of phar I have been thinking about for a few days
without a fleshed-out solution - currently bundling an archive inside of
a phar archive is possible, but the stream URL does not support
accessing internal phar archives. We may implement a mount-like thing a
la PHK (Francois, if you are listening, this would be a great time to
help implement PHK's better features in phar).
Accessing individual files within the zip directly is easily done with
Phar::webPhar(), you just put the phar in your web path, and append the
local path to the internal file. In other words, to access file
"path/to/blah.jpg" in phar "myphar.phar.zip.php" in the docroot of
localhost, you browse to
"http://localhost/myphar.phar.zip.php/path/to/blah.jpg"
- Drop of the open files system limit for the amount of entries used
at the same time. This limitation was rather annoying but necessary to
insure the consistency and safety of zip creation
This is already fixed in phar's zip implementation. When reading files,
phar has at most 2 open file pointers per phar archive, one for
uncompressed and one for compressed files.
However, my point remains intact, I'm not in favour of having phar
included. Unless there is an improved cooperation with the community
(in large) to create this application-archive format. It would really
rock to have a standard format designed, approved and adopted by all
PHP developers and projects. At this point we can bundle it or it may
be a chicken-egg problem :)
I am proposing phar precisely for this purpose - it's time for open
discussions with all interested parties. I have had discussions with
representatives from several projects privately, but now I would like to
take the discussion public because I believe phar provides a foundation
on which to build a standard format, while providing extreme flexibility
for non-standard niche projects.
An important feature of phar I had not mentioned which may alleviate the
need to decide on an exact format is that Phar can convert from phar =>
zip => tar. The code looks like this:
<?php
// assume myphar.phar.zip is some application we downloaded
$a = new Phar('myphar.phar.zip');
$a->convertToTar();
?>
convertToPhar()/convertToTar() can accept an optional compression
argument Phar::GZ or Phar::BZ2 to compress the entire archive.
Greg
Pierre Joye wrote:
Hi Greg,
Current status of phar addresses most of these criticisms:
Looks impressive, great work!
phar implements zip support with native PHP code, enabling some features
I am a bit confused about native PHP code here - we are talking baout an
extension, right? So what exactly is meant here?
Also, as I understand, there are features that ext/zip missed and phar
extension needs. Any reason not to add them to ext/zip?Exactly and I'm rather surprised to see this post given the recent
efforts to export the Zip symbols to allow any extension to share the
zip features. Most of the discussions have been public on pecl-dev.I expended considerable energy in this direction, but shortly after
succeeding in getting the symbols exported, I began to run into
limitation after limitation, and these were all intrinsic design
problems in libzip, this was why I implemented the new zip stuff. I'm
sorry if my communication was less-than-ideal, I have no intention of
operating in dark alleys or behind closed doors, and have been emailing
you offlist every step of the way as well as onlist for all changes that
involved ext/zip directly.There is some private discussions about our respective plans, even
today. I can say that the goals and APIs are different, there is a
need for both extensions (zip will never provide what phar does for
the application archive and pahr will never go as far as zip for the
zip format support), that's my understanding of the current situation.I agree, both extensions are needed for the API differences. ext/zip is
focussed on data users and is very good for that purpose, whereas phar
is designed for php applications. However, it is inaccurate to say
phar's zip format support is inferior to ext/zip. Phar's support of the
zip file format is actually more extensive than libzip's in that it
supports extra fields for files, bzip2 compression per-file, and is more
easily extended. For instance, adding encryption support to phar's zip
implementation is trivial compared to adding it to libzip. Neither
extension supports split files, which is something that might make sense
for ext/zip, or zip64, but zip64 would also be trivial to add to phar,
It is already implemented. We are working on portability issues.
That's why I put the custom stream support as the top priority, it is
the way to go to solve almost portability issues.
as the header formats are already in pharzip.h waiting to be
implemented, should the need arise, for instance, to support InfoZIP
zips with UTF-8 file comments/filenames.
By no mean you can rely only on UTF-8. It is handy but it is not what
zip specs say. OS specific encoding is sadly the way they work.
For the record on this list, we also have a lot of work in progress
with the libzip developers and myself to add many new features. The
short list is:
- custom stream support (like in libgd or libxml), allowing native (as
in operating system native supports) IO functions, bringing the
maximum portability and integration (we can use php stream as well as
soon as php supports large files > 2Go, I did not follow this topic,
maybe it is already in place). The stream works both in read in write
mode.phar uses PHP streams, so if there is a 2 GB limit, this will affect
phar until it is fixed.
- crypt support
this, interestingly, was raised in a private conversation with a
developer over the weekend as a major must-have for closed-source apps,
as they could distro as an encrypted zip, so I am fully in favor of
implementing this in phar with the help and support of those who would
use it. The stuff is all there in pharzip.h, all we need is the crypto
and implementation details (for instance where does the key go? ini
setting?).
- Zipstream (not like the previous point), inline zipped data stream.
Like what you can see in many java-based web services.This is a limitation of phar I have been thinking about for a few days
without a fleshed-out solution - currently bundling an archive inside of
a phar archive is possible, but the stream URL does not support
accessing internal phar archives. We may implement a mount-like thing a
la PHK (Francois, if you are listening, this would be a great time to
help implement PHK's better features in phar).Accessing individual files within the zip directly is easily done with
Phar::webPhar(), you just put the phar in your web path, and append the
local path to the internal file. In other words, to access file
"path/to/blah.jpg" in phar "myphar.phar.zip.php" in the docroot of
localhost, you browse to
"http://localhost/myphar.phar.zip.php/path/to/blah.jpg"
I really don't like this feature, but it may be handy for apps :) I
prefer what we do in zip, fopen("zip://...","..") or
imagecreatefromjpeg("zip://..");
An interesting feature for both zip and phar would be the ability to
pipe the read stream to another stream (ie stdout) instead of writing
the read data into a file/variable.
- Drop of the open files system limit for the amount of entries used
at the same time. This limitation was rather annoying but necessary to
insure the consistency and safety of zip creationThis is already fixed in phar's zip implementation. When reading files,
phar has at most 2 open file pointers per phar archive, one for
uncompressed and one for compressed files.
It is easy to fix in libzip too but that's not a priority right now.
Especially not before the new IO infrastructure is in place.
Now that we have moved one step forward, I would like to make a
suggestion: we should stop the ext/A vs ext/B. It is pointless (we
agreed that goals are different) and counter productive. I still think
that in the end you should rely on ext/zip for the zip operations. The
"limitations" you are talking about could be fixed very quickly in
ext/zip without waiting for the custom stream support (for the file
handles, checksum, etc.). The write for the php stream support is also
possible without too much work, but it is an extra step I did not want
to take, I prefer to get the new IO in place before.
However I find your actual positions confusing and each mails bring
opposing arguments about the shared work between other archives
extension and phar. Can you clarify your view please?
Cheers,
However I find your actual positions confusing and each mails bring
opposing arguments about the shared work between other archives
extension and phar. Can you clarify your view please?
Essential: nothing
Optional: bz2, spl, zlib
Completely different and nothing whatever to do with phar: ext/zip.
- Steph
Cheers,
However I find your actual positions confusing and each mails bring
opposing arguments about the shared work between other archives
extension and phar. Can you clarify your view please?Essential: nothing
Optional: bz2, spl, zlib
Completely different and nothing whatever to do with phar: ext/zip.
I don't get a word of what you are saying. (sidenote, I was asking
Greg to clarify his view)
--
Pierre
http://blog.thepimp.net | http://www.libgd.org
Pierre Joye wrote:
for ext/zip, or zip64, but zip64 would also be trivial to add to phar,
It is already implemented. We are working on portability issues.
That's why I put the custom stream support as the top priority, it is
the way to go to solve almost portability issues.
Which is? zip64? This is not implemented in any public releases of
libzip, or in any of the 5 branches of zip's bundling of libzip, is
there a public repo that I have missed that I should be looking at?
as the header formats are already in pharzip.h waiting to be
implemented, should the need arise, for instance, to support InfoZIP
zips with UTF-8 file comments/filenames.By no mean you can rely only on UTF-8. It is handy but it is not what
zip specs say. OS specific encoding is sadly the way they work.
InfoZIP does have a specific extension stored as an extra field. This
is mentioned in appnote.txt. ZIP64 support for UTF-8 is also explicitly
mentioned, perhaps you thought I was talking about a different area of
the spec?
Accessing individual files within the zip directly is easily done with
Phar::webPhar(), you just put the phar in your web path, and append the
local path to the internal file. In other words, to access file
"path/to/blah.jpg" in phar "myphar.phar.zip.php" in the docroot of
localhost, you browse to
"http://localhost/myphar.phar.zip.php/path/to/blah.jpg"I really don't like this feature, but it may be handy for apps :) I
prefer what we do in zip, fopen("zip://...","..") or
imagecreatefromjpeg("zip://..");
Of course, identical syntax works with the phar:// stream wrapper if
ext/phar is present.
An interesting feature for both zip and phar would be the ability to
pipe the read stream to another stream (ie stdout) instead of writing
the read data into a file/variable.
stream_copy_to_stream()
works great in this arena, not sure any
additional features are needed.
<?php
$fp = fopen('phar://blah.phar/path/to/blah.jpg', 'rb');
stream_copy_to_stream($fp, STDOUT);
?>
Now that we have moved one step forward, I would like to make a
suggestion: we should stop the ext/A vs ext/B. It is pointless (we
agreed that goals are different) and counter productive. I still think
that in the end you should rely on ext/zip for the zip operations. The
"limitations" you are talking about could be fixed very quickly in
ext/zip without waiting for the custom stream support (for the file
handles, checksum, etc.). The write for the php stream support is also
possible without too much work, but it is an extra step I did not want
to take, I prefer to get the new IO in place before.
Here are the issues with ext/zip that are must-fix showstoppers for
pecl/phar to use ext/zip:
- zlib/bz2 must be an optional dependency, currently zlib is required.
The design of libzip is such that the zlib library is required in order
to load, not just to compile it. - libzip compares local/central directory records and calculates crc on
load for every file. This must be done on access to a specific file in
order for it to be truly viable for phar (lazy validation). - libzip has no support for extra fields for either file or archive.
We use them to store file permissions in pecl/phar.
Everything else phar's zip implementation provides is a
"would-be-nice-to-have" such as small performance enhancements on access
to a file by name (phar's implementation is O(1), while libzip's is O(n)
because phar uses a hash table and libzip uses a vector).
However I find your actual positions confusing and each mails bring
opposing arguments about the shared work between other archives
extension and phar. Can you clarify your view please?
I'm not sure which opposing arguments you're referring to, but I will
try to clarify.
My view is simple and hasn't changed from the first message. pecl/phar
has specific needs in a zip implementation that are not currently
available from ext/zip. I am more than willing to use ext/zip should
these enhancements become available. It also might make sense to
consider having ext/zip use phar's zip implementation, and I am open to
that possibility.
pecl/phar devs hope to limit the number of required dependencies in
order to best benefit end users who often have little to no control over
their configuration. In addition, we'd like to see the best
implementation possible of these archive formats. I would have optional
dependencies on pecl/archive and pecl/zip, but pecl/archive does not
build on windows, and pecl/zip has the issues I listed above. Note that
removing phar's internal zip support and instead using pecl/zip would
make phar's zip support optional, another consideration in the mix, but
I am more than willing to do this.
I also would fix up pecl/zip/lib if I really thought it was possible,
but taking into account my limited abilities, the only way I see that it
could be done is with a redesign of libzip. I defer to you and the
libzip developers to prove me wrong if there is a way to satisfy
pecl/phar's needs in a zip implementation, as you are far more
accomplished and experienced in libzip internals.
Now, here's the actual state of things at cvs.php.net: pecl/phar has the
zip support it needs internally. libzip does not yet have the zip
support pecl/phar needs. Isn't all of this argument moot until
ext/zip/lib has the needed features?
I'd love it if we could steer away from talking about pecl/phar's zip
support, as this is really a side note to the main proposal. pecl/phar
also supports tar archives, based on an implementation Dmitry Stogov
wrote as a proof-of-concept. More importantly, I want to note that the
manual does not seem to have built recently, and you'll have to read the
documentation at http://docs.php.net/manual/en/book.phar.php to see the
most up-to-date stuff.
Thanks,
Greg
Hey Greg,
This looks very promising. Great to see that you took those feedbacks
and really attacked them leading to a huge improvement in phar (should I
say night and day :) I think you've really accomplished a lot in these
few months.
Are there any docs which describe the transparent front controller
configuration? I'd love to take a look and give it a spin.
Thanks!
Andi
-----Original Message-----
From: Gregory Beaver [mailto:greg@chiaraquartet.net]
Sent: Monday, January 28, 2008 9:31 AM
To: internals Mailing List; Steph Fox; Marcus Boerger
Subject: [PHP-DEV] re-proposal of pecl/phar for inclusion in coreHi all,
I have been working hard on pecl/phar to address several issues raised
last May when it was first mentioned on the list, and would like to
summarize where phar stands today with regards to those criticisms:Criticisms:
- non-standard file format
- limited introspection
- no support for web-based applications
- by default, phar archives require the phar extension to run
- massive modification of php applications required to run them as a
phar archive- no caching of phar files in opcode caches
- has write support in the extension
Current status of phar addresses most of these criticisms:
- full read/write support for tar and zip file formats plus original
phar file format- introspection of phar-based archives is available via the
"phar.phar"
command-line tool, and all standard tar/zip tools can introspect tar
and
zip files- web-based phar archives has been completed, 1 line of code is
needed
to enable the web front controller. Concept proved by running
phpMyAdmin from its original tarball without code changes- default stub for phar-based archives allows standard PHP
applications
to run from a phar archive without the phar extension being present.- interception of read-based file functions and include now allows
most
php applications to run from a phar archive without any modification
to
the original code- Gopal committed code to APC that allows caching of files from
stream
wrappers- write support is disabled by default, and can only be enabled on
the
system level, this has not changed.phar is also the first PHP extension to provide full read/write
support
of the tar file format on windows (libarchive supports this on unix)
phar implements zip support with native PHP code, enabling some
features
not present in ext/zip such asopendir()
stream support, bzip2
compression, file permissions stored in the zip archive, and greatly
improved efficiency on accessing just a few files within a large zip
archive.
phar also supports creating and running gzipped/bzipped tar or phar
archives without requiring decompression (this is done on the fly) at
the expense of the expected performance hit.Phar has no required dependencies, and optional dependencies on spl,
zlib and bz2 (zlib+bz2 are obviously required for
compressing/decompressing with those formats, spl is only required for
fancy-pants stuff, the stream wrapper, web front controller, and other
major features do not require spl)Development is still actively occurring on phar, to fix a few known
issues and increase code coverage in the unit tests. The enhancements
above are in CVS in the soon-to-be-released phar 2.0.0.If phar were in core, it would allow people distributing applications
or
libraries to bundle unpacking code or installation code inside the
archive. Applications could also be designed to run right out of the
phar archive for users to try them out or even for final installation
using the standard tar/zip file formats. The phar file format has the
advantage of not requiring the phar extension in order to run. The
tar/zip file formats have the advantage that if the phar extension is
not present a simple "unzip" command (or the equivalent for tar or for
windows) allows easy installation.As such, I would like to ask for a second consideration of bundling
phar
in core, as it has a huge potential for enhancing the distribution of
PHP applications.Thanks,
Greg
Andi Gutmans wrote:
Hey Greg,
This looks very promising. Great to see that you took those feedbacks
and really attacked them leading to a huge improvement in phar (should I
say night and day :) I think you've really accomplished a lot in these
few months.
Are there any docs which describe the transparent front controller
configuration? I'd love to take a look and give it a spin.
Hi,
I replied to Andi earlier, forgot to hit Reply-All, so I'll send this
again, but with a bit more detail.
These two links describe the main things needed:
http://docs.php.net/manual/en/phar.webphar.php
http://docs.php.net/manual/en/phar.interceptfilefuncs.php
Most apps would need both Phar::interceptFileFuncs() and
Phar::webPhar(). For instance, to get phpMyAdmin working, I downloaded
the tar.gz, and ran this script (it assumes you've got an insecure mysql
with root having no password, so change that if you don't, and you
should change the absolute paths to the right location):
<?php
chdir('/home/cellog/testapache/htdocs');
@unlink('phpMyAdmin.phar.tar.php');
copy('phpMyAdmin-2.11.3-english.tar.gz', 'phpMyAdmin.phar.tar.php');
$a = new Phar('phpMyAdmin.phar.tar.php');
var_dump(count($a));
$a->startBuffering();
$a["phpMyAdmin-2.11.3-english/config.inc.php"] = '<?php
/* Servers configuration */
$i = 0;
/* Server localhost (config:root) [1] */
$i++;
$cfg['Servers'][$i]['host'] = 'localhost';
$cfg['Servers'][$i]['extension'] = 'mysqli';
$cfg['Servers'][$i]['connect_type'] = 'tcp';
$cfg['Servers'][$i]['compress'] = false;
$cfg['Servers'][$i]['auth_type'] = 'config';
$cfg['Servers'][$i]['user'] = 'root';
$cfg['Servers'][$i]['password'] = '';
/* End of servers configuration */
if (strpos(PHP_OS, 'WIN') !== false) {
$cfg['UploadDir'] = getcwd()
;
} else {
$cfg['UploadDir'] = '/tmp/pharphpmyadmin';
@mkdir('/tmp/pharphpmyadmin');
@chmod('/tmp/pharphpmyadmin', 0777);
}';
$a->setStub('<?php
Phar::interceptFileFuncs();
Phar::webPhar("phpMyAdmin.phar.tar.php",
"phpMyAdmin-2.11.3-english/index.php");
echo "phpMyAdmin is intended to be executed from a web browser\n";
exit -1;
__HALT_COMPILER();
');
$a->stopBuffering();
?>
After doing this, you can copy phpMyAdmin.phar.tar.php to any location
in your document root and browse to it, and it should pop up the
familiar app, barring some bug we haven't encountered yet :).
Greg
Hi all,
I have been working hard on pecl/phar to address several issues raised
last May when it was first mentioned on the list, and would like to
summarize where phar stands today with regards to those criticisms:Criticisms:
- non-standard file format
- limited introspection
- no support for web-based applications
- by default, phar archives require the phar extension to run
- massive modification of php applications required to run them as a
phar archive- no caching of phar files in opcode caches
- has write support in the extension
Current status of phar addresses most of these criticisms:
- full read/write support for tar and zip file formats plus original
phar file format- introspection of phar-based archives is available via the "phar.phar"
command-line tool, and all standard tar/zip tools can introspect tar and
zip files- web-based phar archives has been completed, 1 line of code is needed
to enable the web front controller. Concept proved by running
phpMyAdmin from its original tarball without code changes- default stub for phar-based archives allows standard PHP applications
to run from a phar archive without the phar extension being present.- interception of read-based file functions and include now allows most
php applications to run from a phar archive without any modification to
the original code- Gopal committed code to APC that allows caching of files from stream
wrappers- write support is disabled by default, and can only be enabled on the
system level, this has not changed.phar is also the first PHP extension to provide full read/write support
of the tar file format on windows (libarchive supports this on unix)
phar implements zip support with native PHP code, enabling some features
not present in ext/zip such asopendir()
stream support, bzip2
compression, file permissions stored in the zip archive, and greatly
improved efficiency on accessing just a few files within a large zip
archive.
phar also supports creating and running gzipped/bzipped tar or phar
archives without requiring decompression (this is done on the fly) at
the expense of the expected performance hit.Phar has no required dependencies, and optional dependencies on spl,
zlib and bz2 (zlib+bz2 are obviously required for
compressing/decompressing with those formats, spl is only required for
fancy-pants stuff, the stream wrapper, web front controller, and other
major features do not require spl)Development is still actively occurring on phar, to fix a few known
issues and increase code coverage in the unit tests. The enhancements
above are in CVS in the soon-to-be-released phar 2.0.0.If phar were in core, it would allow people distributing applications or
libraries to bundle unpacking code or installation code inside the
archive. Applications could also be designed to run right out of the
phar archive for users to try them out or even for final installation
using the standard tar/zip file formats. The phar file format has the
advantage of not requiring the phar extension in order to run. The
tar/zip file formats have the advantage that if the phar extension is
not present a simple "unzip" command (or the equivalent for tar or for
windows) allows easy installation.As such, I would like to ask for a second consideration of bundling phar
in core, as it has a huge potential for enhancing the distribution of
PHP applications.
+1 for core inclusion.
Thanks,
Greg--
/////////////////////////////////////////////////////
Service provided by hitOmeter.NET internet messaging!
.
Hi Gregory,
Do you have any benchmarks that compare the speed between trying to
include/require files NOT in a phar archive, compared with calling
include/require for files inside a phar archive?
I have a large PHP application with about 5000 PHP files and we make use
of the __autoload() functionality and Smarty extensively, each page load
probably includes between 5-100 files itself, so the speed of this
operation is crucial.
It would be great if we could bundle our entire application as a single
phar archive, it would also make automatic in-place upgrades/roll-backs
that much easier, but if the day-to-day operation takes a significant
speed hit, it obviously won't be worth it.
Thanks.
Criticisms:
- non-standard file format
- limited introspection
- no support for web-based applications
- by default, phar archives require the phar extension to run
- massive modification of php applications required to run them as a
phar archive- no caching of phar files in opcode caches
- has write support in the extension
--
Mike <ipso@snappymail.ca
Mike wrote:
Hi Gregory,
Do you have any benchmarks that compare the speed between trying to
include/require files NOT in a phar archive, compared with calling
include/require for files inside a phar archive?I have a large PHP application with about 5000 PHP files and we make use
of the __autoload() functionality and Smarty extensively, each page load
probably includes between 5-100 files itself, so the speed of this
operation is crucial.It would be great if we could bundle our entire application as a single
phar archive, it would also make automatic in-place upgrades/roll-backs
that much easier, but if the day-to-day operation takes a significant
speed hit, it obviously won't be worth it.
Hi Mike,
I don't have any - I have been focusing 100% on correctness thus far.
Anecdotally, I couldn't perceive any difference in performance between
gzipped pharred phpMyAdmin and off-the-disk phpMyAdmin, but that is not
so useful :). Based on the implementation, I suspect that an
uncompressed tar may be the fastest phar, with uncompressed phar without
a signature as a close second. zip is slower simply because of the file
format's design, but again I have not measured this to verify it.
Uncompressed tar files don't have any checksum beyond the 512 byte
header, which reduces processing of individual files. Both phar and zip
do a crc32 on file contents to catch corruption, and zip file format has
redundant local/central headers for each file to further catch
corruption and make repair possible (although I have to admit I've never
had any success getting pkzip to repair a broken zip before). phar has
a whole-file signature support, which replaces crc32 verification if
present, and zip does not (yet) have signature support.
However, a phpMyAdmin I pharred up a year or two again using a pure PHP
implementation of the phar wrapper (http://pear.php.net/PHP_Archive) was
noticeably slower, so I'm encouraged.
My game plan with pecl/phar at the moment is to finish fixing up the few
known remaining issues, finish the docs (the "creating phar" docs is
incomplete), and then profile. I have also not yet tried it with APC to
see how that would work, and this is a high priority. There are
definitely ways to cache the phar manifest, eliminating the need to
parse the file, I have not investigated this either, and this would in
fact be the largest speedup.
I already have a setup I've been using to debug the web front controller
which could be used to benchmark, but I wonder which application I
should use for a benchmark? Any suggestions? If anyone else has the
resources and time to do a benchmark comparison, this would be very
helpful to us poor time-strapped phar devs.
What needs comparison is the app on-disk, and the app in formats
uncompressed phar, uncompressed tar, zip, gzipped tar, bzipped tar,
gzipped phar, bzipped phar, zip compressed individually with deflate
(zlib), ...
It's quite a long list of possibilities. Perhaps the most important is
uncompressed tar/uncompressed phar/on-disk. This will showcase the
latency, and running these with APC is a priority, once we can verify
that APC actually caches as intended with phar.
Greg
Hello Mike,
not that I have any benchmarks. But I have one thing you might want to
know. You extract a phar and map it to the extracted folder. That is any
operation that would normally end up in the phar then ends up in direct file
access. Doing so would add a tiny overhead for loading the file only. That
is the pahr extension has to figure out that it is responsible for a file
and then modify the filename. This should not be measurable at all.
marcus
Tuesday, January 29, 2008, 9:35:06 PM, you wrote:
Hi Gregory,
Do you have any benchmarks that compare the speed between trying to
include/require files NOT in a phar archive, compared with calling
include/require for files inside a phar archive?
I have a large PHP application with about 5000 PHP files and we make use
of the __autoload() functionality and Smarty extensively, each page load
probably includes between 5-100 files itself, so the speed of this
operation is crucial.
It would be great if we could bundle our entire application as a single
phar archive, it would also make automatic in-place upgrades/roll-backs
that much easier, but if the day-to-day operation takes a significant
speed hit, it obviously won't be worth it.
Thanks.
Criticisms:
- non-standard file format
- limited introspection
- no support for web-based applications
- by default, phar archives require the phar extension to run
- massive modification of php applications required to run them as a
phar archive- no caching of phar files in opcode caches
- has write support in the extension
Best regards,
Marcus