Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:29286 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 16517 invoked by uid 1010); 7 May 2007 18:54:27 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 16501 invoked from network); 7 May 2007 18:54:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 May 2007 18:54:27 -0000 X-Host-Fingerprint: 72.77.217.26 static-72-77-217-26.tampfl.dsl-w.verizon.net Received: from [72.77.217.26] ([72.77.217.26:22010] helo=localhost.localdomain) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 86/E4-56106-0E57F364 for ; Mon, 07 May 2007 14:54:27 -0400 To: internals@lists.php.net,Gregory Beaver Message-ID: <463F75C2.2050008@php.net> Date: Mon, 07 May 2007 14:53:54 -0400 User-Agent: Thunderbird 2.0.0.0 (Macintosh/20070326) MIME-Version: 1.0 References: <463F7301.9040104@chiaraquartet.net> In-Reply-To: <463F7301.9040104@chiaraquartet.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Posted-By: 72.77.217.26 Subject: Re: how does Phar actually work? From: davey@php.net (Davey Shafik) The stub could also easily include code to allow for an extraction flag to work. So you could run php my.phar --extract and have the code dumped to the FS as it originally was. The choice to add these things (the stub and the extract flag), is just that, a choice. The same as choosing short tags, or relying on magic_quotes* etc. Of course, sane defaults when creating phars, is something we can decide on as things move on, I would vote for having both PHP_Archive and the --extract flag code inserted by default, this would solve the issues IMO. - Davey Gregory Beaver wrote: > Hi, > > There has been a bit of inevitable FUD with phar. Although the manual > (http://php.net/phar) describes a fair amount of how phar works, the > design decisions are not documented. > > Originally, the phar stream wrapper was a userspace thing. Davey Shafik > designed it to take advantage of a neat loophole in the design of the > tar file format so that a valid tar could be run by PHP without needing > to have the phar stream wrapper loaded. This was great until I started > using it to run the PEAR Installer. The performance hit was tremendous, > as every newly included file required scanning the entire file, header > by header, until we found the needed file. Worst case, it meant loading > megabytes of information just to locate a file. The zip file format has > the same limitation - the entire archive needs to be scanned. > > Both of these formats were not designed for random access in the way a > traditional filesystem is designed. In fact, I could not find an > example of a archive format that is designed for this. > > As such, borrowing from the design of disk filesystems, I created a new > format that is very small and processes very quickly. It is so much > faster, I can't detect a difference in performance running the PEAR > installer off of the disk and running it out of a phar. I am sure there > is a difference that apache benchmark would detect because of extra > load-in time of the file manifest. The way phar works now is a file > manifest is at the start of the phar archive (similar to a directory > file in traditional filesystems). Each file has a manifest entry > containing the file name, size of the file, and offset into the archive > plus some flags and optional meta-data. The manifest is currently > limited in size to 1 MB, so some applications probably would not be > possible to phar under the current design. > > Each phar has a loader stub, which can be any php code, but must contain > the __HALT_COMPILER(); token. This will allow creating phars that also > contain PHP_Archive to work under conditions where the phar extension is > disabled. It is the loader stub that makes it possible to run a phar > with plain vanilla PHP. > > I see two possible solutions to the concerns raised by others. > > 1) don't worry, be happy > 2) re-design the phar file format such that it is a tar again, and put > the manifest for quick loading in one of the first files of the tar archive. > > If I had thought #2 was a good idea, I would have already done it, so > there is my opinion. > > One basic assumption I would like to raise here is that nobody is going > to download a .phar archive who does not already have PHP. Does this > assumption sound sane? > > If so, I would like to provide some simple scripts for unpharring and > repharring a .phar archive. This is not hard to do with a 5-line PHP > script. > > One of the big questions I would have though would be for xdebug (hi > Derick) and designers of IDEs, as it would be good to ensure that it is > possible to step through a phar, or even to dump the source line with an > error message. These, to me, seem to be the most pressing disadvantages > of phar currently - it becomes much harder to debug a problem in a PHP > script when it is stuffed into a phar. > > Thanks, > Greg