Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:35012 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 10558 invoked by uid 1010); 30 Jan 2008 00:32:15 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 10542 invoked from network); 30 Jan 2008 00:32:15 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Jan 2008 00:32:15 -0000 Authentication-Results: pb1.pair.com header.from=greg@chiaraquartet.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=greg@chiaraquartet.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain chiaraquartet.net from 38.99.98.18 cause and error) X-PHP-List-Original-Sender: greg@chiaraquartet.net X-Host-Fingerprint: 38.99.98.18 beast.bluga.net Linux 2.6 Received: from [38.99.98.18] ([38.99.98.18:47769] helo=mail.bluga.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D3/07-55338-D85CF974 for ; Tue, 29 Jan 2008 19:32:14 -0500 Received: from mail.bluga.net (localhost.localdomain [127.0.0.1]) by mail.bluga.net (Postfix) with ESMTP id CE859C0E600; Tue, 29 Jan 2008 17:32:10 -0700 (MST) Received: from [192.168.0.106] (CPE-76-84-14-241.neb.res.rr.com [76.84.14.241]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bluga.net (Postfix) with ESMTP id 65D77C0E5FE; Tue, 29 Jan 2008 17:32:10 -0700 (MST) Message-ID: <479FC59E.7050706@chiaraquartet.net> Date: Tue, 29 Jan 2008 18:32:30 -0600 User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: Mike CC: internals Mailing List References: <479E1152.50301@chiaraquartet.net> <1201638906.24165.109.camel@ipso.snappymail.ca> In-Reply-To: <1201638906.24165.109.camel@ipso.snappymail.ca> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] re-proposal of pecl/phar for inclusion in core From: greg@chiaraquartet.net (Gregory Beaver) Mike wrote: > Hi Gregory, > > Do you have any benchmarks that compare the speed between trying to > include/require files NOT in a phar archive, compared with calling > include/require for files inside a phar archive? > > I have a large PHP application with about 5000 PHP files and we make use > of the __autoload() functionality and Smarty extensively, each page load > probably includes between 5-100 files itself, so the speed of this > operation is crucial. > > It would be great if we could bundle our entire application as a single > phar archive, it would also make automatic in-place upgrades/roll-backs > that much easier, but if the day-to-day operation takes a significant > speed hit, it obviously won't be worth it. Hi Mike, I don't have any - I have been focusing 100% on correctness thus far. Anecdotally, I couldn't perceive any difference in performance between gzipped pharred phpMyAdmin and off-the-disk phpMyAdmin, but that is not so useful :). Based on the implementation, I suspect that an uncompressed tar may be the fastest phar, with uncompressed phar without a signature as a close second. zip is slower simply because of the file format's design, but again I have not measured this to verify it. Uncompressed tar files don't have any checksum beyond the 512 byte header, which reduces processing of individual files. Both phar and zip do a crc32 on file contents to catch corruption, and zip file format has redundant local/central headers for each file to further catch corruption and make repair possible (although I have to admit I've never had any success getting pkzip to repair a broken zip before). phar has a whole-file signature support, which replaces crc32 verification if present, and zip does not (yet) have signature support. However, a phpMyAdmin I pharred up a year or two again using a pure PHP implementation of the phar wrapper (http://pear.php.net/PHP_Archive) was noticeably slower, so I'm encouraged. My game plan with pecl/phar at the moment is to finish fixing up the few known remaining issues, finish the docs (the "creating phar" docs is incomplete), and then profile. I have also not yet tried it with APC to see how that would work, and this is a high priority. There are definitely ways to cache the phar manifest, eliminating the need to parse the file, I have not investigated this either, and this would in fact be the largest speedup. I already have a setup I've been using to debug the web front controller which could be used to benchmark, but I wonder which application I should use for a benchmark? Any suggestions? If anyone else has the resources and time to do a benchmark comparison, this would be very helpful to us poor time-strapped phar devs. What needs comparison is the app on-disk, and the app in formats uncompressed phar, uncompressed tar, zip, gzipped tar, bzipped tar, gzipped phar, bzipped phar, zip compressed individually with deflate (zlib), ... It's quite a long list of possibilities. Perhaps the most important is uncompressed tar/uncompressed phar/on-disk. This will showcase the latency, and running these with APC is a priority, once we can verify that APC actually caches as intended with phar. Greg