Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:60369 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 50208 invoked from network); 30 Apr 2012 05:22:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Apr 2012 05:22:08 -0000 Authentication-Results: pb1.pair.com smtp.mail=larry@garfieldtech.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=larry@garfieldtech.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain garfieldtech.com from 66.111.4.25 cause and error) X-PHP-List-Original-Sender: larry@garfieldtech.com X-Host-Fingerprint: 66.111.4.25 out1-smtp.messagingengine.com Received: from [66.111.4.25] ([66.111.4.25:41229] helo=out1-smtp.messagingengine.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id CB/97-13197-F712E9F4 for ; Mon, 30 Apr 2012 01:22:08 -0400 Received: from compute1.internal (compute1.nyi.mail.srv.osa [10.202.2.41]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 4E83B21A62 for ; Mon, 30 Apr 2012 01:22:05 -0400 (EDT) Received: from frontend1.nyi.mail.srv.osa ([10.202.2.160]) by compute1.internal (MEProxy); Mon, 30 Apr 2012 01:22:05 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:date:from:mime-version:to :subject:content-type:content-transfer-encoding; s=smtpout; bh=i /Zr5DjDUxOSQeJaTptcnF2kPo0=; b=nV/suVwzcrKv09ZGNYSi8RsJA8zMNkXAf Y2Ja1kBHmBjbg975H8M6ndHcI/KCBcHMSOZQF7nM9mNhSc+NLXm5TrEQomvxqq6a WIGKVWleMblPykkbIbILOTgi23WjjFVx0EHz6Ap+d2qijdzyqJNN1iPQRbo4DaDV b4P/4vd77A= X-Sasl-enc: 6akXo5y3MLvGyBM6Guk5GpuzvgQhI4QX52pov+Z+R/tq 1335763324 Received: from [192.168.42.21] (c-98-220-238-115.hsd1.il.comcast.net [98.220.238.115]) by mail.messagingengine.com (Postfix) with ESMTPSA id 0A3158E0229 for ; Mon, 30 Apr 2012 01:22:03 -0400 (EDT) Message-ID: <4F9E2173.50005@garfieldtech.com> Date: Mon, 30 Apr 2012 00:21:55 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: internals@lists.php.net Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: readfile() memory usage From: larry@garfieldtech.com (Larry Garfield) So, I've been reading articles for a decade now that say that readfile() is great and wonderful except for memory usage. Specifically, that it reads a file into memory entirely, and then prints it to stdout from there. So if you're outputing a big file you will hit your memory limit and kill the server. Thus, one should always loop over fread() instead. The most recent article I found saying that was from 2007, with a StackExchange thread saying the same from 2011. I've even found mention of it in old PHP Bugs. However, I cannot replicate that in my own testing. Earlier today I was running some benchmarks of different file streaming techniques in PHP (5.3.6 specifically) and found that fread() looping, fpassthru(), readfile(), and stream_copy_to_stream() perform almost identically on memory, and all are identical on CPU except for fread() which is slower, which makes sense since you're looping in PHP space. What's more, I cranked my memory limit down to 10 MB and then tried streaming a 20 MB file. No change. The PHP peak memory never left around a half-meg or so, most of which I presume is just the Apache/PHP overhead. But it's not actually possible for readfile() to be buffering the whole file into memory before printing and not die if the file is bigger than the memory limit. I verified that the data I'm getting downloaded from the script is correct, and exactly matches the file that it should be streaming. My first thought was that this is yet another case of PHP improving and fixing a long-standing bug, but somehow the rest of the world not knowing about it so "conventional wisdom" persists long after it's still wise. However, I found no mention of readfile() in the PHP 5 change log[1] at all aside from one note from back in 5.0.0 Beta 1 about improving performance under Windows. (I'm on Linux.) So, what's going on here? Has readfile() been memory-safe for that long without anyone noticing? Is my test completely flawed (although I don't see how since I can verify that the code works as expected)? Something else? Please un-confuse me! (Note: Sending this to internals since this is an engine question, and I am more likely to reach whoever it was that un-sucked readfile() sometime in the silent past that way. ) --Larry Garfield