Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:11797 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 95019 invoked by uid 1010); 1 Aug 2004 20:32:47 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 94945 invoked from network); 1 Aug 2004 20:32:46 -0000 Received: from unknown (HELO admin01-nyc.clicvu.com) (209.10.150.70) by pb1.pair.com with SMTP; 1 Aug 2004 20:32:46 -0000 Received: from smtp.clicvu.com ([192.168.0.72]) by admin01-nyc.clicvu.com (Post.Office MTA v3.5.3 release 223 ID# 0-64039U1000L100S0V35) with ESMTP id com for ; Sun, 1 Aug 2004 16:30:33 -0400 Content-type: text/plain Date: Sun, 01 Aug 2004 16:28:18 -0400 To: internals@lists.php.net Subject: Lamenting PHP's streaming support... From: jtl_phpdotnet@spamex.com Message-ID: Hi everyone, I'm trying to write some serious parsing applications in PHP. I find myself frequently lamenting the 4GL-like support for buffered streams. I'd rather a full fledged streaming API with stream handles (or objects) like you get in mature 3GL languages like C and Java. I'm making do with the single character-stream buffer available to me in the "output buffer." I wrap this stream in classes that emulate distinct character streams by saving the current output buffer, clearing the output buffer for the new virtual stream, and then restoring the original output buffer when the virtual stream is closed. This works, but it costs in overhead and requires repeatedly creating string objects to store old buffers and then rewriting those objects back to the output buffer. This is less than ideal from both a performance standpoint and a complexity standpoint (and an increased potential for wierd errors). I'm not too concerned about the performance issues of these virtual buffers because I can architect the application so that it minimizes these switches. However, I find myself (so far) unable to architect around another serious performance issue. I'm having to create a new string for each character sequence that I write to the output buffer. I'd rather just copy the substring of the document being parsed directly to the output buffer. Object creation is an expensive activity when thousands of objects needed to be created for a single page hit. All I need to deal with this problem is a new PHP function: ob_write($string, $start, $length) This would write the characters in substr($string, $start, $length) to the output buffer without creating an intermediate string object. Is there anything on the horizon that would give me the kind of streaming support I'm looking for? Thanks for your help! ~joe