Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:15385 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 65599 invoked by uid 1010); 11 Mar 2005 22:05:52 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 65547 invoked from network); 11 Mar 2005 22:05:52 -0000 Received: from unknown (HELO yahoo.co.kr) (127.0.0.1) by localhost with SMTP; 11 Mar 2005 22:05:52 -0000 X-Host-Fingerprint: 70.85.46.36 unknown Received: from ([70.85.46.36:48041] helo=prohost.org) by pb1.pair.com (ecelerity HEAD r(5124)) with SMTP id 49/D6-31540-04612324 for ; Fri, 11 Mar 2005 17:05:52 -0500 Received: (qmail 1408 invoked from network); 11 Mar 2005 22:05:49 -0000 Received: from cpe00095beeab35-cm000f9f7d6664.cpe.net.cable.rogers.com (HELO ?192.168.1.101?) (69.196.31.219) by prohost.org with SMTP; 11 Mar 2005 22:05:49 -0000 Message-ID: <4232163C.1000906@prohost.org> Date: Fri, 11 Mar 2005 17:05:48 -0500 User-Agent: Mozilla Thunderbird 1.0 - [MOOX M3] (Windows/20041208) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Wez Furlong CC: internals@lists.php.net References: <4231F330.6000705@prohost.org> <4e89b426050311134633b22b@mail.gmail.com> In-Reply-To: <4e89b426050311134633b22b@mail.gmail.com> X-Enigmail-Version: 0.89.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/mixed; boundary="------------040606000707040501040507" Subject: Re: [PHP-DEV] HALT Patch From: ilia@prohost.org (Ilia Alshanetsky) --------------040606000707040501040507 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Wez Furlong wrote: > If you name the token __HALT_PHP_PARSER__ instead (or something > equally unlikely to be used by accident and unambiguous in meaning), > you'll get a +1 from me :-) I aim to please ;-) Here is the revised patch that makes the token a bit clearer and also increases the strictness of the parser. For those of you wondering how would you quickly locate the end of script and start of data here are some solutions: 1) while (fgets($fp) != '')); 2) Assuming the author knows the maximum length of the actual code, they could read X bytes and then using strpos locate the position of the HALT token and start reading data dump from there. $fp = fopen(__FILE__, "r"); $halt_token = ""; $pos = strpos(fread($fp, 10000), $halt_token); fseek($fp, $pos + strlen($halt_token)); Ilia --------------040606000707040501040507 Content-Type: text/plain; name="halt.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="halt.txt" Index: Zend/zend_language_scanner.l =================================================================== RCS file: /repository/ZendEngine2/zend_language_scanner.l,v retrieving revision 1.124 diff -u -a -p -r1.124 zend_language_scanner.l --- Zend/zend_language_scanner.l 7 Mar 2005 16:48:49 -0000 1.124 +++ Zend/zend_language_scanner.l 11 Mar 2005 21:53:04 -0000 @@ -1342,6 +1342,10 @@ NEWLINE ("\r"|"\n"|"\r\n") return T_INLINE_HTML; } +"" { + yyterminate(); +} + "" { HANDLE_NEWLINES(yytext, yyleng); if (CG(short_tags) || yyleng>2) { /* yyleng>2 means it's not */ --------------040606000707040501040507--