Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:43815 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 23705 invoked from network); 30 Apr 2009 17:19:03 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Apr 2009 17:19:03 -0000 Authentication-Results: pb1.pair.com header.from=nlopess@php.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=nlopess@php.net; spf=unknown; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain php.net does not designate 212.55.154.21 as permitted sender) X-PHP-List-Original-Sender: nlopess@php.net X-Host-Fingerprint: 212.55.154.21 relay1.ptmail.sapo.pt Linux 2.4/2.6 Received: from [212.55.154.21] ([212.55.154.21:45688] helo=sapo.pt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 7D/EE-42998-48DD9F94 for ; Thu, 30 Apr 2009 13:19:02 -0400 Received: (qmail 1836 invoked from network); 30 Apr 2009 17:17:01 -0000 Received: from unknown (HELO sapo.pt) (10.134.37.162) by relay1 with SMTP; 30 Apr 2009 17:17:01 -0000 Received: (qmail 19715 invoked from network); 30 Apr 2009 17:20:59 -0000 X-AntiVirus: PTMail-AV 0.3-0.92.0 X-Virus-Status: Clean (0.01746 seconds) Received: from unknown (HELO PC3EE1F19287) (nunoplopes@sapo.pt@[93.197.134.14]) (envelope-sender ) by mta12 (qmail-ldap-1.03) with SMTP for ; 30 Apr 2009 17:20:59 -0000 Message-ID: To: "Matt Wilmas" , Cc: "Dmitry Stogov" , References: <6604D94D40FD465F992144110B075BB5@pc1> Date: Thu, 30 Apr 2009 18:18:50 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5512 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 Subject: Re: [PHP-DEV] [PATCH] Scanner "diet" with fixes, etc. From: nlopess@php.net ("Nuno Lopes") The patch looks generally ok. However I'll need a few more days to review it carefully and throughly. (you can merge it in the meantime if you want). I'm just slighty concern with the amount of parsing we are now doing by hand, and with the possible (local) security bugs we might be introducing.. Nuno ----- Original Message ----- > Hi Dmitry, Brian, all, > > Here's a scanner patch that I mentioned awhile ago, with a possible way to > work around the re2c EOF handling issues. > > The primary change is to do a "manual scan" like I talked about in areas > that match large amounts and can contain NULL bytes (strings/comments, > which are now scanned faster too), as is done for inline HTML. I called > it a "diet" :-) because it removes my complicated string regex patterns > from a couple years ago, which doesn't make the .l file much smaller after > adding the manual scan code (easier to understand...?), but it does result > in a ~34k reduction of 5.3's generated .c file... > > This fixes Bug #46817, as well as a better, more proper fix for the older > Bug #42767, both related to ending comments. > > Now inline HTML chunks aren't broken up when a tag starting with "s" is > encountered (