Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:43814 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 70656 invoked from network); 30 Apr 2009 12:32:44 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Apr 2009 12:32:44 -0000 Authentication-Results: pb1.pair.com smtp.mail=tony@daylessday.org; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=tony@daylessday.org; sender-id=pass Received-SPF: pass (pb1.pair.com: domain daylessday.org designates 89.208.40.236 as permitted sender) X-PHP-List-Original-Sender: tony@daylessday.org X-Host-Fingerprint: 89.208.40.236 mail.daylessday.org Linux 2.6 Received: from [89.208.40.236] ([89.208.40.236:40807] helo=daylessday.org) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 32/F4-42998-A6A99F94 for ; Thu, 30 Apr 2009 08:32:43 -0400 Received: from [192.168.3.47] (unknown [212.42.62.198]) by daylessday.org (Postfix) with ESMTPSA id 768A6BFA101; Thu, 30 Apr 2009 16:32:39 +0400 (MSD) Message-ID: <49F99A58.9020506@daylessday.org> Date: Thu, 30 Apr 2009 16:32:24 +0400 User-Agent: Thunderbird 2.0.0.19 (X11/20081227) MIME-Version: 1.0 To: RQuadling@GoogleMail.com CC: php-dev References: <6604D94D40FD465F992144110B075BB5@pc1> <49F94BC6.5060904@zend.com> <49F993FA.2090301@php.net> <10845a340904300515k62fe7dbes4e22b318c61be140@mail.gmail.com> In-Reply-To: <10845a340904300515k62fe7dbes4e22b318c61be140@mail.gmail.com> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] Re: [PATCH] Scanner "diet" with fixes, etc. From: tony@daylessday.org (Antony Dovgal) That's what I call 'overquoting'. On 30.04.2009 16:15, Richard Quadling wrote: > 2009/4/30 Scott MacVicar : >> [^] is a special case to write a portable match any character in re2c. >> >> Scott >> >> Dmitry Stogov wrote: >>> Hi Matt, >>> >>> Does this patch fix EOF handling issues related to mmap()? (e.g. parsing >>> of files with size 4096, 8192, ...). Now we have two dirty fixes to >>> handle them correctly. >>> >>> The patch is quite big to understand it quickly. I'll probably take a >>> look on weekend. >>> >>> -ANY_CHAR [^\x00] >>> +ANY_CHAR [^] >>> >>> Is [^] a correct regular expression? >>> >>> Thanks. Dmitry. >>> >>> Matt Wilmas wrote: >>>> Hi Dmitry, Brian, all, >>>> >>>> Here's a scanner patch that I mentioned awhile ago, with a possible >>>> way to work around the re2c EOF handling issues. >>>> >>>> The primary change is to do a "manual scan" like I talked about in >>>> areas that match large amounts and can contain NULL bytes >>>> (strings/comments, which are now scanned faster too), as is done for >>>> inline HTML. I called it a "diet" :-) because it removes my >>>> complicated string regex patterns from a couple years ago, which >>>> doesn't make the .l file much smaller after adding the manual scan >>>> code (easier to understand...?), but it does result in a ~34k >>>> reduction of 5.3's generated .c file... >>>> >>>> This fixes Bug #46817, as well as a better, more proper fix for the >>>> older Bug #42767, both related to ending comments. >>>> >>>> Now inline HTML chunks aren't broken up when a tag starting with "s" >>>> is encountered (