Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:43811 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 19605 invoked from network); 30 Apr 2009 07:48:26 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Apr 2009 07:48:26 -0000 Authentication-Results: pb1.pair.com header.from=rquadling@googlemail.com; sender-id=pass; domainkeys=bad Authentication-Results: pb1.pair.com smtp.mail=rquadling@googlemail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain googlemail.com designates 209.85.218.161 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: rquadling@googlemail.com X-Host-Fingerprint: 209.85.218.161 mail-bw0-f161.google.com Received: from [209.85.218.161] ([209.85.218.161:54549] helo=mail-bw0-f161.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id BA/F5-15944-8C759F94 for ; Thu, 30 Apr 2009 03:48:25 -0400 Received: by bwz5 with SMTP id 5so1641262bwz.23 for ; Thu, 30 Apr 2009 00:48:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:in-reply-to :references:from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=zEDQ2q3r2gRpGNI5JWwu38XevBKkB3+WLMykyV6wzvA=; b=iNatNRRFbWCGZsGHb5zpgqnEESuOyNCVywsRFScY76V4W/GpW1ZK8HA4Z8BtZr5laa QiX4l28xxjLmWDyPBUhD1iSHcAdhVC6q5DZ9spKNDwB6XZ1Zo+2NDRYutMM5eM8lCAio KKbwauNErxNWwTO6XblmXDlGJU9l0kmgFRuvs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:content-transfer-encoding; b=X9zbKnaE2+/c5yIPvD2P1DuzsPgtVAlWeiUSppkTa+BhzmyA3rRWuqArXy6EEvKPN9 9ePIosBX7A9rNuyrT48N7ebTpvaKKsifhCpgb0Yo6MNAjfnKKGIH9S79a2dpmNeYlQbj Lig62/L3H9cgghsqgaPDSED5viiVZOmu9Ksnw= MIME-Version: 1.0 Received: by 10.223.113.132 with SMTP id a4mr557476faq.75.1241077701068; Thu, 30 Apr 2009 00:48:21 -0700 (PDT) Reply-To: RQuadling@googlemail.com In-Reply-To: <49F94BC6.5060904@zend.com> References: <6604D94D40FD465F992144110B075BB5@pc1> <49F94BC6.5060904@zend.com> Date: Thu, 30 Apr 2009 08:48:01 +0100 Message-ID: <10845a340904300048w328591ccl62ebe069ba6f92b7@mail.gmail.com> To: Dmitry Stogov Cc: Matt Wilmas , internals@lists.php.net, shire@php.net Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Re: [PATCH] Scanner "diet" with fixes, etc. From: rquadling@googlemail.com (Richard Quadling) 2009/4/30 Dmitry Stogov : > Hi Matt, > > Does this patch fix EOF handling issues related to mmap()? (e.g. parsing = of > files with size 4096, 8192, ...). Now we have two dirty fixes to handle t= hem > correctly. > > The patch is quite big to understand it quickly. I'll probably take a loo= k > on weekend. > > -ANY_CHAR [^\x00] > +ANY_CHAR [^] > > Is [^] a correct regular expression? > > Thanks. Dmitry. > > Matt Wilmas wrote: >> >> Hi Dmitry, Brian, all, >> >> Here's a scanner patch that I mentioned awhile ago, with a possible way = to >> work around the re2c EOF handling issues. >> >> The primary change is to do a "manual scan" like I talked about in areas >> that match large amounts and can contain NULL bytes (strings/comments, w= hich >> are now scanned faster too), as is done for inline HTML. =C2=A0I called = it a >> "diet" :-) because it removes my complicated string regex patterns from = a >> couple years ago, which doesn't make the .l file much smaller after addi= ng >> the manual scan code (easier to understand...?), but it does result in a >> ~34k reduction of 5.3's generated .c file... >> >> This fixes Bug #46817, as well as a better, more proper fix for the olde= r >> Bug #42767, both related to ending comments. >> >> Now inline HTML chunks aren't broken up when a tag starting with "s" is >> encountered (