Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:43285 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 35490 invoked from network); 10 Mar 2009 00:35:59 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 10 Mar 2009 00:35:59 -0000 Authentication-Results: pb1.pair.com smtp.mail=php_lists@realplain.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=php_lists@realplain.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain realplain.com from 209.235.152.151 cause and error) X-PHP-List-Original-Sender: php_lists@realplain.com X-Host-Fingerprint: 209.235.152.151 mail961c35.nsolutionszone.com Linux 2.5 (sometimes 2.4) (4) Received: from [209.235.152.151] ([209.235.152.151:45850] helo=mail961c35.nsolutionszone.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 4F/16-30036-DE5B5B94 for ; Mon, 09 Mar 2009 19:35:58 -0500 X-POP-User: wilmascam.centurytel.net Received: from pc1 (72-161-148-24.dyn.centurytel.net [72.161.148.24]) by mail961c35.nsolutionszone.com (8.13.6/8.13.1) with SMTP id n2A0Zp7J023771; Tue, 10 Mar 2009 00:35:52 GMT Message-ID: <5B7B7BA5BF8C40598CABBB39CD24EE8E@pc1> To: "PHP Internals List" , "shire" Cc: "Lukas Kahwe Smith" References: <49B57F4F.9080901@tekrat.com> <033E05F2D7264057AEE4FCFFD7E827AE@pc1> <49B5AD5B.908@tekrat.com> Date: Mon, 9 Mar 2009 19:35:51 -0500 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5512 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 Subject: Re: [PHP-DEV] 5.3 items From: php_lists@realplain.com ("Matt Wilmas") Hi Brian, ----- Original Message ----- From: "shire" Sent: Monday, March 09, 2009 > Hey Matt, > > Matt Wilmas wrote: > >>> 9. tokenizer misses last single-line comment >>> (http://bugs.php.net/bug.php?id=46817) >> >> I was going to take care of that one, as I mentioned in a previous >> message, though it's been awhile since I've been delayed much longer >> with stuff here. :-( (Nothing set up for building PHP on this system >> yet; hope to in the next several hours finally, and do some things!) >> > > Sorry I missed you're earlier email. I saw this sitting on the 5.3 todo > list and it was breaking some of our parsing so I figured I'd take a stab > at it. Here is my current patch > http://tekrat.com/downloads/bits/php53.scanner_eof.patch, please let me > know if you have some suggestions/changes. It sounds like you commented > on this initially so please let me know what you/we should do ie: merging > my patch/your work, commiting this, or if you had a better fix in mind > etc. My biggest complaint is that my current patch requires adding \x00 > to any exclusion rules ("[^"). > > These changes for handling EOF should probably be ported to the INI > scanner as well for the above reason and to keep them similar. I don't have much time right now, but looked at it quick, and see that you're actually trying to work around the re2c issues in general. :-) I was only thinking of putting a "band-aid" on the comment symptom(s), since those are about the only ones that occur with valid code (is the tokenizer ext. *supposed to* handle all tokens in code that wouldn't really compile?). And yeah, about excluding \x00 from ANY_CHAR, it could change things, since it's always been allowed, although it seems strange that code would have literal NULLs in it (generated eval()'d code?). That was part of the reason I couldn't come up with a generic fix while keeping all behavior. If re2c would just remember the last matching state it was in at EOF like Flex! Otherwise, I don't know what to do. :-/ I'm going to do something else before trying to implement what I was going to do, so there's no patch yet... >> As far as I know there's still the other comment-related issue where no >> Warning is giving about "Unterminated comment ..." for unclosed /* ... >> It's all of course related to the fundamental re2c issue, for now, where >> when the scanned input ends while a variable length part of a rule is >> being matched, it just aborts ("return 0;") in YYFILL(). > > I don't seem to see this problem, perhaps I'm not reproducing it > correctly? As far as the Warning, with "> And that applies to the case Lukas gave in the bug report: WHITESPACE >> pattern is variable length. > > Didn't see/find this is there a bug # or link? I meant the "could be related if not the same problem" comment added the other day in Bug #46817. > > -shire - Matt