Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:55582 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 11179 invoked from network); 22 Sep 2011 12:56:25 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 22 Sep 2011 12:56:25 -0000 Authentication-Results: pb1.pair.com smtp.mail=nikita.ppv@googlemail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=nikita.ppv@googlemail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain googlemail.com designates 209.85.215.170 as permitted sender) X-PHP-List-Original-Sender: nikita.ppv@googlemail.com X-Host-Fingerprint: 209.85.215.170 mail-ey0-f170.google.com Received: from [209.85.215.170] ([209.85.215.170:44731] helo=mail-ey0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id F8/50-09139-3703B7E4 for ; Thu, 22 Sep 2011 08:56:22 -0400 Received: by eyh6 with SMTP id 6so1536718eyh.29 for ; Thu, 22 Sep 2011 05:56:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=7XOhF8BZIqWdVJP+7GM7jU3MdDmWzUfnE9WnkmWjVI0=; b=qzGwuzjmFUqX7OWKQ9+94TBqFoppXz+py/7XmEpO4W/n+7DLmvIBtqxU+7wuWO944t HvgyVIkGkp49i49FkNe5lPpbkR6gh4V8Fp/hCziw9yCSnsx1MS+zf+xsbVleHdzScuR8 k4FftSpAmoZQRPzUsWOhPGKeVBrXPsOmA2Fpg= MIME-Version: 1.0 Received: by 10.14.17.147 with SMTP id j19mr764099eej.192.1316696174285; Thu, 22 Sep 2011 05:56:14 -0700 (PDT) Received: by 10.14.99.205 with HTTP; Thu, 22 Sep 2011 05:56:14 -0700 (PDT) In-Reply-To: References: <4E6FB55E.4060906@oracle.com> Date: Thu, 22 Sep 2011 14:56:14 +0200 Message-ID: To: Ferenc Kovacs Cc: Hannes Magnusson , Christopher Jones , internals@lists.php.net Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Revert Tokenizer behavior for 5.4 From: nikita.ppv@googlemail.com (Nikita Popov) So, is there any consensus about this? Nikita On Fri, Sep 16, 2011 at 10:07 AM, Ferenc Kovacs wrote: >> >> Wait wait wait. Thats the point here? >> __COMPILER_HALT_OFFSET__ already tells you where the data starts. >> >> -Hannes >> > > I didn't sent this message first, but after reading the mail from > Chris, I think maybe it would clear the confusion: > > It is about tokenizing a file which has __halt_compiler(); in it. > before the fix of the original bugreport, one could get the warning > "Unexpected character in input" if he tried to token_get_all() a > script which had binary data after the __halt_compiler(); > iliaa's fix was trivial: break from the tokenizer if __halt_compiler > token is found. > but it isn't good enough, because as the original bugreporter pointed out= : > 1, now the token_get_all() won't return the (); after __halt_compiler, > which means that if you rebuild the code from the tokens, you will > have invalid php code. > 2, you have no way to get the binary data after the __halt_compiler > via the tokenizer, so you can't rebuild the original file using only > the tokenizer. (for example one could use the tokenizer to strip the > whitespaces and comments from a given file in-place) > > both problems could be hacked around from userland, but imo it still > worth fixing those. > > -- > Ferenc Kov=E1cs > @Tyr43l - http://tyrael.hu >