Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:35940 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 20282 invoked from network); 4 Mar 2008 06:51:16 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 4 Mar 2008 06:51:16 -0000 Authentication-Results: pb1.pair.com smtp.mail=andi@zend.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=andi@zend.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 212.25.124.162 as permitted sender) X-PHP-List-Original-Sender: andi@zend.com X-Host-Fingerprint: 212.25.124.162 mail.zend.com Windows 2000 SP4, XP SP1 Received: from [212.25.124.162] ([212.25.124.162:59351] helo=mx1.zend.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 0E/D6-29055-261FCC74 for ; Tue, 04 Mar 2008 01:51:15 -0500 Received: from us-ex1.zend.com ([192.168.16.5]) by mx1.zend.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 4 Mar 2008 08:51:27 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable x-cr-hashedpuzzle: A8xZ EsAf GOwV H+/w JeaK J1sx J373 KXFO MN0E Nwl5 OGp9 Omai OrZ5 P0pL U2Qc WA2A;2;aABlAGwAbAB5AEAAcABoAHAALgBuAGUAdAA7AGkAbgB0AGUAcgBuAGEAbABzAEAAbABpAHMAdABzAC4AcABoAHAALgBuAGUAdAA=;Sosha1_v1;7;{072A28B4-59A1-48E0-A5EF-E799BF101269};YQBuAGQAaQBAAHoAZQBuAGQALgBjAG8AbQA=;Tue, 04 Mar 2008 06:51:07 GMT;UgBFADoAIABbAFAASABQAC0ARABFAFYAXQAgAFsAUgBGAEMAXQAgAFIAZQBwAGwAYQBjAGUAIAB0AGgAZQAgAGYAbABlAHgALQBiAGEAcwBlAGQAIABzAGMAYQBuAG4AZQByACAAdwBpAHQAaAAgAGEAbgAgAHIAZQAyAGMAIABbADEAXQAgAGIAYQBzAGUAZAAgAGwAZQB4AGUAcgA= Content-class: urn:content-classes:message x-cr-puzzleid: {072A28B4-59A1-48E0-A5EF-E799BF101269} Date: Mon, 3 Mar 2008 22:51:07 -0800 Message-ID: <698DE66518E7CA45812BD18E807866CE01506D08@us-ex1.zend.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer Thread-Index: Ach8s/lVnCZnXRN9Sxm4JVxtuNVn3ABDguag References: <1706278209.20080302232134@marcus-boerger.de> To: "Marcus Boerger" , X-OriginalArrivalTime: 04 Mar 2008 06:51:27.0144 (UTC) FILETIME=[2B2C3A80:01C87DC4] Subject: RE: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer From: andi@zend.com ("Andi Gutmans") Hi Marcus, Johannes, and all, First of all let me say that I have no conceptual problem with replacing the scanner with re2c. If it's cleaner, performs better and a better maintained piece of software (let's hope Marcus doesn't get run over) then we can move to re2c. There are a few important things to consider though: - There is a huge PHP/MySQL community in the far east especially in Japan. You may not hear as much from them because they mostly don't post on our public lists but it's large. They very much depend on multibyte support and it works well for them (I have talked to several people in those communities). Shift-JIS is a matter of fact for those communities. We can't just dump them in PHP 5.3. - We need to make sure that we have a streams story that works and existing functionality is supported by it (sounds like this is almost complete so probably not high risk). - We should make sure we can achieve compatibility including supporting functionality like declare(...) which is used by some including multibyte guys. I haven't heard of a reason why this couldn't be possible with RE2C. I think all the above is doable but we shouldn't ship without accomplishing that 100% compatibility especially telling the non-Latin world that we will stop supporting them. So at the end of the day it all boils down to timing. I have been expecting Johannes to cut a beta any day now (I realize Sun acquisition somewhat postponed his schedule). PHP 5.3 is on a pretty good track to a good & stable release cycle. I think re-engineering a core piece of the engine at this point adds considerable risk and would definitely prolong the release cycle. So while I'm supportive of embracing RE2C if we get commitment to reach that 100% compatibility including multibyte support, I don't quite understand the sense of urgency and why we'd want to introduce this risk so late in the development of PHP 5.3. This is a risk the release manager shouldn't really be willing to take. Rewriting this multibyte support will require time and interaction with the communities that are currently using it to make sure that it meets their needs. It will not be a trivial project. We can definitely work towards RE2C in parallel and as Stas said the engine hasn't really been changing very much recently to make this hard (we finished our todos for 5.3). We could even branch off PHP 5.4 right after RC1 for PHP 5.3 and therefore reduce the time where this patch would need to be maintained separately (although I think it can already be maintained in a branch). Let's consider all the angles in addition to wanting to get the code in the tree asap. Andi