Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:35898 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 11328 invoked from network); 3 Mar 2008 04:39:42 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Mar 2008 04:39:42 -0000 Authentication-Results: pb1.pair.com smtp.mail=stas@zend.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=stas@zend.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 212.25.124.162 as permitted sender) X-PHP-List-Original-Sender: stas@zend.com X-Host-Fingerprint: 212.25.124.162 mail.zend.com Windows 2000 SP4, XP SP1 Received: from [212.25.124.162] ([212.25.124.162:37356] helo=mx1.zend.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 0F/AC-29055-D018BC74 for ; Sun, 02 Mar 2008 23:39:42 -0500 Received: from us-ex1.zend.com ([192.168.16.5]) by mx1.zend.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 3 Mar 2008 06:39:53 +0200 Received: from [192.168.17.76] ([192.168.17.76]) by us-ex1.zend.com with Microsoft SMTPSVC(6.0.3790.3959); Sun, 2 Mar 2008 20:39:50 -0800 Message-ID: <47CB8107.1090802@zend.com> Date: Sun, 02 Mar 2008 20:39:35 -0800 Organization: Zend Technologies User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: Marcus Boerger CC: internals@lists.php.net References: <1706278209.20080302232134@marcus-boerger.de> <47CB2E9D.6010102@zend.com> <1642796941.20080303002651@marcus-boerger.de> In-Reply-To: <1642796941.20080303002651@marcus-boerger.de> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 03 Mar 2008 04:39:50.0179 (UTC) FILETIME=[9DCD3F30:01C87CE8] Subject: Re: [PHP-DEV] [RFC] Replace the flex-based scanner with an re2c [1] based lexer From: stas@zend.com (Stanislav Malyshev) Hi! >> Were the stream support issues solved? > > We completely dropped multibyte support. The reason is that the way we were I wasn't asking about multibyte (that we discuss below), but about other streams - I think I mentioned it on IRC last time re2c parser was discussed. I remember re2c used mmap, and not all files PHP can run can be mmap'ed. Was it fixed? > Once we have finished the move to re2c, we can support all of those > correctly. The multibyte support also duplicated the encoding tables > otherwise available in ext/mbstring or ext/iconv or pecl/intl. pecl/intl per se doesn't have any encoding tables. ICU does, but that would mean you have to have ICU to run PHP. That might not be a big problem since ICU is supported by IBM (read: good chance more "exotic" systems would have support) it is still dependency on non-bundled 3rd party library in PHP 5 core. Of course, PHP 6 has this dependency, but we might want to not have such things in 5.x so that you won't have to change your system too much while staying on 5.x. > Rely on a not supported undocumented feature? I am rather able to build php > and rewrite that support. Being undocumented is nothing to be proud of, however as poorly documented as it is, it is used. I'm all for implementing it in a better way - and having new parser is a good time to do it. That's exactly the reason we shouldn't rush with it but do it right this time. There's no burning need to have a new parser right now, so we can have some moment to think - ok, how we want multibyte support there to work? And if we might need some modifications, we'd have time and flexibility to do it, not having the code in 5.3 which was supposed to go in RC in Q1 (ending 1 month from now). > You are free to contribute and make MB support working upfront. I know I'm free :) However, as much as I understand the eagerness of having it in the source tree, I repeat that I do not think dropping multibyte support in 5.3 is acceptable. Thus, if it is committed right now, 5.3 would have to be deferred until this is resolved. If this is resolved timely for 5.3 - great. If not, we better get it in 5.4 right than in 5.3 wrong. -- Stanislav Malyshev, Zend Software Architect stas@zend.com http://www.zend.com/ (408)253-8829 MSN: stas@zend.com