Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:34872 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 71946 invoked by uid 1010); 22 Jan 2008 04:25:10 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 71931 invoked from network); 22 Jan 2008 04:25:10 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 22 Jan 2008 04:25:10 -0000 Authentication-Results: pb1.pair.com smtp.mail=andrei@gravitonic.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=andrei@gravitonic.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain gravitonic.com from 204.11.219.139 cause and error) X-PHP-List-Original-Sender: andrei@gravitonic.com X-Host-Fingerprint: 204.11.219.139 mail.lerdorf.com Received: from [204.11.219.139] ([204.11.219.139:53333] helo=mail.lerdorf.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 06/70-64254-62075974 for ; Mon, 21 Jan 2008 23:25:10 -0500 Received: from [192.168.11.21] (c-71-202-45-81.hsd1.ca.comcast.net [71.202.45.81]) (authenticated bits=0) by mail.lerdorf.com (8.14.2/8.14.2/Debian-2) with ESMTP id m0M4P6C4029598; Mon, 21 Jan 2008 20:25:06 -0800 In-Reply-To: <47956F60.3050902@wikimedia.org> References: <4794AE48.20005@daylessday.org> <47956F60.3050902@wikimedia.org> Mime-Version: 1.0 (Apple Message framework v753) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-ID: <0DC2776D-F5A1-4C80-9408-8E5A29D833CD@gravitonic.com> Cc: php-dev Content-Transfer-Encoding: 7bit Date: Mon, 21 Jan 2008 20:25:06 -0800 To: Tim Starling X-Mailer: Apple Mail (2.753) X-Virus-Scanned: ClamAV 0.92/5515/Mon Jan 21 15:03:56 2008 on colo.lerdorf.com X-Virus-Status: Clean Subject: Re: [PHP-DEV] why we must get rid of unicode.semantics switch ASAP From: andrei@gravitonic.com (Andrei Zmievski) > As for PHP 6 generally: there needs to be a solid migration path, > such as forwards-compatible syntax introduced to PHP 5. MediaWiki > has extensive support for unicode in PHP 5, including a pure PHP > implementation of NFC, cross-script and confusable character > checks, extensive parsing of UTF-8 text using regexes both with and > without /u, and megabytes of localisations in the form of PHP > source files with UTF-8 string literals. > > Porting all this to a UTF-16-based environment would be a hassle, > and we don't gain anything from it in terms of features for our > users. I'd hate to end up in an adversarial situation, where > developers working in PHP are forced to boycott or fork PHP 6. > That's why a simple migration path is important. > I don't think that "porting to a UTF-16 environment" in your case is that hard at all. UTF-8 source files will work transparently with proper script encoding setting, PCRE regexes work the same way, and you could replace your own implementations of NFC, etc with PHP provided ones. -Andrei