Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:47370 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 48885 invoked from network); 17 Mar 2010 16:17:13 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Mar 2010 16:17:13 -0000 Authentication-Results: pb1.pair.com smtp.mail=foolistbar@googlemail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=foolistbar@googlemail.com; sender-id=pass; domainkeys=bad Received-SPF: pass (pb1.pair.com: domain googlemail.com designates 72.14.220.152 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: foolistbar@googlemail.com X-Host-Fingerprint: 72.14.220.152 fg-out-1718.google.com Received: from [72.14.220.152] ([72.14.220.152:52276] helo=fg-out-1718.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E0/88-05162-78001AB4 for ; Wed, 17 Mar 2010 11:17:12 -0500 Received: by fg-out-1718.google.com with SMTP id e21so532617fga.11 for ; Wed, 17 Mar 2010 09:17:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:cc:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=o7AniY2dsPfsbJq8ZnouTrToE89iMGwmzgX5882Yx6s=; b=u5fne5IlA8S6ZZ3zPU2Gw5vjtjAyGjbzJEI25nQU+RrwCxQDqMwphllk/Ai2oVon4j W3FWvJCl1lgOUcVuaH8YOXqUCrFl8OJmcJfcffrVwy+DYrKYlNwmmRaman8eQI7QJLKu Cn4BuWJUvvbGBFT7PMZtADm25AMNYXkYqVAwk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=cc:message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=rgNOyoPBbxO/8FFk8LVTo20Iuv84rI5inh0a0a5OjPwSK5yiCdLQBTCbqbWVvpX8pU PzNct4RIm7FQu7Qb8XYrWLVlJ/NxsrtVkeNty/incvloNujc88AGC2L3IzkUluhEi0X6 E+qxKms2pWS3+VhL1pSPeSK3K6xDJsmtwQOT8= Received: by 10.87.43.17 with SMTP id v17mr13078948fgj.53.1268842630119; Wed, 17 Mar 2010 09:17:10 -0700 (PDT) Received: from [192.168.12.3] (static-88.131.66.112.addr.tdcsong.se [88.131.66.112]) by mx.google.com with ESMTPS id 13sm5048464fxm.10.2010.03.17.09.17.07 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 17 Mar 2010 09:17:09 -0700 (PDT) Cc: Stanislav Malyshev , Hannes Magnusson , internals@lists.php.net Message-ID: <627E479A-B8DA-4156-A28E-5341E8258785@googlemail.com> To: Philip Olson In-Reply-To: <7DB4E2BE-9035-4AA0-9A40-564217CB6BD8@roshambo.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Date: Wed, 17 Mar 2010 00:23:33 +0100 References: <4B9926E8.4080202@lerdorf.com> <7f3ed2c31003120958w7bd41059o88869669c6f5b0d9@mail.gmail.com> <4B9A880B.7060003@zend.com> <7f3ed2c31003121035of414ae0k9e2577d8b22e7538@mail.gmail.com> <4B9A8BF8.8030904@zend.com> <7DB4E2BE-9035-4AA0-9A40-564217CB6BD8@roshambo.org> X-Mailer: Apple Mail (2.936) Subject: Re: [PHP-DEV] PHP 6 From: foolistbar@googlemail.com (Geoffrey Sneddon) On 12 Mar 2010, at 20:15, Philip Olson wrote: > > On Mar 12, 2010, at 10:46 AM, Stanislav Malyshev wrote: > >> Hi! >> >>> Yeah. >>> We tried it, and it simply didn't pan out (performance, bc, lost >>> interest, ..). >> >> I think it is a bit premature to declare the death of Unicode in >> PHP. Yes, we know there are problems, and yes, it was harder that >> initially thought, so we may want to take a step back and rethink >> it. Also we may want to get Unicode out of the way of other PHP >> development, since it's taking longer than planned. But that >> doesn't mean we should bury it. > > How have other languages progressed down the unicode road? Is there > anything we can learn from their progress over these past few years? From all the languages that I've had dealings with, only Python has attempted anything like the previous PHP 6 attempt. Ruby's move to a certain level of Unicode support in 1.9 is interesting, though I'm not entirely sure that's been out for long enough to draw any real conclusions about uptake of it from. I think the most important thing learnt from the Python case is that backwards compatibility is paramount, and trying to break backwards compatibility with programmatic conversion to the new language version is hard to gather uptake on, yet alone what happened with the old PHP6 branch, which would've broken large amounts of applications with no way to programmatically convert code to it. Python 2 had no problem getting uptake where Unicode strings need to be specifically marked (e.g., u"foo" as opposed to "foo"), yet Python 3 (which can mostly be programmatically converted from Python 2) has had comparatively little uptake due to its incompatibility. So, let me start with what I want to be true of PHP 6: anything that runs under PHP 5.3 and does not throw any errors (with E_ALL | E_DEPRECATED) must behave identically under PHP 6. That single statement has quite a lot of consequences, but, with regards to Unicode, one thing more than anything else: Unicode strings cannot be the default. I have plenty of code that uses UTF-8 in some strings and arbitrary binary data in others. I want to be able to move to PHP 6 gradually: I shouldn't have to wait for every library I rely upon to be modified for PHP 6 compatibility. I should just be able to move to PHP 6, and look over my own code and change what strings I want to Unicode strings. To point out what should be obvious to everyone here: one of the biggest strengths of PHP is the large amount of library and applications already written for it. Making a large, backwards incompatible change such as making Unicode strings the default would not only limit adoption to those who have entirely new code, but also alienate most shared-hosting providers who cannot afford to break their clients code because of a backwards incompatible change that'll break everyone's applications. If there's one thing I've learnt from working on browsers for the past few years it's that backwards compatibility is more valuable than something new and shiny. I have no doubt PHP needs Unicode support, but I don't think that breaking backwards compatibility for it is the right solution. The fact that PHP is deployed as it is, often in shared hosting setups, should very much be a reason to be concerned for backwards compatibility. A browser would get almost no marketshare if it broke a large percentage of existing websites; I believe the same to be true of PHP with the websites it powers. -- Geoffrey Sneddon