Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:30707 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 26643 invoked by uid 1010); 9 Jul 2007 22:36:56 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 26624 invoked from network); 9 Jul 2007 22:36:56 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 9 Jul 2007 22:36:56 -0000 Authentication-Results: pb1.pair.com smtp.mail=stas@zend.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=stas@zend.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 63.205.162.114 as permitted sender) X-PHP-List-Original-Sender: stas@zend.com X-Host-Fingerprint: 63.205.162.114 unknown Windows 2000 SP4, XP SP1 Received: from [63.205.162.114] ([63.205.162.114:46341] helo=us-ex1.zend.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 8A/43-32798-CD7B2964 for ; Mon, 09 Jul 2007 18:34:13 -0400 Received: from [127.0.0.1] ([192.168.16.180]) by us-ex1.zend.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 9 Jul 2007 15:33:58 -0700 Message-ID: <4692B7D4.6040001@zend.com> Date: Mon, 09 Jul 2007 15:33:56 -0700 Organization: Zend Technologies User-Agent: Thunderbird 2.0.0.4 (Windows/20070604) MIME-Version: 1.0 To: Antony Dovgal CC: Andrei Zmievski , internals@lists.php.net References: <1181829227.3478.3.camel@localhost.localdomain> <4678252F.2050803@sci.fi> <46783212.4020900@lerdorf.com> <34654.216.230.84.67.1183064088.squirrel@www.l-i-e.com> <54557.78.61.224.253.1183098089.squirrel@avilys.eik.lt> <4684BB91.4070507@zend.com> <2169.24.1.37.132.1183693664.squirrel@www.l-i-e.com> <1183699755.14343.5.camel@johannes.nop> <7d5a202f0707060224oa64dfeaw2c7ee17a735648f9@mail.gmail.com> <468E1158.2030900@lerdorf.com> <468E13C6.1070109@pooteeweet.org> <468E2009.9000703@zend.com> <468E7180.3020709@zend.com> <468E7256.10905@zend.com> <4692B1A3.1000808@zend.com> In-Reply-To: <4692B1A3.1000808@zend.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-OriginalArrivalTime: 09 Jul 2007 22:33:58.0701 (UTC) FILETIME=[3DCCC5D0:01C7C279] Subject: Re: [PHP-DEV] What is the use of "unicode.semantics" in PHP 6? From: stas@zend.com (Stanislav Malyshev) > Do _I_ like that horrible IS_STRING/IS_UNICODE mess we have atm? No. I don't think there's any way of having both unstructured character data and Unicode text represented without having two distinct types. Either that or you'd have to tell on each step which one it is, and that would suck much more. > I would love to have clean and easy PHP6 without all the > "compatibility", which creates gazillion problems to both users and > developers. Fixing unicode=on does not remove the IS_STRING/IS_UNICODE duality. We still have two kinds of data - unstructured bit stream and structured text. If we want strlen("превед") to return 6 - since that Russian word has 6 characters - then we have no way but recognize that it's not just a collection of bits but Unicode text, and that would require separate type, as I see it. And as I see it, this is the source of the problems when people try to operate on text as on bit stream and vice versa. Unless I totally missed what mess you are referring to... -- Stanislav Malyshev, Zend Software Architect stas@zend.com http://www.zend.com/ (408)253-8829 MSN: stas@zend.com