Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:30363 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 32897 invoked by uid 1010); 29 Jun 2007 09:03:09 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 32881 invoked from network); 29 Jun 2007 09:03:09 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 29 Jun 2007 09:03:09 -0000 Authentication-Results: pb1.pair.com header.from=tokul@users.sourceforge.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=tokul@users.sourceforge.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain users.sourceforge.net from 213.197.162.99 cause and error) X-PHP-List-Original-Sender: tokul@users.sourceforge.net X-Host-Fingerprint: 213.197.162.99 avilys.eik.lt Linux 2.6 Received: from [213.197.162.99] ([213.197.162.99:42980] helo=avilys.eik.lt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 62/22-10817-BCAC4864 for ; Fri, 29 Jun 2007 05:03:08 -0400 Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id F0DC51F5105 for ; Fri, 29 Jun 2007 12:01:17 +0300 (EEST) Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id D212A1F50F9 for ; Fri, 29 Jun 2007 12:01:17 +0300 (EEST) Received: from 78.61.224.253 (NaSMail authenticated user tomas@topolis.lt) by avilys.eik.lt with HTTP; Fri, 29 Jun 2007 12:01:17 +0300 (EEST) Message-ID: <41782.78.61.224.253.1183107677.squirrel@avilys.eik.lt> In-Reply-To: <4684BB91.4070507@zend.com> References: <1181829227.3478.3.camel@localhost.localdomain> <7d5a202f0706141844l3c75b556hdbecbcd5a43747c9@mail.gmail.com> <4671F184.2020401@lerdorf.com> <6sof73dj69ldpspfc5ukrc58qr9ckbin2b@4ax.com> <4677E7B1.2080305@lerdorf.com> <4677F5FB.1070206@lerdorf.com> <4678252F.2050803@sci.fi> <46783212.4020900@lerdorf.com> <34654.216.230.84.67.1183064088.squirrel@www.l-i-e.com> <54557.78.61.224.253.1183098089.squirrel@avilys.eik.lt> <4684BB91.4070507@zend.com> Date: Fri, 29 Jun 2007 12:01:17 +0300 (EEST) To: internals@lists.php.net User-Agent: NaSMail/1.0 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] What is the use of "unicode.semantics" in PHP 6? From: tokul@users.sourceforge.net ("Tomas Kuliavas") >> Unicode code points can be defined with \u, but PHP6 breaks existing >> octal >> and hex escape sequences. > > What do you mean? Doesn't \x20 create U0020 character? Or you mean you'd > expect it to create just one-byte 0x20? Doesn't binary string do that? Try higher than 0x7F values. If I write "\xA0", I expect one byte with A0 hex value and not 0xC2\xA0 (\u00A0). If I use \x80-\xFF range, I expect functions to match bytes and not only \u0080 - \u00FF Binary strings can do that, but they are not backwards compatible. In order to do same thing in PHP4/5 and PHP6, I'll have to move code into separate libraries. >> PHP6 is very noisy ("Notice: fwrite(): 13 character unicode buffer >> downcoded for binary stream runtime_encoding", "Warning: base64_encode() >> expects parameter 1 to be strictly a binary string, Unicode string >> given") > > Well, exporting and importing to and from non-unicode contexts are > tricky, and fwrite and base64_encode do exactly that. Maybe some > functions need to be less noisy, I don't know - but when people work > with unicode they must be aware that interoperating with non-unicode > contexts brings some complexity, I don't see how that can be avoided. For me it means that I have to maintain wrappers for fwrite, base64_encode, ord, crc32 and all other unicode aware functions. Any direct PHP string or stream function call can cause compatibility issues or notices. Any function working with binary data will need separate version for PHP6. Instead of having unicode switches in interpreter itself, I'll have to implement them in scripts. Talk about performance issues after that. -- Tomas