Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:18394 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 22306 invoked by uid 1010); 25 Aug 2005 02:42:00 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 22290 invoked from network); 25 Aug 2005 02:42:00 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 25 Aug 2005 02:42:00 -0000 X-Host-Fingerprint: 141.146.126.229 agminet02.oracle.com Linux 2.4/2.6 Received: from ([141.146.126.229:12766] helo=agminet02.oracle.com) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 96/A7-28235-7FF2D034 for ; Wed, 24 Aug 2005 22:42:00 -0400 Received: from rgmgw1.us.oracle.com (rgmgw1.us.oracle.com [138.1.186.110]) by agminet02.oracle.com (Switch-3.1.7/Switch-3.1.7) with ESMTP id j7P2ftf2011776; Wed, 24 Aug 2005 21:41:55 -0500 Received: from localhost (localhost [127.0.0.1]) by rgmgw1.us.oracle.com (Switch-3.1.4/Switch-3.1.0) with SMTP id j7P2fsGc028979; Wed, 24 Aug 2005 20:41:54 -0600 Received: from [130.35.48.248] (mtozawa-pc2.us.oracle.com [130.35.48.248]) by rgmgw1.us.oracle.com (Switch-3.1.4/Switch-3.1.0) with ESMTP id j7P2fr2t028948; Wed, 24 Aug 2005 20:41:53 -0600 Message-ID: <430D2FCA.7010901@oracle.com> Date: Wed, 24 Aug 2005 19:41:14 -0700 User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andrei Zmievski CC: christopher.jones@oracle.com, PHP Developers Mailing List References: <430BDBAC.70701@oracle.com> <74038418fdd7cdc963d93501092f4858@gravitonic.com> In-Reply-To: <74038418fdd7cdc963d93501092f4858@gravitonic.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE Subject: Re: [PHP-DEV] Re: PHP Unicode support design document From: makoto.tozawa@oracle.com (Makoto Tozawa) Andrei Zmievski wrote: >> Is there any way to keep the byte semantics (in oppose to unicode >> semantics) >> only for the existing functions? For example, the Oracle 8 functions >> can be >> configured to use utf-8 for the character encoding of strings. In >> order for >> them to work properly, fundamental functions, which Oracle 8 function >> call, >> have to behave in byte samentics. And if they work properly when the >> unicode >> semantics switch is turned on, by setting the runtime_encoding to utf-8, >> they can be called by uncode applications. > > I couldn't parse this on the first try. Could you restate this? Say there is a function which calls strlen($s) expecting it returns byte size of $s, and it is working fine when $s constains multibyte characters. For example, the function expects strlen('áéí') returns 6 when the encoding is utf-8. If this function is called by Uniocde ready applications on Unicode-enabled PHP, it will fall into error because strlen('áéí') will return 3. Is there any way to let strlen('áéí') return 6 only when it is called by the existing function? Hope I explained well this time. Makoto