Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:18387 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 63484 invoked by uid 1010); 24 Aug 2005 23:23:52 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 63469 invoked from network); 24 Aug 2005 23:23:52 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 Aug 2005 23:23:52 -0000 X-Host-Fingerprint: 216.145.54.171 mrout1.yahoo.com FreeBSD 4.7-5.2 (or MacOS X 10.2-10.3) (2) Received: from ([216.145.54.171:20598] helo=mrout1.yahoo.com) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 09/D1-28235-8810D034 for ; Wed, 24 Aug 2005 19:23:52 -0400 Received: from [66.228.175.145] (borndress-lm.corp.yahoo.com [66.228.175.145]) by mrout1.yahoo.com (8.13.4/8.13.4/y.out) with ESMTP id j7ONNB7D045767; Wed, 24 Aug 2005 16:23:11 -0700 (PDT) In-Reply-To: <430BDBAC.70701@oracle.com> References: <430BDBAC.70701@oracle.com> Mime-Version: 1.0 (Apple Message framework v622) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-ID: <74038418fdd7cdc963d93501092f4858@gravitonic.com> Content-Transfer-Encoding: 7bit Cc: christopher.jones@oracle.com, PHP Developers Mailing List Date: Wed, 24 Aug 2005 16:23:19 -0700 To: Makoto Tozawa X-Mailer: Apple Mail (2.622) Subject: Re: [PHP-DEV] Re: PHP Unicode support design document From: andrei@gravitonic.com (Andrei Zmievski) Hi, On Aug 23, 2005, at 7:30 PM, Makoto Tozawa wrote: > "HTTP Input Encoding > ... > If the HTTP request contains the encoding specification in the headers, > then it will be used instead of this setting." > > With my best knowledge there isn't such http request header which > specifies the encoding of the request. In case the intent is to honor > the ACCEPT-CHARSET, it may cause a problem because browsers don't > gurantee the encoding in the ACCEPT-CHARSET is same as the encoding > used to escape characters in the URL query string. After all, the > ACCEPT-CHARSET is to specify the character encodings acceptable for > the response. I took a closer look at this today and RFC 2616 does not specify whether user agents are supposed to send a charset parameter in the Content-Type header of the POST request. I did not see any of my browsers doing so. I think we can safely disregard this and rely on http_input_encoding and output_encoding settings. We are not going to use Accept-Charset for the reasons you mention. > Is there any way to keep the byte semantics (in oppose to unicode > semantics) > only for the existing functions? For example, the Oracle 8 functions > can be > configured to use utf-8 for the character encoding of strings. In > order for > them to work properly, fundamental functions, which Oracle 8 function > call, > have to behave in byte samentics. And if they work properly when the > unicode > semantics switch is turned on, by setting the runtime_encoding to > utf-8, > they can be called by uncode applications. I couldn't parse this on the first try. Could you restate this? -Andrei