Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:30830 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 46257 invoked by uid 1010); 12 Jul 2007 04:16:28 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 46242 invoked from network); 12 Jul 2007 04:16:28 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 12 Jul 2007 04:16:28 -0000 Authentication-Results: pb1.pair.com header.from=tokul@users.sourceforge.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=tokul@users.sourceforge.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain users.sourceforge.net from 213.197.162.99 cause and error) X-PHP-List-Original-Sender: tokul@users.sourceforge.net X-Host-Fingerprint: 213.197.162.99 avilys.eik.lt Linux 2.6 Received: from [213.197.162.99] ([213.197.162.99:34695] helo=avilys.eik.lt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 19/21-24312-81BA5964 for ; Thu, 12 Jul 2007 00:16:26 -0400 Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id E21E11F514B for ; Thu, 12 Jul 2007 07:14:25 +0300 (EEST) Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id CA90A1F5148 for ; Thu, 12 Jul 2007 07:14:25 +0300 (EEST) Received: from 78.61.224.253 (NaSMail authenticated user tomas@topolis.lt) by avilys.eik.lt with HTTP; Thu, 12 Jul 2007 07:14:25 +0300 (EEST) Message-ID: <36788.78.61.224.253.1184213665.squirrel@avilys.eik.lt> In-Reply-To: <2193.24.1.37.132.1184202721.squirrel@www.l-i-e.com> References: <1181829227.3478.3.camel@localhost.localdomain> <7d5a202f0706141844l3c75b556hdbecbcd5a43747c9@mail.gmail.com> <4671F184.2020401@lerdorf.com> <6sof73dj69ldpspfc5ukrc58qr9ckbin2b@4ax.com> <4677E7B1.2080305@lerdorf.com> <4677F5FB.1070206@lerdorf.com> <4678252F.2050803@sci.fi> <46783212.4020900@lerdorf.com> <34654.216.230.84.67.1183064088.squirrel@www.l-i-e.com> <54557.78.61.224.253.1183098089.squirrel@avilys.eik.lt> <2159.24.1.37.132.1183693437.squirrel@www.l-i-e.com> <468DDFEB.3080404@zend.com> <2031.24.1.37.132.1183965946.squirrel@www.l-i-e.com> <43868.195.22.180.233.1183968428.squirrel@avilys.eik.lt> <2193.24.1.37.132.1184202721.squirrel@www.l-i-e.com> Date: Thu, 12 Jul 2007 07:14:25 +0300 (EEST) To: internals@lists.php.net User-Agent: NaSMail/1.2 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] What is the use of "unicode.semantics" in PHP 6? From: tokul@users.sourceforge.net ("Tomas Kuliavas") >>>>>> Unicode code points can be defined with \u, but PHP6 breaks >>>>>> existing octal and hex escape sequences. >>>> >>>> I don't understand what this means... >>> >>> I think I know... >>> >>> I have code like this, somewhere: >>> >>> if (preg_match("|[\xF0-\xFF]|", $data)){ >>> $data = un_microsuck($data); >>> } >>> >>> un_microsuck() basically detects and converts any of the goof-ball >>> extended ASCII from MS products (Word, Outlook, etc) to an HTML >>> equivalent character. >>> >>> But now \xF0 isn't going to be ASCII 128 anymore, is it? >> >> \xF0 never was ASCII. ASCII (ISO-646) is 7bit character set. \xF0 is >> decimal 240. It is 8bit. > > Don't tell me. > > Tell Microsoft. > > Cuz I sure as heck get a LOT of input data >> \x7f and I have to do > something reasonable with it... > > And I did say "extended ASCII" in the other paragraph, after all... > >>> Or maybe \xF0 will "work" but the octal \360 won't? >> >> Are you sure that you can't do that by setting >> unicode.something_encoding to iso-8859-1 or windows-1252? > > I dunno. > > Doesn't really matter if I can't set those in .htaccess, that's for sure. All unicode. settings except unicode.semantics are PHP_INI_ALL. From README.UNICODE ---- Script Encoding =============== ... If you cannot change the encoding system wide, you can use a pragma to override the INI setting in a local script: ---- -- Tomas