Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:29631 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 34154 invoked by uid 1010); 21 May 2007 17:13:05 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 34138 invoked from network); 21 May 2007 17:13:05 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 May 2007 17:13:05 -0000 Authentication-Results: pb1.pair.com smtp.mail=andrei@gravitonic.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=andrei@gravitonic.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain gravitonic.com from 204.11.219.139 cause and error) X-PHP-List-Original-Sender: andrei@gravitonic.com X-Host-Fingerprint: 204.11.219.139 mail.lerdorf.com Received: from [204.11.219.139] ([204.11.219.139:57092] helo=mail.lerdorf.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id A4/5E-03101-123D1564 for ; Mon, 21 May 2007 13:13:05 -0400 Received: from [192.168.1.166] (adsl-75-57-244-158.dsl.snfc21.sbcglobal.net [75.57.244.158]) (authenticated bits=0) by mail.lerdorf.com (8.14.1/8.14.1/Debian-2) with ESMTP id l4LHD179015963; Mon, 21 May 2007 10:13:02 -0700 In-Reply-To: <59165.88.118.163.159.1179641635.squirrel@avilys.eik.lt> References: <51491.88.118.163.159.1179577357.squirrel@avilys.eik.lt> <464EEF4B.1030002@zend.com> <40865.88.118.163.159.1179583186.squirrel@avilys.eik.lt> <464F090A.9090200@zend.com> <35054.88.118.163.159.1179589687.squirrel@avilys.eik.lt> <464F650B.6090802@zend.com> <59165.88.118.163.159.1179641635.squirrel@avilys.eik.lt> Mime-Version: 1.0 (Apple Message framework v752.2) X-Priority: 3 (Normal) Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed Message-ID: <335A483A-55B1-4A1D-A2CF-A6DB0EDDFA5F@gravitonic.com> Cc: "Antony Dovgal" , internals@lists.php.net Content-Transfer-Encoding: quoted-printable Date: Mon, 21 May 2007 10:13:01 -0700 To: Tomas Kuliavas X-Mailer: Apple Mail (2.752.2) X-Virus-Scanned: ClamAV 0.90.2/3274/Mon May 21 08:19:42 2007 on colo.lerdorf.com X-Virus-Status: Clean Subject: Re: [PHP-DEV] PHP Unicode extension in PHP6 From: andrei@gravitonic.com (Andrei Zmievski) On May 19, 2007, at 11:13 PM, Tomas Kuliavas wrote: > 0xC4 and 0x85 are hex codes for latin small letter a with ogonek in =20= > utf-8. =C4=85 > > var_dump("=C4=85" =3D=3D "\xC4\x85"); > echo "=C4=85\n"; > echo "\xC4\x85"; > ?> > > If script is written in utf-8, I expect bool(true) on var_dump() line. var_dump("=C4=85" =3D=3D b"\xC4\x85"); This will give you what you want, if the script is written in UTF-8 =20 and your runtime encoding is set to UTF-8. > // example uses utf-8. similar code is used in iso-8859-2 - > // iso-8859-16 decoding. utf-8 decoding does not need mapping tables > // and is written in pcre. > $s1 =3D "=C4=85"; > $s2 =3D "\xC4\x85"; > echo str_replace($s2,'ą',$s1); > ?> > > Expected result: ą > Got: =C4=85 > > test setup (php6.0-200705190630) uses trimmed php.ini with only > unicode.semantics=3Don setting > > unicode.fallback_encoding - no value > unicode.filesystem_encoding - no value > unicode.http_input_encoding - no value > unicode.output_encoding - no value > unicode.runtime_encoding - no value > unicode.script_encoding - no value > unicode.semantics - On > unicode.stream_encoding - UTF-8 Why didn't you set any encoding settings? -Andrei=