Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:30689 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 11442 invoked by uid 1010); 9 Jul 2007 18:41:53 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 11427 invoked from network); 9 Jul 2007 18:41:53 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 9 Jul 2007 18:41:53 -0000 Authentication-Results: pb1.pair.com smtp.mail=andrei@gravitonic.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=andrei@gravitonic.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain gravitonic.com from 204.11.219.139 cause and error) X-PHP-List-Original-Sender: andrei@gravitonic.com X-Host-Fingerprint: 204.11.219.139 mail.lerdorf.com Received: from [204.11.219.139] ([204.11.219.139:47355] helo=;; connection timed out; no servers could be reached) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 40/60-08293-E6182964 for ; Mon, 09 Jul 2007 14:41:52 -0400 Received: from [192.168.1.166] (adsl-75-57-244-158.dsl.snfc21.sbcglobal.net [75.57.244.158]) (authenticated bits=0) by ;; connection timed out; no servers could be reached (8.14.1/8.14.1/Debian-7) with ESMTP id l69IflSu023011; Mon, 9 Jul 2007 11:41:47 -0700 In-Reply-To: <47498.78.61.224.253.1183713764.squirrel@avilys.eik.lt> References: <1181829227.3478.3.camel@localhost.localdomain> <7d5a202f0706141844l3c75b556hdbecbcd5a43747c9@mail.gmail.com> <4671F184.2020401@lerdorf.com> <6sof73dj69ldpspfc5ukrc58qr9ckbin2b@4ax.com> <4677E7B1.2080305@lerdorf.com> <4677F5FB.1070206@lerdorf.com> <4678252F.2050803@sci.fi> <46783212.4020900@lerdorf.com> <34654.216.230.84.67.1183064088.squirrel@www.l-i-e.com> <54557.78.61.224.253.1183098089.squirrel@avilys.eik.lt> <2159.24.1.37.132.1183693437.squirrel@www.l-i-e.com> <468DDFEB.3080404@zend.com> <47498.78.61.224.253.1183713764.squirrel@avilys.eik.lt> Mime-Version: 1.0 (Apple Message framework v752.2) X-Priority: 3 (Normal) Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed Message-ID: Cc: internals@lists.php.net Content-Transfer-Encoding: quoted-printable Date: Mon, 9 Jul 2007 11:41:46 -0700 To: Tomas Kuliavas X-Mailer: Apple Mail (2.752.2) X-Virus-Scanned: ClamAV 0.90.3/3615/Mon Jul 9 07:28:23 2007 on colo.lerdorf.com X-Virus-Status: Clean Subject: Re: [PHP-DEV] What is the use of "unicode.semantics" in PHP 6? From: andrei@gravitonic.com (Andrei Zmievski) Once again, you're trying to work with bytes inside Unicode strings, =20 which just does not make sense. What do you propose we do, somehow =20 automatically detect that you used \x inside a Unicode string and =20 turn it into a binary one? Or simply allow one to stick any byte =20 sequence inside what is supposed to be a valid UTF-16 string? If you're trying to generate a UTF-8 string on a byte by byte basis, =20 then it needs to be a binary string, I'm sorry. Whether you do this =20 via being in unicode.semantics=3Doff mode or via using b"" prefix is up =20= to you. -Andrei > unicode.fallback_encoding =3D> 'utf-8' =3D> 'utf-8' > unicode.filesystem_encoding =3D> no value =3D> no value > unicode.http_input_encoding =3D> 'utf-8' =3D> 'utf-8' > unicode.output_encoding =3D> 'utf-8' =3D> 'utf-8' > unicode.runtime_encoding =3D> 'utf-8' =3D> 'utf-8' > unicode.script_encoding =3D> 'utf-8' =3D> 'utf-8' > unicode.semantics =3D> On =3D> On > unicode.stream_encoding =3D> UTF-8 =3D> UTF-8 > > --- test.php --- > $string1 =3D "=C4=85"; > $string2 =3D "\xC4\x85"; > var_dump($string1 =3D=3D $string2) > var_dump(preg_match("/[\240-\377]/",$string1)); > var_dump(preg_match("/[\240-\377]/",$string2)); > ?> > --- > > =C4=85 is in utf-8 (latin small letter a with ogonek, latin extended-a = =20 > range). > It contains two bytes with 0xC4 0x85 values. > > Expected result and actual result for php 5.2.0: > --- > bool(true) > int(1) > int(1) > --- > "/[\240-\377]/" range should match 0xC4 byte. > > Actual result (PHP6): > --- > bool(false) > int(0) > int(1) > --- > > --=20 > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php