Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:37851 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 77090 invoked from network); 24 May 2008 19:14:39 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 May 2008 19:14:39 -0000 Authentication-Results: pb1.pair.com header.from=johannes@php.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=johannes@php.net; spf=unknown; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain php.net does not designate 83.243.58.163 as permitted sender) X-PHP-List-Original-Sender: johannes@php.net X-Host-Fingerprint: 83.243.58.163 mail4.netbeat.de Received: from [83.243.58.163] ([83.243.58.163:47757] helo=mail4.netbeat.de) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 30/D9-21001-C1968384 for ; Sat, 24 May 2008 15:14:39 -0400 Received: (qmail 7411 invoked by uid 507); 24 May 2008 19:14:30 -0000 Received: from unknown (HELO ?192.168.1.103?) (postmaster%schlueters.de@88.217.41.168) by mail4.netbeat.de with ESMTPA; 24 May 2008 19:14:30 -0000 To: Steph Fox Cc: Andrei Zmievski , Antony Dovgal , internals@lists.php.net In-Reply-To: <008f01c8bd9d$63f0cde0$4401a8c0@foxbox> References: <7d6e34d80805191240k64cb1ba6k3e8f7a50ddf068c@mail.gmail.com> <4831F27B.7030001@suse.de> <296949B4-D328-49FE-968B-4942B28FE869@pooteeweet.org> <7d6e34d80805191454m69614624v7a05037fa947328e@mail.gmail.com> <698DE66518E7CA45812BD18E807866CE019F60DE@us-ex1.zend.net> <34.64.28995.1BE23384@pb1.pair.com> <02e701c8bab7$19a3dd10$4401a8c0@foxbox> <4833FD5B.2010308@daylessday.org> <003f01c8bb33$81ae5030$4401a8c0@foxbox> <48346ED3.9040505@gravitonic.com> <001501c8bbf6$5702cf50$4401a8c0@foxbox> <48371131.50003@gravitonic.com> <008f01c8bd9d$63f0cde0$4401a8c0@foxbox> Content-Type: text/plain; charset=UTF-8 Date: Sat, 24 May 2008 21:14:31 +0200 Message-ID: <1211656471.11520.36.camel@goldfinger.johannes.nop> Mime-Version: 1.0 X-Mailer: Evolution 2.12.3 (2.12.3-4.fc8) Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] Unicode progress [Was: unicode.semantics ad infinitum] From: johannes@php.net (Johannes =?ISO-8859-1?Q?Schl=FCter?=) Hi, On Sat, 2008-05-24 at 13:55 +0100, "Steph Fox" wrote: > Warning: crc32() expects parameter 1 to be strictly a binary string, Unicode > string given in ... > > Surely if a function's *expecting* a binary string it should do a silent > conversion, and only throw a warning if the conversion fails? I don't see > why the onus should always be on the user to adapt to this. For some functions taking binary strings is critical for working nicely with an automatic conversion in this case crc32(u"äöü") and crc32(b"äöü") would give completely different results depending on the runtime encoding, relying on a implicit conversion there is most likely a bug (at least for apps written with PHP 6 in mind). Oh and I might probably also argue that crc32(u"äöü") should give the crc32 of the internal representation (utf-16...) of the string, which is a total wtf for the user then. The correct solution is to make safe use of the "S" modifier and not using it too much. As binary casts are allowed in modern PHP versions I don't see this as an issue, while such a cast isn't absolutely the best thing to do: I'd go with unicode_encode() to be sure about the encoding being used, everything else is prone to fail some time (some code changing unicode.runtime_encoding for some random reason...) johannes