Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:37852 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 80104 invoked from network); 24 May 2008 19:36:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 May 2008 19:36:27 -0000 Authentication-Results: pb1.pair.com smtp.mail=steph@zend.com; spf=softfail; sender-id=softfail Authentication-Results: pb1.pair.com header.from=steph@zend.com; sender-id=softfail Received-SPF: softfail (pb1.pair.com: domain zend.com does not designate 64.97.136.155 as permitted sender) X-PHP-List-Original-Sender: steph@zend.com X-Host-Fingerprint: 64.97.136.155 smtpout0155.sc1.he.tucows.com Solaris 8 (1) Received: from [64.97.136.155] ([64.97.136.155:50426] helo=n068.sc1.he.tucows.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E5/4A-21001-A3E68384 for ; Sat, 24 May 2008 15:36:27 -0400 Received: from sc1-out04.emaildefenseservice.com (64.97.139.2) by n068.sc1.he.tucows.com (7.2.069.1) id 4769316E015A81E6; Sat, 24 May 2008 19:35:50 +0000 X-SpamScore: 2 X-Spamcatcher-Summary: 2,0,0,ffefa11fc17c8f9e,0332e30bd222c097,steph@zend.com,-,RULES_HIT:152:355:379:539:540:541:542:543:567:599:601:945:973:988:989:1155:1156:1260:1277:1311:1313:1314:1345:1437:1515:1516:1518:1534:1541:1587:1593:1594:1676:1711:1730:1747:1766:1792:2073:2075:2078:2194:2199:2393:2559:2562:2827:3027:3353:3622:3865:3866:3867:3868:3869:3870:3871:3872:3873:3874:4250:4470:5007:6119:6261:7875,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF: not bulk,SPF:,MSBL:none,DNSBL:none,TSO:0 X-Spamcatcher-Explanation: Received: from foxbox (host86-137-246-48.range86-137.btcentralplus.com [86.137.246.48]) (Authenticated sender: steph.fox) by sc1-out04.emaildefenseservice.com (Postfix) with ESMTP; Sat, 24 May 2008 19:35:49 +0000 (UTC) Message-ID: <014201c8bdd5$8aa482a0$4401a8c0@foxbox> Reply-To: "Steph Fox" To: =?UTF-8?Q?Johannes_Schl=C3=BCter?= Cc: "Andrei Zmievski" , "Antony Dovgal" , References: <7d6e34d80805191240k64cb1ba6k3e8f7a50ddf068c@mail.gmail.com> <4831F27B.7030001@suse.de> <296949B4-D328-49FE-968B-4942B28FE869@pooteeweet.org> <7d6e34d80805191454m69614624v7a05037fa947328e@mail.gmail.com> <698DE66518E7CA45812BD18E807866CE019F60DE@us-ex1.zend.net> <34.64.28995.1BE23384@pb1.pair.com> <02e701c8bab7$19a3dd10$4401a8c0@foxbox> <4833FD5B.2010308@daylessday.org> <003f01c8bb33$81ae5030$4401a8c0@foxbox> <48346ED3.9040505@gravitonic.com> <001501c8bbf6$5702cf50$4401a8c0@foxbox> <48371131.50003@gravitonic.com> <008f01c8bd9d$63f0cde0$4401a8c0@foxbox> <1211656471.11520.36.camel@goldfinger.johannes.nop> Date: Sat, 24 May 2008 20:37:01 +0100 Organization: Zend Technologies MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="UTF-8"; reply-type=original Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Subject: Re: [PHP-DEV] Unicode progress [Was: unicode.semantics adinfinitum] From: steph@zend.com ("Steph Fox") Heya Johannes, > For some functions taking binary strings is critical for working nicely > with an automatic conversion in this case > crc32(u"äöü") > and > crc32(b"äöü") > would give completely different results depending on the runtime > encoding, Yes - but why should the user have to do the casting? Why can't the function itself cast to binary when it has an 'S' modifier? Like, during zend_parse_parameters() for example? Whatever happened to keeping PHP simple? relying on a implicit conversion there is most likely a bug > (at least for apps written with PHP 6 in mind). > > Oh and I might probably also argue that > crc32(u"äöü") > should give the crc32 of the internal representation (utf-16...) of the > string, which is a total wtf for the user then. Nobody's asking to be able to cast it to unicode. I'm asking whether it's entirely necessary to force users to cast to binary all over the place, and a strict binary parameter spec looks like being one place where the cast could be done internally. > The correct solution is to make safe use of the "S" modifier and not > using it too much. > > As binary casts are allowed in modern PHP versions I don't see this as > an issue, while such a cast isn't absolutely the best thing to do: I'd > go with unicode_encode() to be sure about the encoding being used, > everything else is prone to fail some time (some code changing > unicode.runtime_encoding for some random reason...) You're telling me an explicit cast to binary could fail internally but not externally? That doesn't make a lot of sense somehow. - Steph