Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:18816 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 88001 invoked by uid 1010); 13 Sep 2005 10:31:21 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 87898 invoked from network); 13 Sep 2005 10:31:20 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 13 Sep 2005 10:31:20 -0000 X-Host-Fingerprint: 80.74.107.235 mail.zend.com Linux 2.5 (sometimes 2.4) (4) Received: from ([80.74.107.235:59535] helo=mail.zend.com) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 4E/0F-58045-8E4A6234 for ; Tue, 13 Sep 2005 06:07:37 -0400 Received: (qmail 28181 invoked from network); 13 Sep 2005 10:07:33 -0000 Received: from internal.zend.office (HELO ?127.0.0.1?) (10.1.1.1) by internal.zend.office with SMTP; 13 Sep 2005 10:07:33 -0000 Message-ID: <4326A4E1.30304@zend.com> Date: Tue, 13 Sep 2005 14:07:29 +0400 User-Agent: Thunderbird 1.4 (X11/20050907) MIME-Version: 1.0 To: Derick Rethans CC: val khokhlov , internals@lists.php.net References: <43215A91.8050409@zend.com> <9CF57DC5-A18B-4264-B20B-8552B0BB66F1@gravitonic.com> <6.2.3.4.2.20050912175136.04449320@localhost> <43268C01.20006@zend.com> <1031242468.20050913123221@vk.kiev.ua> <43269F21.5030705@zend.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] unserialize() & unicode issues From: antony@zend.com (Antony Dovgal) On 13.09.2005 13:52, Derick Rethans wrote: > On Tue, 13 Sep 2005, Antony Dovgal wrote: > >> On 13.09.2005 13:32, val khokhlov wrote: >> > Hello Antony, >> > >> > Tuesday, September 13, 2005, 11:21:21 AM, you wrote: >> > >> > AD> Even if the class name is in Unicode, we can try to convert it to ASCII >> > AD> and fail only in the case when we can't find its class entry in the >> > AD> list. >> > I think, it's not the only way. >> > If we don't care about being compatible with previous PHP's >> > serialize(), a more portable way is to store class/property names in >> > unicode (if unicode_semantics=off when serializing, convert hash keys to >> > unicode). Since we do know script encoding, we can always downgrade >> > unicoded names into local encoding. >> >> So you propose to store strings/hash keys/class names in Unicode even if >> unicode_semantics is Off ? >> It looks like adding unnecessary overhead to me. > > But needed, as even with the semantics off, you can get unicode strings. > Which can end up as array keys. Yes, in this case there is no way to avoid converting (while doing unserialize()), but I don't see any point in creating Unicode strings when serializing with unicode_semantics is Off. -- Wbr, Antony Dovgal