Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:18812 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 38030 invoked by uid 1010); 13 Sep 2005 08:49:58 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 38014 invoked from network); 13 Sep 2005 08:49:57 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 13 Sep 2005 08:49:57 -0000 X-Host-Fingerprint: 80.74.107.235 mail.zend.com Linux 2.5 (sometimes 2.4) (4) Received: from ([80.74.107.235:38064] helo=mail.zend.com) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 7F/28-58045-60C86234 for ; Tue, 13 Sep 2005 04:21:27 -0400 Received: (qmail 31439 invoked from network); 13 Sep 2005 08:21:22 -0000 Received: from internal.zend.office (HELO ?127.0.0.1?) (10.1.1.1) by internal.zend.office with SMTP; 13 Sep 2005 08:21:22 -0000 Message-ID: <43268C01.20006@zend.com> Date: Tue, 13 Sep 2005 12:21:21 +0400 User-Agent: Thunderbird 1.4 (X11/20050907) MIME-Version: 1.0 To: Andi Gutmans CC: Andrei Zmievski , php-dev , Dmitry Stogov References: <43215A91.8050409@zend.com> <9CF57DC5-A18B-4264-B20B-8552B0BB66F1@gravitonic.com> <6.2.3.4.2.20050912175136.04449320@localhost> In-Reply-To: <6.2.3.4.2.20050912175136.04449320@localhost> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] unserialize() & unicode issues From: antony@zend.com (Antony Dovgal) Even if the class name is in Unicode, we can try to convert it to ASCII and fail only in the case when we can't find its class entry in the list. So, I don't see any need in markers and other fairly major changes. On 13.09.2005 04:54, Andi Gutmans wrote: > Not coming with a solution, but I believe this would be a bad idea. I > do think some people will be using IS_UNICODE strings when > unicode_semantics=off, mainly for existing applications. They may > want to serialize Unicode strings even though their classes are > IS_STRING. It might make sense to raise an error though if a "class" > is used, but if it's just a value or a hash key, then those are valid > in unicode_semantics=off. > > Andi > > At 06:44 AM 9/9/2005, Andrei Zmievski wrote: >>Yes, serialization is a problem. I would actually advocate putting a >>marker in the serialized file that indicates what the value of >>unicode_semantics switch was during the serialization, and if the >>value is different during deserialization, refuse to load it or start >>a new session. One really should not be changing that switch on a >>whim in-between sessions. -- Wbr, Antony Dovgal