Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98840 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 69227 invoked from network); 21 Apr 2017 17:43:54 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Apr 2017 17:43:54 -0000 Authentication-Results: pb1.pair.com header.from=michal.brzuchalski@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=michal.brzuchalski@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.42 as permitted sender) X-PHP-List-Original-Sender: michal.brzuchalski@gmail.com X-Host-Fingerprint: 74.125.82.42 mail-wm0-f42.google.com Received: from [74.125.82.42] ([74.125.82.42:38779] helo=mail-wm0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E2/60-65481-8D44AF85 for ; Fri, 21 Apr 2017 13:43:53 -0400 Received: by mail-wm0-f42.google.com with SMTP id r190so22930911wme.1 for ; Fri, 21 Apr 2017 10:43:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=xy649Oig0LUH3pt/HTi8xI1M6bZMgwfVUs/VtgvO9t8=; b=OPJhWnMceN59hV0qjepzLPmm7JyDQ3WAwQY79xMVgAE6dVyDUpjinydnWBtGW9P9QG aSDaSyQE3yHLRhjGZPwiaVSHb5i3NMa46rmu85+oDixO2oGjAJ++byonz+dGxLUErqx+ /GW7G0E47ItDetH20tMKFLN104XuaJ32RR7fe0s2FHoOvGeWmtOuLveYvhyA1oFDag+C yRMFpVsSbuaHq9/I1yNA+dEGGjWdcYGX+HkYl9Iw4mAWb5SyXcNAbKWSY7Sjnk2UP9j5 MlflkhkH7pQTakQNeN0Jfk9aW/SHS+HgtqlgbJrFbBl60nFB/OBRcYL9izDMCu1SZQ16 BHJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=xy649Oig0LUH3pt/HTi8xI1M6bZMgwfVUs/VtgvO9t8=; b=fifFHpi5pDlJfrQ6APIr8Sf8rId15Qx3bc8nfqnZRfyqjYFJSRtXwPjFPASz5rsk3g 0FlyvA4HNROMAy/5oDKr7xJOq7bkkZNZHaHocGDzqQvVUUhL7qW5iIAnvVtIk7X90zUX uQOZy5kTRTWhajdthDshShahUbjjmWM3vd/8w3M3f1cEA5WJnV1Kya8qg0AlBP7+eZTS 5frfaCCw65nf4+d+oymKjETnI1gRvgEHFcuV9Ew+sto0SAVw/fVUjSihEiEjrq2UwqVD +3lnwEHn2y5A+SuIJeqVK8qxjJ+A2PJlygHKRb4aHueLB4UT/et/1VIpw6BfqJZOChP3 Ah2Q== X-Gm-Message-State: AN3rC/7tZ/EGDn56wNy6FYRa5df9ApyfLKHVFJfHH9O+wB0ZGFTTY6GE 3DhsuWHf0GnPDvt+amplhZrcCfxBSg== X-Received: by 10.28.138.209 with SMTP id m200mr8758253wmd.109.1492796630278; Fri, 21 Apr 2017 10:43:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.151.152 with HTTP; Fri, 21 Apr 2017 10:43:49 -0700 (PDT) Received: by 10.223.151.152 with HTTP; Fri, 21 Apr 2017 10:43:49 -0700 (PDT) In-Reply-To: References: Date: Fri, 21 Apr 2017 19:43:49 +0200 Message-ID: To: Nikita Popov Cc: PHP Internals List Content-Type: multipart/alternative; boundary=001a114717fadea8b3054db0cceb Subject: Re: [PHP-DEV] A replacement for the Serializable interface From: michal.brzuchalski@gmail.com (=?UTF-8?Q?Micha=C5=82_Brzuchalski?=) --001a114717fadea8b3054db0cceb Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Nikita, Thank you for explanation no I get it. I only don't know what means serialization "O" and "C" but don't bother. I'll try to google it. Cheers -- Micha=C5=82 21.04.2017 15:51 "Nikita Popov" napisa=C5=82(a): On Fri, Apr 21, 2017 at 2:47 PM, Micha=C5=82 Brzuchalski < michal.brzuchalski@gmail.com> wrote: > I know my voice is doesn't mean anything but IMHO interface with magic > methods could bring more inconsistency. > > I know PHP is consistently inconsistent but I would prefer if it is > posdible to fix an issue with present method naming. > > Cheers > Magic methods have a distinct backwards compatibility advantage. They allow you to add __serialize/__unserialize to an existing class that currently uses Serializable. Older PHP versions will then use the Serializable implementation, while never versions will use __serialize/__unserialize. An interface makes this a lot more complicated, because you either have to bump your PHP version requirement (unlikely), or you have to provide a shim interface for older PHP versions. This shim interface would then be part of any library currently implementing Serializable, which seems sub-optimal to me. That's why I think magic methods are better for this case (though I don't strongly care). Also, to answer an OTR question: We cannot simply reuse the Serializable interface by allowing an array return value from Serializable::unserialize(). The array return value is only a means to an end: the important part is that the new serialization mechanism does not share serialization state -- using arrays instead of strings is just a convenient way to achieve this. However, Serializable::unserialize() currently shares the state and we cannot change this without breaking BC -- so we cannot reuse this interface. Nikita > 21.04.2017 11:39 "Nikita Popov" napisa=C5=82(a): > >> Hi internals, >> >> As you are surely aware, serialization in PHP is a big mess. Said mess i= s >> caused by some fundamental issues in the serialization format, and >> exacerbated by the existence of the Serializable interface. Fixing the >> serialization format is likely not possible at this point, but we can >> replace Serializable with a better alternative and I'd like to start a >> discussion on that. >> >> The problem is essentially that Serializable::serialize() is expected to >> return a string, which is generally obtained by recursively calling >> serialize() in the Serializable::serialize() implementation. This >> serialize() call shares state information with the outer serialize(), to >> ensure that two references to the same object (or the same reference) wi= ll >> continue referring to a single object/reference after serialization. >> >> This causes two big issues: >> >> First, the implementation is highly order-dependent. If >> Serializable::serialize() contains multiple calls to serialize(), then >> calls to unserialize() have to be repeated **in the same order** in >> Serializable::unserialize(), otherwise unserialization may fail or be >> corrupted. In particular this means that using parent::serialize() and >> parent::unserialize() is unsafe. (See also >> https://bugs.php.net/bug.php?id=3D66052 and linked bugs.) >> >> Second, the existence of Serializable introduces security issues that we >> cannot fix. Allowing the execution of PHP code during unserialization is >> unsafe, and even innocuous looking code is easily exploited. We have >> recently mitigated __wakeup() based attacks by delaying __wakeup() calls >> until the end of the unserialization. We cannot do the same for >> Serializable::unserialize() calls, as their design strictly requires the >> unserialization context to still be active during the call. Similarly, >> Serializable prevents an up-front validation pass of the serialized >> string, >> as the format used for Serializable objects is user-defined. >> >> The delayed __wakeup() mitigation mentioned in the previous point also >> interacts badly with Serializable, because we have to delay __wakeup() >> calls to the end of the unserialization, which in particular also implie= s >> that Serializable::unserialize() sees objects prior to wakeup. (See also >> https://bugs.php.net/bug.php?id=3D74436.) >> >> In the end, everything comes down to the fact that Serializable requires >> nested serialization calls with context sharing. >> >> The alternative mechanism (__sleep + __wakeup) does not have these issue= s >> (anymore), but it is not sufficiently flexible for general use: Notably, >> __sleep() allows you to limit which properties are serialized, but the >> properties still have to actually exist on the object. >> >> I'd like to propose the addition of a new mechanism which essentially >> works >> the same way as Serializable, but uses arrays instead of strings and doe= s >> not share context. I'm not sure about the naming (RealSerializable, >> anyone?), so I'll just go with magic methods __serialize() and >> __unserialize() for now: >> >> public function __serialize() : array; >> public function __unserialize(array $data) : void; >> >> From a userland perspective the implementation should be the same as for >> Serializable methods, but with interior serialize()/unserialize() calls >> stripped out. Right now Serializable implementations already usually wor= k >> by doing something like "return serialize([ ... ])", this would change i= t >> to just "return [ ... ]" and move the serialize()/unserialize() call int= o >> the engine, where we can perform it safely and robustly. >> >> The new methods should reuse the "O" serialization format, rather than >> introducing a new one. This allows a measure of interoperability with >> previous PHP versions, which can still decode serialized strings from >> newer >> versions using __wakeup(). >> >> If an object has both __wakeup() and __unserialize(), then __unserialize= () >> should be called. If an object implements both Serializable::unserialize= () >> and __unserialize(), then we should invoke one or the other based on >> whether "C" or "O" serialization is used. >> >> Thoughts? >> >> Nikita >> > --001a114717fadea8b3054db0cceb--