Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98837 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 52982 invoked from network); 21 Apr 2017 12:47:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Apr 2017 12:47:27 -0000 Authentication-Results: pb1.pair.com header.from=michal.brzuchalski@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=michal.brzuchalski@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.128.181 as permitted sender) X-PHP-List-Original-Sender: michal.brzuchalski@gmail.com X-Host-Fingerprint: 209.85.128.181 mail-wr0-f181.google.com Received: from [209.85.128.181] ([209.85.128.181:34335] helo=mail-wr0-f181.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id A3/4B-61625-B5FF9F85 for ; Fri, 21 Apr 2017 08:47:25 -0400 Received: by mail-wr0-f181.google.com with SMTP id z109so54152218wrb.1 for ; Fri, 21 Apr 2017 05:47:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=zZINLubGPmE96lXdiC4oqStKOOohcBXHq1MtOzXrlFU=; b=MdxxIhjbmPALgqQzPTh08FZdLJNwVAkKlTbeXrKjqbtC7PEaJA2ib0Yw57Sn88Vjk2 ci+1PaofUbcS7slD1V7WgkCAg2cLbI9AXace62PmhtcI0ErPWLwswZg3Z/9UvN4CJSiD UBllyqk480D9WY+j2H6/6Nfm9i9VX0t07arzrBd+vIfGFwfE7Vl+ynnwlFuOJXUchWKb vH7a6TQBOi711xn469pL82X45vDhHoccNODQ112WJbx1Ff7BpTA3uh5Hep1IdcONGxQQ YwJYS+jjL8ZGWFKldqxDGHj6VcKHcb4ConhCfzIuDRO+IeBYAOGMpOqQnT1411WjOgPq kY/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=zZINLubGPmE96lXdiC4oqStKOOohcBXHq1MtOzXrlFU=; b=Sch0aEtOOOXnqpNEZzJgCIuT9iiA3n09USihEtE+nMiGBeMOvFaiy9MKPF02yhnruL mf3Gw7Gp56KBJ826MeEaAlkKn7sTqF9h7gVr3ZyrSH361oFpOjyL5cdtCZv/7J5ELOYc 5SGqXcIhm0TpCXtdJUWi82DnSh5nGJrFjCatsB56Fib3HFI5gnvWjcBZfMNBSCx83vH5 t5qWur/Gtcb56nTFNbbQBie/wXLwiajcGT3c/4eVhveStDhB/GbBJ/FrpZj6O5M77Sb3 5oCJHNPHPz50EmEsfpt/zagPMW64ywPme20N2k38JwC9RQvY1Oaoyh5uuApmwgmil/Eb mLXw== X-Gm-Message-State: AN3rC/4rNxPhVqhWIonvVFNmZ/tCQnB8yQGRvX00Q4oMe2t2u3dULtVj laQKGGKpt1485kEmYrF85VaxJamptA== X-Received: by 10.223.148.7 with SMTP id 7mr12349222wrq.65.1492778839754; Fri, 21 Apr 2017 05:47:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.151.152 with HTTP; Fri, 21 Apr 2017 05:47:18 -0700 (PDT) Received: by 10.223.151.152 with HTTP; Fri, 21 Apr 2017 05:47:18 -0700 (PDT) In-Reply-To: References: Date: Fri, 21 Apr 2017 14:47:18 +0200 Message-ID: To: Nikita Popov Cc: PHP Internals List Content-Type: multipart/alternative; boundary=94eb2c0d228678d89b054daca8ea Subject: Re: [PHP-DEV] A replacement for the Serializable interface From: michal.brzuchalski@gmail.com (=?UTF-8?Q?Micha=C5=82_Brzuchalski?=) --94eb2c0d228678d89b054daca8ea Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I know my voice is doesn't mean anything but IMHO interface with magic methods could bring more inconsistency. I know PHP is consistently inconsistent but I would prefer if it is posdible to fix an issue with present method naming. Cheers 21.04.2017 11:39 "Nikita Popov" napisa=C5=82(a): > Hi internals, > > As you are surely aware, serialization in PHP is a big mess. Said mess is > caused by some fundamental issues in the serialization format, and > exacerbated by the existence of the Serializable interface. Fixing the > serialization format is likely not possible at this point, but we can > replace Serializable with a better alternative and I'd like to start a > discussion on that. > > The problem is essentially that Serializable::serialize() is expected to > return a string, which is generally obtained by recursively calling > serialize() in the Serializable::serialize() implementation. This > serialize() call shares state information with the outer serialize(), to > ensure that two references to the same object (or the same reference) wil= l > continue referring to a single object/reference after serialization. > > This causes two big issues: > > First, the implementation is highly order-dependent. If > Serializable::serialize() contains multiple calls to serialize(), then > calls to unserialize() have to be repeated **in the same order** in > Serializable::unserialize(), otherwise unserialization may fail or be > corrupted. In particular this means that using parent::serialize() and > parent::unserialize() is unsafe. (See also > https://bugs.php.net/bug.php?id=3D66052 and linked bugs.) > > Second, the existence of Serializable introduces security issues that we > cannot fix. Allowing the execution of PHP code during unserialization is > unsafe, and even innocuous looking code is easily exploited. We have > recently mitigated __wakeup() based attacks by delaying __wakeup() calls > until the end of the unserialization. We cannot do the same for > Serializable::unserialize() calls, as their design strictly requires the > unserialization context to still be active during the call. Similarly, > Serializable prevents an up-front validation pass of the serialized strin= g, > as the format used for Serializable objects is user-defined. > > The delayed __wakeup() mitigation mentioned in the previous point also > interacts badly with Serializable, because we have to delay __wakeup() > calls to the end of the unserialization, which in particular also implies > that Serializable::unserialize() sees objects prior to wakeup. (See also > https://bugs.php.net/bug.php?id=3D74436.) > > In the end, everything comes down to the fact that Serializable requires > nested serialization calls with context sharing. > > The alternative mechanism (__sleep + __wakeup) does not have these issues > (anymore), but it is not sufficiently flexible for general use: Notably, > __sleep() allows you to limit which properties are serialized, but the > properties still have to actually exist on the object. > > I'd like to propose the addition of a new mechanism which essentially wor= ks > the same way as Serializable, but uses arrays instead of strings and does > not share context. I'm not sure about the naming (RealSerializable, > anyone?), so I'll just go with magic methods __serialize() and > __unserialize() for now: > > public function __serialize() : array; > public function __unserialize(array $data) : void; > > From a userland perspective the implementation should be the same as for > Serializable methods, but with interior serialize()/unserialize() calls > stripped out. Right now Serializable implementations already usually work > by doing something like "return serialize([ ... ])", this would change it > to just "return [ ... ]" and move the serialize()/unserialize() call into > the engine, where we can perform it safely and robustly. > > The new methods should reuse the "O" serialization format, rather than > introducing a new one. This allows a measure of interoperability with > previous PHP versions, which can still decode serialized strings from new= er > versions using __wakeup(). > > If an object has both __wakeup() and __unserialize(), then __unserialize(= ) > should be called. If an object implements both Serializable::unserialize(= ) > and __unserialize(), then we should invoke one or the other based on > whether "C" or "O" serialization is used. > > Thoughts? > > Nikita > --94eb2c0d228678d89b054daca8ea--