Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118699 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 98490 invoked from network); 28 Sep 2022 18:36:51 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 28 Sep 2022 18:36:51 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 08B32180538 for ; Wed, 28 Sep 2022 11:36:50 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS24940 176.9.0.0/16 X-Spam-Virus: No X-Envelope-From: Received: from chrono.xqk7.com (chrono.xqk7.com [176.9.45.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 28 Sep 2022 11:36:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bastelstu.be; s=mail20171119; t=1664390207; bh=/vzW/nxvIsz9OybDwHycTTt8KEw4s46pLd9PzcaYLPA=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=aPP4EwDFUeL0EQpsxCp48atFEjn3Nkmj8t7NUegp2kpuICipWVU4BkhX2KaD07/2I ccVrwQFWbgdZ8cM9pCCguEknYQBkUZMLeWVLDNw3gwW0xPd4tgyRXt8siS6xdrNAfn daXR/PNHu1K/DfqSwjwC50wNwHDw0A3L3I00SyK8b9svcUhGwNfWOsVNh7UiLIOSKH mInTSwt+bXuSrb/AA7mEy57L2qIsQ7Zqs1skJbiEu12D1PR/ta/WVNw12oPksfvtfe 0kvFQ0PGTC6gQII1y6ghauUHxBtIAgzbBUoSTf8Kk/kT5IUXu06jJirE3qhLII6bpA b6tl/s8cl6TPA== Message-ID: <09eff8be-2a36-4fa3-5bb6-791df4d5bc4b@bastelstu.be> Date: Wed, 28 Sep 2022 20:36:46 +0200 MIME-Version: 1.0 Content-Language: en-US To: Nicolas Grekas Cc: internals@lists.php.net References: <530b3a9d-0ee4-6061-8c69-df672d238032@bastelstu.be> <51a57a4d-ccb5-a581-205e-e2684e4db8c8@bastelstu.be> <81f1ceb4-1086-3f1a-942d-aebbb5c2712f@bastelstu.be> <94b9c2ab-70d5-48e1-50f8-409a3073023a@bastelstu.be> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling From: tim@bastelstu.be (=?UTF-8?Q?Tim_D=c3=bcsterhus?=) Hi On 9/28/22 19:53, Nicolas Grekas wrote: > I'm not quoting the full text of the discussion because the php ML server > refused the message - too long it said. Yeah, that's fine. I also aggressively prune irrelevant parts from whatever I quote; saves traffic and storage, and also makes it easier to find the answers. Additional context is available by looking up the quoted message. >> Based on my analysis and understanding of the linked examples, the only >> one that *might* break (I don't fully understand it), is the one in >> src/Symfony/Component/VarExporter/Internal/Registry.php. The others will >> either no change or change to be more correct. >> >> […] >> > > As you saw, the BC break is real. That's what I wanted to highlight and > what worries me for the community at large. This is not some niche BC break. That's not at all what I'd say I've seen. My argument (see quote) was that while it *technically* breaks BC under a strict definition, it does not change behavior in a *negative way*. I don't find this strict definition of BC break particularly useful, because that effectively results in https://xkcd.com/1172/ ("Workflow"), preventing all progress. > There is also an argument that is missing in this conversation and that > Yasuo made (in another thread by mistake I guess?): Yes, I have seen that one. For reference of the other readers, the email is here: https://externals.io/message/118554#118684. > In general, unserializing external serialised data is the same as "Read and >> execute remote PHP code" >> (Executing arbitrary code via memory corruption is trivial task with >> unserialize()) I am aware of that and acknowledged the security issue in the third list point of this section: https://wiki.php.net/rfc/improve_unserialize_error_handling#increase_the_error_reporting_severity_in_the_unserialize_parser > In all valid use cases of unserialize(), we do trust the backend where we > get serialized payloads from. This means constructed broken payloads are > not something we need to deal with (the DateTimeImmutable example you > gave). Okay, that's fair with regard to DTI. The 'Error' case still exists for readonly classes that lose properties, so ... > The thing we need to guard against is much narrower. That makes the problem > this PR aims to solve much smaller. Truncated payloads are a thing because > of race conditions or unreliable network services. FC/BC of serialized > payloads is another thing that can only be dealt with by implementing > __unserialize() or the likes. And that's mostly it. Other errors (e.g. ... while it is true that this FC/BC of payloads needs to be dealt with by implementing __unserialize(), you can't rely on every class doing that (correctly). That's why you currently need to catch 'Throwable', not just 'Exception' and that's why the RFC proposes to wrap the Throwables. Truncated payloads should not happen even with unreliable network services, because of integrity checks embedded into the protocol (e.g. TLS). In any case, truncated payloads will currently emit E_NOTICE, so that relates to the second vote (E_WARNING to Exception) more than the first vote (wrap Exception), whereas I understand your concerns are mostly with the "wrap Exception" part. > parse error) are not caused by unserialize itself, but by regular user code. > Sorry, I don't understand what "parse error" or "other errors" you are referring to. >> "RFC: Static variables in inherited methods" - >> https://wiki.php.net/rfc/static_variable_inheritance >> >> That RFC actually changed the behavior of the application I maintain. >> You voted in favor of the RFC, but did not participate in the >> discussion. The fix was simple and I agree that the new behavior is >> better, so no hard feelings. I'd be curious though how the breakage from >> that RFC is more acceptable to you than the anticipated breakage from >> this RFC? Honest question to understand your PoV better. >> > > Rereading the RFC, Nikita described a niche and obscure behavior of the > engine. There are no numbers to support that understanding (how "niche" > this was) and that's missing in the RFC. I remember other RFCs which did > have numbers to quantify the impact. I guess it was hard to get that Understood, thank you for looking into this. I believe raw numbers are not the (most) important metric. IMO subtlety of the break is more important. A subtle change that affects a small minority is worse than a less subtle change that affects more users. > number. And that's also why I made this gist: so that the impact of the BC > break could be evaluated somehow. Many spots in Symfony likely means many > more spots in the PHP community. I'm not going to be able to conduct a I didn't and don't expect you to perform a larger-scale analysis, that's hard to do, because one has to look at the *context* of the 'unserialize()' call to determine what exactly the intended behavior is. That's what I attempted to do with your Symfony examples (I am not a Symfony user and might have misunderstood some, though) and based on that I disagree with "many spots". For the application I maintain at work, I've also looked through the unserialize() calls and concluded the the vast majority of them are buggy as they already are and I am planning to fix those independent of the results of this RFC. > larger analysis of this impact, but that can still give an intuition of it > to readers. Mine is: it's likely going to be impactful, not in a good way. > Best regards Tim Düsterhus