Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118610 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 53012 invoked from network); 12 Sep 2022 19:46:26 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 12 Sep 2022 19:46:25 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id BC620180538 for ; Mon, 12 Sep 2022 12:46:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-yb1-f172.google.com (mail-yb1-f172.google.com [209.85.219.172]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 12 Sep 2022 12:46:22 -0700 (PDT) Received: by mail-yb1-f172.google.com with SMTP id y82so14416858yby.6 for ; Mon, 12 Sep 2022 12:46:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=W4MGkXneFqntqcAqSzEaXYTl+KJjJZ9CPr3y6TdwB+w=; b=ocooMxiZiNRbgu8apXaW9wJXENh7mbozZqd01REHoWLiT+8Cg7KXoV9DqqUKvGeTfy CRCN+PfEYSncG9lbFwrkiwU4/axnK/DKsuKQXl8ofkO2h7jf6PwjDNoV5XqrpuWdPZTR f2PivGKdpeNPEv7CfzBJ8fp8a8AIA0vNafrs0zJzEhl92bfBYNh6r9JBd9HQMgjXD6Fb Wy1wz9hqXF5dYjFxj86YNsXZAbe27RKOkntA2Z5fawFt9GI/5d0ijKEFDCo3ZuQpJ/cE GgzknPHVtjV06UUcf98JxKsYUD6rsVoeYBg2zBWhcN8B6JSNlmPuWCABYZvkyWhpfBGy qsQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=W4MGkXneFqntqcAqSzEaXYTl+KJjJZ9CPr3y6TdwB+w=; b=EI60oT+0x1AfhMDAl6b8Ibuq3SRphw+j6FapNn3sgmJti0Qhhnfe3oFHGWkXJSBa6E 2jief9uUWLpK2GkWZWdaDkqDwvpuZLH8Y48IuPuYS3PhuTkG+kVrW7XyR7pbqEYKXbm3 VoWBoh2n+/cbAQN2UJaVyMARvmBGosTvaKIpJjhBkFBnPt3teSTPh0vJo9ru4ngwHTsa Fs45YTuS1krca94NXg6aswojDy6SAjgfAaL/g6K+56mFl3mkMqPSOl5qHyk1JYYPZtE+ kGjhJTE+pgFl7wILYvNlFwFvLPDFmVWpcZ/arqaCIIrYEqTvjezAOOXa7K/L3+YYa5Dt RBJg== X-Gm-Message-State: ACgBeo1khpkG5AtRhJhWZ9s+nmajSLj+U52/bF2esGDSrVYU/nnH899h VvAzrHBnA8sMvKOthkG7SH8gk5vZ6pfpFTGBPI8ub/qFgFs= X-Google-Smtp-Source: AA6agR5NFF/oviqpdfLkhvA/rg8Kz40Amn7ds4U+jz2bkKst8XPN6+xtIdQmC+1JLgwP6DWXLnVBJeu2Pa4N3RHP828= X-Received: by 2002:a05:6902:1245:b0:6ae:325e:166a with SMTP id t5-20020a056902124500b006ae325e166amr19396666ybu.206.1663011981852; Mon, 12 Sep 2022 12:46:21 -0700 (PDT) MIME-Version: 1.0 References: <530b3a9d-0ee4-6061-8c69-df672d238032@bastelstu.be> <51a57a4d-ccb5-a581-205e-e2684e4db8c8@bastelstu.be> In-Reply-To: <51a57a4d-ccb5-a581-205e-e2684e4db8c8@bastelstu.be> Date: Mon, 12 Sep 2022 21:46:10 +0200 Message-ID: To: =?UTF-8?Q?Tim_D=C3=BCsterhus?= Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="0000000000006f8cd705e88026de" Subject: Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling From: nicolas.grekas+php@gmail.com (Nicolas Grekas) --0000000000006f8cd705e88026de Content-Type: text/plain; charset="UTF-8" > On 9/8/22 19:06, Nicolas Grekas wrote: > > From what I understand, there are two things in the RFC: > > > > 1. a proposal to wrap any throwables thrown during unserialization > > inside an UnserializationFailedException > > Correct. > > > 2. a discussion to figure out a way to turn these notices+warnings > into > > exceptions. > > Partly correct: > > The RFC primarily proposes unifying the existing non-Exception errors > (which are currently split between E_WARNING and E_NOTICE with no > apparent pattern). To *what* exactly (E_WARNING or some Exception) they > are unified is up for discussion/vote. > Thanks for clarifying. Our accepted release process policy says "Backward compatibility must be respected with the same major releases". I think it's critical to stick to this policy as much as possible and also to plan for the smoothest upgrade path when preparing the next major (deprecations FTW.) That's what I have in mind when I review the RFC and where I don't see the match yet. I know we're only at the beginning of the discussion and we're still looking for what would work best. > > About 1., I read that you think "this is not considered an issue", but to > > me, changing the kind of exceptions that a piece of code might throw is a > > non negligible BC break. There needs to be serious justification for it. > I > > tried looking for one, but I'm not convinced there is one: replacing the > > existing catch(Throwable) by catch(UnserializationFailedException) won't > > provide much added value to userland, if any. And "reliably include more > > than just the call unserialize() in the try" is not worth the BC break > IMHO. > > unserialize() is a generic function that will also call arbitrary > callbacks. You already have no guarantees whatsoever about the kind of > exception that is thrown from it. I am unable to think of a situation > where the input data is well-defined enough to reliably throw a specific > type of exception, but not well-defined enough to successfully unserialize. > > So on the contrary wrapping any Exceptions with a well-defined Exception > allows you to rely on a specific and stable Exception to catch in the > first place. > > As you specifically mention catch(Throwable), I'd like to point out that > this will continue to work, because UnserializationFailedException > implements Throwable. > You might have misunderstood my point so let me give two examples: Example 1. set_error_handler(fn ($t, $m) => throw new ErrorException($m)); try { $ser = serialize($value); catch (ErrorException $e) { // deal with $e finally { restore_error_handler(); } Example 2. try { $ser = @serialize($value); catch (Exception $e) { // deal with $e but don't catch Error on purpose, to let them bubble down } Changing serialize to wrap any throwables into an UnserializationFailedException would break both examples. That's what I mean when I say this breaks BC. As any BC-break, this might cause big troubles for the community and this should be avoided. The release process policy I mentioned above is a wise document. > > About 2., unserialize() accepts a second argument to give it options. Did > > you consider adding a 'throw_on_error' option to opt-in into the behavior > > you're looking for? I think that would be a nice replacement for the > > current logics based on set_error_handler(). If we want to make PHP 9 > throw > > by default, we could then decide to deprecate *not* passing this option. > > I did not consider this when writing the RFC. I now considered it, and I > do not believe adding a flag to control this a good thing. > > 1. "No one" is going to set that flag, because it requires additional > effort. I strongly believe that the easiest solution should also the > correct solution for the majority of use cases. The flag fails this > "test", because the correct solution should be "don't fail silently". > Unless we deprecate calling the method without the argument as I suggested. I agree, it might not be ideal to ask the community to rewrite every call to serialize($v) to serialize($v, ['throw_on_error' => true/false]) but that's still way better than a BC break. Maybe there's another way that doesn't break BC. I'm looking forward to one. 2. If you actually want to set that option in e.g. a library, then you > break compatibility with older PHP versions for no real gain. If you go > all the way and remember to add that extra flag, then you can also write > an unserialize wrapper that does the set_error_handler dance I've shown > in the introduction of the RFC. Similar effort to adding the flag, but > better compatibility. > I think my proposal provides BC with older versions of php: passing extra (unused) options is allowed and polyfilling UnserializationFailedException is trivial, so one could set_error_handler(fn (...) => throw new UnserializationFailedException(...)) and be both BC and FC. About writing a wrapper, that's what we all do right now, each with our own wrapping logic. You're looking for unifying those logics and I support that goal. But this leads to another idea : what about adding a new function, let's name it unserialize2() for now, and deprecate unserialize()? That'd provide the upgrade path we need. > 3. It does literally nothing for users that use a throwing error > handler, which to my understanding includes the vast majority of > framework code. > I'm not sure what you mean here. The flag would allow removing those error handlers, which is what you want to achieve in the end I believe (well, except when BC with older versions is desired of course, see my previous comment.) > 4. Even for users that do not use a throwing error handler, omitting the > option literally does nothing, because unserialize() might already throw > depending on what the unserialize handlers of unserialized objects do. > Sure. I don't see how that's a problem. Throwing is what almost any lines of PHP code can do. We're used to dealing with that and I don't see why serialize() needs special care. I might have missed the point you want to make here though. > > Lastly I'd like to add a 3. to your proposal, because there is one more > > thing that makes unserialization annoying: the unserialize_callback_func > > < > https://www.php.net/manual/var.configuration.php#ini.unserialize-callback-func > > > > ini setting. It would be great to be able to pass a 'callback_func' > option > > to unserialize to provide it, instead of calling ini_set() as we have to > > quite often right now. > > > > Would that make sense to you? > > > > TIL about that ini setting. Can you clarify where this callback comes in > helpful? What can it do for you what your autoloader can't do? I've > attempted to search GitHub to find out about the use cases, but almost > all of the results are copies of php-src that match the .phpt tests. > Sure: that callback is called when PHP tries to create an object of class Foo but the autoloader fails to load said class. When there is no handler, PHP creates a __PHP_Incomplete_Class object, but I regularly use this callback to throw instead when I don't want those strange incomplete objects (note that I also have seen use cases where generating a __PHP_Incomplete_Class *is* the desired behavior.) However as of now I do not believe that it is appropriate to include in > my RFC, because it is only indirectly related to error handling. I'd > like to keep the RFC focused. This makes it easier for readers to > understand the proposal, allowing voters to make an educated vote and > serving as longer-term documentation if the vote passes. > I proposed that because in many situations unserializing a payload to something that contains __PHP_Incomplete_Class objects is not desired and should be leveled up to an error instead. That's the relation with error handling and thus with the RFC. Nicolas --0000000000006f8cd705e88026de--