Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118589 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 40904 invoked from network); 8 Sep 2022 22:14:19 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 8 Sep 2022 22:14:19 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8CCFA180546 for ; Thu, 8 Sep 2022 15:14:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS24940 176.9.0.0/16 X-Spam-Virus: No X-Envelope-From: Received: from chrono.xqk7.com (chrono.xqk7.com [176.9.45.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 8 Sep 2022 15:14:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bastelstu.be; s=mail20171119; t=1662675256; bh=rboNYHSugPdl+kmIfe9XTb73PTBn4jNUAu5mexZaupM=; h=Date:Subject:To:References:From:In-Reply-To:From; b=brfxtpFZ3/Kri1DoWlPuKyYAHbjbhxpdqOLVYDH8LxIdSNIuQVHoxR99n5ISMtbDy W54D2lRgKwH3LYaTMgfWm9rI4bxc/NQztdgNDulWCJEddtVyJQCrakgPm2DkWWsZ0J QYu842vVSIENOqMSfjD/6J0DCpVvN7kGxrzbkete/aNA4sKJsq25MwhsJULR0JcOME /YxTGwuux4kdUcBBHiYOFt+vgfyEn8uCEct47+KDe98PtDzrregGaliz72uF/HkdYY S89NyoIvp+i4MD2HX34VaDfUiDe15kbOXGn+gNh5SpE8n5R1lrwCRmtbAaGswKh8xz sllReUmy7Rl/Q== Message-ID: <51a57a4d-ccb5-a581-205e-e2684e4db8c8@bastelstu.be> Date: Fri, 9 Sep 2022 00:14:14 +0200 MIME-Version: 1.0 Content-Language: en-US To: internals@lists.php.net, Nicolas Grekas References: <530b3a9d-0ee4-6061-8c69-df672d238032@bastelstu.be> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling From: tim@bastelstu.be (=?UTF-8?Q?Tim_D=c3=bcsterhus?=) Hi On 9/8/22 19:06, Nicolas Grekas wrote: > From what I understand, there are two things in the RFC: > > 1. a proposal to wrap any throwables thrown during unserialization > inside an UnserializationFailedException Correct. > 2. a discussion to figure out a way to turn these notices+warnings into > exceptions. Partly correct: The RFC primarily proposes unifying the existing non-Exception errors (which are currently split between E_WARNING and E_NOTICE with no apparent pattern). To *what* exactly (E_WARNING or some Exception) they are unified is up for discussion/vote. > About 1., I read that you think "this is not considered an issue", but to > me, changing the kind of exceptions that a piece of code might throw is a > non negligible BC break. There needs to be serious justification for it. I > tried looking for one, but I'm not convinced there is one: replacing the > existing catch(Throwable) by catch(UnserializationFailedException) won't > provide much added value to userland, if any. And "reliably include more > than just the call unserialize() in the try" is not worth the BC break IMHO. unserialize() is a generic function that will also call arbitrary callbacks. You already have no guarantees whatsoever about the kind of exception that is thrown from it. I am unable to think of a situation where the input data is well-defined enough to reliably throw a specific type of exception, but not well-defined enough to successfully unserialize. So on the contrary wrapping any Exceptions with a well-defined Exception allows you to rely on a specific and stable Exception to catch in the first place. As you specifically mention catch(Throwable), I'd like to point out that this will continue to work, because UnserializationFailedException implements Throwable. > About 2., unserialize() accepts a second argument to give it options. Did > you consider adding a 'throw_on_error' option to opt-in into the behavior > you're looking for? I think that would be a nice replacement for the > current logics based on set_error_handler(). If we want to make PHP 9 throw > by default, we could then decide to deprecate *not* passing this option. I did not consider this when writing the RFC. I now considered it, and I do not believe adding a flag to control this a good thing. 1. "No one" is going to set that flag, because it requires additional effort. I strongly believe that the easiest solution should also the correct solution for the majority of use cases. The flag fails this "test", because the correct solution should be "don't fail silently". 2. If you actually want to set that option in e.g. a library, then you break compatibility with older PHP versions for no real gain. If you go all the way and remember to add that extra flag, then you can also write an unserialize wrapper that does the set_error_handler dance I've shown in the introduction of the RFC. Similar effort to adding the flag, but better compatibility. 3. It does literally nothing for users that use a throwing error handler, which to my understanding includes the vast majority of framework code. 4. Even for users that do not use a throwing error handler, omitting the option literally does nothing, because unserialize() might already throw depending on what the unserialize handlers of unserialized objects do. > Lastly I'd like to add a 3. to your proposal, because there is one more > thing that makes unserialization annoying: the unserialize_callback_func > > ini setting. It would be great to be able to pass a 'callback_func' option > to unserialize to provide it, instead of calling ini_set() as we have to > quite often right now. > > Would that make sense to you? > TIL about that ini setting. Can you clarify where this callback comes in helpful? What can it do for you what your autoloader can't do? I've attempted to search GitHub to find out about the use cases, but almost all of the results are copies of php-src that match the .phpt tests. However as of now I do not believe that it is appropriate to include in my RFC, because it is only indirectly related to error handling. I'd like to keep the RFC focused. This makes it easier for readers to understand the proposal, allowing voters to make an educated vote and serving as longer-term documentation if the vote passes. Best regards Tim Düsterhus