Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118840 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 36202 invoked from network); 18 Oct 2022 11:32:48 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 18 Oct 2022 11:32:48 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8BD3B1804A7 for ; Tue, 18 Oct 2022 04:32:47 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_00,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS24940 176.9.0.0/16 X-Spam-Virus: No X-Envelope-From: Received: from chrono.xqk7.com (chrono.xqk7.com [176.9.45.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 18 Oct 2022 04:32:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bastelstu.be; s=mail20171119; t=1666092766; bh=+ksJ6xYa4n+eEuLHbchQarPyDb4Gj69ljqVk52UH7Lg=; h=Date:Subject:To:References:From:In-Reply-To:From; b=FSx7f6U1rb5eU1HdL85jc5rwERzpF89Z/QfqRXnXv3bjYbCUf4AO7Sb4AmzWkKAaC 2MJc63CeNezptyt53HBqOG50HoDCjOupFSDP3MgPY5ZioDEuRZJHlUvtsfh67rGo4J iG2bDDd3yHIWjVemJOBqH9y92k0lUgtd7szoKiZfpZBI22lmQP0yHx+cKvOI5lbYpG GkMeIv0hAcngqM1/NFCpWIfvfZfmGQMqZk46RMfGY9go9kgPZTNWvGsXTlVOSd7nCn q7REEqHjRS5+TiyUj8vAaPpmAQ6oWMF9K4u98gNFiJHFFK2yeyyYdXjLxCdeddL4KM +dmuR/u8eixjA== Message-ID: <5d5849c5-ca79-71e9-ccc8-5c69952fd408@bastelstu.be> Date: Tue, 18 Oct 2022 13:32:45 +0200 MIME-Version: 1.0 Content-Language: en-US To: Tim Starling , internals@lists.php.net References: <22177032-fe72-c39b-63fe-fa4368a70852@bastelstu.be> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] [VOTE] Improve unserialize() error handling From: tim@bastelstu.be (=?UTF-8?Q?Tim_D=c3=bcsterhus?=) Hi On 10/18/22 03:26, Tim Starling wrote: > If I'm reading the implementation correctly, the original exception is > thrown away, there's no way to get the original message and backtrace. > That will make debugging more difficult. See Derick's reply (and my reply to Derick's reply). > I reviewed about 150 callers of unserialize() in the MediaWiki core > and extensions. There were no instances of the kind of error guarding > shown in the RFC. Occasionally, we have: > > $result = @unserialize( $data ); > if ( !$result ) { >      throw new Exception( ... ); > } > > The assumption is that if there is a failure, false will be returned, > as is documented in the manual. I don't understand the assertion in > the RFC that set_error_handler() is required to detect unserialization > failure. false is a valid value that might've been serialized. The manual specifically notes: > false is returned both in the case of an error and if unserializing the serialized false value. It is possible to catch this special case by comparing data with serialize(false) or by catching the issued E_NOTICE. Note that your example code also errors out if the $result is an empty array, an empty string or another falsy values. This might be an acceptable result in your specific case, but is not the correct behavior in the general case. The example in the RFC was written to be correct for the general case, without imposing any constraints in the input data. > Normally, exceptions in MediaWiki are allowed to propagate back to the > global exception handler. There are a couple of categories of > exceptions which should never be caught as a matter of policy: > > * Database errors. These are permanent errors. Attempting to persist > with executing the request after a database error will mostly likely > lead to another error. > > * Request timeout exceptions. We're using our Excimer extension to > implement request timeouts by throwing exceptions from a timer > callback. This has been an interesting can of worms which has brought > some nice features with a number of unexpected consequences. It means > that it's always incorrect to catch and discard an arbitrary Throwable. Please note that this is already broken independently of this RFC. It is entirely reasonable for a library to catch all Exceptions that happen internally and wrap them in a well-defined "library exception" at the library boundary. As a specific example, the "flysystem" file system abstraction library is doing this: https://github.com/thephpleague/flysystem/blob/f5e93fad64cff5f45a1f79246dae15b31099b105/src/MountManager.php#L35-L37 > Probably nothing in our ecosystem is doing database access in an > __unserialize() magic method. But if someone tried it, I would want > the resulting errors to be properly logged with correct backtraces, > not silently discarded. The internal behavior of unserialization is unspecified. It only guarantees you that you get back your original structure and that all the unserialization callbacks will have been called if unserialization fails. For the error case there are no guarantees, except "a Throwable might be thrown". As an example PHP 7.0.10 changed the behavior for unserialization failures to no longer call __destruct() on incompletely initialized objects to fix a security issue: https://3v4l.org/82UJj I believe this is the corresponding bug: https://bugs.php.net/bug.php?id=72663 In PHP 7.0.15/7.1.1 apparently there was another change: https://3v4l.org/tnh2R As such you can't rely on the database access Exception to actually be emitted if unserialization of another object also fails. Best regards Tim Düsterhus