Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:117542 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 76363 invoked from network); 19 Apr 2022 10:01:46 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 19 Apr 2022 10:01:46 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 58B4B180510 for ; Tue, 19 Apr 2022 04:35:14 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 19 Apr 2022 04:35:13 -0700 (PDT) Received: by mail-lj1-f179.google.com with SMTP id q22so1885258ljh.10 for ; Tue, 19 Apr 2022 04:35:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=craigfrancis.co.uk; s=default; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Y6XNvPZ0lpJ0EHL4lhxQP4CaKmeRF2OivgZhfgjn/8A=; b=Ll3lM6qTnUCF6xmllT6tAeFNAw8/q2L2Cj3feYAK5RUWNJ0ZE9gsEjJAEiIyMwJh1O 1kDSn50/qDOFN1iVy0mg0sUgjZonTFVsG8oMO2r/Fjbtg1SIggacq5gBgos+FSCETrY1 gck4J+Dvr0SRJe34aU3aKA0CDkXkLyYpaNgRQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Y6XNvPZ0lpJ0EHL4lhxQP4CaKmeRF2OivgZhfgjn/8A=; b=ykB7xjq6azZngHcc6gbDCBqN3jcQQH4fvllef+oVurEzrQDa3EKHb7WdTnuF7Dhjau 3HfJoCuLfwk6yxrGp7lEMzA4UjGwadG2C6aJJqM9p+o4v+TgpNWi8q40U8vxzqTnNF+y jWjGPJKyK7U+cfk7qYpaXTT0wwODSE3yNTotIpeJ3o+E2MP2pulAv7ncFlfGPboZZNuw Rnbh01xixfNyYboF3u+bwW0WVSVqS64jWve40rO+WK/wITtNIIAghxXRWhKN6n3ipKdj /WA1/lCQ9MksoG8FxsAygSb/Gv+t/noFYaESS7tx8F1/3N+vMfzDVcwgsPderYWEnKug UvXg== X-Gm-Message-State: AOAM530wCanfYZ6dUwxoK1Z8P86vwAxma9LEOIe8ejLuAkzsDx3mZwEl bjyy9ChGpuQtDsc8/S5Fcv8jiK0sCt6LH9b79dnf/Q== X-Google-Smtp-Source: ABdhPJyFDFeJg9Vt+n9YmsWYpaPeb9Nt8HZ58+6GmMJZzq38mHvjioYWHhKooRCuXYvwMdSLJ0BdfdgQ3KgrQVABf7k= X-Received: by 2002:a2e:81c1:0:b0:24b:f44:3970 with SMTP id s1-20020a2e81c1000000b0024b0f443970mr10077729ljg.97.1650368111991; Tue, 19 Apr 2022 04:35:11 -0700 (PDT) MIME-Version: 1.0 References: <42D0A480-F262-4F72-9C4D-887762A8D800@gmail.com> In-Reply-To: <42D0A480-F262-4F72-9C4D-887762A8D800@gmail.com> Date: Tue, 19 Apr 2022 12:34:58 +0100 Message-ID: To: Rowan Tommins Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="000000000000106ce405dd00458f" Subject: Re: [PHP-DEV] NULL Coercion Consistency From: craig@craigfrancis.co.uk (Craig Francis) --000000000000106ce405dd00458f Content-Type: text/plain; charset="UTF-8" On Sat, 16 Apr 2022 at 12:17, Rowan Tommins wrote: > On 8 April 2022 18:34:52 BST, Craig Francis > wrote: > >I've written a new draft RFC to address the NULL coercion problems: > > > >https://wiki.php.net/rfc/null_coercion_consistency > > > I'm sympathetic to the general problem you're trying to solve, but I'm not > convinced by the argument that this is about consistency, because user > defined functions have been consistently rejecting null for non-nullable > parameters since PHP 7.0, and even before that for arrays and objects. Thanks for looking at this Rowan. I am open to discussing different solutions (this is my second attempt), and any suggestions are welcome. In regards to consistency, while I think my RFC does that, it's not my main motivation for creating this RFC - which is to address the backwards compatibility issues. But to expand on consistency, I appreciate user defined functions (since 7.0) didn't coerce NULL, and I do want internal/user defined functions to be consistent with each other (why I agree with the spirit of the original RFC), it's just odd that string/int/float/bool can be coerced, but NULL cannot be coerced in this context and can be in other contexts, e.g. ``` $i = 0; $n = NULL; if ($i == $n) { // Fine } print($i); // Fine print($n); // Fine printf('%s', $i); // Fine printf('%s', $n); // Fine echo 'ConCat ' . $i . $n; // Fine echo htmlspecialchars($i); // Fine echo htmlspecialchars($n . ''); // Fine echo htmlspecialchars($n); // Bad??? ``` Consistency is a good argument for small changes that eliminate unusual > edge cases, but I think far-reaching changes should be able to stand on > their own merits - regardless of whether it's consistent, do we want the > language to work this way? > Good question, while I mention consistency, my focus is on backwards compatibility, and how PHP has accepted NULL to these ~335 parameters since, well, forever... my fear is that the upgrade to PHP 9 will be so difficult (as noted with the lack of tooling to solve this), I'll be auditing PHP code that's stuck on 8.x for many years to come. https://twitter.com/mwop/status/1441044164880355342 To me, the main defence of type coercion is that PHP operates in a world > full of strings - URLs are strings, HTTP requests are strings, and a lot of > API responses are strings - so making it easy to work with strings like > '42' as numbers makes a lot of sense. It's not clear to me that this > extends to null. > Yep, PHP does work in a world of strings, and for simplicity string coercion works fairly well... but I think I can include NULL in that definition, because HTTP requests do not guarantee values have been provided, where PHP (via filter_input) and frameworks (Laravel, Symfony, CakePHP, CodeIgniter, etc), have used NULL to note when a value has not been provided... and many developers have simply not cared about that distinction (e.g. when passing to htmlspecialchars, trim, strlen, preg_match, etc). I think a large part of the problem here is that null can mean many > different things in different contexts - "unknown", "not provided", > "invalid input", "default", "not applicable", etc. > > These differences are subtle, but lead to different expectations of > behaviour: > > - Treat null as a specific case with its own meaning, distinct from any > other valid value. This is what explicitly nullable parameters and union > types allow. > - Treat null the same as any other out of range value, and raise an error. > This is what happens in user-defined functions in PHP, and in built-in > functions expecting non-scalar arguments. Compare also out of range actions > like division by zero. > - Treat null as a special state that propagates through expressions, > because any operation with an unknown input has an unknown output. This is > the approach to null taken by SQL, and by IEEE floating point with NaN. It > is also the basis of the ?-> operator, and of things like Optional.map in > Swift. > - Treat null as a generic default value which can be filled in implicitly > based on requirements. This is the interpretation currently taken by > internal functions for scalar arguments, and what you are proposing to make > standard. It is also, as you point out, the way PHP treats null in some > other contexts, such as many operators. > Thank you, that's a really good explanation. The developers I work with would assume the last definition, in the same way they can pass in an integer to `urlencode()`, where they just don't think about it being implicitly coerced into a string, it just works (TM)... that's not to say the `strict_types=1` style of checking, and NULL being treated as an invalid value, doesn't provide value for some developers (but as shown in the stats, they are in the minority). Interestingly, one of your examples mentions filter_input, which takes the > "propagate" approach, and htmlspecialchars, which doesn't. It would often > be more useful to retain the information that a value is null than to have > it silently converted to an empty string as a side-effect of some other > operation. Those are interesting points. I think `filter_input()` is a bit different, as it's a source of data (not exactly propagating); whereas all of the other functions, they either process that data (and return a sting, and do not propagate) or work with it (e.g. preg_match and strcmp returning a bool, strlen returning an integer, etc). Retaining NULL is interesting, but it would be a new thing... Perhaps it would be useful to have some function-call equivalent of the ?-> > operator. I'm not sure what this would look like for normal function calls, > but it would be easy to add if we had a pipe operator, e.g.: > > If this was equivalent to htmlspecialchars($foo) > $foo |> htmlspecialchars(...) > > Then this could be equivalent to ($foo === null ? null : > htmlspecialchars($foo)) > $foo ?|> htmlspecialchars(...) > ... adding that syntax might be useful in some contexts (when the developer wants to make NULL easier to propagate); but that would be a new feature, and I don't think it solves the backwards compatibility issue we currently face, and what I'm trying to address in this RFC. I'm not set against this RFC, but I'm not quite convinced by the case it > makes, and think there may still be other options to explore. > I'm open to any suggestions, I just want to avoid the situation where Fatal Type Errors happen seemingly randomly (as in, someone who has never used type checks before, and they use NULL without any thought about it). I'm even open to the idea of someone coming up with a tool to auto-update existing code, but so far I've only heard people say that is the way it should be done (with absolutely no detail on how it would even begin to work). I also cannot explain why NULL should be rejected, other than for those developers who see NULL as an invalid value and also use `strict_types=1`... as in, when a team of developers spends a few hundred hours adding strval() everywhere, what does that actually achieve? what do they tell the client? does it make the code easier to read? does it avoid any bugs? or is it only for 8.1 compatibility? Anyway, thanks again Rowan, I really appreciate your thoughts on this. Craig --000000000000106ce405dd00458f--