Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:117551 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 65388 invoked from network); 20 Apr 2022 15:28:36 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 20 Apr 2022 15:28:36 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8AB62180510 for ; Wed, 20 Apr 2022 10:02:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 20 Apr 2022 10:02:22 -0700 (PDT) Received: by mail-lj1-f169.google.com with SMTP id bf11so2619292ljb.7 for ; Wed, 20 Apr 2022 10:02:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=craigfrancis.co.uk; s=default; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YfbgpfOwdeE16hIPZTed68wnagJOL8JZyrlA39eQxZY=; b=LfM7Jf+Mmb7TBT26rm9xHE+ppC53ScRW95+SbSFg1ykD0zJUqNK4scz8DuFBPoNfk+ hdbKrhRl6WN2Fa7Ax7pAE/hmagCE3/fKJuGsw1L4as5867ZLm0Nz/yonzZ9dtjOtMrGz akCGGiHpRf95v+lcxUTyo9e1fF3U07Xa7qkto= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YfbgpfOwdeE16hIPZTed68wnagJOL8JZyrlA39eQxZY=; b=IeIbHZe+Y8geQAxry4rmsJ6lyjw0hWVh014ZSc2cdEQnz5p6xn9f8sIYujtC316N+K KomIWGZzk5kWbsCl/vCTfv2/Zhe3EHEmkqbTjqt7bsnxWzYY1VFxm9INbJzs3o7HAX0V 6pjji5gVoUuFAaZMti4RTrnU4050skJ+wRRCEaWxxz2aUEFTSkjJzEg9WA+1aQNmAsE4 33VCNZiFqOQA1VuotrYTi1qNr6pQyiUzBB9qnmmacj3jJ42wBHZhkuHh8LghDMFBSkZ/ g1Rei876yJrm00chul/LvBW1LZbDIpGKy9bBn2zf/TMcCqeTxGNV0ys7m4ImMpv8IvmY 874A== X-Gm-Message-State: AOAM531q0SdzqAvjga/nC3s1FwmvaA3OTAly+7BKOHtMu9yz5vw09dAG EpqoEnKhAFV29Tl5/27txaLSB9wHE50fZnNXEMqBvg== X-Google-Smtp-Source: ABdhPJxBKD3PEsN0dTJwEOq0xLv6uqv16MOeY9vRsBhsJXVTxSDvWyDAm3e3+ktLP89M+EcIjUi9fNUVudLuKo3+Pug= X-Received: by 2002:a2e:bd09:0:b0:247:e127:5e05 with SMTP id n9-20020a2ebd09000000b00247e1275e05mr14067405ljq.212.1650474141142; Wed, 20 Apr 2022 10:02:21 -0700 (PDT) MIME-Version: 1.0 References: <42D0A480-F262-4F72-9C4D-887762A8D800@gmail.com> <0b061f28-a087-efd3-8602-424ee03458e0@gmail.com> In-Reply-To: <0b061f28-a087-efd3-8602-424ee03458e0@gmail.com> Date: Wed, 20 Apr 2022 18:02:09 +0100 Message-ID: To: Rowan Tommins Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="000000000000e4e64705dd18f4db" Subject: Re: [PHP-DEV] NULL Coercion Consistency From: craig@craigfrancis.co.uk (Craig Francis) --000000000000e4e64705dd18f4db Content-Type: text/plain; charset="UTF-8" On Tue, 19 Apr 2022 at 14:17, Rowan Tommins wrote: > On 19/04/2022 12:34, Craig Francis wrote: > > The developers I work with would assume the last definition > > > I think you've somewhat missed my point. I wasn't talking about people's > habits or preferences, I was talking about different *scenarios* where > null is used to mean different things. > Yep, and I'm thankful you listed them... I'm just trying to focus on how PHP has worked (implicitly based on requirements), and how changing that causes problems (and will be an even bigger problem when it becomes a Fatal Type Error). Yes, some people prefer languages that "fails early" and some are more > interested in "do what I mean", but not everything is about that. > Agreed, the only thing I'd add... failing early with NULL might help debug some problems (assuming there is a problem), but I believe static analysis and unit tests are much better at doing this (e.g. static analysis is able to see if a variable can be NULL, and apply those stricter checks, as noted by George with Girgias RFC). In contrast, failing early at runtime, on something that is not actually a problem, like the examples in my RFC, creates 2 problems (primarily upgrading; but also adding unnecessary code to explicitly cast those possible NULL values to the correct type, which isn't needed for the other scalar types). > > I also cannot explain why NULL should be rejected ... does it avoid > > any bugs? > > > Yes, sometimes. Imagine an array of values provided by the user; during > validation, those which were optional and not provided get set to null. > You then loop through and display those which were provided: > > foreach ( $fields as $name => $value ) { > if ( $value !== null ) { > echo "$name: $value
\n"; > } > } > > Then, you realise you forgot about escaping, and decide to run > everything through htmlspecialchars(): > > $htmlFields = array_map('htmlspecialchars', $fields); > foreach ( $htmlFields as $name => $value ) { > if ( $value !== null ) { > echo "$name: $value
\n"; > } > } > > Spot the bug? $value will now never be null, because htmlspecialchars() > will silently turn the nulls into empty strings. > Thank you... but I will add, while `htmlspecialchars()` rejecting NULL would get you to look at the code again, I wouldn't say it's directly picking up the problem, and it relies on there being a NULL value in $fields for this to be noticed (if that doesen't happen, you're now in a position where random errors will start happening in production). This is where I'd note the value of unit tests, as they are much better placed to check this feature. Bit of a tangent, I'm uncomfortable that `$name` is not being HTML encoded, which takes us to context aware templating engines, and how you can identify these mistakes via the `is_literal` RFC or the `literal-string` type. I've also seen the opposite problem: a string function was removed > because it was no longer needed, and the code broke because values, > including nulls, were no longer being cast to string. > > > Is protecting against this worth the backwards compatibility cost of > changing the behaviour, and requiring extra code in other scenarios? > Possibly not. But that's different from not having any benefit. That's fair, adding in an extra check might give some benefit in some cases, but I don't think it's reliable, or worth the cost in updating existing code (with no tooling to help) and the additional code that will be needed to cast these possible NULL values to the relevant type (which isn't needed for the other scalar types). That said, if you (or anyone) have any better ideas on how to address this problem, please let me know. Craig --000000000000e4e64705dd18f4db--