Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:117651 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 27842 invoked from network); 30 Apr 2022 15:29:00 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 30 Apr 2022 15:29:00 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id C5076180041 for ; Sat, 30 Apr 2022 10:05:13 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 30 Apr 2022 10:05:13 -0700 (PDT) Received: by mail-wr1-f41.google.com with SMTP id e24so14486881wrc.9 for ; Sat, 30 Apr 2022 10:05:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :references:from:in-reply-to:content-transfer-encoding; bh=wBPi3JpC3xMO70buJXGh3eWPihYUBWrbYgXiOTvH6jY=; b=QQ6T+Ei4WvqsBLblu/gYid7tQc+9rgukcBHmYQ/yoOW7d88Q2BqK3sVeTU/ixLOMvy nppaSojjGsGCk8i96gcw/Yx564LzAtlElTOCoLuQjhLmwf7BtK9CQntzcmGPT7+7FZKG Kc0aPboqZGAn20c+ZUFc+hwe6yX8kAhEJhouaZdGmX3ieL+754phPrdVpZ7UYzOiVQLf 38RbEw1L5PU1P3bZfW1BAdmqclBvjNM0i1fDmXzGRuqNc4oWMy3GHK35P4u4Mj/pT0Lt L+o66VSGLJGX89yBdWzz10cGt6cumcbOCs1175PX2EsRHjLlxp+Jj143n4fXxYV/uwst p0bQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=wBPi3JpC3xMO70buJXGh3eWPihYUBWrbYgXiOTvH6jY=; b=CHHmzbnM++CZzKLBJmP4jInxl70yxKYQiDdTAMkuR4JaBLIU1W1gYV8Mb/eTB5bvUW 0+9i/OG1uJC8u94qrOn1+w5Hjv5PtExqBNziiRjqg0pSW4xZ4kEZOPAvUdpINROkn0u7 hP7sI4+76DuSJ+4VjTS+TaVzOK6eFS+dALPeegQsX+K5BpN842yBc2LUWyyRrHOTFHUt u4A1TTaWWV92zwE9Vq6ew6UoLDvcu14F9jWerr+DAjlINFAHEG0cDVwxG/R8tvBo/IBD GnkNfm+J2T4/ZFhF+j6nmXVniYpIKUcOIRrCg/ziajD0P1cci8DfH500Yt0vLXPGxjRd XTig== X-Gm-Message-State: AOAM5311C8kF8Ymsn/wmK1YhHlEybujBxvGiKSO1yLb74CAEf0Yb4l2B 0ZYCzcnpaThnZBhY5p4gRJ+0Ci4mPwo= X-Google-Smtp-Source: ABdhPJxpXKJhTb1uMLyIyvWSXbNgPRVPl5FfxxKs126xrfJqZWBE8ALvMkpT6PlBo+fPG8ERJR/PcA== X-Received: by 2002:adf:ce89:0:b0:20a:d917:5234 with SMTP id r9-20020adfce89000000b0020ad9175234mr3619375wrn.265.1651338312102; Sat, 30 Apr 2022 10:05:12 -0700 (PDT) Received: from [192.168.0.22] (cpc104104-brig22-2-0-cust548.3-3.cable.virginm.net. [82.10.58.37]) by smtp.googlemail.com with ESMTPSA id z10-20020a1c4c0a000000b003942a244f54sm2203142wmf.45.2022.04.30.10.05.10 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 30 Apr 2022 10:05:11 -0700 (PDT) Message-ID: <0484c0c8-569f-7889-343d-829fb820c64d@gmail.com> Date: Sat, 30 Apr 2022 18:05:06 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Content-Language: en-GB To: PHP internals References: <42D0A480-F262-4F72-9C4D-887762A8D800@gmail.com> <0b061f28-a087-efd3-8602-424ee03458e0@gmail.com> <7DB0A01F-04FB-420D-9025-E027E5DE02F7@craigfrancis.co.uk> <9859B3B4-091A-4311-8F68-F6C35FBC32A1@craigfrancis.co.uk> In-Reply-To: <9859B3B4-091A-4311-8F68-F6C35FBC32A1@craigfrancis.co.uk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] NULL Coercion Consistency From: rowan.collins@gmail.com (Rowan Tommins) On 27/04/2022 16:51, Craig Francis wrote: > Forgive this primitive example, but this shows `$name` being used in three different ways, where an automated tool cannot simply change line 1 so it doesn't return a NULL, because it would break the Register link. > > ``` > $name = $request->get('name'); > > if (trim($name) === '') { // Contains non-whitespace characters; so not "", " ", NULL, etc > $where_sql[] = 'name LIKE ?'; > $where_val[] = $name; > } > > echo ' >
> > >
'; > > if ($name !== NULL) { > $register_url = '/admin/accounts/add/?name=' . urlencode($name); > echo ' >

Add Account

'; > } > ``` > > In this example, why does `trim()` and `htmlspecialchars()` justify a fatal Type Error? Honestly, I fail to see how the inconsistent null handling in this example is anything other than a bug: * If strings containing only spaces are considered empty, it's probably a mistake that they're not trimmed everywhere. But of course adding $name = trim($name) would remove all distinction between null and ''. * If null is equivalent to an empty string when deciding whether to run the SQL, why is it not also equivalent to an empty string when rendering the register link? The current logic means that clicking "Go" causes a register link to appear, with an empty name on the query string, which doesn't seem like useful functionality. * On the other hand, if NULL is considered a different state, why is there no behaviour distinguishing it around the SQL logic? It seems likely that an else clause would be added there, perhaps "else { echo 'You must enter a name.'; }" In the current code, that would appear for both null and empty strings, which is probably a mistake. I think that actually demonstrates quite nicely that most code would benefit from treating "string" and "?string" as strictly different types, and either defaulting to an empty string explicitly and early, or considering null at every step. Simultaneously relying on null values being preserved, and them being silently coerced, leads to fragile code where it's not clear where nulls are deliberately handled as empty strings, and where they've simply been forgotten about. >> Telling users when they've passed null to a non-nullable parameter is precisely about *preserving* that distinction: if you want null to mean something specific, treating it as a string is a bug. > > I don't think that represents a bug, are we are talking about a system that takes user input (so often nullable), supports coercion with other simple types, and supports NULL coercion in other contexts (e.g. string concat). Read the sentence you replied to again: IF you want to treat null as distinct from '', THEN failing to do so is a bug. If, on the other hand, you just want to take nullable user input and handle it CONSISTENTLY as a string, then it's fine to explicitly default to '' AT SOURCE. >> Despite all of the above, I am honestly torn on this issue. It is a disruptive change, and I'm not a fan of errors for errors' sake; but I can see the value in the decision made back in 7.0 to exclude nulls by default. > > Thanks Rowan, I'll just add that I think the fatal Type Error for NULL will be much more disruptive, and I would rather relax that requirement for user defined functions, so NULL coercion works in all contexts. I think the main difference between our positions is that I believe that if PHP's type system was designed from scratch today, null would not be silently coerced in these situations. So while I agree that the change will be disruptive, I disagree with your position that it brings no benefit. On 27/04/2022 18:34, Craig Francis wrote: > But I'm wondering, is it only one function? and assuming it's a problem, could we use `Z_PARAM_LONG_OR_NULL()` and specifically throw an exception when either parameter is NULL, like the `max < min` check? On the basis that I'd rather have one extra check for this function, and keep NULL coercion working everywhere else (i.e. where it's fine). Well, since it's one of three examples in Guilliam's e-mail, the answer to the first question seems rather trivially "no", unless I'm missing something? As for the second question, certainly we could add specific prohibitions to null on a case by case basis, but that's basically equivalent to your previous suggestion of explicitly allowing null on a case by case basis, and doesn't really answer the question of what the default behaviour should be - especially bearing in mind that any default should apply to both built-in and user-defined functions. Regards, -- Rowan Tommins [IMSoP]