Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:117837 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 57767 invoked from network); 30 May 2022 13:16:07 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 30 May 2022 13:16:07 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7AD7D1801FD for ; Mon, 30 May 2022 07:59:53 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 30 May 2022 07:59:53 -0700 (PDT) Received: by mail-wr1-f46.google.com with SMTP id s24so7687580wrb.10 for ; Mon, 30 May 2022 07:59:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :references:from:in-reply-to:content-transfer-encoding; bh=0ikhuXrVWMKa+76pMaPyvU/DIGgOCXftgzg48p+wqig=; b=NTM8N7R1UzqFliBqz6xOtQSzdpteuwWf8ih9fjoO+U3a5Ka/hO+PH40Tpp+6Qkoctf PAcbWbR7g7oWsKj0KqdzdZIYIfr4MqdoMkjE8h6xOAPBU8pRmTJBgwclXk/BZ0IzCGWF ChjVqwB0IQ4ebtZ7nYbKbwQbJ3CGmRuUZvRd0hxC9rKbdz0MrP1daFMSmJpC/GuVri5R ZloVekiq9kg9vnXLoq1Jl+Pj33dOK5biulJXkb+cQSR2xAi1Deuyx3UD6Em74xLIVnM0 2yq580xu2m+JXpbFbc1w2FCfTv0YxwyKktVrvRYbCIEHfeUiUancTUcUk9+iQSiFisms 36ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=0ikhuXrVWMKa+76pMaPyvU/DIGgOCXftgzg48p+wqig=; b=5HIN18i4BxrjIlDdlersLSnN47OdaT0a3UzbfHqTnjnw+v5Ncl2q6n8KdozuUEB34f ofVCLDIDKIsVEB1rU3qDfQ9SJ/bAUEL/NU2C3f69wlp60Dv5qvZf+k2Kk52DFbcnhvtW mDN4DF7hUkgfpKq7jfo1JoWZ+r+DdWr68FhPB8SeLs1wolCADrC91pqFHD+amjX9bRaY /t14lVQSAjDG1sATt/oYB5c5UAEe8a4+Ihj+KBnCA3jmnA5bl6z3/QA3VZFCngu2n5CY jrbhS6GhrL1M3WVYns4PiAlQuVk7GowKnaNd7dm50FnEjQl6OkA34zXRuamKxvmkDW4C ejdA== X-Gm-Message-State: AOAM5319vXMF24HEk8u2Lwm2QN6zV6jhJvAbu13GfXtVWlEBc90w0dmc E1JQDamshLM+W9BqGE8rnD+WuziHRwo= X-Google-Smtp-Source: ABdhPJzcNLzhRjTcwHHMbrAXnQVWN8wQ9Y+PdzinPfIWV+vcuYMP9J5iJMPu69steAiRSip6uRDZSw== X-Received: by 2002:a05:6000:384:b0:210:28d:29f1 with SMTP id u4-20020a056000038400b00210028d29f1mr21704859wrf.512.1653922791759; Mon, 30 May 2022 07:59:51 -0700 (PDT) Received: from [192.168.0.22] (cpc104104-brig22-2-0-cust548.3-3.cable.virginm.net. [82.10.58.37]) by smtp.googlemail.com with ESMTPSA id o14-20020a05600c510e00b003942a244f54sm13741875wms.45.2022.05.30.07.59.50 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 30 May 2022 07:59:50 -0700 (PDT) Message-ID: <5c455146-aee3-b1d3-f6d4-b19e0408204b@gmail.com> Date: Mon, 30 May 2022 15:59:50 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Content-Language: en-GB To: internals@lists.php.net References: <1755E8B5-229B-47B2-BBAF-B5E014F5473D@craigfrancis.co.uk> <1180af01-080f-ee0a-3159-74bf7e0a8aea@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] Re: NULL Coercion Consistency From: rowan.collins@gmail.com (Rowan Tommins) On 30/05/2022 15:04, Guilliam Xavier wrote: > For this specific example, shouldn't it rather [already] be like this anyway? > > function getByIdentifier(string $identifier) { > if ( $identifier === '' ) { > throw new InvalidArgumentException("Empty identifier"); > } > // ... > } > > (but I guess you could find actual examples where an empty string is > "valid" and null is not, indeed) The actual code in this case ended up in a generic routine that used isset() to choose which SQL to generate. An empty string would generate a WHERE clause that matched zero rows, but a null would omit the WHERE clause entirely, and match *all* rows. So an extra pre-validation on the string format might be useful for debugging, but wouldn't result in materially different results. > I'm just worried to see people rather disabling deprecation notices > (via error_reporting, or a custom error handler, or even patching > PHP's source and recompiling their own) than "fixing" the code > (granted, that's not specific to*this* RFC, but somewhat "highlighted" here) Indeed, I think we have a general problem with how deprecations are communicated and acted on in general. I have been thinking about how to improve that, other than "never change anything" or "never warn people we're going to change anything", and will try to write up my ideas soon. > function null_to_empty_string(?string $string_or_null): string > { return $string_or_null === null ? '' : $string_or_null; } > > (but also its "opposite" empty_string_to_null(string $string): ?string) That's actually an interesting observation. It's probably quite common to treat empty strings as null when going from input to storage; and to treat null as empty string when retrieving again. Importantly, databases generally *don't* treat them as equivalent, so forgetting that translation can be a real cause of bugs. I often advocate for string columns in databases to allow either null or empty string, but not both (by adding a check constraint), so that such bugs are caught earlier. To go back to Craig's favourite example, that could be a genuine problem caused by passing null to htmlspecialchars() - if we intended that null to be stored as such in the database, we've silently converted it into a non-equivalent empty string. (Yes, escaping should be done on output not input, but it's not completely infeasible that that combination might happen.) Regards, -- Rowan Tommins [IMSoP]