Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118738 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 49973 invoked from network); 4 Oct 2022 10:33:57 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 4 Oct 2022 10:33:57 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 08D3A180044 for ; Tue, 4 Oct 2022 03:33:55 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 4 Oct 2022 03:33:54 -0700 (PDT) Received: by mail-wm1-f48.google.com with SMTP id z13-20020a7bc7cd000000b003b5054c6f9bso10571985wmk.2 for ; Tue, 04 Oct 2022 03:33:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date; bh=11GR6WEOtklxcOrZj/zm4x46PFzPms3zQ6ZitZa7gTk=; b=R30b8rGgrFkdIxoAzVB3HuepnDP8KXHxJVjzH8MlJrR9/qfzlL/a1EYJtTCO8K532m ugO3D29mjXDvOlhtNoY4vQxNWXCJkCEbWK+EVfRDKFwbiEAZvWarZEfTY0u6ovo/ZQPA VAFoKcECP6JfBfACm4wTJlFnEaHnnn1lfYpJDSeaUMIjABeXXVb2ziHaWmEccy4Vyt6a K1iz+ibZbC7XVB8yGMBIhCk4/ODqdO1iZOJmDRnWo9eFZSfGC5DvFAc/mT4/+LLEzuxY MIFpXfUR05VysXfTBhq0hLSKz6aLCAlBnuziNHAihZGsE3lqVaviDX82C5kHvxWT/19h RyQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date; bh=11GR6WEOtklxcOrZj/zm4x46PFzPms3zQ6ZitZa7gTk=; b=uUJGo814+/Nm+x26tLA2kp7QqCEQale0CjLHlk4Daup1MXx+9yBm3Sbsiso5+kJNZ6 s4p3QF9NhzxmFVncfwIs/QTz0Jp08GsP1bXTn00obCJEByGh2jaMV0x0x/wW1GyeuRyR 1mZYUlis8PDpnIRUxkWN60tP/2+qtDkVzPsgCrcporYUq3PHoa4ziuQNisjDjWZp8lcU TxoFBZ0K065BDDIMUVuG4+5AHize1wKAwbgwffu8Lvk5zr45x6WkeDTyRPueaku3bAGY 1ypEwg8LiTDVK8LGrkOIdxm+kDM14b1qsPDKd0lZ8nl3YPvuo47XGhIVRvkF0JTHXbne maBA== X-Gm-Message-State: ACrzQf2aqAncZJTBUZ6LPxYrHKVEsJLQ1kFBTW7r/95U1YJzqKmiCkm0 pMc/hA1Q8tI613XmNTZfPWPx5nxiYKk= X-Google-Smtp-Source: AMsMyM7n7J2NwJeVX+0miOyxOwBc10zWt2bNvZdlldW9b2jVoajjFZoCANXqsMmVM//+SqWqIcQO6g== X-Received: by 2002:a05:600c:1c89:b0:3b4:a612:c3e0 with SMTP id k9-20020a05600c1c8900b003b4a612c3e0mr9671016wms.20.1664879633259; Tue, 04 Oct 2022 03:33:53 -0700 (PDT) Received: from [192.168.0.22] (cpc104104-brig22-2-0-cust548.3-3.cable.virginm.net. [82.10.58.37]) by smtp.googlemail.com with ESMTPSA id t3-20020a5d6a43000000b0022cc0a2cbecsm11943106wrw.15.2022.10.04.03.33.52 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 04 Oct 2022 03:33:52 -0700 (PDT) Message-ID: Date: Tue, 4 Oct 2022 11:33:50 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 To: internals@lists.php.net References: <0cfb9a7b-1168-42ef-ae1a-bdc72210de43@app.fastmail.com> Content-Language: en-GB In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] Sanitize filters From: rowan.collins@gmail.com (Rowan Tommins) On 04/10/2022 01:38, David Gebler wrote: > What about FILTER_VALIDATE_EMAIL which is notorious for being next to > useless? > [...] > Seems to me like there could at the very least be a plausible case for some > better [...] is_valid_email() etc. type functions in core > to replace some of the filter API. The "notorious" thing I know is that validating e-mail addresses is next to impossible because of multiple overlapping standards, and a huge number of esoteric variations that might or might not actually be deliverable in practice. If you think the implementation can be improved, that doesn't need a new is_valid_email() function, just a tested and documented patch to the existing one; if it can't be improved, then any new function will be just as useless. In practice, the most common typos don't result in invalid e-mail addresses anyway, just incorrect ones - "gamil.com" instead of "gmail.com", and so on. For those, you don't need to Validate or Sanitize; you need to Escape and Verify: escape what you're given (context-dependent, so necessarily part of an SMTP or API client library), attempt to send an e-mail, and wait for the user to verify they've received it. On 04/10/2022 02:29, Vasilii Shpilchin wrote: > filter_input() is the only alternative to accessing superglobal arrays > directly. [...] > FILTER_SANITIZE_EMAIL - helps to clean up typical mess caused by > copy-pasting an email. > FILTER_SANITIZE_URI - similar thing but to URIs. > FILTER_SANITIZE_NUMBER_FLOAT - nice since it provides a flag to control > scientific notation None of these sounds very useful to me, but I think that just confirms the biggest problem with the extension: it's trying to be everything to everyone, and ends up with a bewildering set of options as a result. I don't think any rewrite or replacement can ever avoid that problem, because it's inherent in the problem space. I have a draft proposal I might share soon for some "strict cast" functions, but even simple cases like "string to integer" could have a dozen different implementations which would all be equally "valid" according to some use case or opinion, so it's a bit of a quagmire. Regards, -- Rowan Tommins [IMSoP]