Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:115308 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 17754 invoked from network); 6 Jul 2021 06:17:12 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 6 Jul 2021 06:17:12 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 09F871804C8 for ; Mon, 5 Jul 2021 23:38:53 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 5 Jul 2021 23:38:52 -0700 (PDT) Received: by mail-ed1-f49.google.com with SMTP id x12so26483361eds.5 for ; Mon, 05 Jul 2021 23:38:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=x6XXCjrWbueGaDvS5gRHW+MCHGsiQMwGxZSi6waOw5A=; b=PM6WkQa1gFN8fPxSYe4E185v5nFywZkYLRVRPrVyTVNrvbZcnBeC5l5s7MBdJEd7wL UjcV9tWG7oSWRqi7DlnUxq0Rd5x3agAFsRPvN9NLDqTeeC8mwZ4/AdE4zQ6MiPqfDX0K B0XPptvFNEjTq9oPD4s711vETRabjujZ6s9mQ6o91R+Y0uYvdY4fAH+R9ofaN1nCAFkx 9oUqIZszB2LXB0+DWRCHc2Cz8H7KW8yRSzLgxjrUlqChv1l6z7JiMtEFl0vIXKHe5hbz 0YA9UqVIbmkVEIDMtLJEA/2m+zSNef/cPrGBkbCOb7oigk5th8Gw3dyjenqb5eMsk3GK /t3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=x6XXCjrWbueGaDvS5gRHW+MCHGsiQMwGxZSi6waOw5A=; b=r6R+fKv1R/N9dqJ5jN8sXvqWWZRMU0VfU2gBlLsdyDfoQFhti1uyr0Fnv0/04lUNYl VCiOXQPNh+7fd+ad/GLhk0TWHVnWIVL43yTaKz+jHA6+jALdgOj6xuhOQHyTpZFbh3aG JZ9KL8aUrsqoLUb8FVNxtj0wWg8NvFtTH9dCxB3ZG6tAthl61Vlw0YgYMPmyGtomv9zG InOJbu4k0Yggz1ztY8FB7SKolKJ88Va4dCU3EOwV7ZglSCoQ0jqUncw+2UKkcyh5pF9M BjO8aqjk12m6TG6qW5H2+xOm14TPLnMjqlSjZtSUzMyct71WcmLJGByddzo/hIau5HRe W72g== X-Gm-Message-State: AOAM533rCDgXcZDaCGk+XiRxHJ83mvdWR97AfoWmgGtj1dkmIqTTFWK5 WlCJzsbqsHXWt7g9GiSdGJrSEoVaTNCrKtkoiQ+yEiEpeLkiP5/f X-Google-Smtp-Source: ABdhPJzf4xRQrKifDwqEgOJ8uDRu8SZyiJbL9W8c/cTqUILt3wIzVWMANSCwx/I51uxj/G7JSvGEMJ6rCFk0bzYOcaU= X-Received: by 2002:aa7:c405:: with SMTP id j5mr20981903edq.122.1625553531236; Mon, 05 Jul 2021 23:38:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 6 Jul 2021 08:38:39 +0200 Message-ID: To: Craig Francis Cc: PHP internals Content-Type: multipart/alternative; boundary="000000000000caf41505c66eac3b" Subject: Re: [PHP-DEV] [RFC] [VOTE] is_literal From: george.banyard@gmail.com ("G. P. B.") --000000000000caf41505c66eac3b Content-Type: text/plain; charset="UTF-8" On Mon, 5 Jul 2021 at 20:15, Craig Francis wrote: > Hi Internals, > > I have opened voting on https://wiki.php.net/rfc/is_literal for the > is-literal function. > > The vote closes 2021-07-19 > > The proposal is to add the function is_literal(), a simple way to identify > if a string was written by a developer, removing the risk of a variable > containing an Injection Vulnerability. > > This implementation is for literal strings ONLY (after discussion over > allowing integers) and, thanks to the amazing work of Joe Watkins, now > works fully with compiler optimisations, interned strings etc. > > Craig > Hi Craig, Although I think the idea of the feature is useful, I'm not so sure about the implementation. I watched the talk you referenced a couple of times at different points in time (the first being a couple of years back), and I fail to see how this RFC is a similar implementation to it. As how they do it at Google is to have it part of the type systems (arguably in a weird way but nonaless), and due to the language being compiled the compiler will just flat out refuse to produce an executable if the types mismatch. From my understanding, the RFC's implementation is similar to what Google does, which is to "annotate" the string, but without having the guarantees of a compiler to back it up. This approach is totally reasonable for static analysis, as running it is akin to the compilation step in checking the validity. However, having this approach built into the language itself seems rather problematic to me. Ideally we would want to assign a variable to be of 'literal' type to ensure none of the actions applied to it demote it from being a literal, and when such a demotion would occur, for it to TypeError. Due to PHP's nature we cannot do this (yet?), therefore overloading the concatenation operation seems rather unwise. The case where concatenation between a literal and a non-literal happens, without error, is very similar to passing around a nullable type until one function/method/property doesn't accept null where it blows up into your face, and you need to track down where on earth did the null value came from, which might be multiple calls prior. And there has been a hell of a lot of talks/articles/etc. about *not* using nullable types due to this issue. This is I think the main issue with the current shape of the proposal. This implementation will detect certain security issues, but finding the root cause for them is going to be rather complicated, as the concatenation operation is basically kicking the can down the road about the responsibility of checking whether or not the result is a literal. Whereas using a function like concat_literal() which checks that the inputs are indeed literals provides immediate feedback that the type constraint is not being violated. Due to this reason, I'm voting against this proposal. Best regards, George P. Banyard --000000000000caf41505c66eac3b--