Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:115414 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 40935 invoked from network); 12 Jul 2021 18:33:56 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 12 Jul 2021 18:33:56 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 183FC1804AA for ; Mon, 12 Jul 2021 11:57:14 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 12 Jul 2021 11:57:13 -0700 (PDT) Received: by mail-lj1-f179.google.com with SMTP id q4so25869518ljp.13 for ; Mon, 12 Jul 2021 11:57:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=craigfrancis.co.uk; s=default; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=u5CGvxdvzYrqODWYimaliLa95RiD6+AfXy7PyABCItg=; b=c/Dj8bmJFV8bMn5yImPV7HGD8z/4uOg2q+QIl9TlivnOfKhjtaIyHDXPMZSrX9xPOA ukFbnLvAAyF1FcgVqw6yY0uPqn6y/AyetbdqlWtS1lzs8d+6PiBame6b5RgoZMlQDBsV p/9BefzUYAkwo74JVtwPjHhGI+Wc/LIClbm9w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=u5CGvxdvzYrqODWYimaliLa95RiD6+AfXy7PyABCItg=; b=l/JNN6lLV9Zak8ZANTL+edxe09RNC59buPKpQ+REyYnDpJaZoVFgJc175+31AlNL+Q Utb6m2puQFViY0ET1whZL5Yp1qQSo0w56GkjEK91rm55LR2qbowT04kZFih/xVwKGycZ fG48SRCLgBHugVVTkDaJMU5HParugRZwocbP+o/xplO8cUpRufGqHe6pA2YFCPHcex9+ aqGx+E+QhsKRg4JfopAewexEc6uwxF3cXMt0fF/6YwyN5t2L9Ye7y19/lw6w22ZaafMO s2HZAEsnbf2NyMT3Ps2RBJSwJMDqVQMzPsxtRTKkzi/zGz4+A1ilgYkd2WqV1AJk5zA2 XbIQ== X-Gm-Message-State: AOAM530sh+fJO8sMvUfkJ+kS7/MnVu3BpQrLwUWJq4XRUpdTZseulk1J QrukLoz2BDHpFtwo/8Px1IUK8JZMfChQa34LYUD7Xw== X-Google-Smtp-Source: ABdhPJxrl7YiBDkE9s0VKapWfymYXlFuKHgK/0+e6Z8LN1KYpmj6xQ4vwydSaeM09Nyd+Q4giZUiKwwfG+sned0Vrvs= X-Received: by 2002:a2e:9c02:: with SMTP id s2mr512777lji.299.1626116228618; Mon, 12 Jul 2021 11:57:08 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Mon, 12 Jul 2021 19:56:57 +0100 Message-ID: To: Dan Ackroyd Cc: "G. P. B." , PHP internals Content-Type: multipart/alternative; boundary="0000000000002badf605c6f1b05f" Subject: Re: [PHP-DEV] [RFC] [VOTE] is_literal From: craig@craigfrancis.co.uk (Craig Francis) --0000000000002badf605c6f1b05f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, 12 Jul 2021 at 14:17, Dan Ackroyd wrote: > On Mon, 12 Jul 2021 at 02:09, Craig Francis > wrote: > > > > you can choose not to use string concatenation ... allowing anyone to > customise it to their needs > > Please can you explain how: > > i) An individual programmer can enforce that they don't accidentally > use string concatenation for stuff that is used in a security > sensitive context. > i) Usually if developers want to add specific restrictions for themselves or their team that the vast majority of a userbase would not need or use, they would use a separate tool that fills that niche to enforce their chosen coding style (like how they might want to enforce "tabs vs spaces" in their own codebase). > ii) A team of 50 programmers can enforce that they don't accidentally > use string concatenation for stuff that is used in a security > sensitive context. > ii) Same answer as i) as it scales. iii) A library can enforce that string concatenation isn't used in the > params passed to it. > iii) A library shouldn't care if the developer used concatenation, it should only care if user data has been included incorrectly (i.e. checking for an Injection Vulnerability). A library=E2=80=99s purpose is to ensure w= hether code is safe or not, not about enforcing personal coding styles. > and doesn=E2=80=99t improve security in any way, > > It [disallowing concatenation] prevents issues from being able to make it > through to production. > But the main reason would be to reduce the cost of using this feature > long-term. > Disallowing concatenation doesn=E2=80=99t guarantee that issues can=E2=80= =99t get through to production though; for example, something that can sometimes be a non-literal, but during development defaults to a literal. From a security point of view, it doesn=E2=80=99t matter whether the error = is caught at the point of concatenation, or when it=E2=80=99s checked as the l= ibrary receives it, because the injection vulnerability gets caught either way. I take it the "cost" that you=E2=80=99re referring to is just the debugging= time? Ultimately any extra debug time that might occur, is magnitudes less than the time it would take almost every developer to check and rewrite most of the projects that exist today, which is what the idea of not supporting concatenation requires. For a developer who really finds debugging too onerous, there are other ways of debugging - using tools better suited for it such as XDebug or Static Analysis tools like Psalm (and as above, if you were wanting to force yourself/your team to not concatenate, you would be using a Static Analysis tool already (i/ii)). > you can choose not to use string concatenation (I haven't needed to). > > Wait....what? > > Is your position both that preserving literal-ness across string > concatenation is required, otherwise this feature is too hard to use, > and at the same time, you've not needed that in your own applications. > > Is that right? Because if preserving literal-ness across strings > wasn't required for you...why would it be required for every other > project? > > And to be clear, I don't think it's required. > We are trying to improve the language for the majority of developers. I=E2=80=99m an experienced PHP developer who is genuinely passionate about security. I find it fun, curating my code to be as secure as possible is practically a hobby. That makes me an Outlier. Like a lot of us here. My focus is on writing an RFC that works for as many developers as possible, so whether it=E2=80=99s =E2=80=98necessary=E2=80=99 in my own personal proj= ects is irrelevant to whether a simple safety feature should exist for the community. We need to make things easy and safe for the people who are /not/ just high-level programmers, but for people who don=E2=80=99t know everything we= do and need things to be simple and functional as possible. My projects do use a lot of string concatenation. Not an erroneously high amount, probably about normal. So let=E2=80=99s say we do break string concatenation: For some numbers, I've found roughly 1,300 instances of SQL and HTML concatenation in my projects. And even if I would be willing to go through such a big task to replace them, the real problem is actually finding them, because if you=E2=80=99re trying to use a search, well, who w= ould have thought looking for "." would return so many results? > I think by listening to feedback from people who aren't sure it's a > good idea, who said "this is a good idea but only if it's really easy > to start using it" that this RFC has been watered down from the most > useful proposal. At the core, there is a good idea behind this RFC, > but the set of trade-offs chosen just aren't the right ones, and > aren't the "proven" trade-offs made at Google. > The proposal hasn=E2=80=99t changed. This is keeping to the original concep= t, and while you wanted to remove string concatenation support, that does not mean that anything so strict was our intention. The proposal was always meant for the greatest and easiest adoption possible, but that was your creative difference with us, which is fine, but doesn=E2=80=99t mean that this isn= =E2=80=99t exactly as originally intended. The only real change to `is_literal()` that has been made since the start of this RFC is improvements to the compilation process. Otherwise it is the same as day one. If it will put your mind at ease, Krzysztof Kotowicz is probably the best placed person to provide feedback, as he is the implementor of this same principle in JavaScript. And he provided feedback to this project, saying that they trust concatenated constants, and that yes, while a programmer could go out of their way to do something *intentionally* dangerous (like building up an array of single character literals, and joining them together based on user input), the =E2=80=9Cgo-safe-html=E2=80=9D library a= uthors decided that "the ergonomics of trusting concatenated constants far outweighs the security concern". Craig --0000000000002badf605c6f1b05f--