Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114858 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 34927 invoked from network); 14 Jun 2021 12:13:55 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 14 Jun 2021 12:13:55 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id A74AC1804F4 for ; Mon, 14 Jun 2021 05:30:09 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-vs1-f43.google.com (mail-vs1-f43.google.com [209.85.217.43]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 14 Jun 2021 05:30:09 -0700 (PDT) Received: by mail-vs1-f43.google.com with SMTP id y207so7653725vsy.12 for ; Mon, 14 Jun 2021 05:30:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=basereality-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=r/Gp8M9UjpnqITVybp+csJ3DL86qEMiAPqWS4kArxW4=; b=Dlo8bidCxQrbxv1Aj6PdnsB+U/BfZ7TGMFERG5TzfNV1A8Neo6kdfBy+SFmZViCcIw JuP/tvTKIgKCO9m/2/UU4Xu8O7uFKIjpKQrmub2g3vdIN6AU/cM4b3QgdwQFRYP5v6V1 SCmnC+CqhPV9WEvZJaOFBPZ5Ru4hiG56LREZw2LgQVB1tXRYbSO9/h+GeZT2hnR/J6pi zXhUx7pNYBD8K1XJypV/fgBY5lNv2lvs9J4TlI7TWlGhQzJBIkC4YXryGKrh6OxqlBvJ hPcEBnjW2oiD+mKV1PQQxFGmVStjQUke21ELYhZIe5VEQ+F7z5Tvgx/GKH7+9g5XwPqg zj2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=r/Gp8M9UjpnqITVybp+csJ3DL86qEMiAPqWS4kArxW4=; b=FfbPsAwi511I8ZMnEUX5X2oBJRrO4TCE8twQJPGt84TnFekOPRhxY2iQ4AiZFR8ApD 6BJC3X235bqpXgDcyPF1n4TO4g3p76apDFSqZIejwpYO1fSHCurjh3CTWl+hhYXiCbjB /3OTqvAaZJeghYYQVg3kvlV3rBfwkNLD/thZ3B4WZf19QfpxL7dkTX7GhupWaqU54wF1 oU+6B9JAt/sdelq4KljUTNTsg9A5g640CrkyUvtMcWnploUGzBvaFjS8s/KXOTc2equG vF65FkmKdmTYQoRdS2U5oqwg8iXEaAuIYu9B5JO65qbEziNGSQd7qyuhJod1RpkoJd+Z WOPQ== X-Gm-Message-State: AOAM530IUPRNfd3hjowE3E0ixYLpDmDBPjWM6l0ilZ8l2CzCONmf78+z RnQpU90IsBtzxzP1TxcgeUIRQbRB5mIgMPtrcVXXgg== X-Google-Smtp-Source: ABdhPJzKTy/M3qcXo0L8jaHU6d3wzhuYOztjZoIWWzaPa9e1JGcnhUmyLXnUAv6h45iiunTHvsQsoopIfnCx808+iMU= X-Received: by 2002:a67:e9c9:: with SMTP id q9mr17699324vso.24.1623673805796; Mon, 14 Jun 2021 05:30:05 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Mon, 14 Jun 2021 13:29:53 +0100 Message-ID: To: Joe Watkins Cc: Craig Francis , PHP internals , Matthew Brown Content-Type: text/plain; charset="UTF-8" Subject: Re: [PHP-DEV] [RFC] is_literal From: Danack@basereality.com (Dan Ackroyd) Hi Craig, Joe, While I think the idea behind this RFC is great, the current implementation is terrible, as it will cause some fatal errors in production. # The problem Imagine we have this code: function getInfoPanel(UserPreferences $up): string { $html = "
"; $html .= // more html here $html .= "
" return $html; } class UserPreferences { private DB $db; function getColor(): string { $preferredColor = $this->db->getColor(); if ($preferredColor === 'light') { return '#fff'; } if ($preferredColor === 'dark') { return '#000'; } return '#fff'; // default is light } } And then the string returned by getInfoPanel() is used elsewhere that does a check for is_literal. That check will pass. Now imagine a feature request comes in, to allow people to customise the color of their UI, so the UserPreferences code is changed to: class UserPreferences { private DB $db; function getColor(): string { $preferredColor = $this->db->getColor(); if ($preferredColor === 'light') { return '#fff'; } if ($preferredColor === 'dark') { return '#000'; } return $preferredColor; // user has set custom color. } } Assume that both UserPreferences and getInfoPanel code is covered by unit tests. The developer who made that change would run the tests, see the tests pass and then push their code live. At which point, it would cause an error in production, as the string returned would not pass the is_literal check. This would be a complete pain in the butt to fix. The only information would be that the html string is not literal....but there wouldn't be any info about which bit was the problem. Someone would have to manually check which code had been called to produce the non-literal string. That could mean the site would be unavailable for users that had set a custom color. Or more likely, the developers would just turn off the is_literal check or at least downgrade it to a mere logging error aka something to be ignored. # How to fix this in the RFC By not carrying the literal flag on string concatenation, but requiring developers to use functions like literal_concat() and literal_implode(), the 'false affordance'* goes away: function getInfoPanel(UserPreferences $up): string { $html[] = "
"; $html[] = // more html here $html[] = "
" return literal_concat(...$html); } Now, when the feature request comes in to allow users to set custom colors, the code will fail on the literal_concat. When that function checks internally that all the strings passed to it are literal, it will throw an exception from the function that has made the bad assumption about whether a string is user input or not. Yes, it's slightly ironic that making errors be louder and happen earlier in the program is better than making them happen later. # Wouldn't forcing people to use literal_concat() and literal_implode() be annoying I don't think so. The code change required is pretty small, and can be done either by a junior developer, or could be automated by things like https://github.com/rectorphp/rector. Even if there are 100 places where an existing series of string concatenations needs to be converted to chucking in an array, and then using literal_concat() and each of those takes 10 minutes... that's 2 days to do the conversion. In return for which the application becomes a lot more secure than it was previously. But additionally, I don't think many people are actually going to be using literal_concat() in many places. My guess is that the vast majority of places where you are combining bits of strings together, you would realise that you can switch to using type-specific escaping, as you're going to need to do the escaping anyway. # Is this a problem in other languages? The similar implementation used in Google is used for Java. As Java is statically compiled, any attempt to pass a non-literal string to a function that expects a literal string fails to compile. And if it won't compile, it won't run, and so won't have errors in production. The only thing that needs to be changed in my opinion is not carrying the literal flag through string concatenation. Although that would be slightly more work to use the feature, avoiding having inevitable and hard to debug errors in production is the trade-off. cheers Dan Ack * false affordance - something that works, except when it doesn't.