Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108537 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 21769 invoked from network); 13 Feb 2020 14:17:39 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 13 Feb 2020 14:17:39 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 4789F180531 for ; Thu, 13 Feb 2020 04:32:01 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 13 Feb 2020 04:32:00 -0800 (PST) Received: by mail-wr1-f44.google.com with SMTP id u6so6485950wrt.0 for ; Thu, 13 Feb 2020 04:31:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=craigfrancis.co.uk; s=default; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=J4qdHLuH4IarB9z2RDzLTxlNzbhbyt558dP/wGIP4dI=; b=hbE+krmeYp993qSE3RfDqOjgxXhNbbd8fogh8Rv+YjCkHOdb/wzzP+fc2LGvUdMbCY Eur/95lj9ISQCyVjb6pxbVqi5OVLz7kDmfhivAi3H81GfeoeLo3brjex2UJ8J/XzDDki nHRHZYBAbfDhl+7mpUxoJdxtpuo9+jrDr2qzU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=J4qdHLuH4IarB9z2RDzLTxlNzbhbyt558dP/wGIP4dI=; b=lNi9s52+AZ8jluRBEvvBc7MMWOnXr6gD4RV+HMSUW51HLKvJyMwVCD/LZWk4cMDeg5 l9ns4VTA/YcNfuZB8OvhbDMVAPNzAtAQUmwhjWcToS88ZLntVbpG3lvMwfQ43xmv7qyt C7OOOssyz0vq65XBsVtIpwb/bHRBqOPtAjB/7DJN1zeb4OeZSsTi31xUoxg5ku2BhueQ qFl2aVcmKnTl9d0CVrwPd2he3RDMgNbuHDfKDGN5OzZ14dBhkrJOjkwnzuF2EsJm4r8Z og6PWJKx7BHNOrXPJ9KHnK8mnmLZnhVQs9BdEzTjA7doTExp74Tm4NmnAEC9W+DEBL3+ gLgg== X-Gm-Message-State: APjAAAXVmFv3cEjfMD3Q0dIInI71AW/vNzwMxQlk826CF8SCcNAGGx9r 84ZZkY/+RIAJxZoakzpEkBl5BEWB7ZBuRa0t/3b8vojaYHg= X-Google-Smtp-Source: APXvYqyGOauDoPdZOXFXPjPG25+LJYI8fIjy6nZiBk6F2fxg7kmdPLR0ceg58bMd1HvfiVbMDSrMMoEyAr9I/UBMnC0= X-Received: by 2002:a5d:5152:: with SMTP id u18mr21224113wrt.214.1581597115323; Thu, 13 Feb 2020 04:31:55 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 13 Feb 2020 12:31:44 +0000 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary="0000000000003ca9ce059e74466d" Subject: Re: Literal / Taint checking From: craig@craigfrancis.co.uk (Craig Francis) --0000000000003ca9ce059e74466d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, While there was a brief discussion about an *is_literal*() method in August, I'm wondering where I can go next? Just as a reminder, the main objection seemed to be that Taint checking is the current solution. For example, those created by Laruence[1], MediaWiki[2], and Matthew[3]. But this can never be as good at the PHP engine explicitly stating a variable *only* contains literal values, where it can be checked at runtime, and be a key part of the development process. And while I'm using SQL injection in my examples (because it's easy to show how it can enforce the use of parameterised queries); it would also be useful to protect against command line injection, and HTML/XSS as well (e.g. a templating system can only accept HTML as literal strings, and the user supplied values be provided separately). I'm assuming this would change the zval structure (to include an "is_literal" flag?), and it would be more of a PHP 8.0 change, rather than 8.1. Craig --- Broken taint check, due to missing quote marks: $sql =3D =E2=80=98... WHERE id =3D =E2=80=99 . mysqli_real_escape_string($d= b, $_GET[=E2=80=98id=E2=80=99]); --- Support for "WHERE ... IN", ideally done via an abstraction, so you don't need to write this every time: $sql =3D '... WHERE id IN (' . substr(str_repeat('?,', count($ids)), 0, -1)= . ')'; --- [1] https://github.com/laruence/taint [2] https://www.mediawiki.org/wiki/Phan-taint-check-plugin [3] https://psalm.dev/r/ebb9522fea --- On Thu, 15 Aug 2019 at 19:02, Craig Francis wrote: > Hi, > > How likely would it be for PHP to do Literal tracking of variables? > > This is something that's being discussed JavaScript TC39 at the moment > [1], and I think it would be even more useful in PHP. > > We already know we should use parameterized/prepared SQL, but there is no > way to prove the SQL string hasn't been tainted by external data in large > projects, or even in an ORM. > > This could also work for templating systems (blocking HTML injection) and > commands. > > Internally it would need to introduce a flag on every variable, and a > single function to check if a given variable has only been created by > Literal(s). > > Unlike the taint extension, there should be no way to override this (e.g. > no taint/untaint functions); and if it was part of the core language, it > will continue to work after every update. > > One day certain functions (e.g. mysqli_query) might use this information > to generate a error/warning/notice; but for now, having it available for > checking would be more than enough. > > Craig > > > > public function exec($sql, $parameters =3D []) { > if (!*is_literal*($sql)) { > throw new Exception('SQL must be a literal.'); > } > $statement =3D $this->pdo->prepare($sql); > $statement->execute($parameters); > return $statement->fetchAll(); > } > > ... > > $sql =3D 'SELECT * FROM table WHERE id =3D ?'; > > $result =3D $db->exec($sql, [$id]); > > > > [1] https://github.com/tc39/proposal-array-is-template-object > https://github.com/mikewest/tc39-proposal-literals > --0000000000003ca9ce059e74466d--