Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:88218 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 73947 invoked from network); 15 Sep 2015 16:23:10 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 15 Sep 2015 16:23:10 -0000 Authentication-Results: pb1.pair.com smtp.mail=ircmaxell@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=ircmaxell@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.212.193 as permitted sender) X-PHP-List-Original-Sender: ircmaxell@gmail.com X-Host-Fingerprint: 209.85.212.193 mail-wi0-f193.google.com Received: from [209.85.212.193] ([209.85.212.193:35678] helo=mail-wi0-f193.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id AD/18-28087-AE548F55 for ; Tue, 15 Sep 2015 12:23:08 -0400 Received: by wicxq10 with SMTP id xq10so5866631wic.2 for ; Tue, 15 Sep 2015 09:23:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=FR+9f2dUkVs+nTObIUBxjyrN3H11L2PLEuxZRVMCoKE=; b=AlnXvQ+Wgb6OUh9o8POJ9rJkezYJx2LroFkDQEu38iFq7CIq0/HLmLaGcIF1bm8nvW +KiIK7zpxb0sz2oixcLO/fG+jq27/0WXeRitTPm60sbcFM6ILGHmQ6r5MzwtHHl0Z601 SBmRIMaZ/ZTWRJB87vC3cUBgf639XUKWHXQLWNW+KgtjQP5dy6KaMlr7mAAUmgp39mhs N1i8QgIMfscALKNea2yJGcqKQ8yHRYwJowGUwwloGO8KFn9+e3hUm9gLSxbFBFvkWboM SoZ+9WL7u8y7NIPWfeMXpXuLBoBY7MmtuOxQfWbCXcBrPFN2Nxsre39Bey9lZ8hL5D8C AATw== MIME-Version: 1.0 X-Received: by 10.194.82.167 with SMTP id j7mr41577304wjy.123.1442334182749; Tue, 15 Sep 2015 09:23:02 -0700 (PDT) Received: by 10.28.55.18 with HTTP; Tue, 15 Sep 2015 09:23:02 -0700 (PDT) In-Reply-To: References: Date: Tue, 15 Sep 2015 12:23:02 -0400 Message-ID: To: Arvids Godjuks Cc: Craig Francis , Christopher Owen , Kalle Sommer Nielsen , Internals , wietse@porcupine.org, Xinchen Hui Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] taint From: ircmaxell@gmail.com (Anthony Ferrara) All, On Tue, Sep 15, 2015 at 11:15 AM, Arvids Godjuks wrote: > I fully support your effort to get this into the PHP to be part of core > extensions, or at least one of those that keep up with the language > releases. > This is a very good tool to have, and you can actually run it in production > to catch things that may slipped the stating (things happen). And it's > invaluable during during testing. I'm against any form of tainting for a few reasons. First, it's only really useful as a first-order heuristic. A failure does not indicate a security vulnerability, and a non-failure does not indicate the absence of one. An example of a false-positive would be if (preg_match('(^[a-z0-9]+$)', $input)) echo $input;. Taint-based systems would flag that as a possible XSS, where it's clearly not in any context. An example of a false-negative would be the following: >Bar. If the input is "foo onclick=dosomethingbad();", an XSS is still executed. Another example of a false negative would be: $query = "SELECT * FROM foo WHERE bar = " . mysqli_real_escape_string($c, $input) . ";"; A false positive is potentially OK. A false negative is not. Second, it will encourage the improper pattern of "sanitize" functions to de-taint input. These are functions which call `mysqli_real_escape_string(htmlspecialchars(whatever(etc($input))))`. No matter what the context the variables are used in. This doesn't improve security (and in many cases can demonstrably harm it). It just encourages people to work around it. And that ignores that there's an `untaint()` function that they can call on any variable to "silence the errors". Third, it ignores context. This is related to the first two, but I think is a separate concern. An example from the taint RFC (https://wiki.php.net/rfc/taint) is the shell-execution. If the variable is used in the context of command, one escape function is needed. If it's used as an argument, another is needed. Detecting which is not something that's trivial for a language-level taint function. And calling the wrong function defeats the purpose. So if we can't detect the context, we can't reliably say whether it's safe or not. In the hands of an experienced engineer/security researcher, taint systems can be a really handy tool to help find possible issues. In the hands of a junior engineer, it can actually make things worse because it encourages them to just "silence the errors". Considering that it's not a reliable tool, and is only a heuristic, I would suggest that this be implemented as an extension, and not in the core of a programming language. With that said, if there are engine hooks necessary to implement this feature better as an extension, I'm all for adding those hooks. I'm just against adding something to the language that won't actually be able to accomplish what it promises. Escaping is always context-sensitive, in every case. Tainting as has been proposed in the past doesn't (and literally can't in the general case) know the context. Therefore it can't reliably know the correct way to escape. Consider the case of addslashes() and magic_quotes_gpc. The reason that they are insecure is that they don't know the context that the variable is used in. This leads to character set issues, as well as cases where backslashes enter contexts they don't escape (like HTML output). We learned years ago that context-insensitive escaping doesn't work and is literally bad. Let's not make the same mistakes again (even if it's for the correct reasons this time). Anthony