Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63102 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 22114 invoked from network); 18 Sep 2012 18:52:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 18 Sep 2012 18:52:37 -0000 Authentication-Results: pb1.pair.com smtp.mail=bryan@ravensight.org; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=bryan@ravensight.org; sender-id=unknown Received-SPF: error (pb1.pair.com: domain ravensight.org from 209.85.219.42 cause and error) X-PHP-List-Original-Sender: bryan@ravensight.org X-Host-Fingerprint: 209.85.219.42 mail-oa0-f42.google.com Received: from [209.85.219.42] ([209.85.219.42:47974] helo=mail-oa0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id DB/DE-07072-3F2C8505 for ; Tue, 18 Sep 2012 14:52:35 -0400 Received: by oagh2 with SMTP id h2so232163oag.29 for ; Tue, 18 Sep 2012 11:52:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:references:in-reply-to:subject:date:message-id :mime-version:content-type:x-mailer:thread-index:content-language :x-gm-message-state; bh=VFhHUQkue41+Jq4020p0+vXn2AMx2hTE83kCTd4MNiw=; b=K0NkkRTxV7GKB8RA6brKxqUCKI6y+es1WWdZZd/gFtyZuFo4/0eTDUVJSDYW9kfsHg BaaV8bHUB4HvoRyG0CG/3f6wuXdQ76mOvS82o1KNaA4qAqmBL3KwgXx5+GteWrNp8rFR igDiYnoq/FG1ZECMI42U4FPp7WG9Qpsd9WA/GvQqG4PbySqcu2xS5ugn7q9MSgcCrQ/q bhN5h/IenumgKU+uRKt1nr6dittJtiQjkKG18sIvVZLL574N/kIe58gd5hKhmi7PYVJc Rsb4axBqrBi7rKSiycg1lf0ZZS7Add5Daga+BcU4aN4qAU4F46T+8DsgJNqdjdUfcDOh 5EQw== Received: by 10.60.13.37 with SMTP id e5mr1073413oec.98.1347994352673; Tue, 18 Sep 2012 11:52:32 -0700 (PDT) Received: from Genie (108-202-93-53.lightspeed.mssnks.sbcglobal.net. [108.202.93.53]) by mx.google.com with ESMTPS id b5sm375096obd.18.2012.09.18.11.52.31 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 18 Sep 2012 11:52:32 -0700 (PDT) To: "'Anthony Ferrara'" Cc: References: <011201cd95c7$33d43c30$9b7cb490$@org> In-Reply-To: Date: Tue, 18 Sep 2012 13:52:26 -0500 Message-ID: <011901cd95ce$bfea0900$3fbe1b00$@org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_011A_01CD95A4.D7140100" X-Mailer: Microsoft Office Outlook 12.0 thread-index: Ac2VyLqV3W6P5j88Sleu7rb7WeSTdQAANfHw Content-Language: en-us X-Gm-Message-State: ALoCoQnM94ZIPdCxjB+S8OwOobuqVdzMhGACru8opi9d67ULZBdsKVokEc4IA7VFA5svhnt1Hf08 Subject: RE: [PHP-DEV] RFC: Implementing a core anti-XSS escaping class From: bryan@ravensight.org ("Bryan C. Geraghty") ------=_NextPart_000_011A_01CD95A4.D7140100 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Antony, I'll concede that the term "escaping" is improperly used in many places; even in the OWASP documentation. But I'll point out that the CWE document is identifying a distinction in the two terms by saying, "This overlapping usage extends to the Web, such as the "escape" JavaScript function whose purpose is stated to be encoding". But when you say, "With the end result being the exact same...", I don't think you've thought it through. I've read some of your stuff and I'm pretty confident that you understand the benefits of white-listing over black-listing. For the uninitiated, yes, a black-list can be configured to produce the same results at a given point-in-time, but the fundamental approach is different. A white-list operates on an explicit specification and lets nothing else through. A black-list assumes that the input data is mostly correct and it filters out the bad. To add to that, how do you convert from ISO-8859-1 to UTF-8 with a black-list or by escaping? Your reference to mysql_real_escape_string is exactly the point I'm trying to make. The use of that function is "discouraged" because it DID escape; it looked for specific bad characters. It was fundamentally flawed. And that is the functionality PHP developers, as you just demonstrated, will refer to. The current recommendation is to use a library that properly encodes the entire data stream. I'll also agree that consistency with the industry is not as important because there seem to be plenty of misuses. However, I do think that we should use terminology that sets the functionality apart. So, given the operating mode difference and the precedent set by mysql_escape_string, mysql_real_escape_string, etc., I think "encode" is the way to go. Thanks, Bryan From: Anthony Ferrara [mailto:ircmaxell@gmail.com] Sent: Tuesday, September 18, 2012 1:09 PM To: Bryan C. Geraghty Cc: internals@lists.php.net Subject: Re: [PHP-DEV] RFC: Implementing a core anti-XSS escaping class Bryan et al, On Tue, Sep 18, 2012 at 1:58 PM, Bryan C. Geraghty wrote: Hello everyone, Paddy is correct here. The purpose of this API is output ENCODING which is a very good thing. This discussion provides a very good case for a point I made via Twitter this morning: In this RFC, all uses of the term "escape" should be replaced by the term "encode". This is not solely a problem with this RFC. The term "escape" is being used developers in the industry when they mean "encoding". This is bad thing because, from a security perspective, escaping is exactly the opposite of encoding. It's a very common thing: http://cwe.mitre.org/data/definitions/116.html > The usage of the "encoding" and "escaping" terms varies widely. For example, in some programming languages, the terms are used interchangeably, while other languages provide APIs that use both terms for different tasks. This overlapping usage extends to the Web, such as the "escape" JavaScript function whose purpose is stated to be encoding. Of course, the concepts of encoding and escaping predate the Web by decades. Given such a context, it is difficult for CWE to adopt a consistent vocabulary that will not be misinterpreted by some constituency. > I think that picking one, and sticking with it is fine. No matter which is chosen... - Escaping is done by setting up a black-list and replacing those elements with an approved variant. - Encoding is done by converting all of the input data into the target format. Some bytes may end up being exactly the same but they are all processed. With the end result being the exact same... I understand why people on this list are associating the functionality defined in this RFC with filtering because the name is leading them astray. Besides the fundamental difference in the definitions of each item, the security industry is using the term "encoding"; take a look at the OWASP documentation for a quick example. The OWASP documentation uses them interchangeably. However, specifically for this task, the ESAPI is defined as a: https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_ Sheet > The OWASP ESAPI project has created an escaping library in a variety of languages including Java, PHP, Classic ASP, Cold Fusion, Python, and Haskell. > If we want developers with little application security background to be able to understand these things, we need to be consistent. In this case, I'm not sure consistency with the industry is as important (mainly because the industry is itself inconsistent). The important thing is to pick one and stick to it. I would suggest "escape" mainly because people in PHP are already familiar with it (via mysql_real_escape_string, etc)... Anthony ------=_NextPart_000_011A_01CD95A4.D7140100--