Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63120 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 63270 invoked from network); 18 Sep 2012 22:06:28 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 18 Sep 2012 22:06:28 -0000 Authentication-Results: pb1.pair.com header.from=adamjonr@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=adamjonr@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.214.170 as permitted sender) X-PHP-List-Original-Sender: adamjonr@gmail.com X-Host-Fingerprint: 209.85.214.170 mail-ob0-f170.google.com Received: from [209.85.214.170] ([209.85.214.170:51861] helo=mail-ob0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 29/57-07072-260F8505 for ; Tue, 18 Sep 2012 18:06:27 -0400 Received: by obbwc18 with SMTP id wc18so466896obb.29 for ; Tue, 18 Sep 2012 15:06:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=EjnFa40zKFdCY+60rBaRCVzG6ThNVN1Xr3kYryYm4hk=; b=szmM4aKAvcUfHoDk2e4hHWBDpjghjnpDio/4fflz6PK+ukn6300E8hMm5QATsfHZrL sizVNE7OtWiBn3yXDkCylH8tRzmd39R1NZe48BBPP3CtceH26ytEl/wegmcbtQ8Z28c+ Xz/I6Uljq4R8kXtbaA0/YPHcH8UUQMDbdoQYEew0YiwFnIZt/AjKKQSRtt0qayZtjDQJ YxsdGpDl8db17XJsestvWfXA82ez1869umiyHti1OwrvWVItcrY+vKCQmGUsPgk/sN01 fy3kld23y5dXHQMeEyGBZ6qCG1tnMgR9xoT4IEEdQPC59tpNDWoS8qz1EayjdDlYNejr YkIg== MIME-Version: 1.0 Received: by 10.60.170.241 with SMTP id ap17mr1693827oec.4.1348005984228; Tue, 18 Sep 2012 15:06:24 -0700 (PDT) Received: by 10.76.95.198 with HTTP; Tue, 18 Sep 2012 15:06:24 -0700 (PDT) In-Reply-To: References: Date: Tue, 18 Sep 2012 18:06:24 -0400 Message-ID: To: internals@lists.php.net Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] RFC: Implementing a core anti-XSS escaping class From: adamjonr@gmail.com (Adam Jon Richardson) On Tue, Sep 18, 2012 at 7:30 AM, P=E1draic Brady = wrote: > Hi all, > > I've written an RFC for PHP over at: https://wiki.php.net/rfc/escaper. > The RFC is a proposal to implement a standardised means of escaping > data which is being output into XML/HTML. > > https://wiki.php.net/rfc/escaper Some Quick Thoughts Multiparadigm PHP I hope any implementation would embrace procedural coding paradigms AND OOP paradigms. I tend to code using a Functional Programming (FP) style, and I don't need/want objects to be the only interface. Extension First It seems wise to get this working and tested as an extension first, just as Rasmus and others suggested. Ability To Pass Some HTML Through Without Escaping (Whitelisting) Functions should allow whitelisting of elements when desired. For example, html escaping may be desired for all elements in a paragraph except for spans, br's, etc. I've built a quick extension that I use in my web framework that does this: https://github.com/AdamJonR/nephtali-php-ext string nephtali_str_escape_html(string str [, array whitelist [, string charset]]) The escaping works as outlined below: 1) Escape all html special characters in str. 2) Loop through whitelist items. 3a) If the item begins and ends with '/', consider it a regex and replace the matches in the string with the original (htmlspecialchars decoded) text (this works because <,>,",', and & are not meta characters in regexes.) 3b) Otherwise, handle as a standard string and replace the matches with the unescaped whitelist item text. The idea is that, to be safe, everything should be first escaped. Then, only unescape the items that match the whitelist (e.g., array('

','

','etc.').) The regex option is handy because you often have situations where the internal contents of the tag vary (e.g., id, class, href, etc.) and this allows you to pass these through unescaped. Of note, I've not officially released the extension, as I'm still testing/developing it, but it serves as an example for ideas. PHP Escaping-Specific Tags Could Be Considered I wonder if PHP tags for escaping could be considered, as it seems that there's still a plurality of developers that use PHP itself as the templating language. For example: // automatically echo'd and escaped for special html chars val ?> // automatcially echo'd and escaped for special html chars whilst letting through p's val, array('

','

') ?> // automatcially echo'd and escaped for special html chars whilst letting through p's and using different encoding val, array('

','

'), $encoding =3D 'something' ?> // automatcially echo'd and escaped for special html chars, no whitelisting allowed val ?> // automatcially echo'd and escaped for special url chars, no whitelisting allowed val ?> Thanks, Adam