Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:64540 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 196 invoked from network); 5 Jan 2013 03:59:39 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 5 Jan 2013 03:59:39 -0000 Authentication-Results: pb1.pair.com smtp.mail=adamjonr@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=adamjonr@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.216.176 as permitted sender) X-PHP-List-Original-Sender: adamjonr@gmail.com X-Host-Fingerprint: 209.85.216.176 mail-qc0-f176.google.com Received: from [209.85.216.176] ([209.85.216.176:62826] helo=mail-qc0-f176.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E3/B9-38386-925A7E05 for ; Fri, 04 Jan 2013 22:59:38 -0500 Received: by mail-qc0-f176.google.com with SMTP id n41so9181134qco.7 for ; Fri, 04 Jan 2013 19:59:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=8SMCll0B93kAHVFlfr/kpZBvizjnSGz7UjrCZ0vDwik=; b=QnWPxU9ykKudpkC/HSl6VOl/ultm7Xn2NDsiAmVrYc/5lnEnh587l12dbn+7VUOvOa 0IGu3j0py9QuqexXAN+nGrN7t3NWNFN8UTA1CL5wVNp4DrFIF05jA0g0QaeDgp67ADzs TfXakqqiShvm63BFrdsj/ljfxY2XfaFdi54kiZxg7r9mwnF2IEhNIFN5tBv885FgwRJr hKbD7kEOOv8bL7/gAO70X4uCY8UEGdS80iO17QUO+oTKVjXyGSHV652mjGDlpo8Twevs Oq6aHAM+d2PAv0nv9K5oJyKJakjbAaIoKpbgEPfqLh48piez6iftKEufftXd9AHR2gvQ Uhpw== MIME-Version: 1.0 Received: by 10.229.76.149 with SMTP id c21mr8341136qck.141.1357358374025; Fri, 04 Jan 2013 19:59:34 -0800 (PST) Received: by 10.229.22.133 with HTTP; Fri, 4 Jan 2013 19:59:33 -0800 (PST) Date: Fri, 4 Jan 2013 22:59:33 -0500 Message-ID: To: "internals@lists.php.net" Content-Type: text/plain; charset=ISO-8859-1 Subject: Providing improved functionality for escaping html (and other) output. From: adamjonr@gmail.com (Adam Jon Richardson) It's important to escape output according to context. PHP provides functions such as htmlspecialchars() to escape output when the context is HTML. However, one often desires to allow some subset of HTML through without escaping (e.g.,
, , etc.) Functions such as strip_tags() do allow whitelisting, but their usage poses security risks due to lingering attributes (e.g., strip_tags('click me', ''.) One can develop a more robust mechanism in userland that first escapes input using htmlspecialchars() and then unescapes whitelisted sequences. Because of the variance in html tags due to potential attributes (e.g., optionally including various classes, img src attributes, etc), offering the ability to optionally specify a whitelist sequence through use of a regex could also offer significant benefits (e.g., any string sequence starting and ending with '/' will be handled as a regex.) However, the common nature of this need, coupled with the performance benefits of implementing this internally prompts my interest in two options. - Add a fifth parameter to htmlspecialchars() that takes an array of whitelisted sequences. Even though this seems like a terribly long function to call, one could easily wrap the call in a facade function. - Add a new function called str_escape(), but this introduces potential BC issues. There are of course other options (e.g., integrate this as an additional filter, etc.) I've built an extension that, while focused on an old web framework of mine, contains a function that can serve as a proof-of-concept that implements the functionality I've outlined above (see nephtali_str_escape_html): https://github.com/AdamJonR/nephtali-php-ext/blob/master/nephtali.c I've tossed out the idea on this list before, but it was only tangentially related to the discussion at the time. At this point, I'd really like to focus on this idea directly to see what approach might seem wisest (including doing nothing, if the frequency of use does not justify bringing the functionality into the core.) Thoughts? Adam