Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63118 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 52883 invoked from network); 18 Sep 2012 20:48:06 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 18 Sep 2012 20:48:06 -0000 Authentication-Results: pb1.pair.com header.from=dragoonis@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=dragoonis@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.219.42 as permitted sender) X-PHP-List-Original-Sender: dragoonis@gmail.com X-Host-Fingerprint: 209.85.219.42 mail-oa0-f42.google.com Received: from [209.85.219.42] ([209.85.219.42:64036] helo=mail-oa0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D2/55-07072-30ED8505 for ; Tue, 18 Sep 2012 16:48:03 -0400 Received: by oagh2 with SMTP id h2so368619oag.29 for ; Tue, 18 Sep 2012 13:48:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=DfZ/DO3ROQsWDttBOeQ0DeVLrRuPbybGKp4ywOWOvWE=; b=Mo+OFDw4M3yt6lH1I75L7qLu+dtFXYkZmSICDoFOrsLuejzPmzZVCUetrXha2bZisP aR2ygF6LrB1zk6HMm7deS+3gokSwTv0tXqbXJ9B8JGAwpMXilbT0BaEOZcQVWQbVYofj 2fQn3G5c1PX96bkZXtG9rrMtQS0a19HKl+ikCPLxvYtAUhYFUj8NSw7f+4CnwaBYMo43 3+RPiZXlTCr1s6Vc9GsOri7hFRJpA+a+EmR/pEy7k3qKcH7X6rh42EvVljCmq0qod05L KZZn7ImYB+75iiQpyl2KL6ysmKJhKzAv5GOto0lpmB3kpZ97nUz6gHX8U+se02aAM0qj GroQ== MIME-Version: 1.0 Received: by 10.182.202.1 with SMTP id ke1mr1400118obc.51.1348001280339; Tue, 18 Sep 2012 13:48:00 -0700 (PDT) Received: by 10.60.18.164 with HTTP; Tue, 18 Sep 2012 13:48:00 -0700 (PDT) In-Reply-To: <5058D807.3030209@gmail.com> References: <5058B7A3.3030708@gmail.com> <5058D807.3030209@gmail.com> Date: Tue, 18 Sep 2012 21:48:00 +0100 Message-ID: To: PHP Internals List Cc: Rasmus Lerdorf , Stanislav Malyshev Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] RFC: Implementing a core anti-XSS escaping class From: dragoonis@gmail.com (Paul Dragoonis) @All, I'd like to provide a real use case since i feel people have went off on a tangent of their own. i.e: a list of blog posts.

getID();?>" title=3D"escapeHtmlAttr($post->getTitle());?>"> escapeHtml($post->getTitle());?>

Please see the different needs for escaping generalised html output, and the same but within an attribute. This is an important problem that we need to try and solve, the htmlspecialchars() stuff isn't good enough else we wouldn't need custom preg_match() solution like in the proposed RFC. I'm happy for this to be a SPL class or a function such as escape_var() with options on it (similar to how filter_var() works right now). Adding additional extensions in todays PHP eco-system is actually not going to help us at all since only like 2% of people are ever going to install it. It has to be in ./ext/standard/ or ./ext/spl/. @Rasmus/Stas Are you happy with us adding a new class or function to ./ext/spl/ or ./ext/standard/. This isn't one of these shiny "must have" features, it's actually addressing a very important problem. For PHP developers to benefit from the escaping functions provided by zend/symfony they have to actually be using those frameworks and that's really a small portion of PHP code out there in the wild. If we can introduce the new escape_var() function or a new OO class (as per the RFC) then it's going to be readily available in the future. Many thanks, Paul Dragoonis. On Tue, Sep 18, 2012 at 9:22 PM, =C1ngel Gonz=E1lez wro= te: > On 18/09/12 21:06, P=E1draic Brady wrote: >> Hi =C1ngel, >> >> The methods all refer to literal strings, values or digits. We can't >> reasonably escape data while allowing valid markup for the current >> context since that's a contradiction by its very nature. If you needed >> to let user values drive CSS names, Javascript functions or variable >> naming, or HTML markup, you need something completely different. For >> example, HTML markup can be sanitised against a whitelist using >> HTMLPurifier. >> >>> I'm fine with the concept, but I'm not sold on the interface. >>> It should be really clear when each of them should be used. >>> >>> escapeHtml() >>> Ok, this is going to be used to show content inside a html document. >>> >>> escapeHtmlAttr() >>> Use when using unquoted html attributes, otherwise use html escaping. >>> When was the last time I saw an unquotted attribute with user-provided = content? >> Hopefully never since that's the ideal ;). However, HTML5 allows >> unquoted attributes which is perfectly valid. We don't make the user's >> choice on this but we could provide the relevant tool for escaping if >> they are completely and irredeemably insane :P. > Someone may be insane enough to try to destroy his planet, but "some insa= ne > soul might want it" is no reason to build such weapon. :) > > As it's a crazy thing to do, we shouldn't provide means to do it. If > your parameter > is not a hardcoded number, just quote it and use escapeX function on its > content. > > >>> I think it should be replaced by a quoteHtmlAttr() function which prope= rly >>> escapes the content and adds the quotes for you (or it might skip them >>> if it determines it's not needed in this case). >> The RFC focuses on escaping - not sanitising or reformatting. > As an api client I just want to pass a parameter to the attribute. > > Doing > echo 'escapeHtml("font-weight: normal") . '">'; > or > echo 'quoteHtmlAttrib("font-weight: normal") . '= >'; > > is equivalent, just a distinction on the function contract. But in the > second case the function avoids the ambiguity on whether the attribute > used double quotes, single ones or no quote at all, since it can choose > the one it "prefers". > > The goal is to make easy to write secure code. I think the second way > does it better. If we need to change the name of the rfc, so be it. > > >>> escapeJs() >>> Escape javascript... but inside