I am beginning to see this as another 'date/time' type of problem. Adopt the
standard that everything internally is UTC and many of the problems go away.
I can remember discussions on unicode and PHP6. PHP5 was just being RC'ed with
tools for handling unicode (mbstring) but there was no coherence on how to
handle things ... as there still isn't. I was hoping that PHP6 would be
internally unicode, and then one only had to ensure that the interfaces
correctly coded to and from unicode. Internally everything is easy because there
is no 'encoding problems'.
'Content' going in and out needs to be correctly processed and that is the base
of this. The bulk of my own 'persistent data' is content such as 'wiki', blog',
'forum posts', 'articles' and so on. Others will most likely say that I should
not be using 'html' as the storage medium, but it does provide a flexible
standard format and 'ckeditor' provides a generic editor for all content. The
problem of cause is that we are storing html tags within the data so 'crude'
filtering using htmlspecialchars is not practical. The current process sanitizes
data input for normal users, but still allows 'admin' users direct source access
which is still a security risk, but we have to trust someone.
My point here is that much of what is being discussed on 'a core anti-XSS
escaping class' is missing the some of the basic problems and 'filtering' is my
own take on the correct way of managing this! Many of the recent XSS holes have
simply been the likes of the 'highlight' function is smarty which had no
filtering at all ... and just needed sanitizing before anything was done with
it. This 'class' is purely targeting a small area of the problem and repackaging
functions which still need the user to understand which 'filter' to apply to
which string? If it expected to simply apply a process to the output which will
'protect users' then it can never succeed. Te users need to understand just
where to 'filter' the strings they are using and what filters to use.
Now if what is proposed is a 'class' that will decompose an html page with
embeded css and js and magically remove any XSS injection then it might be
useful, and I think the creator of that would be in line for a Nobel prise?
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
My point here is that much of what is being discussed on 'a core anti-XSS
escaping class' is missing the some of the basic problems and 'filtering'
is my own take on the correct way of managing this!
and this is where you are wrong.
see
https://www.owasp.org/index.php/Abridged_XSS_Prevention_Cheat_Sheet#A_Positive_XSS_Prevention_Model
and
https://www.owasp.org/index.php/Abridged_XSS_Prevention_Cheat_Sheet#Why_Can.27t_I_Just_HTML_Entity_Encode_Untrusted_Data.3F
Many of the recent XSS holes have simply been the likes of the 'highlight'
function is smarty which had no filtering at all ... and just needed
sanitizing before anything was done with it.
you didn't experienced all of the possible contexts where an XSS
vulnerability can take place. this doesn't mean that those vectors doesn't
exists.
This 'class' is purely targeting a small area of the problem and
repackaging functions which still need the user to understand which
'filter' to apply to which string?
nope.
this class is targeting to provide the developers with a tool to safely
encode content into each possible context.
If it expected to simply apply a process to the output which will 'protect
users' then it can never succeed.
escaping the output doesn't mean that you can't also filter the input
(usually they walk hand in hand: "filter in escape out")
you are the only one preaching here that half of that is an ok solution.
if you only filter the input, you cannot use more than one output context
without the risk of compromise, and you also put all your defense in the
belief that you data stored in your relational database (or cache, etc.) is
safely filtered.
Te users need to understand just where to 'filter' the strings they are
using and what filters to use.
yeah, that's one thing that we can't fix, as for properly encoding the
output you need to know the output context.
Now if what is proposed is a 'class' that will decompose an html page with
embeded css and js and magically remove any XSS injection then it might be
useful, and I think the creator of that would be in line for a Nobel prise?
?
how does it relate to the current discussion
--
Ferenc Kovács
@Tyr43l - http://tyrael.hu
Hi Lester,
'Content' going in and out needs to be correctly processed and that is the
base of this. The bulk of my own 'persistent data' is content such as
'wiki', blog', 'forum posts', 'articles' and so on. Others will most likely
say that I should not be using 'html' as the storage medium, but it does
provide a flexible standard format and 'ckeditor' provides a generic editor
for all content. The problem of cause is that we are storing html tags
within the data so 'crude' filtering using htmlspecialchars is not
practical. The current process sanitizes data input for normal users, but
still allows 'admin' users direct source access which is still a security
risk, but we have to trust someone.
How you store data is somewhat irrelevant. If you store it as plain
text with no markup, that doesn't guarantee that someone will never
sneak in and add markup. This applies whether the context is itself
HTML or anything else. As a result the "crude" filtering is anything
but crude. It's a simple and effective part of a Defense In Depth
strategy. A far better solution, while only starting to gain traction
is to adopt a Content Security Policy which informs browsers about
what your markup should enable (i.e. a whitelist). By default, this
disables all inline Javascript, for example, which is the usual target
of XSS attacks.
The other thing about Defense In Depth is that when its advantages
exceed its disadvantages, applying it should be automatic. If folk
extract a value from a database and echo it to a HTML template
unescaped because it "should be" safe, I consider that a security
vulnerability. What if an SQLi attack altered it? What if an admin is
crooked or their password cracked? What if...x100. Security
vulnerabilities are not isolated events - they can be combined and
chained together.
My point here is that much of what is being discussed on 'a core anti-XSS
escaping class' is missing the some of the basic problems and 'filtering' is
my own take on the correct way of managing this! Many of the recent XSS
holes have simply been the likes of the 'highlight' function is smarty which
had no filtering at all ... and just needed sanitizing before anything was
done with it. This 'class' is purely targeting a small area of the problem
and repackaging functions which still need the user to understand which
'filter' to apply to which string? If it expected to simply apply a process
to the output which will 'protect users' then it can never succeed. Te users
need to understand just where to 'filter' the strings they are using and
what filters to use.
Filtering/Input Sanitisation goes hand in hand with Output
Encoding/Escaping. You can't have one without the other and also claim
to have executed a Defense In Depth strategy. You're then stripping
away defenses based on the expectation that whatever remains will
never fail. If it does, and it does all the time in reality, then your
lack of escaping as a backup is a massive problem.
What does this mean? Escaping is not a small area of the problem -
it's one of the biggest areas of the problem - potentially bigger than
input sanitisation since invalid values are irrelevant to proper
escaping which operates blindly by design. A lack of escaping impacts
every single point in every shred of application output which contains
data sourced from everything not literally defined in the current
request and just one failure may be sufficient for an attacker to dump
encoded Javascript into the browser to steal cookies, perform
requests, track key presses, rewrite HTTPS links, attack browser
extensions, and any number of other effects.
Your final point is accurate to a point. Users, by and large, don't
understand XSS. This is not, however, a justification for withholding
tools that are useful to those who do know how to properly use them.
Education is a separate issue which I'm also trying to address:
http://phpsecurity.readthedocs.org
Now if what is proposed is a 'class' that will decompose an html page with
embeded css and js and magically remove any XSS injection then it might be
useful, and I think the creator of that would be in line for a Nobel prise?
HTMLPurifier by Edward Z. Yang. It only works on body content - not
the header section, but knock yourself out ;). There's also the far
less CPU intensive option of the Content Security Policy though we're
reliant on the penetration of modern browsers to distribute that
across more users. That said, Defense In Depth - folk should seriously
consider implementing this right now.
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team