Hi all,
This is a little improvement for HTML escape.
https://wiki.php.net/rfc/secure-html-escape
"/" escape is recommended by OWASP and we may follow them.
Any comments?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
This is a little improvement for HTML escape.
https://wiki.php.net/rfc/secure-html-escape"/" escape is recommended by OWASP and we may follow them.
Could you include some samples of malicious input and what the output
would actually look like? It's not obvious from the RFC or the link
referenced.
-Sara
Hi Sara,
This is a little improvement for HTML escape.
https://wiki.php.net/rfc/secure-html-escape"/" escape is recommended by OWASP and we may follow them.
Could you include some samples of malicious input and what the output
would actually look like? It's not obvious from the RFC or the link
referenced.
They don't explain as code. AFAIK This is the case for generating invalid
HTML that destroys HTML tag structure.
<tag attr=<?php htmlentities($str, ENT_QUOTES, 'UTF-8') ?>>
When $str is
sometext /
Produced HTML would be
<tag attr=sometext />and tag is closed.
The code is broken in first place since attribute must be enclosed by
"(HTML5/XHTML) or '(HTML4), but many (if not most) browsers just allows
attributes without qoutes.
As long as user don't have other mistakes, it's not a security issue. It's
not vulnerable by itself, but it may be possible do some bad thing on some
implementations. It's just a precaution. It's good precaution as it does
not break any existing browsers. IMHO.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
They don't explain as code. AFAIK This is the case for generating invalid
HTML that destroys HTML tag structure.<tag attr=<?php htmlentities($str, ENT_QUOTES, 'UTF-8') ?>>
When $str is
sometext /
Produced HTML would be
<tag attr=sometext />and tag is closed.
Oh, I see what the RFC is suggesting now: Include encoding '/' by
default. Yeah, I got no problem with that. Seems completely
reasonable. ((Though I twitch every time I see unquoted attributes))
-Sara
The code is broken in first place since attribute must be enclosed by
"(HTML5/XHTML) or '(HTML4), but many (if not most) browsers just allows
attributes without qoutes.
Well, the HTML5 specification says attribute values can be left
unquoted, so I'd say that the "code is broken" statement is invalid.
http://www.w3.org/TR/html-markup/syntax.html#syntax-attr-unquoted
Hi Pavel,
The code is broken in first place since attribute must be enclosed by
"(HTML5/XHTML) or '(HTML4), but many (if not most) browsers just allows
attributes without qoutes.Well, the HTML5 specification says attribute values can be left
unquoted, so I'd say that the "code is broken" statement is invalid.
http://www.w3.org/TR/html-markup/syntax.html#syntax-attr-unquoted
Thank you for heads up!
We must have this change as a security fix, then.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi!
They don't explain as code. AFAIK This is the case for generating invalid
HTML that destroys HTML tag structure.<tag attr=<?php htmlentities($str, ENT_QUOTES, 'UTF-8') ?>>
htmlentities()
is no good for encoding unquoted tags. This is obvious
since it does not encode space and space is a significant character with
unquoted tags. So if you have code like the above, it's game over for
you, no need to do anything further.
As long as user don't have other mistakes, it's not a security issue. It's
not vulnerable by itself, but it may be possible do some bad thing on some
implementations. It's just a precaution. It's good precaution as it does
not break any existing browsers. IMHO.
I don't see any reason so far to do this. The code above is broken with
and without quoting /, and quoting / adds nothing to its security as far
as I can see.
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
"/" escape is recommended by OWASP and we may follow them.
Surely if this is to stop <foo bar=<?=htmlspecialchars($foobar); ?>>,
then we'd have to escape ' ' too?
--
Andrea Faulds
http://ajf.me/
Hi Andrea,
"/" escape is recommended by OWASP and we may follow them.
Surely if this is to stop <foo bar=<?=htmlspecialchars($foobar); ?>>,
then we'd have to escape ' ' too?
Making ENT_QUOTES
as a default is good idea also.
I should have add this to the RFC.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi!
Making
ENT_QUOTES
as a default is good idea also.
I should have add this to the RFC.
Why is it a good idea? Could you explain what it adds to the security of
this function?
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Making
ENT_QUOTES
as a default is good idea also.
I should have add this to the RFC.Why is it a good idea? Could you explain what it adds to the security of
this function?
I suppose the argument could be made for "safe by default", since single quotes are now valid for HTML attributes as well. (I miss XHTML...)
More interesting to me, what's the use case for ENT_NOQUOTES? This one causes issues whatever attribute syntax one chooses.
Best regards
Rouven
Hi all,
It's dated but: https://wiki.php.net/rfc/escaper I see Yasuo edited it a
wee bit in September on its 1 year anniversary to add ext/filter as an
option. I had hoped Anthony would get around to it but c'est la vie.
Without quotes you need to escape almost ALL non alphanumeric characters in
an attribute value just to make sure you cover every known and unknown
browser parsing oddity. It's just a bad practice full stop despite HTML5
allowing it.
ENT_QUOTES
should be the default for obvious reasons. It escapes quotes.
htmlentities()
doesn't anything more than htmlspecialchars()
unless you
count turning "Pádraic Ó'Brádaigh" into "Pádraic
Ó'Brádaigh" as a positive benefit to the Irish language and
the size of its webpages :P.
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative
Without quotes you need to escape almost ALL non alphanumeric characters in an attribute value just to make sure you cover every known and unknown browser parsing oddity. It's just a bad practice full stop despite HTML5 allowing it.
ENT_QUOTES
should be the default for obvious reasons. It escapes quotes.
Just to be clear, the current default (ENT_COMPAT) does escape double quotes. The change to ENT_QUOTES
would escape single quotes as well.
Best regards
Rouven
Hi Stas,
On Sun, Feb 2, 2014 at 7:21 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Making
ENT_QUOTES
as a default is good idea also.
I should have add this to the RFC.Why is it a good idea? Could you explain what it adds to the security of
this function?
Users can do
<tag attr='<?php echo htmlentities($str)?>' >and this is valid. I think there is no reason not to escape ' by default.
I agree that user should not use unquoted attributes in general.
'/' escape could be still useful. For example, user may have validation
code that allows printable ASCII chars w/o spaces. '/' escape may protect
apps from generating invalid tag in this case.
We could say "your application is broken in first place".
However, both "'" and '/" escapes do not break apps at all, yet it
covers some issues.
There is no reason not to escape these chars by default. IMHO.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
On Sun, Feb 2, 2014 at 7:21 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Making
ENT_QUOTES
as a default is good idea also.
I should have add this to the RFC.Why is it a good idea? Could you explain what it adds to the security of
this function?Users can do
<tag attr='<?php echo htmlentities($str)?>' >and this is valid. I think there is no reason not to escape ' by default.
I agree that user should not use unquoted attributes in general.
'/' escape could be still useful. For example, user may have validation
code that allows printable ASCII chars w/o spaces. '/' escape may protect
apps from generating invalid tag in this case.We could say "your application is broken in first place".
However, both "'" and '/" escapes do not break apps at all, yet it
covers some issues.There is no reason not to escape these chars by default. IMHO.
Even we may deprecate ENT_COMPAT
and ENT_QUOTES. We may ignore
them and escape all chars recommended by OWASP always. (Except ENT_NOQUOTES)
I think ENT_COMPAT/ENT_QUOTES does bad things rather than good.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi!
Users can do
<tag attr='<?php echo htmlentities($str)?>' >
They also can do <? echo $str; ?> and <? eval($_GET['f']); ?>. That's
not what they should be doing, but they can do it. That doesn't mean
there's something wrong with echo or PHP compiler.
and this is valid. I think there is no reason not to escape ' by default.
I agree that user should not use unquoted attributes in general.
'/' escape could be still useful. For example, user may have validation
I don't see how it would be useful.
code that allows printable ASCII chars w/o spaces. '/' escape may protect
apps from generating invalid tag in this case.
This seems to be a very contrives scenario invented to fit your point.
If they already pre-filter input, they could also remove / or other
special characters. The fact is that htmlentities is useless as security
feature in this context, and removing / does not make it useful. Saying
"we'll add escape so that it would be safe" is magic-quotes kind of
mistake - it gives the users wrong impression that it's OK to do things
that they should not be doing.
There is no reason not to escape these chars by default. IMHO.
There is a reason - there's no reason to escape them. In every scenario
that htmlentites should be used, escaping them is useless. In every
scenario where espacing / is useful, htmlentities should not be used. By
promoting usage of htmlentities in scenarios where it should absolutely
not be used, we are only doing the users a disservice.
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi Stas,
Hi!
Users can do
<tag attr='<?php echo htmlentities($str)?>' >They also can do <? echo $str; ?> and <? eval($_GET['f']); ?>. That's
not what they should be doing, but they can do it. That doesn't mean
there's something wrong with echo or PHP compiler.
I don't believe this has anything to do with the question at hand.
and this is valid. I think there is no reason not to escape ' by default.
I agree that user should not use unquoted attributes in general.
'/' escape could be still useful. For example, user may have validation
I don't see how it would be useful.
I'm not sure it is either. OWASP definitely notes it, but it's not an attribute
termination character inside quotes.
There is no reason not to escape these chars by default. IMHO.
There is a reason - there's no reason to escape them. In every scenario
that htmlentites should be used, escaping them is useless. In every
scenario where espacing / is useful, htmlentities should not be used. By
promoting usage of htmlentities in scenarios where it should absolutely
not be used, we are only doing the users a disservice.
There are three ways to present an attribute value validly in HTML5:
- Double quoted
- Single quoted
- Unquoted.
Bearing in mind that people who use htmlentities()
make a mockery of UTF-8 by
overescaping and increasing output page size for no good reason whatsoever, both
htmlspecialchars()
and htmlentities()
only work by default for the first option.
They do not work by default for the last two options.
In userland, virtually all security-concious libraries and frameworks cover TWO
options: 1 and 2 by setting ENT_QUOTES. It seems reasonable for PHP to make the
change also unless it has some hitherto unmentioned downside.
Also, for reference, here is the actual paragraph from the OWASP XSS cheatsheet:
"Escape the following characters with HTML entity encoding to prevent switching
into any execution context, such as script, style, or event handlers. Using hex
entities is recommended in the spec. In addition to the 5 characters significant
in XML (&, <, >, ", '), the forward slash is included as it helps to end an HTML
entity."
I read "entity" as "tag".
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative
Hi Padraic,
On Mon, Feb 3, 2014 at 8:06 PM, Pádraic Brady padraic.brady@gmail.comwrote:
There are three ways to present an attribute value validly in HTML5:
- Double quoted
- Single quoted
- Unquoted.
Unquoted is really bad standard with respect to security. I don't
understand why
they allow unquoted attributes, but I think we need to address this some
how.
htmlentities/htmlspecialchars may have ENT_NO_SPACE as an option. If
there is space char, null string is returned. Standard allows space before
attributes. User may write
<tag attr = <?php echo htmlentities($str, ENT_NO_SPACE);?> >
Use of this option is not recommended, but there is the standard. We may
support it even if we don't recommend it.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi!
Use of this option is not recommended, but there is the standard. We may
support it even if we don't recommend it.
Nowhere in any standard it says we must use htmlentities to support
every possible context. There are contexts where htmlentities is
completely unsuitable - such as unquoted attributes, Javascript, CSS,
etc. In these contexts, other ways of escaping output should be used.
I get an impression you're trying to fit a square peg into a round hole
here. There are other ways to escape things and they should match the
context the output is used in. Trying to serve every scenario with one
function would not work.
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi Stas,
On Tue, Feb 4, 2014 at 7:21 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Use of this option is not recommended, but there is the standard. We may
support it even if we don't recommend it.Nowhere in any standard it says we must use htmlentities to support
every possible context. There are contexts where htmlentities is
completely unsuitable - such as unquoted attributes, Javascript, CSS,
etc. In these contexts, other ways of escaping output should be used.I get an impression you're trying to fit a square peg into a round hole
here. There are other ways to escape things and they should match the
context the output is used in. Trying to serve every scenario with one
function would not work.
We may or may not support unquoted attributes.
I think it's really dangerous, therefore we my not support it ;)
It may be good for PHP to declare "We support HTML5!", though.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Yasuo Ohgaki wrote (on 03/02/2014):
On Tue, Feb 4, 2014 at 7:21 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Nowhere in any standard it says we must use htmlentities to support
every possible context.We may or may not support unquoted attributes.
I think it's really dangerous, therefore we my not support it ;)
It may be good for PHP to declare "We support HTML5!", though.
I think part of the misunderstanding here is the distinction between
"should PHP support an appropriate escape mechanism for this situation?"
and "should the htmlentities()
function be extended to be the
appropriate escape mechanism for this situation?"
The security requirement is for users to use appropriate escaping, and
quoting, mechanisms for the output formats they use. The combination of
quoted attributes and htmlspecialchars()
with ENT_QUOTES
is a secure
escaping method, provided by the core of PHP.
HTML5 allows users to use non-quoted attributes, but PHP does not
currently have a built-in function which provides adequate escaping for
that scenario. Such a function would need to do more than just escaping
/, as others have pointed out; for instance, it would need to either
escape, filter, or reject all forms of whitespace.
I have no real opinion on what that function should be, except that I
will personally never use it, because I will simply put quotes around my
attributes and remove any need for it.
Regards,
Rowan Collins
[IMSoP]
Hi Rowan,
I think part of the misunderstanding here is the distinction between "should
PHP support an appropriate escape mechanism for this situation?" and "should
thehtmlentities()
function be extended to be the appropriate escape
mechanism for this situation?"The security requirement is for users to use appropriate escaping, and
quoting, mechanisms for the output formats they use. The combination of
quoted attributes andhtmlspecialchars()
withENT_QUOTES
is a secure
escaping method, provided by the core of PHP.HTML5 allows users to use non-quoted attributes, but PHP does not
currently have a built-in function which provides adequate escaping for that
scenario. Such a function would need to do more than just escaping /, as
others have pointed out; for instance, it would need to either escape,
filter, or reject all forms of whitespace.I have no real opinion on what that function should be, except that I will
personally never use it, because I will simply put quotes around my
attributes and remove any need for it.
That's what we should be doing. Part of the concern with having a full
on unquoted attribute value escaping mechanism is what happens over
the course of an application's lifecycle. I'm absolutely of your
opinion, but others would argue that attribute escaping is defence in
depth against the day someone removes quotes without thinking. HTML5
has made that side of the fence more relevant.
If it were to be done, it would be a separate function other than
htmlspecialchars()
which I assume is why htmlentities()
as the local
greedy escaper makes an attractive carrier at face value. I don't
actually think it does fit there without redefining its purpose so a
separate function would be wiser.
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative
Hi Stas,
On Mon, Feb 3, 2014 at 5:17 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Users can do
<tag attr='<?php echo htmlentities($str)?>' >They also can do <? echo $str; ?> and <? eval($_GET['f']); ?>. That's
not what they should be doing, but they can do it. That doesn't mean
there's something wrong with echo or PHP compiler.and this is valid. I think there is no reason not to escape ' by default.
I agree that user should not use unquoted attributes in general.
'/' escape could be still useful. For example, user may have validation
I don't see how it would be useful.
code that allows printable ASCII chars w/o spaces. '/' escape may protect
apps from generating invalid tag in this case.This seems to be a very contrives scenario invented to fit your point.
If they already pre-filter input, they could also remove / or other
special characters. The fact is that htmlentities is useless as security
feature in this context, and removing / does not make it useful. Saying
"we'll add escape so that it would be safe" is magic-quotes kind of
mistake - it gives the users wrong impression that it's OK to do things
that they should not be doing.There is no reason not to escape these chars by default. IMHO.
There is a reason - there's no reason to escape them. In every scenario
that htmlentites should be used, escaping them is useless. In every
scenario where espacing / is useful, htmlentities should not be used. By
promoting usage of htmlentities in scenarios where it should absolutely
not be used, we are only doing the users a disservice.
I think we have different perspectives.
Some users has to confirm standard like PCI DSS.
PCI DSS requires to follow security standards and guidelines from OWASP,
SANS, etc.
Why not make PHP standard compliant?
It does not hart existing applications at all and this is simple enough
change.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi!
Some users has to confirm standard like PCI DSS.
PCI DSS requires to follow security standards and guidelines from OWASP,
SANS, etc.Why not make PHP standard compliant?
It does not hart existing applications at all and this is simple enough
change.
I'm sorry, could you please quote me a standard that requires PHP to
escape / in function called htmlentites? If there's no such standard,
the argument of "but the standard requires it" is void. No standard can
require you to use htmlentites where it should not be used. Putting
stuff into language just because somebody in the internet mentioned in
different context that it might be a good idea - is not. We should
understand why it is done and why it is a good idea, especially when
we're talking about security. In this case, the proposed use case should
never be used with htmlentities, due to obvious gaping security hole.
Adding code to enable such scenario is just not right. Instead, we
should tell people "Never ever do it. Ever.".
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi Stas,
On Tue, Feb 4, 2014 at 7:14 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Some users has to confirm standard like PCI DSS.
PCI DSS requires to follow security standards and guidelines from OWASP,
SANS, etc.Why not make PHP standard compliant?
It does not hart existing applications at all and this is simple enough
change.I'm sorry, could you please quote me a standard that requires PHP to
escape / in function called htmlentites?
I've already written the URL to OWASP.
PCI DSS v3 states in section 6.5
Develop applications based on secure coding guidelines.
Note: The vulnerabilities listed at 6.5.1 through 6.5.10 were current with
industry best
practices when this version of PCI DSS was published. However, as industry
best
practices for vulnerability management are updated (for example, the OWASP
Guide,
SANS CWE Top 25, CERT Secure Coding, etc.), the current best practices must
be used
for these requirements.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi!
I've already written the URL to OWASP.
PCI DSS v3 states in section 6.5
Develop applications based on secure coding guidelines.
Secure coding guidelines in this case is to not use htmlentities in this
context. If you already violate this requirement, why would you expect
PHP to un-violate it for you?
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi Stas,
On Tue, Feb 4, 2014 at 7:24 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:
I've already written the URL to OWASP.
PCI DSS v3 states in section 6.5
Develop applications based on secure coding guidelines.
Secure coding guidelines in this case is to not use htmlentities in this
context. If you already violate this requirement, why would you expect
PHP to un-violate it for you?
I'm lost here.
OWASP suggests to escape at least
& --> &
< --> <
--> >
" --> "
' --> ' ' not recommended because its not in the HTML spec
(See: section 24.4.1) ' is in the XML and XHTML specs.
/ --> / forward slash is included as it helps end an HTML entity
https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)Prevention_Cheat_Sheet#RULE.231_-_HTML_Escape_Before_Inserting_Untrusted_Data_into_HTML_Element_Content
I'm not sure why you state "already violate this requirement".
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Yasuo Ohgaki wrote:
I'm lost here.
OWASP suggests to escape at least& --> &
< --> <--> >
" --> "
' --> ' ' not recommended because its not in the HTML spec
(See: section 24.4.1) ' is in the XML and XHTML specs.
/ --> / forward slash is included as it helps end an HTML entityhttps://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)Prevention_Cheat_Sheet#RULE.231_-_HTML_Escape_Before_Inserting_Untrusted_Data_into_HTML_Element_Content
I'm not sure why you state "already violate this requirement".
It may be that what you are asking for is a flag on htmlentities for 'OWASP'
compliant option. Others would probably view that as not then being html5
compliant since html5 has it's own list of 'escaped' characters. One of the
irritating things I find is 'unescaping' a string does not return the original
string simply because the html5 rule has not been followed! A clean html5 result
should be the default.
Looking at the Rule 2 from the OWASP they are actually asking for every
character below 256 to be escaped when used in an attribute! But the important
thing here is 'untrusted' data, and sanitising any externally supplied data
needs a little more care than simply trying to wrap it in htmlentities which I
think is what Stas is saying? Personally I try to avoid any path where input can
be processed direct back to output, filter the input, don't simply try and patch
the output?
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Hi Lester,
Yasuo Ohgaki wrote:
I'm lost here.
OWASP suggests to escape at least& --> &
< --> <--> >
" --> "
' --> ' ' not recommended because its not in the HTML spec
(See: section 24.4.1) ' is in the XML and XHTML specs.
/ --> / forward slash is included as it helps end an HTML
entityhttps://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)Prevention_Cheat_Sheet#RULE.231_-_HTML_Escape_Before_Inserting_Untrusted_Data_into_HTML_Element_Content
I'm not sure why you state "already violate this requirement".
It may be that what you are asking for is a flag on htmlentities for 'OWASP'
compliant option. Others would probably view that as not then being html5
compliant since html5 has it's own list of 'escaped' characters. One of the
irritating things I find is 'unescaping' a string does not return the
original string simply because the html5 rule has not been followed! A clean
html5 result should be the default.
OWASP compliance focuses on the special characters which are the same
regardless of HTML spec. What is output MAY differ which is why it
suggests something like hex encoding where differences between specs
exist.
Looking at the Rule 2 from the OWASP they are actually asking for every
character below 256 to be escaped when used in an attribute! But the
important thing here is 'untrusted' data, and sanitising any externally
supplied data needs a little more care than simply trying to wrap it in
htmlentities which I think is what Stas is saying? Personally I try to avoid
any path where input can be processed direct back to output, filter the
input, don't simply try and patch the output?
It's not a question of validating/filtering input. Handling input
get's it into the application where Mystery Process 1 - Infinity are
performed. Who knows what these Mystery Processes do? I don't - I'm
not writing everyones application for them! They could be grabbing
data, transforming it, reading from the database, using a Composer
package replaced en route by the NSA, etc. Ergo, we escape on output
to HTML/JSON at all times and without exception. The same way we
escape on output and without exception when the output target is a
database. Input and Output are like borders - nobody gets across them
without a customs check. It may seem unnecessary at times but that's
because most of the point is to consistent to a fault to eliminate the
risk of any errors in those Mystery Processes and to guarantee that
the correct escaping is performed - DB? JSON? HTML? XML? RPC? Command
Line?
Also helps not having to dissect every single application route just
to figure out every input's output encoding... That just drives me
nuts, and I have seen it. It's easy to forget sometimes that other
people have to maintain and audit your applications at times, so go
easy on them!
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative
Hi Lester,
Yasuo Ohgaki wrote:
I'm lost here.
OWASP suggests to escape at least& --> &
< --> <--> >
" --> "
' --> ' ' not recommended because its not in the HTML spec
(See: section 24.4.1) ' is in the XML and XHTML specs.
/ --> / forward slash is included as it helps end an HTML
entityhttps://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)
Prevention_Cheat_Sheet#RULE.231_-HTML_Escape_Before
Inserting_Untrusted_Data_into_HTML_Element_ContentI'm not sure why you state "already violate this requirement".
It may be that what you are asking for is a flag on htmlentities for
'OWASP' compliant option. Others would probably view that as not then being
html5 compliant since html5 has it's own list of 'escaped' characters. One
of the irritating things I find is 'unescaping' a string does not return
the original string simply because the html5 rule has not been followed! A
clean html5 result should be the default.Looking at the Rule 2 from the OWASP they are actually asking for every
character below 256 to be escaped when used in an attribute! But the
important thing here is 'untrusted' data, and sanitising any externally
supplied data needs a little more care than simply trying to wrap it in
htmlentities which I think is what Stas is saying? Personally I try to
avoid any path where input can be processed direct back to output, filter
the input, don't simply try and patch the output?
I suggests "Validate all inputs if they meet spec" and "Escape/Use secure
and proper API".
Currently PHP lacks these APIs. JavaScript literal escape is one of them.
As you can see in
my blog, escaping is not a simple task to do. There should be APIs for it.
I'm willing to address this issue and there is a RFC for it already.
https://wiki.php.net/rfc/escaper
I just don't have time for it :)
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Stas,
Hi!
Some users has to confirm standard like PCI DSS.
PCI DSS requires to follow security standards and guidelines from OWASP,
SANS, etc.Why not make PHP standard compliant?
It does not hart existing applications at all and this is simple enough
change.I'm sorry, could you please quote me a standard that requires PHP to
escape / in function called htmlentites? If there's no such standard,
the argument of "but the standard requires it" is void. No standard can
require you to use htmlentites where it should not be used. Putting
stuff into language just because somebody in the internet mentioned in
different context that it might be a good idea - is not. We should
understand why it is done and why it is a good idea, especially when
we're talking about security. In this case, the proposed use case should
never be used with htmlentities, due to obvious gaping security hole.
Adding code to enable such scenario is just not right. Instead, we
should tell people "Never ever do it. Ever.".
There is far too much going on here...
- Bear in mind that
htmlentities()
andhtmlspecialchars()
are
equivalent for HTML special characters. - PCI DSS is a real standard that real people apply in real
applications. You can google it if you don't believe me. It's at
version 3.0. - PCI DSS specifically notes OWASP guides as a source of best
practice as part of Requirement 6 (which covers XSS among other
things). - The OWASP guide for XSS mentions escaping the forward slash.
- We do not currently escape the forward slash.
While I'm dubious about forward slash escaping myself and think it
might have been OWASP veering into overkill, it doesn't change the
fact that Yasuo's argument is perfectly sound. Nor does it change the
state of single quote escaping which is very obvious out of sync with
best practice.
Ignoring all this is tantamount to telling people not to use
htmlspecialchars()
or htmlentities()
at all. As in, never! Which,
coincidentally, is almost the current best practice in PHP where we
wrap those functions for the sake of insulating ourselves from its
more colourful behaviours. Yep, the standard is not to use these
functions as-is. It would be nice to see them fixed or replaced, but
I'm not holding my breath.
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative
Hi Padraic,
On Tue, Feb 4, 2014 at 7:31 AM, Pádraic Brady padraic.brady@gmail.comwrote:
While I'm dubious about forward slash escaping myself and think it
might have been OWASP veering into overkill,
Yes they are. They are very conservative to security.
For example, they suggest to escape almost all char by applying
HEX escape for JavaScript string literals.
It may be too much, but I'm sure it's more secure.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
On Tue, Feb 4, 2014 at 7:31 AM, Pádraic Brady padraic.brady@gmail.comwrote:
While I'm dubious about forward slash escaping myself and think it
might have been OWASP veering into overkill,Yes they are. They are very conservative to security.
For example, they suggest to escape almost all char by applying
HEX escape for JavaScript string literals.It may be too much, but I'm sure it's more secure.
If anyone interested in PHP implementation of OWASP suggested JavaScript
string literal escape. This is my blog for it.
http://blog.ohgaki.net/javascript-string-escape
It's written in Japanese. You may recognize PHP code and use google
translate or like if you would like to read content.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi,
Some thoughts on this. So far as I see it, the point of adding /
escaping would be for unquoted attribute values. However, there's
nothing we can really do for unquoted attribute values - the moment
someone adds a space, they're broken. Unquoted attribute values being
legal in HTML5 is, IMO, a good thing. They're nice when authoring HTML,
as this:
<input type="text" name="username" id="username"
placeholder="Username" />
Can become this:
<input type=text name=username id=username placeholder=Username>
However, you should never be putting user input as an attribute value
without quotes around it. I don't think adding / here is a good idea.
With or without it, code which does <input foo=<?=$bar?> /> will be
vulnerable anyway. But if we add it, people might think doing that is
safe, which would be worse.
Andrea Faulds
http://ajf.me/
Hi,
Doing a bit of due diligence, the reason the forward slash was added
was to prevent any possibility of someone introducing a Javascript
comment into an attribute. It's very sketchy, but the theory is that
since html escaping won't escape /, using html escaping on a
javascript attribute like onmouseover might allow the browser to
interpret a comment, disregard the terminating quotes of an attribute,
and then inject HTML. Personally, it seems a bit garbled - user input
in a Javascript attribute should be escaped as a Javascript string
literal with no input injected as actual Javascript code - so it would
require both a severe browser parsing issue AND a lack of proper
contextual escaping.
Paddy
Hi Padraic,
On Wed, Feb 5, 2014 at 6:22 AM, Pádraic Brady padraic.brady@gmail.comwrote:
Doing a bit of due diligence, the reason the forward slash was added
was to prevent any possibility of someone introducing a Javascript
comment into an attribute. It's very sketchy, but the theory is that
since html escaping won't escape /, using html escaping on a
javascript attribute like onmouseover might allow the browser to
interpret a comment, disregard the terminating quotes of an attribute,
and then inject HTML. Personally, it seems a bit garbled - user input
in a Javascript attribute should be escaped as a Javascript string
literal with no input injected as actual Javascript code - so it would
require both a severe browser parsing issue AND a lack of proper
contextual escaping.
It depends on how HTML parser parse HTML.
<tag onmouseover="user_code_here; /*"><tag foo="*/; evil_code_here;">
<tag onmouseover=user_code_here;/* ><tag foo=*/; evil_code_here; >
If parser aware the "/*" as JS comment, then it can be attacked.
'/' escape would prevent malicious code to attack when user has sloppy
validation and/or filter.
(HTML parser should parse HTML document as HTML at first and should not
recognize JS elements and/or anything else, but who knows every parser out
there is implemented correctly)
I've added deprecation of ENT_COMPAT/ENT_QUOTES to the RFC.
Is it ready to vote?
No more issues to discuss?
Anyone?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Yasuo,
Is it ready to vote?
No more issues to discuss?
Anyone?
I would remove all mention of htmlentities()
other than briefly noting
that it would also be changed as part of the RFC. The rationale is
that the proper escaping function for HTML is htmlspecialchars()
(so
emphasise that function). htmlentities()
escapes anything with a
suitable HTML entity including non-special UTF-8 characters, i.e. it's
overkill and it disproportionately increases output size in
non-English languages such as Gaelic. The use of htmlentities()
is
just a senseless bad habit by English speaking programmers based on
historic ties to non-Unicode output that needs to die already:
http://stackoverflow.com/questions/12648655/html-encoding-of-japanese-text
I would also split the vote into three sections:
- Should we escape single quotes by default?
- Should we escape forward slashes by default?
- Should we deprecate
ENT_COMPAT
and ENT_QUOTES?
The main risk I'd see is if people don't won't to escape forward slash
and kill the entire RFC over that one change.
You also don't mention single quotes anywhere in the RFC ;). You
should note that with an example so voters know it will be encoded by
default.
Paddy
--
Pádraic Brady
http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative