Resolution for ver_export()/addslashes() encoding based script execution attack?

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi all,

Since this RFC is declined,

https://wiki.php.net/rfc/multibyte_char_handling

We need another short term resolution for it at least.
Any suggestions?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Nikita Popov — view source — reply

unread

Hi all,

Since this RFC is declined,

https://wiki.php.net/rfc/multibyte_char_handling

We need another short term resolution for it at least.
Any suggestions?

Quoting from another thread:

I'd like to start off by saying that I disagree with your premise that
this is a security vulnerability that needs to be fixed quickly and across
all supported versions. As far as I can see the issue is somebody using
addslashes() in an inappropriate context - this is a vulnerability in the
application, not PHP. This is a lot like saying that we have an RCE
vulnerability in eval() because someone had the genius idea of putting
eval($_GET['str']) in his or her code.

There is no vulnerability here as far as PHP is concerned. As such there is
no need for a short term resolution.

Nikita

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Nikita,

Hi all,

Since this RFC is declined,

https://wiki.php.net/rfc/multibyte_char_handling

We need another short term resolution for it at least.
Any suggestions?

Quoting from another thread:

I'd like to start off by saying that I disagree with your premise that
this is a security vulnerability that needs to be fixed quickly and across
all supported versions. As far as I can see the issue is somebody using
addslashes() in an inappropriate context - this is a vulnerability in the
application, not PHP. This is a lot like saying that we have an RCE
vulnerability in eval() because someone had the genius idea of putting
eval($_GET['str']) in his or her code.

There is no vulnerability here as far as PHP is concerned. As such there
is no need for a short term resolution.

This kind of bugs are considered as vulnerability. There are number of
them. Examples are encoding based JavaScript/SQL injections. I'm not sure
what do you mean by "not a vulnerability".

using addslashes() in an inappropriate context - this is a vulnerability
in the application, not PHP.

This is incorrect. PHP supports SJIS and others even in engine.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Stas Malyshev — view source — reply

unread

Hi!

This is incorrect. PHP supports SJIS and others even in engine.

I am not sure what kind of support you're referring to, what support the
engine has for SJIS? Do you mean input filters? Those are just to read
script code, AFAIK, aren't they?

I think we need to face the fact that the scenario you are proposing is
at least perceived as a rare case which is encountered in rare
environment and is enabled by a shoddy code. Thus, proposing
global-level engine changes for it are unlikely to gain consensus.

Looking at the RFC, it makes claims of addslashes etc. being insecure,
and other functions being insecure and unreliable, without giving proper
examples and explanations of the context. I have a feeling that once
people learn the context, they feel the claims in the RFC are
overreaching and the solution proposed makes too many changes for the
case that is perceived as too rare and need sloppy coding to enable.

I would suggest to go back to the use case here and consider this:

Do we have a use case that is hard to handle in user's code?
What exactly makes it hard to handle - which capabilities for the
user are missing? Is it access to certain information, certain
algorithms, lack of knowledge?
What is the minimal set of actions we'd have to take to make it
easier? What is the set of actions that would cover 80% of user's needs?

I think once we have that, we have better chance at arriving at some
resolution that would be more acceptable.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Stas,

On Tue, Feb 25, 2014 at 3:47 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:

This is incorrect. PHP supports SJIS and others even in engine.

I am not sure what kind of support you're referring to, what support the
engine has for SJIS? Do you mean input filters? Those are just to read
script code, AFAIK, aren't they?

Engine supports SJIS/other encoding script execution. Therefore,
addslashes()/var_export() behavior is security vulnerability just like
encoding based SQL/JavaScript injections.

Databases/browsers fixed these issues as vulnerability.

I think we need to face the fact that the scenario you are proposing is

at least perceived as a rare case which is encountered in rare
environment and is enabled by a shoddy code. Thus, proposing
global-level engine changes for it are unlikely to gain consensus.

Looking at the RFC, it makes claims of addslashes etc. being insecure,
and other functions being insecure and unreliable, without giving proper
examples and explanations of the context. I have a feeling that once
people learn the context, they feel the claims in the RFC are
overreaching and the solution proposed makes too many changes for the
case that is perceived as too rare and need sloppy coding to enable.

Although, knowledgeable attacker would figure out how. I don't see the
point disclosing details of vulnerability in public before fixing it. It is
not a proper way to do. (I might have posted details in ML, though)

I would suggest to go back to the use case here and consider this:

Do we have a use case that is hard to handle in user's code?

Writing their own var_export() is not simple task.
In contrast, fixing them in PHP is easy by using mbstring.

What exactly makes it hard to handle - which capabilities for the
user are missing? Is it access to certain information, certain
algorithms, lack of knowledge?

Same as 1. Writing their own var_export() is not simple task.
addslashes()/var_export() do not recognize char encoding, just like old
database escape functions.

What is the minimal set of actions we'd have to take to make it
easier? What is the set of actions that would cover 80% of user's needs?

Ideally, modifying addslashes()/var_export() is the best as users don't
have to change function name and no new functions.

Adding functions is the second. It close to modifying
addslashes()/var_export(), since it requires adding extra parameter.
However, there would be 2 codes mostly the same with this. The size of
duplicated code is not small. This is not good at all. It would be better
to pass function pointer internally. i.e. Pass function pointer to make it
multibyte aware.

I think once we have that, we have better chance at arriving at some

resolution that would be more acceptable.

Regarding details of this issue, I just think it's not right thing to do
disclosing detail of fatal vulnerability before fixing. However, I don't
mind much disclosing details of attack method because it is well known to
certain people already, anyway.

I think I've posted enough info to security@php.net. Is it okay to post it
here?

Regards,

P.S. We are not users, but developers. We should consider only
consequences, i.e. number of affected users, like hood, etc do not matter.
That said, resolution could vary. Even for fatal issue, we may choose to
document it as limitation if resolution is not feasible.

We have two feasible resolutions for this issue, change functions or add
new functions. IMO. Although, I would not suggest, we may ask users to
handle it by themselves.

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Stas Malyshev — view source — reply

unread

Hi!

Engine supports SJIS/other encoding script execution. Therefore,

I'm not sure what you mean by "other encoding script execution". Engine
execution is the same regardless of the data you use, nowhere in the
opcodes there's any encoding that is important.

addslashes()/var_export() behavior is security vulnerability just like
encoding based SQL/JavaScript injections.

I don't see how it is "therefore". If there's SQL injection or JS
injection that is not the result of wrong code, then let's outline them
and consider them. Specifically for var_export, which behavior is
broken? I see no mention of the specific examples in the RFC.

Databases/browsers fixed these issues as vulnerability.

Which "these issues"? Databases and browsers don't have var_export()
function.

Although, knowledgeable attacker would figure out how. I don't see the
point disclosing details of vulnerability in public before fixing it. It

We can take it to security list if you think it's too sensitive, or in
private bug. But if we have RFC that say var_export() "lacks proper
multibyte handling" but doesn't say what is expected from it, it's hard
to accept it.

Writing their own var_export() is not simple task.
In contrast, fixing them in PHP is easy by using mbstring.

Why would one need to write their own var_export? What are "them" that
are easy to fix by using mbstring?

addslashes()/var_export() do not recognize char encoding, just like old
database escape functions.

OK, this is a bit more. Why var_export needs to "recognize char
encoding" and what it means for var_export to "recognize char encoding"?

Regarding details of this issue, I just think it's not right thing to do
disclosing detail of fatal vulnerability before fixing. However, I don't

I think many people (myself included) do not see "fatal vulnerability"
being there. If you have some details and feel it's not good to disclose
it you could send it to security@php.net or privately to me or open
private bug. Since I don't know what these details are I rely here on
your judgement.

I think I've posted enough info to security@php.net
mailto:security@php.net. Is it okay to post it here?

Yes, but I don't remember the details about var_export, etc. there.
Maybe I missed the email, could you forward it to me privately then?

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by johannes@schlueters.de — view source — reply

unread

Hi!

Engine supports SJIS/other encoding script execution. Therefore,

I'm not sure what you mean by "other encoding script execution". Engine
execution is the same regardless of the data you use, nowhere in the
opcodes there's any encoding that is important.

I think Yasuo refers to
php -d zend.script_encoding=SJIS

addslashes()/var_export() behavior is security vulnerability just like
encoding based SQL/JavaScript injections.

I don't see how it is "therefore". If there's SQL injection or JS
injection that is not the result of wrong code, then let's outline them
and consider them. Specifically for var_export, which behavior is
broken? I see no mention of the specific examples in the RFC.

i think Yasuo assumes that results from those operations with a
script_encoding setting would be handled "correctly".

I don't think we can do that, though. zend.script_encoding is a hardly
documented feature which should be used with care.

The documentation of addslashes() refers to "characters". I don't think
the behavior should depend on PHP settings but like all "basic"
functions assume on ASCII compatible single byte strings. Adding magic
there makes it more confusing and harder to use. As said in a previous
discussion on this topic rather than using addslashes users should use
context-aware escaping functions. Using addslashes is almost always a
bad idea.

The situation around var_export() is a bit more complicated.
var_export() is used to create application configuration, cache data
etc. so one might expect the PHP which created that to be able to read
that, again. Doing this isn't easy, though, as it makes the generated
file non-portable.

Adding "magic" into probably security relevant features is problematic
and unless the engine is truly encoding-aware I'd abstain from such
changes.

johannes

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Johannes,

On Tue, Feb 25, 2014 at 8:39 PM, Johannes Schlüter
johannes@schlueters.dewrote:

Hi!

Engine supports SJIS/other encoding script execution. Therefore,

I'm not sure what you mean by "other encoding script execution". Engine
execution is the same regardless of the data you use, nowhere in the
opcodes there's any encoding that is important.

I think Yasuo refers to
php -d zend.script_encoding=SJIS

Yes, it is.
There are users that uses SJIS and SJIS like encoding for PHP scripts. This
is nice feature for Windows users in east Asia.

addslashes()/var_export() behavior is security vulnerability just like

encoding based SQL/JavaScript injections.

I don't see how it is "therefore". If there's SQL injection or JS
injection that is not the result of wrong code, then let's outline them
and consider them. Specifically for var_export, which behavior is
broken? I see no mention of the specific examples in the RFC.

i think Yasuo assumes that results from those operations with a
script_encoding setting would be handled "correctly".

I don't think we can do that, though. zend.script_encoding is a hardly
documented feature which should be used with care.

The documentation of addslashes() refers to "characters". I don't think
the behavior should depend on PHP settings but like all "basic"
functions assume on ASCII compatible single byte strings. Adding magic
there makes it more confusing and harder to use. As said in a previous
discussion on this topic rather than using addslashes users should use
context-aware escaping functions. Using addslashes is almost always a
bad idea.

I think many users realize this thanks to magic_quotes_gpc ;) Therefore,
many users know there would be issue with addslashes(), but not for
var_export(), I suppose.

The situation around var_export() is a bit more complicated.

var_export() is used to create application configuration, cache data
etc. so one might expect the PHP which created that to be able to read
that, again. Doing this isn't easy, though, as it makes the generated
file non-portable.

It would be portable as long as user uses correct encoding. The default
would be UTF-8 for 5.6 and ASCII for others in case if we modify
addslashes()/var_export(). (If mb_*(), internal_encoding setting is used)
Users who are not affected by this issue can use the default.

For affected users, all we have to do is ask users to specify proper
encoding so that php_addslashes() would not break structure of var_export()
data for certain encoding.

As you know, all databases' escaping functions have encoding parameter.
PostgreSQL uses encoding parameter stored in db connection structure. This
is the reason why pg_escape_string() has optional database base connection
parameter for escaping.

Adding "magic" into probably security relevant features is problematic

and unless the engine is truly encoding-aware I'd abstain from such
changes.

I agree. For example, pgsq/mysql/mysqli module does some magic by using and
assuming default connection's encoding parameter may be used to escape
string properly. If users are using multiple connections with different
encoding, it may cause problem. It wouldn't cause problem almost always
because chances are rare that users are using multiple encoding at once,
though.

Anyway, if we add mb_*() or modify existing functions, user has to specify
correct encoding when it differs from default encoding ("default_charset"
for 5.6 and up, mb_internal_encoding for less)

Although there would be code duplication, adding mb_*() seems to be the
best choice. Duplication may be removed with a little changes in API. This
cannot happen in released versions, but it is possible for 5.6. I should
write this in the RFC.

If this is fixed in all released versions, it's nicer for users and good
for us, but I'm not against fix this issue only in 5.6 and warn affected
users hoping nobody will be attacked.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Lester Caine — view source — reply

unread

Yasuo Ohgaki wrote:

As you know, all databases' escaping functions have encoding parameter.
PostgreSQL uses encoding parameter stored in db connection structure. This
is the reason why pg_escape_string() has optional database base connection
parameter for escaping.

On the whole any database access I'm doing with Firebird is done using
parameters which are handled in the database connection rather than having to
worry about many of these sorts of 'protections'. The result for me is that I
don't have to worry about many of the problems the more lax handling of data in
MySQL can create. But the more important thing here is that I've not used a
'locale' other than UTF8 for websites for many years and so the whole underlying
structure needs fixing rather than trying to patch small areas that are better
handled by doing the job correctly in the first place!

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Lester,

Yasuo Ohgaki wrote:

As you know, all databases' escaping functions have encoding parameter.
PostgreSQL uses encoding parameter stored in db connection structure. This
is the reason why pg_escape_string() has optional database base connection
parameter for escaping.

On the whole any database access I'm doing with Firebird is done using
parameters which are handled in the database connection rather than having
to worry about many of these sorts of 'protections'. The result for me is
that I don't have to worry about many of the problems the more lax handling
of data in MySQL can create. But the more important thing here is that I've
not used a 'locale' other than UTF8 for websites for many years and so the
whole underlying structure needs fixing rather than trying to patch small
areas that are better handled by doing the job correctly in the first place!

We cannot force users to use Unicode for database/file/etc ;)
I'm not proposing use of locale, but new escape API that support multibyte
encoding.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Lester Caine — view source — reply

unread

Yasuo Ohgaki wrote:

    As you know, all databases' escaping functions have encoding parameter.
    PostgreSQL uses encoding parameter stored in db connection structure. This
    is the reason why pg_escape_string() has optional database base connection
    parameter for escaping.


On the whole any database access I'm doing with Firebird is done using
parameters which are handled in the database connection rather than having
to worry about many of these sorts of 'protections'. The result for me is
that I don't have to worry about many of the problems the more lax handling
of data in MySQL can create. But the more important thing here is that I've
not used a 'locale' other than UTF8 for websites for many years and so the
whole underlying structure needs fixing rather than trying to patch small
areas that are better handled by doing the job correctly in the first place!

We cannot force users to use Unicode for database/file/etc ;)
I'm not proposing use of locale, but new escape API that support multibyte
encoding.

My point Yasuo is that I think the reason this has been rejected is simply
because there needs to be a more comprehensive review of how 'multi_byte'
characters are handled? There needs to be at least some agreement moving forward
if PHP should be made fully UTF-8 complaint or the alternative 'ASCII only' for
code, so that perhaps 'mb_' SHOULD be the only way to handle multi byte strings?
Short term fixes are just exacerbating the real problem? Also I don't see how
the RFC addressed the problem anyway. While I use UTF-8 output exclusively I've
not had to resort to mbstring extension to do so yet so would not even use the
extra functions in normal production.

--
Lester Caine - G8HFL

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Lester,

Yasuo Ohgaki wrote:
    As you know, all databases' escaping functions have encoding
parameter.
PostgreSQL uses encoding parameter stored in db connection
structure. This
is the reason why pg_escape_string() has optional database base
connection
parameter for escaping.
On the whole any database access I'm doing with Firebird is done using
parameters which are handled in the database connection rather than
having
to worry about many of these sorts of 'protections'. The result for
me is
that I don't have to worry about many of the problems the more lax
handling
of data in MySQL can create. But the more important thing here is
that I've
not used a 'locale' other than UTF8 for websites for many years and
so the
whole underlying structure needs fixing rather than trying to patch
small
areas that are better handled by doing the job correctly in the first
place!

We cannot force users to use Unicode for database/file/etc ;)
I'm not proposing use of locale, but new escape API that support multibyte
encoding.
My point Yasuo is that I think the reason this has been rejected is simply
because there needs to be a more comprehensive review of how 'multi_byte'
characters are handled? There needs to be at least some agreement moving
forward if PHP should be made fully UTF-8 complaint or the alternative
'ASCII only' for code, so that perhaps 'mb_' SHOULD be the only way to
handle multi byte strings? Short term fixes are just exacerbating the real
problem? Also I don't see how the RFC addressed the problem anyway. While I
use UTF-8 output exclusively I've not had to resort to mbstring extension
to do so yet so would not even use the extra functions in normal production.

IMO. PHP core should only support UTF-8 with simple/fast Unicode library.
Use UTF-8 internally, allow any encoding for I/O. This will makes many
users life easier.

so that perhaps 'mb_' SHOULD be the only way to handle multi byte strings?

I agree mostly. It would be feasible solution for now. I don't care what
module/feature is going to handle multibyte strings, but we need multiple
encoding support for I/O related functions at least. Escape function is one
of them. Escaped data is escaped to be outputted somewhere. e.g. That's the
reason why we have multiple encoding support for htmlspecialchars(). (We
may force to generate all text with UTF-8, then convert when it is
outputted, but this is BC.)

We cannot force users to use "\n" as line break. Like line break, we cannot
force users to specific encoding for I/O. OSX filename is one of them, too.
We cannot force OSX to use NFC file name. There are things like these.

Short term fixes are just exacerbating the real problem?

I don't think so. It's just an simple addition to mbstirng. We already have
various mess around multibyte support. e.g. When we need to know multbyte
character length, there is only mb_strlen(). Adding required function for
security weighs. Besides, users who don't use mb_*() do not have to care at
all. Framework developers that support i18n should care, though.

IMHO, both short and long term resolution is good enough. It does not harm
nor affect future PHP development at all. It's a module anyway.

I don't see how the RFC addressed the problem anyway.

Please research how databases were fixed this issue many years ago. I don't
remember well, but I guess it was around 2005.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by padraic.brady@gmail.com — view source — reply

unread

Hi,

I don't see how the RFC addressed the problem anyway.

Please research how databases were fixed this issue many years ago. I don't
remember well, but I guess it was around 2005.

I have a vague recollection of issues, but since there's little
specific detail on this (as it pertains to PHP) publicly it's
impossible for most of us to assess what the problem may be. It's even
stranger to see a secret security report being RFC'd publicly, with
the attendant discussions on list, which appears to go against
responsible disclosure if one can put two and two together in a Eureka
moment. It just spreads a lot of doubt and confusion to no end.

Paddy

--
Pádraic Brady

http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Padraic,

On Fri, Feb 28, 2014 at 7:22 AM, Pádraic Brady padraic.brady@gmail.comwrote:

I don't see how the RFC addressed the problem anyway.

Please research how databases were fixed this issue many years ago. I
don't
remember well, but I guess it was around 2005.

I have a vague recollection of issues, but since there's little
specific detail on this (as it pertains to PHP) publicly it's
impossible for most of us to assess what the problem may be. It's even
stranger to see a secret security report being RFC'd publicly, with

Right. This kind of discussion should be done in closed list.

the attendant discussions on list, which appears to go against
responsible disclosure if one can put two and two together in a Eureka
moment. It just spreads a lot of doubt and confusion to no end.

For the time being, I suggest look for the details of char encoding based
SQL/JavaScript injections. The basic is the same.

Regards,

P.S. Are we really going to discuss this kind of discussion in public?
Can't we just discuss implementation?

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Pierre Joye — view source — reply

unread

hi,

P.S. Are we really going to discuss this kind of discussion in public?

Yes, these issues are public anyway. Alone the RFC and these
discussions provide enough information to anyone willing to know or do
more about them.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

11 years ago by Stas Malyshev — view source — reply

unread

Hi!

The situation around var_export() is a bit more complicated.
var_export() is used to create application configuration, cache data
etc. so one might expect the PHP which created that to be able to read
that, again. Doing this isn't easy, though, as it makes the generated
file non-portable.

Are you suggesting if var_export generates the data it may not be
readable by standard PHP? Or by PHP running with specific
script_encoding like SJIS? If the latter, I think var_export to generate
valid SJIS data is hard to achieve, since SJIS is not ASCII-compatible.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Stas,

On Wed, Feb 26, 2014 at 8:52 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:

The situation around var_export() is a bit more complicated.
var_export() is used to create application configuration, cache data
etc. so one might expect the PHP which created that to be able to read
that, again. Doing this isn't easy, though, as it makes the generated
file non-portable.

Are you suggesting if var_export generates the data it may not be
readable by standard PHP? Or by PHP running with specific
script_encoding like SJIS? If the latter, I think var_export to generate
valid SJIS data is hard to achieve, since SJIS is not ASCII-compatible.

I think you've mailed this before reading my mail to you.

PHP supports SJIS and the like. Escape functions should provide safe
escaping like databases. The only way to solve this issue is encoding aware
escape which databases adopted years ago. I'm proposing known vulnerability
with known method to fix.

BTW, users who do not have to worry about this are not affected by proposed
change.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Stas Malyshev — view source — reply

unread

Hi!

PHP supports SJIS and the like. Escape functions should provide safe
escaping like databases. The only way to solve this issue is encoding

You are making global claims which sound plausible on surface, but when
you put them in context, it gets much more complicated. PHP supports
SJIS, yes - just as it supports any other existing encoding. That does
not mean every scenario possible is automatically working or can be made
to work consistently. Specifically, SJIS encoding is not compatible with
ASCII encoding, which means same function with same parameters would not
work, moreover - choosing parameters which would work in every situation
may be quite problematic.

aware escape which databases adopted years ago. I'm proposing known
vulnerability with known method to fix.

Encoding-aware inside PHP means different thing than inside the
database. The database has only one encoding within the session and
doesn't have to deal with anything else. PHP has much more to deal with,
especially in scenarios you are proposing - where you are writing out
PHP scripts to be executed later. Your solution assumes one set of
environment settings and will break in another, current solution works
in one set of environment settings and will break in another. So I'm not
sure we'll be making much of an improvement there by switching set of
cases where it will break.

BTW, users who do not have to worry about this are not affected by
proposed change.

That is not the reason to accept the change. It is a necessary, but not
a sufficient condition for acceptance.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Stas,

On Sat, Mar 1, 2014 at 3:57 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:

PHP supports SJIS and the like. Escape functions should provide safe
escaping like databases. The only way to solve this issue is encoding

You are making global claims which sound plausible on surface, but when
you put them in context, it gets much more complicated. PHP supports
SJIS, yes - just as it supports any other existing encoding. That does
not mean every scenario possible is automatically working or can be made
to work consistently. Specifically, SJIS encoding is not compatible with
ASCII encoding, which means same function with same parameters would not
work, moreover - choosing parameters which would work in every situation
may be quite problematic.

It seems your are trying to solve issue in a way of cannot be solved.

To eliminate encoding based attacks, proper use of encoding is mandatory.
Leaving encoding setting automagic would be cause of issues. Example is use
of locale for deciding encoding in programming language. (Note: use of
locale is not always bad choice. It's good enough for applications in many
cases, but it's bad in programming language in general.)

There are many lessons with databases/browsers in past years. Why not adopt
it for PHP?

aware escape which databases adopted years ago. I'm proposing known
vulnerability with known method to fix.

Encoding-aware inside PHP means different thing than inside the
database. The database has only one encoding within the session and
doesn't have to deal with anything else. PHP has much more to deal with,
especially in scenarios you are proposing - where you are writing out
PHP scripts to be executed later. Your solution assumes one set of
environment settings and will break in another, current solution works
in one set of environment settings and will break in another. So I'm not
sure we'll be making much of an improvement there by switching set of
cases where it will break.

Like database users may have multiple connections at once and should know
what the encoding for each connection, PHP users can use multiple exported
PHP scripts with multiple encodings and PHP users should know what the
correct encoding for a script. There is a way to handle this situation
properly now. declare(encoding='NAME');

What I would like to fix is design bug of encoding support. There should be
encoding aware escape/unescape if various encodings are supported,
otherwise we open hole to attacker.

BTW, users who do not have to worry about this are not affected by
proposed change.

That is not the reason to accept the change. It is a necessary, but not
a sufficient condition for acceptance.

Necessity is clear. There is situation that attackers could execute their
script with PHP supported encoding. This is fatal bug. Counter measure is
also clear. Use of encoding aware function is mandatory. Therefore, all we
have to discuss is "implementation of encoding aware escaping/unescaping
functions".

Please note that it does not matter at all if PHP is going to support UTF-8
as internal character encoding or not. Encoding aware escape/unescape is
mandatory regardless of internal UTF-8 support because this is a issue of
input/output. i.e. If PHP could handle I/O properly or not.

Let's make it simple.

There is issue (code execution by attacker).
There is solution (encoding aware escape/unescape).
There are 2 feasible resolutions (modify existing function or add new
function)
Proper implementation may vary. (what we should discuss)

To Pierrie,

I'll post (or create new RFC) to explain details of encoding based attacks.
Knowledge about encoding based attack is mandatory for Unicode support
also.

Any more comments for disclosing attack details at point?

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

11 years ago by Yasuo Ohgaki — view source — reply

unread

Hi Stas,

On Tue, Feb 25, 2014 at 8:01 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:

Engine supports SJIS/other encoding script execution. Therefore,

I'm not sure what you mean by "other encoding script execution". Engine
execution is the same regardless of the data you use, nowhere in the
opcodes there's any encoding that is important.

addslashes()/var_export() behavior is security vulnerability just like
encoding based SQL/JavaScript injections.

I don't see how it is "therefore". If there's SQL injection or JS
injection that is not the result of wrong code, then let's outline them
and consider them. Specifically for var_export, which behavior is
broken? I see no mention of the specific examples in the RFC.

Databases/browsers fixed these issues as vulnerability.

Which "these issues"? Databases and browsers don't have var_export()
function.

There is basic realization issue. There is a class of attack referred as
encoding based attack. Basic principle is the same regardless of
system/program.

All injection attacks are involves with instruction and data. Injection
attacks are attack that injects instruction to data. e.g. buffer overflow,
javascript injection, sql injection, etc. Encoding based attacks are used
for text interface systems that supports certain encoding with certain
escape method. Notably, SJIS/etc and \ escape.

Although, knowledgeable attacker would figure out how. I don't see the

point disclosing details of vulnerability in public before fixing it. It

We can take it to security list if you think it's too sensitive, or in
private bug. But if we have RFC that say var_export() "lacks proper
multibyte handling" but doesn't say what is expected from it, it's hard
to accept it.

I think you've missed my mails in security@php.net.

Writing their own var_export() is not simple task.

In contrast, fixing them in PHP is easy by using mbstring.

Why would one need to write their own var_export? What are "them" that
are easy to fix by using mbstring?

var_export() and other functions listed in the RFC.

addslashes()/var_export() do not recognize char encoding, just like old
database escape functions.

OK, this is a bit more. Why var_export needs to "recognize char
encoding" and what it means for var_export to "recognize char encoding"?

var_export() and others have to recognize char encoding to perform escaping
properly.

Regarding details of this issue, I just think it's not right thing to do
disclosing detail of fatal vulnerability before fixing. However, I don't

I think many people (myself included) do not see "fatal vulnerability"
being there. If you have some details and feel it's not good to disclose
it you could send it to security@php.net or privately to me or open
private bug. Since I don't know what these details are I rely here on
your judgement.

If attacker provided script execution is not a fatal, what would be fatal?

I think I've posted enough info to security@php.net
mailto:security@php.net. Is it okay to post it here?

Yes, but I don't remember the details about var_export, etc. there.
Maybe I missed the email, could you forward it to me privately then?

No problem. I'll find and send it later.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

Resolution for ver_export()/addslashes() encoding based script execution attack?

I think once we have that, we have better chance at arriving at some resolution that would be more acceptable.

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

Cheers,

Are you suggesting if var_export generates the data it may not be readable by standard PHP? Or by PHP running with specific script_encoding like SJIS? If the latter, I think var_export to generate valid SJIS data is hard to achieve, since SJIS is not ASCII-compatible.

That is not the reason to accept the change. It is a necessary, but not a sufficient condition for acceptance.

I think once we have that, we have better chance at arriving at some
resolution that would be more acceptable.

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

Are you suggesting if var_export generates the data it may not be
readable by standard PHP? Or by PHP running with specific
script_encoding like SJIS? If the latter, I think var_export to generate
valid SJIS data is hard to achieve, since SJIS is not ASCII-compatible.

That is not the reason to accept the change. It is a necessary, but not
a sufficient condition for acceptance.