Hi all,
escapeshellargs()/fgetcsv() have multibyte char support via mblen() which
depends on LC_TYPE, but addslashes()
don't take into count MBCS at all.
addslashes()
could behave like escapeshellargs(). Current addslashes()
is not usable with SJIS/BIG5/etc.
Depending on locale is problematic. Is it safe with multi threaded SAPI?
Using mbstring feature and introduce encoding php.ini setting might be
better.
Any comment making addslashes()
locale aware? and/or current
escapeshellargs()/fgetcsv() implementation?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
escapeshellargs()/fgetcsv() have multibyte char support via mblen() which
depends on LC_TYPE, butaddslashes()
don't take into count MBCS at all.
addslashes()
could behave like escapeshellargs(). Currentaddslashes()
is not usable with SJIS/BIG5/etc.Depending on locale is problematic. Is it safe with multi threaded SAPI?
Using mbstring feature and introduce encoding php.ini setting might be
better.Any comment making
addslashes()
locale aware? and/or current
escapeshellargs()/fgetcsv() implementation?
I don't think locale based MBCS support is optimum, but I'll add it to
addslashes()
for now.
Question is where should I start?
It's security issue for certain char encodings such as SJIS/BIG5.
I'll fix php_addslashes(). Therefore, any functions that use it internally
are affected. e.g. var_export()
, etc. Users are not affected as long as
they are using correct locale.
Should I fix this from 5.3?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
I don't think locale based MBCS support is optimum, but I'll add it to
addslashes()
for now.
I don't think basing behaviour on the locale is a great idea (as
evidenced by the various issues with Turkish and Azeri over the
years). Could we just add an explicit encoding parameter like
htmlspecialchars()
?
Question is where should I start?
It's security issue for certain char encodings such as SJIS/BIG5.
Is there a case other than database access (where we've been directing
users to better APIs like PDO and mysqli for several years, at least)
where this is likely to cause security issues?
I'll fix php_addslashes(). Therefore, any functions that use it internally
are affected. e.g.var_export()
, etc. Users are not affected as long as
they are using correct locale.Should I fix this from 5.3?
This feels like it has the potential to be a really nasty implicit BC
break. I don't think we'd want to change default behaviour on any
stable branch, honestly.
Adam
I don't think locale based MBCS support is optimum, but I'll add it to
addslashes()
for now.I don't think basing behaviour on the locale is a great idea (as
evidenced by the various issues with Turkish and Azeri over the
years). Could we just add an explicit encoding parameter like
htmlspecialchars()
?
That's an option, but it requires a lot of work w/o mbstring being a
default "compiled" module.
Question is where should I start?
It's security issue for certain char encodings such as SJIS/BIG5.Is there a case other than database access (where we've been directing
users to better APIs like PDO and mysqli for several years, at least)
where this is likely to cause security issues?
When users are exporting PHP variables that are supposed to be parsed
as PHP (or PHP data, e.g. serialize()
), addslashes()
ed SJIS/BIG5/etc will
break string quoting with certain chars.
This can be used to execute attacker provided PHP code.
I'll fix php_addslashes(). Therefore, any functions that use it
internally
are affected. e.g.var_export()
, etc. Users are not affected as long as
they are using correct locale.Should I fix this from 5.3?
This feels like it has the potential to be a really nasty implicit BC
break. I don't think we'd want to change default behaviour on any
stable branch, honestly.
I cannot object this argument.
There may be unexpected side effects.
However, users will not be affected as long as they are using correct
locale if locale system is not broken...
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
I'll fix php_addslashes(). Therefore, any functions that use it
internallyare affected. e.g.
var_export()
, etc. Users are not affected as long as
they are using correct locale.Should I fix this from 5.3?
This feels like it has the potential to be a really nasty implicit BC
break. I don't think we'd want to change default behaviour on any
stable branch, honestly.I cannot object this argument.
There may be unexpected side effects.
However, users will not be affected as long as they are using correct
locale if locale system is not broken...
But honestly, how many users will set their locale to SJIS/BIG5 when
dealing with multibyte charsets like those? I typically never touch locales
when dealing with code that handles multibyte charsets (and that's becoming
more and more often these days), because it rarely makes sense. It makes a
lot more sense to me to deal with those issues on the intl/mb_* level and
as for serialization I hardly think that serialize()
with addslashes is a
problem PHP needs to fix. More likely the correct and better fix is for
those developers to be more charset-conscious while making decisions about
how to handle serialized data after transport.
On Tue, Dec 17, 2013 at 7:13 AM, Sherif Ramadan theanomaly.is@gmail.comwrote:
I'll fix php_addslashes(). Therefore, any functions that use it
internallyare affected. e.g.
var_export()
, etc. Users are not affected as long
as
they are using correct locale.Should I fix this from 5.3?
This feels like it has the potential to be a really nasty implicit BC
break. I don't think we'd want to change default behaviour on any
stable branch, honestly.I cannot object this argument.
There may be unexpected side effects.
However, users will not be affected as long as they are using correct
locale if locale system is not broken...But honestly, how many users will set their locale to SJIS/BIG5 when
dealing with multibyte charsets like those? I typically never touch locales
when dealing with code that handles multibyte charsets (and that's becoming
more and more often these days), because it rarely makes sense. It makes a
lot more sense to me to deal with those issues on the intl/mb_* level and
as for serialization I hardly think thatserialize()
with addslashes is a
problem PHP needs to fix. More likely the correct and better fix is for
those developers to be more charset-conscious while making decisions about
how to handle serialized data after transport.
There are many. SJIS, BIG5 and similar char encoding is used widely in East
Asia. For instance, Windows uses SJIS as default char encoding for Japanese
edition.
I bet you never touch locale since you don't have to. I suppose those who
have never touch locale will not be affected by this change.
The issue here is unreliable locale system for certain locales... Locale is
system dependent and we cannot do much for it. However, code uses locale is
in PHP already. e.g. escapeshellcmd/escapeshellarg/fgetcsv or like.
That's the reason why I decided to fix this with locale even if use of
locale is not optimum..
Alternative solution is to use mbstring.
It may be 5.6 only, but I don't mind much with this solution. In fact, I
prefer this ;)
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
The issue here is unreliable locale system for certain locales... Locale
is system dependent and we cannot do much for it. However, code uses locale
is in PHP already. e.g. escapeshellcmd/escapeshellarg/fgetcsv or like.
To be precise, users who never use zend multibyte and script encoding will
not have security issue.
User who never care about locale will not be affected by this change.
So affected users base would be limited.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net