It has come to my attention that it's possible to abuse some
autoloader by trying to autoload class names like
"Foo\Bar....\uploads\exploit.php" for example, which could be
transformed to src/Foo/Bar/../../uploads/exploits.php, and then would
require that file.
Obviously the code would most likely end up in fatal error because the
class is missing, but the file still gets required. This would be
possible if for some reason you use unvalidated user input to create
class names which I know I have done myself in APIs to deserialize
POSTed data to a class Foo\X or Foo\Y for example.
There are different ways to handle this, but the way I see it it would
be best handled at the php core level by simply preventing classes
containing dots (and possibly null bytes and other chars?) from ever
reaching the userland autoloaders. The other alternative is for every
autoloader out there to protect itself, inducing perf penalties and
having most autoloaders being vulnerable. It's a slim vector perhaps
but a valid one nonetheless, and I think it can be fixed at the source
without any BC issue since the only way to use classes with dots right
now is to use stuff like class_alias()
to define them and then use
them through strings/var names consistently, which sounds so painful I
can't imagine why one would have ever done so.
Cheers
It has come to my attention that it's possible to abuse some
autoloader by trying to autoload class names like
"Foo\Bar....\uploads\exploit.php" for example, which could be
transformed to src/Foo/Bar/../../uploads/exploits.php, and then would
require that file.
Are there codepaths where this can be injected from the outside? In the
unserializer this was prevented in
https://github.com/php/php-src/commit/ff8055fc5c9750482aac7a25a074aae0b1e64706 and some further commits, i.e.
https://github.com/php/php-src/commit/7126de4912d9d4c7499deb1f9239980400aa7ec7#diff-d697fc054b607bb0ffd7493daeb6a1afR616
Reasons against moving those in a more central place were
- Performance. checking always costs time (any autoloader is
way slower, shouldn't matter too much) - There are more esoteric ways (mostly for extensions) to create
classes with "illegal" names (only illegal class names I know are
OCI-Lob and OCI-Collection from oci8 extension) an autoloader
(inside an extension) might produce them (I think that is unlikely
to exist in reality) - Autoloader magic (developers sometimes do things I can't imagine)
- Autoloaders still have to verify (i.e. maximum length for the backend
used, some backends might require additional protection (i.e. \ needs
to be escaped))
I didn't consider the reasons to be strong, and I still don't. But maybe
I missed something.
For whitelisting "valid" classnames are described by
[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff\]*
in the parser.
johannes
On Thu, Oct 17, 2013 at 2:31 PM, Johannes Schlüter
johannes@schlueters.de wrote:
Are there codepaths where this can be injected from the outside? In the
unserializer this was prevented in
https://github.com/php/php-src/commit/ff8055fc5c9750482aac7a25a074aae0b1e64706 and some further commits, i.e.
https://github.com/php/php-src/commit/7126de4912d9d4c7499deb1f9239980400aa7ec7#diff-d697fc054b607bb0ffd7493daeb6a1afR616
Sorry when I said deserialize I meant this as a generic term not php's
unserialize()
. I mean say you have two types of something, and the API
works taking a type param + some data for it, and you create it with
something along those lines:
$class = 'Foo\Bar\Type'.$userGivenType;
$obj = new $class($userData);
In this case, the autoloader would be called with whatever the user
passed in unless you validate it, but since autoloader injection isn't
a very common vector many devs wouldn't validate this and assume a
happy path I guess.
Reasons against moving those in a more central place were
- Performance. checking always costs time (any autoloader is
way slower, shouldn't matter too much)- There are more esoteric ways (mostly for extensions) to create
classes with "illegal" names (only illegal class names I know are
OCI-Lob and OCI-Collection from oci8 extension) an autoloader
(inside an extension) might produce them (I think that is unlikely
to exist in reality)- Autoloader magic (developers sometimes do things I can't imagine)
- Autoloaders still have to verify (i.e. maximum length for the backend
used, some backends might require additional protection (i.e. \ needs
to be escaped))
IMO performance is the only valid concern, but it'll be faster done
once in the proper way in C than done in various crappy ways in every
userland autoloader, and it will also protect everyone at once. There
might be edge cases not covered by invalidating "." and "NUL", but to
keep it fast it may be best to only check for known bad chars instead
of the full regex?
Cheers
On Thu, Oct 17, 2013 at 2:31 PM, Johannes Schlüter
johannes@schlueters.de wrote:Are there codepaths where this can be injected from the outside? In the
unserializer this was prevented in
https://github.com/php/php-src/commit/ff8055fc5c9750482aac7a25a074aae0b1e64706 and some further commits, i.e.
https://github.com/php/php-src/commit/7126de4912d9d4c7499deb1f9239980400aa7ec7#diff-d697fc054b607bb0ffd7493daeb6a1afR616Sorry when I said deserialize I meant this as a generic term not php's
unserialize()
. I mean say you have two types of something, and the API
works taking a type param + some data for it, and you create it with
something along those lines:$class = 'Foo\Bar\Type'.$userGivenType;
$obj = new $class($userData);In this case, the autoloader would be called with whatever the user
passed in unless you validate it, but since autoloader injection isn't
a very common vector many devs wouldn't validate this and assume a
happy path I guess.
I know new $class() and other ways, I was more curious about the real
security impact -- usage of unchecked class names has quite a few
security implications even without autoloader. Are there "common" ways
about how unsafe class names might be injected?
An attacker could use that to trigger "bad" constructor behavior or
such.
IMO performance is the only valid concern, but it'll be faster done
once in the proper way in C than done in various crappy ways in every
userland autoloader, and it will also protect everyone at once. There
might be edge cases not covered by invalidating "." and "NUL", but to
keep it fast it may be best to only check for known bad chars instead
of the full regex?
If PHP does a "minimal" check only there's nothing won. Autoloaders
still have to check according to their needs. For example on Windows
checking : is important, else I inject $classname = "c:\foobar" as
filename and bypass one or the other autoloader. And who tells that
classes are loaded from filesystems, not database+eval, suddenly ' is
dangerous. There are tons of other examples easy to construct and we
should not pretend to be safe there.
If one takes class names from untrusted sources one has to be careful.
johannes
On Thu, Oct 17, 2013 at 3:21 PM, Johannes Schlüter
johannes@schlueters.de wrote:
I know new $class() and other ways, I was more curious about the real
security impact -- usage of unchecked class names has quite a few
security implications even without autoloader. Are there "common" ways
about how unsafe class names might be injected?An attacker could use that to trigger "bad" constructor behavior or
such.
Certainly this is possible, but it's a much slimmer risk IMO since it
would need bad code to already be loaded vs allowing the attacker to
trigger a require of an arbitrary file.
In any case the fact that this would not address every security issue
doesn't make it a worthless change I think?
If PHP does a "minimal" check only there's nothing won. Autoloaders
still have to check according to their needs. For example on Windows
checking : is important, else I inject $classname = "c:\foobar" as
filename and bypass one or the other autoloader. And who tells that
classes are loaded from filesystems, not database+eval, suddenly ' is
dangerous. There are tons of other examples easy to construct and we
should not pretend to be safe there.
OK so maybe we need to add : to the list to protect windows. I don't
think the database+eval case is realistic, I'm sure someone out there
does that but what I am trying to achieve is protecting most people
doing sane things. If you load classes from the DB then escaping the
class name is a no-brainer, and it's not a performance hit compared to
the DB IO anyway. A require() is expected to be safe, and people may
not expect php to let through such messed up "Foo..\Bar" class names
since they are pretty much never seen out in the wild. Please try to
see this in a pragmatic way. And in the worst case if select chars
isn't feasible, as I said I'm fine with filtering the whole class
identifier regex if that's the safest way. I just would like this to
be addressed in some way.
If one takes class names from untrusted sources one has to be careful.
True of course, but I don't think it's good to deflect that
responsibility to php users while there is something that can be done
at fairly low cost in the language to help them.
Cheers