Hi!
As many probably know, unserialize() has a security issue following from 
the fact that you can create objects with data from unserialize(), and 
these object may have behavior that is invoked automatically - namely 
__destruct - that can result in unintended results. See e.g. 
http://heine.familiedeelstra.com/security/unserialize among others for 
more detailed description.
So I propose a modification to unserialize(): 
https://wiki.php.net/rfc/secure_unserialize
that would make one of the common cases - serializing data to be stored 
on user side or user-accessible side - more secure by avoiding 
instantiating all object (or all objects not belonging to a whitelist) 
and keeping them as incomplete objects instead.
Comments and suggestions welcome,
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/ 
(408)454-6900 ext. 227
Stas,
Serious question: Why bother trying to clean this up? Why not just 
recommend using JSON or some other generic serialization without tieing 
into specific objects, and pushing the creation logic into userland (where 
it belongs IMHO, at least from a security perspective). At least for any 
times where serialized data may come into contact with a user...
I'm not saying that this isn't a useful addition, I'm just wondering if it 
goes down the right path of what we should be recommending to users as the 
best practice. I wonder if it'd be better to simply recommend a generic 
serialization (JSON, XML, basically something with no class-type 
information other than array, object) instead for any use-case where end 
users might even remotely be able to tamper with the data.
That's not to say serialization would be useless, not at all. But more that 
serialization (as it stands) would be recommended for server-side only 
(which it should be anyway)...
I just fear this may let some people think that serialized data is OK to 
pass to the client. Which is only true with this patch if it's implemented 
well and correctly...
Just my $0.02... Thoughts?
Anthony
On Sat, Mar 30, 2013 at 10:54 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Hi!
As many probably know,
unserialize()has a security issue following from
the fact that you can create objects with data fromunserialize(), and
these object may have behavior that is invoked automatically - namely
__destruct - that can result in unintended results. See e.g.
http://heine.familiedeelstra.com/security/unserialize among others for
more detailed description.So I propose a modification to
unserialize():
https://wiki.php.net/rfc/secure_unserializethat would make one of the common cases - serializing data to be stored
on user side or user-accessible side - more secure by avoiding
instantiating all object (or all objects not belonging to a whitelist)
and keeping them as incomplete objects instead.Comments and suggestions welcome,
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
On Sat, Mar 30, 2013 at 11:05 PM, Anthony Ferrara ircmaxell@gmail.comwrote:
Stas,
Serious question: Why bother trying to clean this up? Why not just
recommend using JSON or some other generic serialization without tieing
into specific objects, and pushing the creation logic into userland (where
it belongs IMHO, at least from a security perspective). At least for any
times where serialized data may come into contact with a user...I'm not saying that this isn't a useful addition, I'm just wondering if it
goes down the right path of what we should be recommending to users as the
best practice. I wonder if it'd be better to simply recommend a generic
serialization (JSON, XML, basically something with no class-type
information other than array, object) instead for any use-case where end
users might even remotely be able to tamper with the data.That's not to say serialization would be useless, not at all. But more that
serialization (as it stands) would be recommended for server-side only
(which it should be anyway)...I just fear this may let some people think that serialized data is OK to
pass to the client. Which is only true with this patch if it's implemented
well and correctly...Just my $0.02... Thoughts?
Anthony
I agree with Anthony on the note that serialize really only should be used 
where the serialized data is being stored server-side.
I already deal with users that believe the solution to storing compound 
data in cookies, for example, is OK to do with serialize. Unfortunately, 
for those that take on this practice and write poorly though-out code they 
are susceptible to these kinds of security vulnerabilities. Same goes for 
those who chose to continue using things like register_globals (which 
luckicly we have removed), but my point is we can't fix poor mentality. We 
can only educate others on the risks.
I think Stas proposes a solution to the problem and I think Anthony 
proposes a viable alternative. I would say that Anthony has found the 
shortest distance between the two points (the problem and the solution), 
however.
On Sat, Mar 30, 2013 at 10:54 PM, Stas Malyshev <smalyshev@sugarcrm.com
wrote:
Hi!
As many probably know,
unserialize()has a security issue following from
the fact that you can create objects with data fromunserialize(), and
these object may have behavior that is invoked automatically - namely
__destruct - that can result in unintended results. See e.g.
http://heine.familiedeelstra.com/security/unserialize among others for
more detailed description.So I propose a modification to
unserialize():
https://wiki.php.net/rfc/secure_unserializethat would make one of the common cases - serializing data to be stored
on user side or user-accessible side - more secure by avoiding
instantiating all object (or all objects not belonging to a whitelist)
and keeping them as incomplete objects instead.Comments and suggestions welcome,
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi!
I think Stas proposes a solution to the problem and I think Anthony
proposes a viable alternative. I would say that Anthony has found the
shortest distance between the two points (the problem and the solution),
however.
The fact is that people do use serialize() for data that may be 
accessible by the user (or made accessible by unrelated security issues, 
which may be upgraded in severity by this - e.g. from SQL injection to 
persistent code backdoor on the server). There are many ways to do 
things differently, and they are known. However, as I said, the fact is 
people do use serialize() and may not even realize the data aren't as 
secure as they are. That's why many security tools flag any object with 
dtor in application using unserialize as insecure. This is not a good 
situation, and presently there are no way to avoid it except dropping 
serialize() completely - which may not be an option is some cases and in 
any case would require serious changes to the production code.
This enhancement is to fix this problem. It is not to change best 
practices or give advice on how to write the most secure system - it is 
to make existing code more secure easily.
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/ 
(408)454-6900 ext. 227
This is not a good situation, and presently there are no way to
avoid it except droppingserialize()completely - which may not be
an option is some cases and in any case would require serious
changes to the production code.
And what about automatic un/serialize() of objects in $_SESSION? 
People don't even see those function calls in their code, so dropping 
the function/ality would be a wildly drastic move.
IMO, there's a minefield of "most surprise" to worry about unless you 
tread gently, as in your suggestion of an extra param. And probably 
want two optional PHP.INI settings: one for when unserialize() is 
called automatically (so you can't pass it anything), and one for when 
unserialize() is called in user code without a second argument but you 
want a default whitelist to be applied (say, to instantly "harden" a 
codebase and sort out consequences later).
-- S.
Hi!
And what about automatic un/serialize() of objects in $_SESSION?
People don't even see those function calls in their code, so dropping
the function/ality would be a wildly drastic move.
Nothing about it, the change is for unserialize() function.
tread gently, as in your suggestion of an extra param. And probably
want two optional PHP.INI settings: one for whenunserialize()is
As we learned many times in the past, behavior-changing ini settings are 
not a good idea. We have to get away from mentality of "if we need to 
modify some behavior, we just put a variable in global state to control 
it". Global state is the last resort, not the first one. Variables that 
have local influence should have local scope.
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/ 
(408)454-6900 ext. 227
And what about automatic un/serialize() of objects in $_SESSION?
People don't even see those function calls in their code, so dropping
the function/ality would be a wildly drastic move.
Nothing about it, the change is for
unserialize()function.
OK. I thought of this as one core security issue with multiple 
possible ways of getting a payload to the internal (C) unserialize 
code. (If not, guess I could draw up an RFC for the other vector.)
It is harder to inject arbitrary objects into session storage than to 
exploit blind-request-variable-unserialization type stuff (though the 
latter can be a stepping stone to the former). But the potential 
payoff in $_SESSION is so huge, I think having "secure unserialize" 
for sessions is fully justified. Otherwise you're saying "I can't 
guarantee objects with killer wakeups/dtors were not injected via one 
of the apps on my server, and I have no way to stop them from 
instantiating magically provided they get through the right way."
We have to get away from mentality of "if we need to modify some
behavior, we just put a variable in global state to control it".
Global state is the last resort, not the first one.
I guess it could be another argument to session_start() instead.
-- S.
P.S. Sure you'll shoot down this idea as well, but I think it would be 
good if Filters had a corresponding validator, such as:
FILTER_VALIDATE_UNSERIALIZED: detect strings in PHP bytestream format. 
Flags FILTER_ALLOW_SERIALIZED_SCALAR, 
FILTER_ALLOW_SERIALIZED_NONOBJECT to fine-tune.
Otherwise, if you are still expecting bytestream format from the 
client and want to detect tampering on input, you have to write a 
best-guess regex to try to differentiate between 'Some:free { text; 
}"' and 'O:8:"class":0:{}' and 'S:12...' etc.
.
Stas,
The fact is that people do use serialize() for data that may be
accessible by the user (or made accessible by unrelated security issues,
which may be upgraded in severity by this - e.g. from SQL injection to
persistent code backdoor on the server). There are many ways to do
things differently, and they are known. However, as I said, the fact is
people do useserialize()and may not even realize the data aren't as
secure as they are. That's why many security tools flag any object with
dtor in application using unserialize as insecure. This is not a good
situation, and presently there are no way to avoid it except dropping
serialize()completely - which may not be an option is some cases and in
any case would require serious changes to the production code.This enhancement is to fix this problem. It is not to change best
practices or give advice on how to write the most secure system - it is
to make existing code more secure easily.
I definitely see your point, and don't disagree with it at all. Again, my 
concern is that people will then be tempted to use serialization to the 
client as it's "safe" (with these modifications). Which I think we should 
discourage for new code....
So what if we did this: We implement your RFC, but also put a warning in 
the docs that serialize() shouldn't be used in places where a user or third 
party can modify the output (to use json_encode() for those areas). That 
way we're not encouraging serialize to be used in places it shouldn't, but 
also give those with legacy codebases or really awkward use-cases the 
ability to be "more secure"...
Thoughts?
On Sun, Mar 31, 2013 at 5:27 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:
I think Stas proposes a solution to the problem and I think Anthony
proposes a viable alternative. I would say that Anthony has found the
shortest distance between the two points (the problem and the solution),
however.The fact is that people do use
serialize()for data that may be
accessible by the user
Yeah, well, the people who do that are also the ones that are unlikely to 
make use of the new parameters to secure themselves. In order to make them 
use of the new feature they have to be explicitly educated about it and in 
that case we can just as well educate them to use json_encode. In that 
regard, I don't think this proposal is particularly useful.
JSON and serialize() are (inherently) different serialization formats with 
different use-cases. The former is rather restricted and as such safe to be 
provided by the user. The latter on the other hand aims at exactly 
replicating the structure. Using the latter format for the former task is 
in any case a bad idea. It's not like unserializing objects with dtors is 
the only issue that can turn up. serialize() also supports references and 
object-references, so one could probably use it quite easily to trigger 
some kind of infinite loop/recursion in the application.
So, I personally don't see much value in this addition. Rather it could 
provide people an excuse to use the function on user-provided data, which 
is, as already mentioned, a bad idea. Even with this additional protection. 
Also I'd like to point out that unserialize() in the past had a relatively 
large number of different security vulnerabilities, so one should really, 
really not trust it with data of unknown origin. Internal classes can quite 
commonly be made to segfault with specially crafted serialization input. 
E.g. the user might think that, hey, DateTime is a safe class, let's allow 
unserializing that. Sounds legit doesn't it? Then let's remember those 
various serialization bugs that were recently fixed in DateTime (or the 
related classes, didn't really look at it). And then we would have a 
potentially exploitable segmentation fault :)
Nikita
Hi!
Yeah, well, the people who do that are also the ones that are unlikely
to make use of the new parameters to secure themselves. In order to make
Why? Making use of one parameter is orders of magnitude easier than 
refactoring your whole application to not use serialize() (especially if 
you need object support).
exactly replicating the structure. Using the latter format for the
former task is in any case a bad idea. It's not like unserializing
You completely ignore situations where two could mix - i.e. where you 
could store data that can potentially be not reproducible by JSON and 
still may - due to some or other oversight - be controlled by the user. 
That's why security has layers - so that if you make mistake on one 
layer, other layer would limit the exposure. That's why you don't run 
you webserver as root - even though you could just say "we shouldn't 
have remote execution holes, so it doesn't matter under which user the 
webserver runs". You seem to be under impression that if particular 
technique solves particular problem, there shouldn't be any security 
measures protecting you if you do not use that technique. I think it is 
plain wrong approach to security.
doesn't it? Then let's remember those various serialization bugs that
were recently fixed in DateTime (or the related classes, didn't really
look at it). And then we would have a potentially exploitable
segmentation fault :)
I don't see how this has anything to do with anything. There are tons of 
different bugs, and I never claimed this patch protects against all of 
them, so if you think somehow finding a scenario that this change does 
not protect from invalidates it, then by the same logic we should never 
have any security features at all that do not make the application 
perfectly secure in the hands of most ignorant user. 
The point is not to make perfect protection, the point is to make common 
scenario safer and more resistant to common attacks that happen to 
existing applications all the time now.
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/ 
(408)454-6900 ext. 227
Why? Making use of one parameter is orders of magnitude easier than
refactoring your whole application to not useserialize()(especially if
you need object support).
Under the RFC, unserialize could potentially create __PHP_Incomplete_Class objects 
(including nested ones), so new handling code would be needed to care for these cases and 
I wouldn't describe that as a simple fix depending on how the app wants to deal with these 
cases.
IMO these projects would be better off creating drop-in replacements for (un)serialize 
that wrap in HMAC to secure the data channel. Trivially done in userland.
The next best thing would be an unserialize that would simply fail if a non-whitelisted 
class was found.
Steve Clay
I think this RFC would worsen the problem of misusing the serialize round-trip.
Even if we make the docs clearer, we'd still be sending the message that there's a new 
"safer" unserialize and some would certainly use that new feature to be more lax about 
guarding serialized structures.
It also sets up a situation where altering either the whitelist or one of the classes in 
the whitelist could open a vulnerability that's not obvious.
I'm also not convinced that this feature would spur developers to fix insecure code.
But setting my arguments against, if the goal is to make unserialize() secure, then it 
should behave like a tripwire: fail loudly if a non-whitelisted class object is found. I 
think returning partially-usable values would gives devs more rope to hang themselves with.
Re the 2nd arg, I'd make only two cases:
- null is given: default behavior
- non-null given: cast to array and that's the class whitelist.
If the overall goal is to make the serialize/unserialize round-trip tamper-proof, we could 
build HMAC right into the API: add secret key args to both functions. No doubt the Suoshin 
patch already uses HMAC during encryption of the session data.
Steve Clay
JSON and
serialize()are (inherently) different serialization formats with different use-cases [...]
Yes, and json requires that all strings (including the keys) has to be 
valid utf-8, and I'm sure that's not always the case (serialize can 
use binary data in both places).
JSON and
serialize()are (inherently) different serialization formats with different use-cases [...]
Yes, and json requires that all strings (including the keys) has to be
valid utf-8, and I'm sure that's not always the case (serialize can
use binary data in both places).
Yes, it is a problem.
var_dump(json_encode("\xe1 - \xc3\xa1"));
PHP Warning:json_encode(): Invalid UTF-8 sequence in argument in php
shell code on line 1
string(4) "null"
In a perfect world, all your input is utf-8, but sometimes what you get
is in a different encoding...
(and you still want to store it as-it-came in the first layer)