HashDoS

8 years ago by Nikita Popov — view source

unread

On Thu, Sep 15, 2016 at 8:48 PM, Scott Arciszewski scott@paragonie.com
wrote:

Would the Internals team be open to discussing mitigating HashDoS in a
future version of PHP? i.e. everywhere, even for json_decode() and friends,
by fixing the problem rather than capping the maximum number of input
parameters and hoping it's good enough.

I'd propose SipHash (and/or a derivative): https://www.131002.net/siphash/

(Look at all the other languages that already adopted SipHash.)

https://medium.freecodecamp.com/hash-table-attack-8e4371fc5261#.s5r5j42x3

Previous discussion on the topic:
http://markmail.org/message/ttbgcvdu4f7uymfb

Nikita

8 years ago by Yasuo Ohgaki — view source

unread

Hi Nikita,

Previous discussion on the topic:
http://markmail.org/message/ttbgcvdu4f7uymfb

Your proposal is mandatory, IMHO.
Let's implement it ASAP.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

8 years ago by Stanislav Malyshev — view source

unread

Hi!

Would the Internals team be open to discussing mitigating HashDoS in a
future version of PHP? i.e. everywhere, even for json_decode() and friends,
by fixing the problem rather than capping the maximum number of input
parameters and hoping it's good enough.

I'd propose SipHash (and/or a derivative): https://www.131002.net/siphash/

I am worries about performance. Base hash structure has to be very
fast. I have doubts that cryptographic function can perform at these
levels. Did you test what is performance of this function compared to
existing hash function?

(Look at all the other languages that already adopted SipHash.)

Adopted as base data structure in the engine? Which ones? What were the
performance costs?

Stas Malyshev
smalyshev@gmail.com

8 years ago by Thomas Hruska — view source

unread

Hi!

Would the Internals team be open to discussing mitigating HashDoS in a
future version of PHP? i.e. everywhere, even for json_decode() and friends,
by fixing the problem rather than capping the maximum number of input
parameters and hoping it's good enough.

I'd propose SipHash (and/or a derivative): https://www.131002.net/siphash/

I am worries about performance. Base hash structure has to be very
fast. I have doubts that cryptographic function can perform at these
levels. Did you test what is performance of this function compared to
existing hash function?

If anyone wants a VERY rough estimate of relative performance
degradation as a result of switching to SipHash, here's a somewhat naive
C++ implementation of a similar data structure to that found in PHP:

https://github.com/cubiclesoft/cross-platform-cpp

(See the "Hash performance benchmark" results at the above link.)

In short, there's a significant degradation just switching from djb2 to
SipHash depending on key type. A similar effect would probably be seen
in PHP.

Randomizing the starting hash value for djb2 during the core startup
sequence could also be effective for mitigating HashDoS. Extensive
testing would have to be done to determine how collision performance
plays out with randomized starting hash values. I can't find any
arguments anywhere against using randomized starting hash values for
djb2. Also of note, the 33 multiplier seems more critical than anything
else for mixing bits together.

--
Thomas Hruska
CubicleSoft President

I've got great, time saving software that you will find useful.

http://cubiclesoft.com/

8 years ago by Nikita Popov — view source

unread

On Fri, Sep 16, 2016 at 7:59 AM, Thomas Hruska thruska@cubiclesoft.com
wrote:

Hi!

Would the Internals team be open to discussing mitigating HashDoS in a
future version of PHP? i.e. everywhere, even for json_decode() and
friends,
by fixing the problem rather than capping the maximum number of input
parameters and hoping it's good enough.

I'd propose SipHash (and/or a derivative):
https://www.131002.net/siphash/

I am worries about performance. Base hash structure has to be very
fast. I have doubts that cryptographic function can perform at these
levels. Did you test what is performance of this function compared to
existing hash function?

If anyone wants a VERY rough estimate of relative performance degradation
as a result of switching to SipHash, here's a somewhat naive C++
implementation of a similar data structure to that found in PHP:

https://github.com/cubiclesoft/cross-platform-cpp

(See the "Hash performance benchmark" results at the above link.)

In short, there's a significant degradation just switching from djb2 to
SipHash depending on key type. A similar effect would probably be seen in
PHP.

Randomizing the starting hash value for djb2 during the core startup
sequence could also be effective for mitigating HashDoS. Extensive
testing would have to be done to determine how collision performance plays
out with randomized starting hash values. I can't find any arguments
anywhere against using randomized starting hash values for djb2. Also of
note, the 33 multiplier seems more critical than anything else for mixing
bits together.

Randomizing the starting hash value of DJB has come up in the previous
thread as well -- it sounds nice, but is turns out to be completely
ineffective, because DJB collisions (at least the standard class of
collisions ignoring overflow 1) are independent of the starting hash
value.

Nikita

8 years ago by Scott Arciszewski — view source

unread

Significant degradation?

SipHash 1-3 is almost as fast as HashDoS-vulnerable hash functions:
https://github.com/funny-falcon/funny_hash

Scott Arciszewski
Chief Development Officer
Paragon Initiative Enterprises https://paragonie.com

On Fri, Sep 16, 2016 at 1:59 AM, Thomas Hruska thruska@cubiclesoft.com
wrote:

Hi!

Would the Internals team be open to discussing mitigating HashDoS in a
future version of PHP? i.e. everywhere, even for json_decode() and
friends,
by fixing the problem rather than capping the maximum number of input
parameters and hoping it's good enough.

I'd propose SipHash (and/or a derivative):
https://www.131002.net/siphash/

I am worries about performance. Base hash structure has to be very
fast. I have doubts that cryptographic function can perform at these
levels. Did you test what is performance of this function compared to
existing hash function?

If anyone wants a VERY rough estimate of relative performance degradation
as a result of switching to SipHash, here's a somewhat naive C++
implementation of a similar data structure to that found in PHP:

https://github.com/cubiclesoft/cross-platform-cpp

(See the "Hash performance benchmark" results at the above link.)

In short, there's a significant degradation just switching from djb2 to
SipHash depending on key type. A similar effect would probably be seen in
PHP.

Randomizing the starting hash value for djb2 during the core startup
sequence could also be effective for mitigating HashDoS. Extensive
testing would have to be done to determine how collision performance plays
out with randomized starting hash values. I can't find any arguments
anywhere against using randomized starting hash values for djb2. Also of
note, the 33 multiplier seems more critical than anything else for mixing
bits together.

--
Thomas Hruska
CubicleSoft President

I've got great, time saving software that you will find useful.

http://cubiclesoft.com/

8 years ago by Stanislav Malyshev — view source

unread

Hi!

Significant degradation?

SipHash 1-3 is almost as fast as HashDoS-vulnerable hash
functions: https://github.com/funny-falcon/funny_hash

I see on this link comparison to Murmur3 - but that's not the function
we are using. Is there a comparison to PHP one?

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Yasuo Ohgaki — view source

unread

Hi all,

Significant degradation?

SipHash 1-3 is almost as fast as HashDoS-vulnerable hash
functions: https://github.com/funny-falcon/funny_hash

I see on this link comparison to Murmur3 - but that's not the function
we are using. Is there a comparison to PHP one?

Unfortunately, SipHash was a lot slower. BJB hash is simple and super
fast, even google's CityHash was a lot slower when I tested. (Sorry I
don't keep the number)

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

8 years ago by Scott Arciszewski — view source

unread

Hi all,

On Sat, Sep 17, 2016 at 5:13 PM, Stanislav Malyshev smalyshev@gmail.com
wrote:

Significant degradation?

SipHash 1-3 is almost as fast as HashDoS-vulnerable hash
functions: https://github.com/funny-falcon/funny_hash

I see on this link comparison to Murmur3 - but that's not the function
we are using. Is there a comparison to PHP one?

Unfortunately, SipHash was a lot slower. BJB hash is simple and super
fast, even google's CityHash was a lot slower when I tested. (Sorry I
don't keep the number)

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

As long as the problem gets solved, I'm not in favor of any particular
solution.

I do find this amusing, though: the author of djb3 was Daniel J Bernstein,
who also helped develop SipHash. I guess non-cryptographic hash functions
are a small world.

Nikita:

Do you think your proposed strategy can solve this problem entirely
without dropping djb3?
Would randomization still help as a defense-in-depth?

To elaborate on the second question: even a 4-byte prefix for the hash
function inputs that's randomly generated at $appropriateIntervalHere might
make intentional collisions harder to trigger. (Then again, maybe not! The
underlying structure of djb3 isn't exactly cryptographic.)

To be clear, "Yes this is entirely solved without switching away from djb"

"No, randomization just hurts opcache and doesn't buy us any security"
are an acceptable set of answers to these questions. I just wanted to ask.

Scott Arciszewski
Chief Development Officer
Paragon Initiative Enterprises <https://paragonie.com/

8 years ago by Stanislav Malyshev — view source

unread

Hi!

Do you think your proposed strategy can solve this problem entirely
without dropping djb3?

Would randomization still help as a defense-in-depth?

Note that to avoid problems with opcache we can only randomize on
initial boot (even then synchronizing among different processes sharing
opcache may be challenging). That means that the process would be
running for extended time (at least days, in theory as long as uptime
allows) with the same seed. Given that, I'm not sure how much
randomization would really improve.

To elaborate on the second question: even a 4-byte prefix for the hash
function inputs that's randomly generated at $appropriateIntervalHere
might make intentional collisions harder to trigger. (Then again, maybe
not! The underlying structure of djb3 isn't exactly cryptographic.)

I don't see how we can do $appropriateIntervalHere if we use opcache. We
could clean the cache of course but I'm not sure server owners would be
very happy if their cache dropped at random intervals with accompanying
load spike.

Stas Malyshev
smalyshev@gmail.com

8 years ago by Tom Worster — view source

unread

Note that to avoid problems with opcache we can only randomize on
initial boot (even then synchronizing among different processes sharing
opcache may be challenging). That means that the process would be
running for extended time (at least days, in theory as long as uptime
allows) with the same seed. Given that, I'm not sure how much
randomization would really improve.

While randomization doesn't eliminate the problem, isn't it still a
valid complication for attackers? If everybody's PHP instance is running
with a different hash key, that's harder to attack than if than if they
all have the same key, even if the key isn't frequently changed.

It reminds me of when Logjam was in the news and we realized it wasn't
smart for everyone to use the same default DH primes.

Tom

8 years ago by Stanislav Malyshev — view source

unread

Hi!

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Max collision per what? How much would be the limit?

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Yasuo Ohgaki — view source

unread

Hi Stas,

On Wed, Sep 21, 2016 at 11:26 AM, Stanislav Malyshev
smalyshev@gmail.com wrote:

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Max collision per what? How much would be the limit?

Collision by keys.

It would be nice to have configurable limit like regex stack/backtrack limit.
That said, wouldn't 1000 enough for almost all apps?

Anyway, we have two choices

Simply limit the number of collisions. (Fast and has no impact to code)
Use crypt safe hash and salt. (Slow and has impact to opcache/etc)

Limiting something is good to have sometimes.
Python even limits number of recursions to 1000 by default.
We have PCRE stack/backtrack limits. (We'll have mbregex stack limit soon)
Collision limit is good one also.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

8 years ago by Niklas Keller — view source

unread

2016-09-21 8:54 GMT+02:00 Yasuo Ohgaki yohgaki@ohgaki.net:

Hi Stas,

On Wed, Sep 21, 2016 at 11:26 AM, Stanislav Malyshev
smalyshev@gmail.com wrote:

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Max collision per what? How much would be the limit?

Collision by keys.

It would be nice to have configurable limit like regex stack/backtrack
limit.
That said, wouldn't 1000 enough for almost all apps?

Having a limit isn't a good idea for all use cases, but it works for some.

Anyway, we have two choices

Simply limit the number of collisions. (Fast and has no impact to code)

It totally has impact to code. What happens if we have enough collisions?
Throw an exception? Might break code, any array insert operation might
suddenly fail. Simply fatal? Not suitable for long running applications
like Aerys (https://github.com/amphp/aerys) or PPM (
https://github.com/php-pm/php-pm).

I guess to support these use cases, we have to switch to a cryptographic
hash function here. But I guess it would be totally fine having such a
feature as compile flag, limiting the number of collisions usually and
provide a cryptographic hash function for long running apps to protect them.

I think such an option is fine for those running long running applications,
since they don't rely on setups like shared hosting and can usually compile
their own versions.

Use crypt safe hash and salt. (Slow and has impact to opcache/etc)

Do we need a salt here? If not, how does it then impact opcache?

Limiting something is good to have sometimes.
Python even limits number of recursions to 1000 by default.

Recursion is / should not be affected by user input and thus is a slightly
different topic.

Regards, Niklas

We have PCRE stack/backtrack limits. (We'll have mbregex stack limit soon)
Collision limit is good one also.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

8 years ago by Glenn Eggleton — view source

unread

Hello,

What if we had some sort of configuration limit on collision length?
Would give most of us a viable way to prevent our apps from being attacked,
and yet in the use cases where we require a higher limit we retain the
ability to up the limit or disable it completely.

Regards, Glenn

2016-09-21 8:54 GMT+02:00 Yasuo Ohgaki yohgaki@ohgaki.net:

Hi Stas,

On Wed, Sep 21, 2016 at 11:26 AM, Stanislav Malyshev
smalyshev@gmail.com wrote:

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Max collision per what? How much would be the limit?

Collision by keys.

It would be nice to have configurable limit like regex stack/backtrack
limit.
That said, wouldn't 1000 enough for almost all apps?

Having a limit isn't a good idea for all use cases, but it works for some.

Anyway, we have two choices

Simply limit the number of collisions. (Fast and has no impact to
code)

It totally has impact to code. What happens if we have enough collisions?
Throw an exception? Might break code, any array insert operation might
suddenly fail. Simply fatal? Not suitable for long running applications
like Aerys (https://github.com/amphp/aerys) or PPM (
https://github.com/php-pm/php-pm).

I guess to support these use cases, we have to switch to a cryptographic
hash function here. But I guess it would be totally fine having such a
feature as compile flag, limiting the number of collisions usually and
provide a cryptographic hash function for long running apps to protect
them.

I think such an option is fine for those running long running applications,
since they don't rely on setups like shared hosting and can usually compile
their own versions.

Use crypt safe hash and salt. (Slow and has impact to opcache/etc)

Do we need a salt here? If not, how does it then impact opcache?

Limiting something is good to have sometimes.
Python even limits number of recursions to 1000 by default.

Recursion is / should not be affected by user input and thus is a slightly
different topic.

Regards, Niklas

We have PCRE stack/backtrack limits. (We'll have mbregex stack limit
soon)
Collision limit is good one also.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

8 years ago by Rowan Collins — view source

unread

What if we had some sort of configuration limit on collision length?

Previous discussions have come to the conclusion that the difference between normal collision frequency and sufficient for a DoS is so large that the only meaningful settings would be on or off. e.g. the proposed limit is 1000, and randomly inserting millions of rows produces about 12.

The problem with long running applications is not that they need to raise the limit, it's that they need to handle the error gracefully if they are in fact under attack. Because hash tables are so ubiquitous in the engine, there's no guarantee that that's possible, so an attacker would have the ability to crash the process with the limit turned on, or hang the CPU with the limit turned off.

Regards,

--
Rowan Collins
[IMSoP]

8 years ago by Niklas Keller — view source

unread

2016-09-21 14:37 GMT+02:00 Rowan Collins rowan.collins@gmail.com:

On 21 September 2016 13:02:20 BST, Glenn Eggleton geggleto@gmail.com
wrote:

What if we had some sort of configuration limit on collision length?

Previous discussions have come to the conclusion that the difference
between normal collision frequency and sufficient for a DoS is so large
that the only meaningful settings would be on or off. e.g. the proposed
limit is 1000, and randomly inserting millions of rows produces about 12.

The problem with long running applications is not that they need to raise
the limit, it's that they need to handle the error gracefully if they are
in fact under attack. Because hash tables are so ubiquitous in the engine,
there's no guarantee that that's possible, so an attacker would have the
ability to crash the process with the limit turned on, or hang the CPU with
the limit turned off.

Another suggestion by Markus Staab is to move the code into the SAPIs and
have SipHash in CLI and the current hash function in e.g. Apache's SAPI.

Since long running applications use only the CLI, it would be fine for them
and it wouldn't even need a compile flag.

Regards, Niklas

8 years ago by Nikita Popov — view source

unread

2016-09-21 14:37 GMT+02:00 Rowan Collins rowan.collins@gmail.com:

On 21 September 2016 13:02:20 BST, Glenn Eggleton geggleto@gmail.com
wrote:

What if we had some sort of configuration limit on collision length?

Previous discussions have come to the conclusion that the difference
between normal collision frequency and sufficient for a DoS is so large
that the only meaningful settings would be on or off. e.g. the proposed
limit is 1000, and randomly inserting millions of rows produces about 12.

The problem with long running applications is not that they need to raise
the limit, it's that they need to handle the error gracefully if they are
in fact under attack. Because hash tables are so ubiquitous in the
engine,
there's no guarantee that that's possible, so an attacker would have the
ability to crash the process with the limit turned on, or hang the CPU
with
the limit turned off.

Another suggestion by Markus Staab is to move the code into the SAPIs and
have SipHash in CLI and the current hash function in e.g. Apache's SAPI.

Since long running applications use only the CLI, it would be fine for them
and it wouldn't even need a compile flag.

Regards, Niklas

Just so we're clear, switching to siphash is not just a matter of replacing
a hash function implementation. It requires larger changes [1] to support
non-trivial integer hashes which a) change the structure of HashTable
buckets and may have performance impact even if we continue using identity
hashing and b) will definitely lead to changes in the internal hashtable
API, so this may require changes in extensions. (Collision counting on the
other hand is fully transparent.)

Nikita

[1] https://github.com/php/php-src/compare/master...nikic:integerHash

8 years ago by Tom Worster — view source

unread

What if we had some sort of configuration limit on collision length?

Previous discussions have come to the conclusion that the difference between normal collision frequency and sufficient for a DoS is so large that the only meaningful settings would be on or off. e.g. the proposed limit is 1000, and randomly inserting millions of rows produces about 12.

The problem with long running applications is not that they need to raise the limit, it's that they need to handle the error gracefully if they are in fact under attack. Because hash tables are so ubiquitous in the engine, there's no guarantee that that's possible, so an attacker would have the ability to crash the process with the limit turned on, or hang the CPU with the limit turned off.

Right. It seems like count-and-limit pushes the problem onto the user
who then has to discriminate normal from malicious causes for rising
counters and find appropriate actions for each.

Even a sophisticated user who understands hash collision counters may
not welcome this since it adds complexity that's hard to test and
involves questionable heuristics.

Tom

8 years ago by Nikita Popov — view source

unread

On 21 September 2016 13:02:20 BST, Glenn Eggleton geggleto@gmail.com
wrote:

What if we had some sort of configuration limit on collision length?

Previous discussions have come to the conclusion that the difference
between normal collision frequency and sufficient for a DoS is so large
that the only meaningful settings would be on or off. e.g. the proposed
limit is 1000, and randomly inserting millions of rows produces about 12.

The problem with long running applications is not that they need to raise
the limit, it's that they need to handle the error gracefully if they are
in fact under attack. Because hash tables are so ubiquitous in the engine,
there's no guarantee that that's possible, so an attacker would have the
ability to crash the process with the limit turned on, or hang the CPU with
the limit turned off.

Right. It seems like count-and-limit pushes the problem onto the user who
then has to discriminate normal from malicious causes for rising counters
and find appropriate actions for each.

Even a sophisticated user who understands hash collision counters may not
welcome this since it adds complexity that's hard to test and involves
questionable heuristics.

Tom

Quoting a relevant part of the previous discussion:

Lets [try] to quantify the probability of reaching the collision limit C
with a hashtable of size N and assuming a random hash distribution. The
upper bound for this should be (N choose C) * (1/N)^(C-1), with (1/N)^(C-1)
being the probability that C elements hash to one value and (N over C) the
number of C element subsets of an N element set. For large N and N >> C we
approximate (N choose C) to (eN/C)^C / sqrt(2piC). As such our upper
bound becomes N * (e/C)^C / sqrt(2pi*C). Choosing N = 2^31 (largest
possible hashtable) we get for C = 20 probability 8.9E-10 and for C = 100
probability 2.3E-149. The patch uses C = 1000.

In other words, it is extremely unlikely that you hit the collision limit
by accident, with non-malicious data. So no, the user does not have to
discriminate normal from malicious causes. If the limit triggers, it's
malicious.

Nikita

8 years ago by Glenn Eggleton — view source

unread

This might be a bit off topic....

Given that you can set POST_REQUEST_SIZE in a production PHP application,
how likely is it really that an app will encounter a HashDos attack?

From what I gather this will require MBs to GBs of data in order to cause a
DoS.

From the web side, I think there are enough tools to prevent HashDos from
happening...

Would the issue then affect only CLI users?

Sorry, if this seems like a derail, I am pretty new to the internals list.

Cheers, Glenn

On 21 September 2016 13:02:20 BST, Glenn Eggleton geggleto@gmail.com
wrote:

What if we had some sort of configuration limit on collision length?

Previous discussions have come to the conclusion that the difference
between normal collision frequency and sufficient for a DoS is so large
that the only meaningful settings would be on or off. e.g. the proposed
limit is 1000, and randomly inserting millions of rows produces about

The problem with long running applications is not that they need to
raise
the limit, it's that they need to handle the error gracefully if they
are
in fact under attack. Because hash tables are so ubiquitous in the
engine,
there's no guarantee that that's possible, so an attacker would have the
ability to crash the process with the limit turned on, or hang the CPU
with
the limit turned off.

Right. It seems like count-and-limit pushes the problem onto the user who
then has to discriminate normal from malicious causes for rising counters
and find appropriate actions for each.

Even a sophisticated user who understands hash collision counters may not
welcome this since it adds complexity that's hard to test and involves
questionable heuristics.

Tom

Quoting a relevant part of the previous discussion:

Lets [try] to quantify the probability of reaching the collision limit C
with a hashtable of size N and assuming a random hash distribution. The
upper bound for this should be (N choose C) * (1/N)^(C-1), with (1/N)^(C-1)
being the probability that C elements hash to one value and (N over C) the
number of C element subsets of an N element set. For large N and N >> C we
approximate (N choose C) to (eN/C)^C / sqrt(2piC). As such our upper
bound becomes N * (e/C)^C / sqrt(2pi*C). Choosing N = 2^31 (largest
possible hashtable) we get for C = 20 probability 8.9E-10 and for C = 100
probability 2.3E-149. The patch uses C = 1000.

In other words, it is extremely unlikely that you hit the collision limit
by accident, with non-malicious data. So no, the user does not have to
discriminate normal from malicious causes. If the limit triggers, it's
malicious.

Nikita

8 years ago by Nikita Popov — view source

unread

This might be a bit off topic....

Given that you can set POST_REQUEST_SIZE in a production PHP application,
how likely is it really that an app will encounter a HashDos attack?

From what I gather this will require MBs to GBs of data in order to cause
a DoS.

From the web side, I think there are enough tools to prevent HashDos from
happening...

Would the issue then affect only CLI users?

Sorry, if this seems like a derail, I am pretty new to the internals list.

Cheers, Glenn

Again quoting previous thread:

This DOS vulnerability is efficient: A 700kb payload can easily take 5
CPU seconds to process on PHP 7 and from there it goes up quadratically
(i.e. 1.4MB payload takes 20 seconds etc)

I don't remember exactly under what circumstances these numbers are
correct. I think the sizes refer to JSON payloads and the system is a
recent gen i5. So for the default post_max_size of 8M = approx 11700kb we
get an expected execution time of 5(11)^2 seconds, which is about 10
minutes. Of course, the execution time limit will trigger before that :)

So unless your post data size limit is very small or you perform additional
size validation on JSON data (and other data) you receive, this attack is
quite practical and not just a theoretical concern :)

Nikita

8 years ago by Stanislav Malyshev — view source

unread

Hi!

Lets [try] to quantify the probability of reaching the collision limit C
with a hashtable of size N and assuming a random hash distribution. The
upper bound for this should be (N choose C) * (1/N)^(C-1), with (1/N)^(C-1)
being the probability that C elements hash to one value and (N over C) the
number of C element subsets of an N element set. For large N and N >> C we
approximate (N choose C) to (eN/C)^C / sqrt(2piC). As such our upper
bound becomes N * (e/C)^C / sqrt(2pi*C). Choosing N = 2^31 (largest
possible hashtable) we get for C = 20 probability 8.9E-10 and for C = 100
probability 2.3E-149. The patch uses C = 1000.

I assume you're talking here about per-hashtable limit?
Here I notice two things:

Hash keys are not really random. Which can both improve and make it
worse, but we have a lot of keys named "key" and not a lot of keys named
"\x07\xF5\xDD".
This does not account for the fact that if we have two keys used by
app colliding, we'll probably see this collision a lot. Of course, if
we're counting per-hashtable, that would make is somewhat less, but still.

To validate this theory, I applied the following patch:

https://gist.github.com/smalyshev/2c3c085998ba81d6c163a27386aaacd8

And run composer update on one of my projects. I got these lines:

Destroyed hash with coll = 48454
Destroyed hash with coll = 41871
Destroyed hash with coll = 41828
Destroyed hash with coll = 40535
Destroyed hash with coll = 32774
Destroyed hash with coll = 22782
Destroyed hash with coll = 21381
Destroyed hash with coll = 15232
Destroyed hash with coll = 14745
Destroyed hash with coll = 12216
Destroyed hash with coll = 11742
Destroyed hash with coll = 11625
Destroyed hash with coll = 8916
Destroyed hash with coll = 6820
Destroyed hash with coll = 5911
Destroyed hash with coll = 5429
Destroyed hash with coll = 3390
Destroyed hash with coll = 1188

I.e. there were 18 hashes with more than 1000 collisions on the course
of the run. Is there something wrong with my count or am I counting
wrong things (I didn't see the patch for limiting counts so maybe I'm
still counting wrong things).

In other words, it is extremely unlikely that you hit the collision limit
by accident, with non-malicious data. So no, the user does not have to
discriminate normal from malicious causes. If the limit triggers, it's
malicious.

I'm not so sure of this now - unless, again, I misunderstand what we're
counting.

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Stanislav Malyshev — view source

unread

Hi!

I'm not so sure of this now - unless, again, I misunderstand what we're
counting.

It just occurred to me that I possibly do misunderstand what is counted

if the proposal is to count collisions per lookup.

I rerun my scripts with that assumption and turns out the longest
collision chain I have is about 8. So in this case - if I'm
understanding it right now - my argument about limit is wrong and
collision chain length is like recursion.

In which case some limit like 1000 (just random number but can be
tested) would probably be OK. The question now is would it be enough to
block DoS? I.e. if we construct data to cause 999 collisions each time
to stay just under the limit, can we still cause trouble or not? It's
still almost 1000 times slower it's supposed to be...

P.S. Sorry for the confusion and the noise, I hope I got the proposed
solution right this time.

Stas Malyshev
smalyshev@gmail.com

8 years ago by Rowan Collins — view source

unread

In which case some limit like 1000 (just random number but can be
tested) would probably be OK. The question now is would it be enough to
block DoS? I.e. if we construct data to cause 999 collisions each time
to stay just under the limit, can we still cause trouble or not? It's
still almost 1000 times slower it's supposed to be...

I think I'm right in saying that the power of the attack comes in the fact that the total time doesn't scale linearly but exponentially. Inserting a third element into a chain of two is faster than inserting a tenth element into a chain of nine. So a hash that hits the limit is more than 1000 times slower than a natural one, but 1000 chains of 999 is orders of magnitude faster than one chain of 999000.

That doesn't exactly answer the question of whether 1000 is the right value, of course.

Regards,

--
Rowan Collins
[IMSoP]

8 years ago by Tom Worster — view source

unread

I think I'm right in saying that the power of the attack comes in the fact that the total time doesn't scale linearly but exponentially.

quadratic is what i read in the previous thread, iirc. even so, it's
still a useful gain.

That doesn't exactly answer the question of whether 1000 is the right value, of course.

it's the parameter for what's in effect a statistical hypothesis test
for randomness, built on the assumption that key patterns that are not
hostile are quasi-random and those that are not random are hostile. 1000
seems large if testing randomness, were that the only consideration.

but i guess there is a concern that, in some cases, legitimate use could
have key patterns with regularities that lead to accumulation in some bins.

so it should work, to a useful extent, if there is a parameter value

low enough to give a worthwhile degree of dos attack protection
high enough to false positive only for benign patterns that would in
any case cause terrible performance degradation

tom

8 years ago by Nikita Popov — view source

unread

On Thu, Sep 22, 2016 at 1:07 AM, Stanislav Malyshev smalyshev@gmail.com
wrote:

Hi!

Lets [try] to quantify the probability of reaching the collision limit C
with a hashtable of size N and assuming a random hash distribution. The
upper bound for this should be (N choose C) * (1/N)^(C-1), with
(1/N)^(C-1)
being the probability that C elements hash to one value and (N over C)
the
number of C element subsets of an N element set. For large N and N >> C
we
approximate (N choose C) to (eN/C)^C / sqrt(2piC). As such our upper
bound becomes N * (e/C)^C / sqrt(2pi*C). Choosing N = 2^31 (largest
possible hashtable) we get for C = 20 probability 8.9E-10 and for C = 100
probability 2.3E-149. The patch uses C = 1000.

I assume you're talking here about per-hashtable limit?
Here I notice two things:

Hash keys are not really random. Which can both improve and make it
worse, but we have a lot of keys named "key" and not a lot of keys named
"\x07\xF5\xDD".

This does not account for the fact that if we have two keys used by
app colliding, we'll probably see this collision a lot. Of course, if
we're counting per-hashtable, that would make is somewhat less, but still.

To validate this theory, I applied the following patch:

https://gist.github.com/smalyshev/2c3c085998ba81d6c163a27386aaacd8

And run composer update on one of my projects. I got these lines:

Destroyed hash with coll = 48454
Destroyed hash with coll = 41871
Destroyed hash with coll = 41828
Destroyed hash with coll = 40535
Destroyed hash with coll = 32774
Destroyed hash with coll = 22782
Destroyed hash with coll = 21381
Destroyed hash with coll = 15232
Destroyed hash with coll = 14745
Destroyed hash with coll = 12216
Destroyed hash with coll = 11742
Destroyed hash with coll = 11625
Destroyed hash with coll = 8916
Destroyed hash with coll = 6820
Destroyed hash with coll = 5911
Destroyed hash with coll = 5429
Destroyed hash with coll = 3390
Destroyed hash with coll = 1188

I.e. there were 18 hashes with more than 1000 collisions on the course
of the run. Is there something wrong with my count or am I counting
wrong things (I didn't see the patch for limiting counts so maybe I'm
still counting wrong things).

In other words, it is extremely unlikely that you hit the collision limit
by accident, with non-malicious data. So no, the user does not have to
discriminate normal from malicious causes. If the limit triggers, it's
malicious.

I'm not so sure of this now - unless, again, I misunderstand what we're
counting.

As linked in my first reply, there was a previous discussion on the topic:
http://markmail.org/message/ttbgcvdu4f7uymfb
The collision-counting approach that Yasuo references is linked at the
start of the first mail: https://github.com/php/php-src/pull/1565

Collisions are counted during insertion operations. While we perform a
hashtable insertion, we first need to iterate over the existing buckets for
a certain hash to see if the insert is actually an update. The
implementation simply keeps track of how many buckets we inspect while
doing that. If the number is above a certain threshold, an error is
generated.

As such, the number of collisions per hashtable is not important. Only the
number of collisions for a certain hash-value is relevant. Getting C=1000
collisions for a single hash-value is extremely unlikely until you
explicitly design your keys to trigger this situation.

Nikita

8 years ago by Stanislav Malyshev — view source

unread

Hi!

As linked in my first reply, there was a previous discussion on the
topic: http://markmail.org/message/ttbgcvdu4f7uymfb
The collision-counting approach that Yasuo references is linked at the
start of the first mail: https://github.com/php/php-src/pull/1565

Collisions are counted during insertion operations. While we perform a
hashtable insertion, we first need to iterate over the existing buckets
for a certain hash to see if the insert is actually an update. The
implementation simply keeps track of how many buckets we inspect while
doing that. If the number is above a certain threshold, an error is
generated.

So, count is per-lookup, then I think it's fine then. Sorry for
misunderstanding it initially. Anything prevents us from merging it? If
not, let's do it.

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Jakub Zelenka — view source

unread

Hi!

As linked in my first reply, there was a previous discussion on the
topic: http://markmail.org/message/ttbgcvdu4f7uymfb
The collision-counting approach that Yasuo references is linked at the
start of the first mail: https://github.com/php/php-src/pull/1565

Collisions are counted during insertion operations. While we perform a
hashtable insertion, we first need to iterate over the existing buckets
for a certain hash to see if the insert is actually an update. The
implementation simply keeps track of how many buckets we inspect while
doing that. If the number is above a certain threshold, an error is
generated.

So, count is per-lookup, then I think it's fine then. Sorry for
misunderstanding it initially. Anything prevents us from merging it? If
not, let's do it.

I don't like the initial version of the patch that was causing fatal error
for json_decode. That's not how json_decode should work. I think that Bob
came up later with a better version that was using json recursion error. It
might require a bit more work for 7.1 as I changed a json parser since then.

Cheers

Jakub

8 years ago by Rowan Collins — view source

unread

I don't like the initial version of the patch that was causing fatal error
for json_decode. That's not how json_decode should work. I think that Bob
came up later with a better version that was using json recursion error. It
might require a bit more work for 7.1 as I changed a json parser since then.

The point of the proposed patch is that it causes fatal error anywhere
that a hash is attacked (and, as discussed, it really is only going to
trigger on a crafted attack).

Adding mitigations elsewhere such as in the JSON parser can be done on
top of that, since they'll presumably catch the problem before the hash
is inserted into.

It's the same as if the attack caused an exponential amount of memory
usage: the engine will bail out as soon as the hard memory limit is
reached, but extensions can and should detect and avoid scenarios likely
to cause that.

Regards,

Rowan Collins
[IMSoP]

8 years ago by Jakub Zelenka — view source

unread

On Thu, Sep 22, 2016 at 10:06 AM, Rowan Collins rowan.collins@gmail.com
wrote:

I don't like the initial version of the patch that was causing fatal error
for json_decode. That's not how json_decode should work. I think that Bob
came up later with a better version that was using json recursion error.
It
might require a bit more work for 7.1 as I changed a json parser since
then.

The point of the proposed patch is that it causes fatal error anywhere
that a hash is attacked (and, as discussed, it really is only going to
trigger on a crafted attack).

Adding mitigations elsewhere such as in the JSON parser can be done on
top of that, since they'll presumably catch the problem before the hash is
inserted into.

It's the same as if the attack caused an exponential amount of memory
usage: the engine will bail out as soon as the hard memory limit is
reached, but extensions can and should detect and avoid scenarios likely to
cause that.

Nope the point of the Bob's patch is to use graceful handling with
exception that can be easily checked by the json parser for example! See
https://github.com/php/php-src/pull/1706

From the quick look, it actually just requires regenerating parser from the
json ext point of view.

Cheers

Jakub

8 years ago by Rowan Collins — view source

unread

Nope the point of the Bob's patch is to use graceful handling with
exception that can be easily checked by the json parser for example! See
https://github.com/php/php-src/pull/1706

Ah, I stand corrected, I hadn't seen that version referenced before.

Am I right in thinking that the idea here is that if the context is
exception-safe it can opt in to a more graceful handling mechanism? And
that if not, it will go ahead and bail out as in Niki's patch?

If so, seems like a good way forward, assuming the additional complexity
isn't too major.

Regards,

Rowan Collins
[IMSoP]

8 years ago by Jakub Zelenka — view source

unread

On Thu, Sep 22, 2016 at 10:54 AM, Rowan Collins rowan.collins@gmail.com
wrote:

Nope the point of the Bob's patch is to use graceful handling with
exception that can be easily checked by the json parser for example! See
https://github.com/php/php-src/pull/1706

Ah, I stand corrected, I hadn't seen that version referenced before.

Am I right in thinking that the idea here is that if the context is
exception-safe it can opt in to a more graceful handling mechanism? And
that if not, it will go ahead and bail out as in Niki's patch?

Yeah it introduces new functions for updating hash which is used by json
for updating array and it's also in std object handler which is used when
updating json object. For some other bits like updating array, it will stay
with fatal. The thing is that json parser can then easily check if there
was an exception and if so, it will set JSON_ERROR_DEPTH and clear it. It
seems much better though.

Cheers

Jakub

8 years ago by Niklas Keller — view source

unread

2016-09-22 20:10 GMT+02:00 Jakub Zelenka bukka@php.net:

On Thu, Sep 22, 2016 at 10:54 AM, Rowan Collins rowan.collins@gmail.com
wrote:

Nope the point of the Bob's patch is to use graceful handling with
exception that can be easily checked by the json parser for example! See
https://github.com/php/php-src/pull/1706

Ah, I stand corrected, I hadn't seen that version referenced before.

Am I right in thinking that the idea here is that if the context is
exception-safe it can opt in to a more graceful handling mechanism? And
that if not, it will go ahead and bail out as in Niki's patch?

Yeah it introduces new functions for updating hash which is used by json
for updating array and it's also in std object handler which is used when
updating json object. For some other bits like updating array, it will stay
with fatal. The thing is that json parser can then easily check if there
was an exception and if so, it will set JSON_ERROR_DEPTH and clear it. It
seems much better though.

But why JSON_ERROR_DEPTH and not a new constant?

Regards, Niklas

8 years ago by Jakub Zelenka — view source

unread

2016-09-22 20:10 GMT+02:00 Jakub Zelenka bukka@php.net:

On Thu, Sep 22, 2016 at 10:54 AM, Rowan Collins rowan.collins@gmail.com
wrote:

Nope the point of the Bob's patch is to use graceful handling with
exception that can be easily checked by the json parser for example!
See
https://github.com/php/php-src/pull/1706

Ah, I stand corrected, I hadn't seen that version referenced before.

Am I right in thinking that the idea here is that if the context is
exception-safe it can opt in to a more graceful handling mechanism? And
that if not, it will go ahead and bail out as in Niki's patch?

Yeah it introduces new functions for updating hash which is used by json
for updating array and it's also in std object handler which is used when
updating json object. For some other bits like updating array, it will
stay
with fatal. The thing is that json parser can then easily check if there
was an exception and if so, it will set JSON_ERROR_DEPTH and clear it. It
seems much better though.

But why JSON_ERROR_DEPTH and not a new constant?

Yeah I agree and I already suggested it in the PR an hour ago. ;)

https://github.com/php/php-src/pull/1706#discussion_r80102953

Cheers

Jakub

8 years ago by Glenn Eggleton — view source

unread

Is the XML api also affected by hashdos?

cheers, glenn

On Thu, Sep 22, 2016 at 8:13 PM, Niklas Keller <me@kelunik.com
javascript:;> wrote:

2016-09-22 20:10 GMT+02:00 Jakub Zelenka <bukka@php.net javascript:;>:

On Thu, Sep 22, 2016 at 10:54 AM, Rowan Collins <
rowan.collins@gmail.com javascript:;>
wrote:

Nope the point of the Bob's patch is to use graceful handling with
exception that can be easily checked by the json parser for example!
See
https://github.com/php/php-src/pull/1706

Ah, I stand corrected, I hadn't seen that version referenced before.

Am I right in thinking that the idea here is that if the context is
exception-safe it can opt in to a more graceful handling mechanism?
And
that if not, it will go ahead and bail out as in Niki's patch?

Yeah it introduces new functions for updating hash which is used by json
for updating array and it's also in std object handler which is used
when
updating json object. For some other bits like updating array, it will
stay
with fatal. The thing is that json parser can then easily check if there
was an exception and if so, it will set JSON_ERROR_DEPTH and clear it.
It
seems much better though.

But why JSON_ERROR_DEPTH and not a new constant?

Yeah I agree and I already suggested it in the PR an hour ago. ;)

https://github.com/php/php-src/pull/1706#discussion_r80102953

Cheers

Jakub

8 years ago by Stanislav Malyshev — view source

unread

Hi!

Yeah it introduces new functions for updating hash which is used by json
for updating array and it's also in std object handler which is used when
updating json object. For some other bits like updating array, it will stay
with fatal. The thing is that json parser can then easily check if there
was an exception and if so, it will set JSON_ERROR_DEPTH and clear it. It
seems much better though.

I'm not sure why special handling for JSON? JSON is certainly not the
only way user data can be ingested and the problem of hash collision is
common to all these ways.

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Bob Weinand — view source

unread

Am 22.9.2016 um 22:08 schrieb Stanislav Malyshev smalyshev@gmail.com:

Hi!

Yeah it introduces new functions for updating hash which is used by json
for updating array and it's also in std object handler which is used when
updating json object. For some other bits like updating array, it will stay
with fatal. The thing is that json parser can then easily check if there
was an exception and if so, it will set JSON_ERROR_DEPTH and clear it. It
seems much better though.

I'm not sure why special handling for JSON? JSON is certainly not the
only way user data can be ingested and the problem of hash collision is
common to all these ways.

--
Stas Malyshev
smalyshev@gmail.com mailto:smalyshev@gmail.com

The patch is not only targeting JSON. He just used JSON as an example.

Every function generating arrays with keys based on user-defined input needs to be updated.

I’m going to update the patch soon and will notify you then.

Bob

8 years ago by Stanislav Malyshev — view source

unread

Hi!

The patch is not only targeting JSON. He just used JSON as an example.

Every function generating arrays with keys based on user-defined input
needs to be updated.

That looks like a very good way to make a lot of mistakes, miss a lot of
cases and end up playing whack-a-mole with covering all functions. Why
not just patch zend_hash.c and be done with it?

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Bob Weinand — view source

unread

Am 23.09.2016 um 03:16 schrieb Stanislav Malyshev smalyshev@gmail.com:

Hi!

The patch is not only targeting JSON. He just used JSON as an example.

Every function generating arrays with keys based on user-defined input
needs to be updated.

That looks like a very good way to make a lot of mistakes, miss a lot of
cases and end up playing whack-a-mole with covering all functions. Why
not just patch zend_hash.c and be done with it?

--
Stas Malyshev
smalyshev@gmail.com mailto:smalyshev@gmail.com

Hey,

Note that the implementation is going to fallback to a fatal error if even more collisions are reached. (in the current patch: 1100 instead of 1000)

So, even if we miss some cases with the exceptions, there still will be a safety net for us.

We could patch zend_hash.c in two ways: SipHash (sloooow) or only fatals (very bad for e.g. servers written in PHP. When they have to decode some JSON, it's trivial for an attacker to crash them very easily). As that's not an option, we need to use exceptions.

Bob

8 years ago by Stanislav Malyshev — view source

unread

Hi!

We could patch zend_hash.c in two ways: SipHash (sloooow) or only fatals
(very bad for e.g. servers written in PHP. When they have to decode some

Why very bad?

JSON, it's trivial for an attacker to crash them very easily). As that's

Fatal error is not crash. It's a normal ending of the request, of the
server can not tolerate it, how can it deal with memory limits, string
overflows, etc.? There's a lot of things right now that can cause fatal
error.

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Jakub Zelenka — view source

unread

Hi,

On Fri, Sep 23, 2016 at 8:16 PM, Stanislav Malyshev smalyshev@gmail.com
wrote:

Hi!

We could patch zend_hash.c in two ways: SipHash (sloooow) or only fatals
(very bad for e.g. servers written in PHP. When they have to decode some

Why very bad?

JSON, it's trivial for an attacker to crash them very easily). As that's

Fatal error is not crash. It's a normal ending of the request, of the
server can not tolerate it, how can it deal with memory limits, string
overflows, etc.? There's a lot of things right now that can cause fatal
error.

That's exactly what we don't want - let the attacker to end our request.
All other things like string overflows and memory limits are under our
control (e.g. we can set limit on the server and reject such requests) but
this isn't.

Cheers

Jakub

8 years ago by Stanislav Malyshev — view source

unread

Hi!

That's exactly what we don't want - let the attacker to end our request.

Why not? What else you can do with this request that has clearly bad and
maliciously constructed data?

All other things like string overflows and memory limits are under our
control (e.g. we can set limit on the server and reject such requests)

Not sure I understand what you mean. How exactly memory limits are under
your control? If somebody sends a request that blows up your memory
limit, how you control it? In fact, if somebody sends, say, a POST that
goes above your post limit - how you handle it without terminating the
request?

Stas Malyshev
smalyshev@gmail.com

8 years ago by Jakub Zelenka — view source

unread

Hi

On Fri, Sep 23, 2016 at 8:47 PM, Stanislav Malyshev smalyshev@gmail.com
wrote:

Hi!

That's exactly what we don't want - let the attacker to end our request.

Why not? What else you can do with this request that has clearly bad and
maliciously constructed data?

I think there is a confusion about the "servers written in PHP". Those
applications serves more requests in a single (main) PHP request using the
even loop. Good examples of that are Aerys or ReactPHP. So we don't want to
kill that main request if one of the handled requests is malicious (ideally
we just ignore that malicious request and server others).

All other things like string overflows and memory limits are under our
control (e.g. we can set limit on the server and reject such requests)

Not sure I understand what you mean. How exactly memory limits are under
your control? If somebody sends a request that blows up your memory
limit, how you control it? In fact, if somebody sends, say, a POST that
goes above your post limit - how you handle it without terminating the
request?

Usually you will have it behind proxy so you can terminate it there if the
request is too big. For example you could set client_max_body_size in
nginx. However we can't effectively catch HashDos by the size because it
can be relatively small request (see nice description of that in one of the
previous emails from Nikita...).

Cheers

Jakub

8 years ago by Stanislav Malyshev — view source

unread

Hi!

I think there is a confusion about the "servers written in PHP". Those
applications serves more requests in a single (main) PHP request using
the even loop. Good examples of that are Aerys or ReactPHP. So we don't
want to kill that main request if one of the handled requests is
malicious (ideally we just ignore that malicious request and server others).

That can't work well. PHP makes a lot of assumptions in short-lived
requests, and assuming the same request lives forever is bound to cause
a lot of issues - from stale objects sitting around to unoptimal memory
use to eventual overflows in all kinds of counters, etc. Why not use
real servers for server work?
You yourself say you'll have it behind nginx or so - so why not let
nginx do the server part?

We have several hundred places in PHP where fatal error (E_ERROR or
E_CORE_ERROR) can be produced. Of course, some of them are compile-time
but not all of them. E.g., various string overflow scenarios can be
triggered by combining or processing strings (such as uncompressing or
decoding various formats). If you server relies on proxy to filter out
all fatal error scenarios, I'm afraid it's much harder than it seems.

I of course do not propose to make nginx filter JSON data. On the
contrary, I propose when we have clearly malicious DoS attempt (like
string overflow or hash collision overflow) to fail fast instead of
trying to make right and playing whack-a-mole. We have enough fails in
this are already.

--
Stas Malyshev
smalyshev@gmail.com

8 years ago by Niklas Keller — view source

unread

2016-09-24 0:21 GMT+02:00 Stanislav Malyshev smalyshev@gmail.com:

Hi!

I think there is a confusion about the "servers written in PHP". Those
applications serves more requests in a single (main) PHP request using
the even loop. Good examples of that are Aerys or ReactPHP. So we don't
want to kill that main request if one of the handled requests is
malicious (ideally we just ignore that malicious request and server
others).

That can't work well.

It works really well.

PHP makes a lot of assumptions in short-lived
requests, and assuming the same request lives forever is bound to cause
a lot of issues - from stale objects sitting around to unoptimal memory
use to eventual overflows in all kinds of counters, etc.

It's not long running requests. They're just CLI apps.

Why not use
real servers for server work?

Performance. Bootstrapping everything on every request is very, very
resource aggressive.

You yourself say you'll have it behind nginx or so - so why not let
nginx do the server part?

You don't have to run it behind Nginx at all. Most often you do it, because
you have other (traditional) apps on the same server.

We have several hundred places in PHP where fatal error (E_ERROR or
E_CORE_ERROR) can be produced.

None of these should be triggerable by user input if you do it right.

Of course, some of them are compile-time
but not all of them. E.g., various string overflow scenarios can be
triggered by combining or processing strings (such as uncompressing or
decoding various formats).

Uncompressing files shouldn't be done in web requests anyway. It's best
pushed to a queue and processed by workers then.

If you server relies on proxy to filter out
all fatal error scenarios, I'm afraid it's much harder than it seems.

We don't.

I of course do not propose to make nginx filter JSON data. On the
contrary, I propose when we have clearly malicious DoS attempt (like
string overflow or hash collision overflow) to fail fast instead of
trying to make right and playing whack-a-mole. We have enough fails in
this are already.

Again, user input shouldn't be able to trigger unrecoverable fatal errors.

Regards, Niklas

8 years ago by Yasuo Ohgaki — view source

unread

Hi Kiklas,

2016-09-21 8:54 GMT+02:00 Yasuo Ohgaki yohgaki@ohgaki.net:

Hi Stas,

On Wed, Sep 21, 2016 at 11:26 AM, Stanislav Malyshev
smalyshev@gmail.com wrote:

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Max collision per what? How much would be the limit?

Collision by keys.

It would be nice to have configurable limit like regex stack/backtrack
limit.
That said, wouldn't 1000 enough for almost all apps?

Having a limit isn't a good idea for all use cases, but it works for some.

Anyway, we have two choices

Simply limit the number of collisions. (Fast and has no impact to code)

It totally has impact to code. What happens if we have enough collisions?
Throw an exception? Might break code, any array insert operation might
suddenly fail. Simply fatal? Not suitable for long running applications like
Aerys (https://github.com/amphp/aerys) or PPM
(https://github.com/php-pm/php-pm).

I guess to support these use cases, we have to switch to a cryptographic
hash function here. But I guess it would be totally fine having such a
feature as compile flag, limiting the number of collisions usually and
provide a cryptographic hash function for long running apps to protect them.

I think such an option is fine for those running long running applications,
since they don't rely on setups like shared hosting and can usually compile
their own versions.

It's the same as you're saying that PCRE stack/backtrack limit and/or
memory limit is not good idea.
The same argument apply to the stack/backtrack/memory limit, but I
don't think these limits are bad idea at all.

AFIAK, we didn't have a bug report that complains slow hash operation
by collisions yet except intended attack, did we?

Limit does not have to be 1000, but as many as possible up to the
value safe against intended attack.

Use crypt safe hash and salt. (Slow and has impact to opcache/etc)

Do we need a salt here? If not, how does it then impact opcache?

Yes. It's mandatory.
Even crypt quality hash has collisions. Attack is possible once
attackers collect enough number of collisions.

Salt breaks hashed key compatibility. Opcache saves hashes and salt breaks it.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

8 years ago by Stanislav Malyshev — view source

unread

Hi!

Hi Stas,

On Wed, Sep 21, 2016 at 11:26 AM, Stanislav Malyshev
smalyshev@gmail.com wrote:

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Max collision per what? How much would be the limit?

Collision by keys.

Not sure I understand. What would be counted - number of collision per
key? Per hashtable? Per process? Per request?

It would be nice to have configurable limit like regex stack/backtrack limit.
That said, wouldn't 1000 enough for almost all apps?

Certainly not. Not even nearly enough. Collisions are pretty frequent
with short strings, for example, and for a big long-running application
1000 hash collisions is nothing. I think you severely underestimate how
frequent hash collisions are, with simple function like we're using,
over millions and millions of hash accesses that we're doing routinely.

I did a quick check, and just running run-tests.php -h (without any
tests!) produces about 5K collisions. Running composer (without doing
anything) - 8K collisions. Running composer update on a simple project -
400K (!) collisions. Now these are pretty simple cases compared to what
a complex modern PHP application does. So I think you are
underestimating it by about 4-5 orders of magnitude.

Anyway, we have two choices

Simply limit the number of collisions. (Fast and has no impact to code)

Fast: true, no impact: not so much. If the limit is too low for your app
(and you probably have no idea how many collisions your app has, and it
is probably highly varied too) your app breaks in some very creative
ways, including dropping dead in the middle of transactions, being
unable to handle errors, etc.

Use crypt safe hash and salt. (Slow and has impact to opcache/etc)

These not the only two choices.

Limiting something is good to have sometimes.
Python even limits number of recursions to 1000 by default.

Recursion limits are completely different. With recursion, with any
proper algorithm, the deeper you go the probability of having next
recursion step drops dramatically (usually exponentially - that's how
you build performant algorithms). However, hash collisions are
completely independent, so having one does not change the chance you
have another (in fact, it may make it higher if the same data is
processed again).

Which means, for recursion, it is very probable there is a limit that
most of your code will never reach. Even then we are cautious of it as
there's no good number for a natural limit.

But with hash collisions, I just see no way to tell "10K collisions is
enough for anything" and at least my quick checks - unless I massively
messed up the test - shows that it's not easy to find a practical limit,
at least with current hash function.

We have PCRE stack/backtrack limits. (We'll have mbregex stack limit soon)
Collision limit is good one also.

Does not follow, as per above.

Stas Malyshev
smalyshev@gmail.com

8 years ago by Yasuo Ohgaki — view source

unread

Hi Stas,

On Wed, Sep 21, 2016 at 11:26 AM, Stanislav Malyshev
smalyshev@gmail.com wrote:

I think we are better to limit max collisions.
I'm +1 for Nikita's proposal does this.

Max collision per what? How much would be the limit?

Collision by keys.

Not sure I understand. What would be counted - number of collision per
key? Per hashtable? Per process? Per request?

IIRC, proposed patch was detecting collisions per key.

It would be nice to have configurable limit like regex stack/backtrack limit.
That said, wouldn't 1000 enough for almost all apps?

Certainly not. Not even nearly enough. Collisions are pretty frequent
with short strings, for example, and for a big long-running application
1000 hash collisions is nothing. I think you severely underestimate how
frequent hash collisions are, with simple function like we're using,
over millions and millions of hash accesses that we're doing routinely.

I did a quick check, and just running run-tests.php -h (without any
tests!) produces about 5K collisions. Running composer (without doing
anything) - 8K collisions. Running composer update on a simple project -
400K (!) collisions. Now these are pretty simple cases compared to what
a complex modern PHP application does. So I think you are
underestimating it by about 4-5 orders of magnitude.

I agree that we cannot be sure how many collision limit is proper for
certain app.
This is the same for memory limit, stack limit, backtrack limit,
recursion limit.

It is possible to set reasonable limit that is good enough for almost
all apps. AFAIK, we don't have a bug report complains slow hash
operation for normal code yet. IMO, this is the evidence that we can
set collision limit safely and prevent intended hash collision
attacks.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net

8 years ago by Tom Worster — view source

unread

If anyone wants a VERY rough estimate of relative performance
degradation as a result of switching to SipHash, here's a somewhat naive
C++ implementation of a similar data structure to that found in PHP:

https://github.com/cubiclesoft/cross-platform-cpp

(See the "Hash performance benchmark" results at the above link.)

In short, there's a significant degradation just switching from djb2 to
SipHash depending on key type. A similar effect would probably be seen
in PHP.

The difference is big enough that people won't want this as a precaution
affecting all of PHP's hashes.

But it's small enough that people might opt for it as a defensive
measure in case of serious attacks in the wild. So having an
implementation but not compiling it by default would be interesting.

Randomizing the starting hash value for djb2 during the core startup
sequence could also be effective for mitigating HashDoS. Extensive
testing would have to be done to determine how collision performance
plays out with randomized starting hash values. I can't find any
arguments anywhere against using randomized starting hash values for
djb2. Also of note, the 33 multiplier seems more critical than anything
else for mixing bits together.

This is consistent with what Nicholas Clark wrote[1] that I mentioned
already in my reply to Scott.

However, he also says

I've got a sneaking suspicion that this story still has legs, and
that someone will pop up with some new surprise or twist. Hence I'm
keeping an eye open to spot any more developments in this decade old
saga, in case there is action Perl 5 needs to take.

In which case it is nice for Perl to have SipHash implemented but not
compiled by default.

Tom

[1]
http://news.perlfoundation.org/2012/12/improving-perl-5-grant-report-11.html

8 years ago by Tom Worster — view source

unread

Would the Internals team be open to discussing mitigating HashDoS in a
future version of PHP? i.e. everywhere, even for json_decode() and friends,
by fixing the problem rather than capping the maximum number of input
parameters and hoping it's good enough.

I'd propose SipHash (and/or a derivative): https://www.131002.net/siphash/

(Look at all the other languages that already adopted SipHash.)

I briefly looked through the "Users" list and didn't find anything
equivalent to using it as PHP's internal base hash.

Python and Rust have an implementation available to users. Ruby is using
it internally but I think it's focused on JSON.

There's some good info[1] on the situation in Perl 5. While SipHash is
available it requires a non-default compile-time option.

Correct me if I'm not reading the situation right.

Tom

[1]
http://news.perlfoundation.org/2012/12/improving-perl-5-grant-report-11.html

Adopted as base data structure in the engine? Which ones? What were the performance costs?

I don't see how we can do $appropriateIntervalHere if we use opcache. We could clean the cache of course but I'm not sure server owners would be very happy if their cache dropped at random intervals with accompanying load spike.

P.S. Sorry for the confusion and the noise, I hope I got the proposed solution right this time.

Regards,

Regards,

Not sure I understand what you mean. How exactly memory limits are under your control? If somebody sends a request that blows up your memory limit, how you control it? In fact, if somebody sends, say, a POST that goes above your post limit - how you handle it without terminating the request?

Does not follow, as per above.

Adopted as base data structure in the engine? Which ones? What were the
performance costs?

I don't see how we can do $appropriateIntervalHere if we use opcache. We
could clean the cache of course but I'm not sure server owners would be
very happy if their cache dropped at random intervals with accompanying
load spike.

P.S. Sorry for the confusion and the noise, I hope I got the proposed
solution right this time.

Not sure I understand what you mean. How exactly memory limits are under
your control? If somebody sends a request that blows up your memory
limit, how you control it? In fact, if somebody sends, say, a POST that
goes above your post limit - how you handle it without terminating the
request?