Binary (un)safety of password_hash() used with PASSWORD_BCRYPT

4 years ago by Sara Golemon — view source

unread

I have this notion that we've discussed this before, I'm certain I knew
that bcrypt wasn't binary safe, but someone reminded me that
password_hash() could be called with null bytes in the password itself and
that is just SCREAMING to have some safety-rails put on IMO.

So I've thrown together https://github.com/php/php-src/pull/6897 to test
that argon2 algos behave well (they do!), and modified the bcrypt algo to
throw an exception if you try to hash a new password containing a null, but
only warns if you've got a null when running password_verify(). My
reasoning for the latter is because anyone trying to auth with a null
character that succeeds does definitely know enough of the password to pass
by simply not passing the null, so we shouldn't break existing users who
already have hashes. This only aims to break users who would otherwise be
able to include a null, because they would have a false sense of security
having their password truncated and can remedy that in their password
choosing.

Since this does introduce a (small) break, I'm putting it to the list for
opinions.
If nobody objects, I'll merge this (8.1 only) at the end of the month.

-Sara

4 years ago by Niklas Keller — view source

unread

Sara Golemon pollita@php.net schrieb am Do., 22. Apr. 2021, 03:47:

I have this notion that we've discussed this before, I'm certain I knew
that bcrypt wasn't binary safe, but someone reminded me that
password_hash() could be called with null bytes in the password itself and
that is just SCREAMING to have some safety-rails put on IMO.

So I've thrown together https://github.com/php/php-src/pull/6897 to test
that argon2 algos behave well (they do!), and modified the bcrypt algo to
throw an exception if you try to hash a new password containing a null, but
only warns if you've got a null when running password_verify(). My
reasoning for the latter is because anyone trying to auth with a null
character that succeeds does definitely know enough of the password to pass
by simply not passing the null, so we shouldn't break existing users who
already have hashes. This only aims to break users who would otherwise be
able to include a null, because they would have a false sense of security
having their password truncated and can remedy that in their password
choosing.

Hey Sara,

Thank you for your work. I think this is a really good safety check to
have. I'd however go a step further and also throw on NUL in
password_verify.

You seem to assume that NUL bytes as input come from the end user, but I
think it's more likely the developer uses a hash function with raw output
as a pre-encoding, for reasons such as bringing long passwords below the
bcrypt character limit.

Best,
Niklas

Since this does introduce a (small) break, I'm putting it to the list for

opinions.
If nobody objects, I'll merge this (8.1 only) at the end of the month.

-Sara

4 years ago by Sara Golemon — view source

unread

Thank you for your work. I think this is a really good safety check to
have. I'd however go a step further and also throw on NUL in
password_verify.

You seem to assume that NUL bytes as input come from the end user, but I
think it's more likely the developer uses a hash function with raw output
as a pre-encoding, for reasons such as bringing long passwords below the
bcrypt character limit.

I definitely am making the assumption you describe and it slightly
horrifies me to learn that people are using a pre-digest (though I can
understand why they would; ostensibly this should improve significance of
the input bits).

Do you have a link to places where frameworks are doing this? I built a
contrived example which I think summarizes the behavior you described here:
https://3v4l.org/6tunp

-Sara

4 years ago by Kamil Tekiela — view source

unread

I don't like throwing exceptions for pretty much the same reasons as Nikita
described. This is a rather limited attack vector. It depends either on the
user going out of their way to make their password vulnerable or on the
developer introducing the vulnerability with the use of another function
mangling the passwords. The latter is more likely. In your example, you had
to make the output of MD5 binary to make the function fail. In fact, as far
as I know, browsers strip NUL bytes or convert them to Unicode unknown
character. See spec
https://html.spec.whatwg.org/multipage/parsing.html#parse-error-unexpected-null-character
If developers want to add such measures as part of defence in depth then
the onus is on them to deal with passwords containing NUL bytes. As for
PHP's behaviour, what we could do is strip NUL bytes before hashing with
bcrypt. This would still weaken passwords, but at least wouldn't discard
the rest of the string. We could also add a warning to the manual
explaining that the function is not binary safe.

4 years ago by Sara Golemon — view source

unread

I don't like throwing exceptions for pretty much the same reasons as
Nikita described.
This is a rather limited attack vector. It depends either on the user
going out of their way
to make their password vulnerable or on the developer introducing the
vulnerability with
the use of another function mangling the passwords. The latter is more
likely. In your
example, you had to make the output of MD5 binary to make the function
fail. In fact,
as far as I know, browsers strip NUL bytes or convert them to Unicode
unknown character.
See spec
https://html.spec.whatwg.org/multipage/parsing.html#parse-error-unexpected-null-character

To simplify this discussion, let's take it as a given that we're not
getting nul bytes directly from the end user, or if we are that they
deserve what they get.

I'd like to focus on the other case, where pre-digests are done. I'll
agree that it's the framework author doing something unexpected that puts
us in this situation, but I don't agree that them doing so is unreasonable.
It's known that bcrypt has a length limitation. It's also clear that bits
of collision space are wasted by using ascii input. Those two facts make
producing binary output from a pre-digest function entirely reasonable.
Specifically, the largest output digest function available for the input
space. sha512 produces 64 bytes of output which fit neatly into bcrypt's
72 character limit.
There is a 64 * (1/256) chance that at least one of those octets is a \0.
Or more concisely 1 in 8. The degree to which this weakens the password
depends on the specific position, anywhere from negligible (last position)
to complete (first position).

I don't like 1/8 odds of my password being made less secure due to a poor
understanding of the limits of the default* password algorithm. That's my
motivation.

bcrypt being the only "always available" algorithm is an issue unto
itself and deserves a separate thread.

As for PHP's behaviour, what we could do is strip NUL bytes before
hashing with bcrypt.
This would still weaken passwords, but at least wouldn't discard the rest
of the string.

That's an option, but then we run into multi-version incompatibilities.
If a machine running 8.1 hashes "foo\0bar" (meaning it effectively hashes
"foobar"), other machines running 8.1+ can verify it just fine, but any
machine running 8.0- try to verify against "foo", and fail.
This is a specific issue for heterogeneous deployments, so maybe that's a
tolerable thing, but I'd probably introduce it in a major (e.g. 9.0). We
could add the fallback check in verify now, and the change to hash in 9.0
so that the only way it'd fail would be producing on 9.0+ but verifying on
8.0- which is a bit extreme in terms of mixed deployments.

I'll be honest, I'd probably defer the exception in this PR until 9.0 as
well, given the pushback.

We could also add a warning to the manual explaining that the function is
not binary safe.

100% Gonna do that regardless of any decision made here.

-Sara

4 years ago by Niklas Keller — view source

unread

Sara Golemon pollita@php.net schrieb am Do., 22. Apr. 2021, 17:27:

Thank you for your work. I think this is a really good safety check to
have. I'd however go a step further and also throw on NUL in
password_verify.

You seem to assume that NUL bytes as input come from the end user, but I
think it's more likely the developer uses a hash function with raw output
as a pre-encoding, for reasons such as bringing long passwords below the
bcrypt character limit.

I definitely am making the assumption you describe and it slightly
horrifies me to learn that people are using a pre-digest (though I can
understand why they would; ostensibly this should improve significance of
the input bits).

Do you have a link to places where frameworks are doing this? I built a
contrived example which I think summarizes the behavior you described here:
https://3v4l.org/6tunp

I have links to a library / blog post:

https://github.com/paragonie/password_lock

https://paragonie.com/blog/2016/02/how-safely-store-password-in-2016#why-bcrypt

We're probably better off returning false for verify then instead of
throwing? Hash could hash a random password instead if NUL bytes are
present.

Best,
Niklas

-Sara

4 years ago by Sara Golemon — view source

unread

Do you have a link to places where frameworks are doing this? I built a
contrived example which I think summarizes the behavior you described
here:
https://3v4l.org/6tunp

I have links to a library / blog post:

https://github.com/paragonie/password_lock

https://paragonie.com/blog/2016/02/how-safely-store-password-in-2016#why-bcrypt

We're probably better off returning false for verify then instead of
throwing? Hash could hash a random password instead if NUL bytes are
present.

So this library (which I'd need convincing is widely used) doesn't actually
have any null byte problems.
Yes, the digest produced by hash(...,true) could have null bytes,
but it's immediately piped into base64_encode() which papers directly over
that issue producing only ascii printable output.

-Sara

4 years ago by Niklas Keller — view source

unread

Sara Golemon pollita@php.net schrieb am Fr., 23. Apr. 2021, 00:21:

Do you have a link to places where frameworks are doing this? I built a
contrived example which I think summarizes the behavior you described
here:
https://3v4l.org/6tunp

I have links to a library / blog post:

https://github.com/paragonie/password_lock

https://paragonie.com/blog/2016/02/how-safely-store-password-in-2016#why-bcrypt

We're probably better off returning false for verify then instead of
throwing? Hash could hash a random password instead if NUL bytes are
present.

So this library (which I'd need convincing is widely used) doesn't
actually have any null byte problems.
Yes, the digest produced by hash(...,true) could have null bytes,
but it's immediately piped into base64_encode() which papers directly over
that issue producing only ascii printable output.

Scott knows what he's doing, so yes, this library isn't vulnerable nor did
I say that. It is rather an example of pre-encoding. People might remember
the approach incorrectly or have a similar idea themselves and make
mistakes in their own version of such code.

Best,
Niklas

4 years ago by Kamil Tekiela — view source

unread

We can also consider switching the default to Argon2id.
As Scott says the NUL byte truncation is not a bug in PHP, but a bug in the
algorithm. I don't know the exact specification but maybe we should leave
the current implementation as is?

4 years ago by Sara Golemon — view source

unread

We can also consider switching the default to Argon2id.
As Scott says the NUL byte truncation is not a bug in PHP, but a bug in
the algorithm. I don't know the exact specification but maybe we should
leave the current implementation as is?

The only way we can make argon2i(d) into the default is if it's always
available.
Currently, the only implementations we have are from external (non-system)
libraries, and making those libraries required is essentially a non-starter.

-Sara

4 years ago by Sara Golemon — view source

unread

People might remember the approach incorrectly or have a similar idea
themselves and make mistakes in their own version of such code.

While I agree in principle that this is possibly true and a good
justification for adding guard rails, I have to admit that assumptions are
only assumptions, they're not data. If this pattern is to be used as
evidence of a problem, then citations are required.

-Sara

4 years ago by Nikita Popov — view source

unread

I have this notion that we've discussed this before, I'm certain I knew
that bcrypt wasn't binary safe, but someone reminded me that
password_hash() could be called with null bytes in the password itself and
that is just SCREAMING to have some safety-rails put on IMO.

So I've thrown together https://github.com/php/php-src/pull/6897 to test
that argon2 algos behave well (they do!), and modified the bcrypt algo to
throw an exception if you try to hash a new password containing a null, but
only warns if you've got a null when running password_verify(). My
reasoning for the latter is because anyone trying to auth with a null
character that succeeds does definitely know enough of the password to pass
by simply not passing the null, so we shouldn't break existing users who
already have hashes. This only aims to break users who would otherwise be
able to include a null, because they would have a false sense of security
having their password truncated and can remedy that in their password
choosing.

Since this does introduce a (small) break, I'm putting it to the list for
opinions.
If nobody objects, I'll merge this (8.1 only) at the end of the month.

I don't think this is a good idea. This adds an error condition that is
based on the input string, which is generally user-provided. As there is
nothing in the default stack preventing null bytes from passing in via HTTP
(correct me if I'm wrong), this means the error can be trivially triggered,
probably on about any PHP application using password_hash(). I don't think
this is acceptable, for much the same reason that having file_exists()
throw on null bytes is a bad idea (something we recently changed).

I'm not sure what problem you're actually trying to solve here. Ultimately,
all this allows is a user to shoot themselves in the foot if they want to
-- and they do need to try actively, it's not like you can end up with a
null byte in a password by accident.

Regards,
Nikita

4 years ago by Sara Golemon — view source

unread

I don't think this is a good idea.

Fair, and that objection alone is enough to merit going RFC if I decide to
proceed with this.

This adds an error condition that is based on the input string, which is
generally user-provided.
As there is nothing in the default stack preventing null bytes from
passing in via HTTP (correct me if I'm wrong),
this means the error can be trivially triggered, probably on about any
PHP application using password_hash().

Correct on all counts. However I see that as a good thing. The site
developer no longer is left in a state where they have less security than
they think they have. Say they have a minimum password length constraint
(quite common), and that check has passed, "Yes, the user came up with a
password that's at least 20 character long, good." Now they pass that into
password_hash(), but because they didn't check for null bytes, the password
that actually gets hashed is only three bytes long, or maybe even an empty
string. Nobody in the chain realizes this is the case, because why on
Earth would it be?

I'm not sure what problem you're actually trying to solve here.
Ultimately, all this allows is a user to shoot themselves in the foot if
they want to -- and they do need to try actively, it's not like you can end
up with a null byte in a password by accident.

Niklas' reply just prior to yours actually offers a case where that's not
unlikely.

it's more likely the developer uses a hash function with raw output as a
pre-encoding,
for reasons such as bringing long passwords below the bcrypt character
limit.

I hadn't actually considered this, but it makes null bytes showing up in an
input password not only possible, but reasonably likely.
In fact, there's a 1/256 chance that the lead byte would be null making
their effective password into an empty string from bcrypt's point of view.
This means that a brute force attack trying to login as that user would
also have a 1/256 chance of match it since the hashed version of a
completely dissimilar password is also empty.

The actual attack complexity has a worst case of 1/65536 since they have to
try lead-null hashable passwords on all users as they won't know which have
the unlucky property of having already hashed into a lead-null input.
Though if the pre-hash algorithm is known they could probably bring this
back to 1/256 by precomputing lead-null hashing inputs before scanning
users for a match.

Yes, this is something that can be addressed in userspace, but is it? I
don't actually know the answer to that, but it worries me.

-Sara

4 years ago by Claude Pache — view source

unread

Le 22 avr. 2021 à 03:47, Sara Golemon pollita@php.net a écrit :

I have this notion that we've discussed this before, I'm certain I knew
that bcrypt wasn't binary safe, but someone reminded me that
password_hash() could be called with null bytes in the password itself and
that is just SCREAMING to have some safety-rails put on IMO.

So I've thrown together https://github.com/php/php-src/pull/6897 to test
that argon2 algos behave well (they do!), and modified the bcrypt algo to
throw an exception if you try to hash a new password containing a null, but
only warns if you've got a null when running password_verify(). My
reasoning for the latter is because anyone trying to auth with a null
character that succeeds does definitely know enough of the password to pass
by simply not passing the null, so we shouldn't break existing users who
already have hashes. This only aims to break users who would otherwise be
able to include a null, because they would have a false sense of security
having their password truncated and can remedy that in their password
choosing.

Since this does introduce a (small) break, I'm putting it to the list for
opinions.
If nobody objects, I'll merge this (8.1 only) at the end of the month.

-Sara

Hi,

The intention is good, but the solution not so, especially for password_verify. You introduce a new, subtle way for this function to behave unexpectedly; because currently, password_verify does not currently trigger warning when given input of correct type. (Or if it does, it must be documented and fixed, because it would be a bad thing.)

Also, the warning message you introduced contains “... this hash should be regenerated using ...”. There is already a dedicated function for conveying that information in a coder-friendly way, namely password_needs_rehash(): let’s use that (it implies that the default algorithm should be changed).

—Claude

4 years ago by Sara Golemon — view source

unread

Also, the warning message you introduced contains “... this hash should
be regenerated using ...”.
There is already a dedicated function for conveying that information in a
coder-friendly way,
namely password_needs_rehash(): let’s use that (it implies that the
default algorithm should be changed).

Negative. There's no way for password_needs_rehash() to know that the
original password contained a null byte.
1/ Because the original password isn't sent to password_needs_rehash, only
the generated hash
2/ Because even if you could derive what password was used to generate the
hash, there's no way to tell if it was "foo" or "foo\0bar", they would both
generate the same hash (given the same salt)

-Sara