Tom,
- arc4random puts a generator in the user process.
This is much more controversial. Some people (Anthony F. for one and
myself
until recently) argue that a generator algorithm in the user process
degrades security. It must in any case be downstream of the kernel
source
and therefore cannot compensate for any problems in using kernel
randomness.
Moreover, it adds another point of failure (a place where there might be
bugs, like OpenSSL's RNG bugs). And finally, since the downstream
generator
is in the user process rather than protected kernel memory, it's easier
for
an attacker to learn the generator's state and thus predict future
outputs.Others, including Dan Bernstein[1] and the OpenBSD crowd[2], argue that
the
later is theoretical rather than a practical vulnerability because if an
attacker can read your memory then your crypto system is a total fail
regardless where you get random numbers from.There is one important nuance here though. If an attacker can read
memory they they can compromise anything currently in memory for the
application. Things that have since been freed (such as tokens and
secrets generated from prior requests) are typically safe. However,
using a userland RNG, that's not necessarily true unless a reseed
(stir) event happened between them. So you do lose some practical
forward security in that things that otherwise may not have been
leaked may be leaked. So I would still consider it a practical
vulnerability.And an attacker reading memory is only game over if they can control
how and when that memory is read. It's a fail in any case, but if the
exploit only allows them to ever see memory from their request, it's a
lot less damaging than if they can see other's requests on the system.
It's still quite bad, but there's definitely a difference.
A counter argument to this, iiuc, goes as follows.
Imagine the PHP server process's RNG as a 32 byte key expanded into a very
long string. Each call to get something random consumes a substring of
some length. The next call consumes the next N bytes, and so forth.
The order of these calls and how many bytes each consumes is
unpredictable, so the argument goes. Even a relatively small diversity of
uses for random bytes (diverse requests each with diverse calls) in the
history of the process up to any given RNG call effectively stirs the RNG.
This unpredictability can be considered a source of entropy under the same
kind of definitions and assumptions that the kernel needs to gather
entropy.
As for my opinion, the argument is categorically invalid only in very
special cases, pretty much irrelevant in PHP practice. In general cases,
it's validity boils down to how much entropy one can believe is introduced
this way and how often. This belief is clearly subjective.
For me, as a PHP user, I get a lot of confidence from this argument. I
only saw it properly set out very recently but I was using essentially the
same thing against people who (having read the notorious Linux random(4)
man page or having picked up the meme) said urandom is unsafe for crypto
because you must not consume more entropy than you can gather. Granted,
there's nothing estimating how much entropy is introduced this way.
Granted, there's even more of this mixing if we all use the system's one
central RNG. But I feel subjectively that it need introduce little
unpredictability to make practical exploits of predicted future randoms
very difficult.
Thanks again for the discussion
It's my pleasure, an interesting topic, as yet unresolved among its
cognoscenti.
Anthony