Could you give some examples?
I'm not sure what kind of IoT devices/OS that support PHP do not have
CSPRNG.
I'm sorry, my reply ended up with subject "Re: internals Digest 27 Jan
2017 10:58:15 -0000 Issue 4425". My fault. I'll copy it here...
There are two problems. One is embedded OSs with crummy
RNGs.
The other is any OS in a "low-entropy environment", fancy-talk for the
situation when the OS's techniques for gathering "noise" from devices
are frustrated by their absence, or little to no activity on those
devices, or the activity not being random.
I don't want to get into an argument about on which IoT Things you might
find PHP. But we know its growing fast, the Things are significant in
botnets,
and that the Things often come with a web server for admin. It's not
unreasonable to use PHP+SQLite to admin a Linux-based baby monitor, for
example.
OSes can provide CSPRNG w/o hardware based RNG. Security on IoT
matters
a lot, especially for IoT that supports PHP. CSPRNG features are in
PHP
core
already. Secure PHP scripts wouldn't work anyway on such devices
anyway.
e.g. generating nonce or like.
Correct. But we're talking about mt_rand()
and therefore not about
crypto. Apps that used it correctly, i.e. that did not relying on its
output being unpredictable, will stop working if you make mt_rand()
fail
on these systems.
It's a policy that selects competent developers that understand
mt_rand()
for greatest punishment.
Issues are
- Current
mt_rand()
is not fully exploited. It wastes more than 99%
of its
random cycle.
This was discussed in Aug last year and dropped.
- Current
uniqid()
's entropy is extremely poor and there is fair
chances
for collisions.
I don't much care about uniqid()
.
Question is
- Are we going to keep these poor behaviors as PHP spec/standard
forever
or not.
idk.
My only argument in this is to disagree with you that it's ok to change
these functions so they don't work in situations in which they did
before. You can improve their seeding without doing that. Just don't
make them fail when the improved seeding fails.
Apart from that, I don't mind you flogging these two dead horses.
Tom
Hi Tom,
Could you give some examples?
I'm not sure what kind of IoT devices/OS that support PHP do not have
CSPRNG.I'm sorry, my reply ended up with subject "Re: internals Digest 27 Jan
2017 10:58:15 -0000 Issue 4425". My fault. I'll copy it here...There are two problems. One is embedded OSs with crummy RNGs
http://samvartaka.github.io/cryptanalysis/2017/01/03/33c3-embedded-rngs.The other is any OS in a "low-entropy environment", fancy-talk for the
situation when the OS's techniques for gathering "noise" from devices are
frustrated by their absence, or little to no activity on those devices, or
the activity not being random.I don't want to get into an argument about on which IoT Things you might
find PHP. But we know its growing fast, the Things are significant in
botnets
https://krebsonsecurity.com/2016/09/krebsonsecurity-hit-with-record-ddos/,
and that the Things often come with a web server for admin. It's not
unreasonable to use PHP+SQLite to admin a Linux-based baby monitor, for
example.
Generating CS random is not simple as in the document you mentioned.
"secure randomness is ideally provided as a system service." I agree this
statement. Systems support complex softwares like PHP should have CSPRNG.
As far as I understand, all IoT devices that support PHP at least have
working CSPRNG.
"This last issue also means that most embedded OSes that do implement a
CSPRNG only implement a non-blocking interface (eg. the /dev/urandom
interface on Unix-like systems) in order to prevent insufficient entropy
from holding up cryptographic operations"
Systems like IoT may not have TRNG and proper reseeding. Therefore,
"random" values have risk to be known to attackers. e.g. Some OSes CSPRNG
use RC4 for /dev/urandom. RC4 is known be vulnerable, but RC4 is still
considered secure if first part of stream is discarded. However, we cannot
guarantee 100% secureness due to the fact that RC4 is stream cipher/RC4's
simplicity.
Therefore, application developers are better to have additional care even
for CSPRNG. For example, current session module read CSPRNG random value
and generate session ID directly. Session module reads at least extra 64
bytes from CSPRNG to mitigate leaked CSPRNG state risk.
We need such caution for CSPRNG, but I suppose we don't have to worry about
CSPRNG availability because it should be there and if it cannot be used,
it's system a problem, not PHP issue. We don't care if data read from
memory is broken or not. CSPRNG availability is the same.
OSes can provide CSPRNG w/o hardware based RNG. Security on IoT matters
a lot, especially for IoT that supports PHP. CSPRNG features are in PHP
core
already. Secure PHP scripts wouldn't work anyway on such devices anyway.
e.g. generating nonce or like.Correct. But we're talking about
mt_rand()
and therefore not about crypto.
Apps that used it correctly, i.e. that did not relying on its output
being unpredictable, will stop working if you makemt_rand()
fail on these
systems.It's a policy that selects competent developers that understand
mt_rand()
for greatest punishment.
I will change my opinion if there is unignorable breakage by CSPRNG usage.
IMO, if users really need working mt_rand()
w/o CSPRNG, they should write
code like
mt_srand((int)lcg_value() * 10000000000);
mt_rand()
;
Vast majority of PHP users should enjoy better random value. They should
not be disturbed by very few users that use system lacking fundamental
security related service, i.e. CSPRNG. IMO.
Additional single line PHP code shouldn't be difficult for any users.
Issues are
- Current
mt_rand()
is not fully exploited. It wastes more than 99% of its
random cycle.This was discussed in Aug last year and dropped.
I didn't notice the discussion, let's discuss again.
It does not make sense to discard more than 99% of cycle. In addition, MT
rand reference implementation is capable to initialize whole state buffer,
why we shouldn't have it?
- Current
uniqid()
's entropy is extremely poor and there is fair
chances for collisions.I don't much care about
uniqid()
.Question is
- Are we going to keep these poor behaviors as PHP spec/standard forever
or not.idk.
My only argument in this is to disagree with you that it's ok to change
these functions so they don't work in situations in which they did before.
You can improve their seeding without doing that. Just don't make them fail
when the improved seeding fails.Apart from that, I don't mind you flogging these two dead horses.
My point is described in previous sentence.
Vast majority of PHP users should enjoy better random value. They should
not be disturbed by very few users that use system lacking fundamental
security related service, i.e. CSPRNG. IMO.
There is no real BC issue here, IMHO.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
Issues are
- Current
mt_rand()
is not fully exploited. It wastes more than 99% of its
random cycle.This was discussed in Aug last year and dropped.
I didn't notice the discussion, let's discuss again.
It does not make sense to discard more than 99% of cycle. In addition, MT
rand reference implementation is capable to initialize whole state buffer,
why we shouldn't have it?
First of all, users should never use mt_rand()
for any security related
purposes.
However, there are too many codes that use mt_rand()
for the purpose.
MT rand is a lot harder to predict than older predictable PRNG. Many users
misunderstood
it as "secure like CSPRNG" because it is advertised "very secure and hard
to predict
than older PRNG" too much.
Attackers can get "mt_rand() generated random string" very easily on many
apps.
Apps generate access token by mt_rand()
is just a matter of requesting
proper
apps' feature. e.g. password generated by mt_rand()
, access token generated
by mt_rand()
.
With ideal implementation, attackers must search random sequence from
2^19937−1 cycle,
but they can guess random sequence easily because PHP's mt_rand()
has
only 2^32 initial states.
It's far less than 1% of MT rand capability. This allows attackers to guess
"mt_rand()
generated random string" easily than it could be.
Attackers may get victim's correct password with combination of "getting
mt_rand()
generated
password" and "social engineering", e.g. Notify victim "Your account is
compromised,
please reset your password. Please make sure to use system generated strong
random
password".
Exploiting Access token that allows access to system by itself could be
even easier.
Attackers generate mt_rand()
created access token and guess next access
token to
be generated, then use the access token that is owned by poor victim.
Mainly, it is users fault because mt_rand()
should never be used for
above purpose, but
mt_rand()
generated random string being extremely weak than it could be is
our fault.
Our part could be fixed by us. Let's fix it now.
Lauri made patch for unseeded mt_rand()
. I'll prepare patch that allows int
array
initialization for mt_srand()
so that whole state buffer can be initialized
as user specifies.
void mt_srand(int|array $seed)
where $seed could be
$seed = [123456789, 987654321, ....]; // Up to max size of state buffer
It can be said current mt_rand()
is good enough for the purpose. I totally
agree with this.
However, I cannot agree that current mt_rand()
implementation is ideal/what
is should be.
Any comments?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
Our part could be fixed by us. Let's fix it now.
Lauri made patch for unseeded
mt_rand()
. I'll prepare patch that allows
int array
initialization formt_srand()
so that whole state buffer can be
initialized as user specifies.void mt_srand(int|array $seed)
where $seed could be
$seed = [123456789, 987654321, ....]; // Up to max size of state buffer
It can be said current
mt_rand()
is good enough for the purpose. I totally
agree with this.
However, I cannot agree that currentmt_rand()
implementation is
ideal/what it should be.
Seed is very important for PRNG and current seeding code/behavior has other
issues.
First issue is:
- PHP does not care if seed is done by "user" or "system"(lcg random now).
- If user seed by mt_srand(1234), then the seed is outstanding for
mt_rand()
/rand() calls across requests.
Most users would expect "random seeding" when there is no
mt_srand()
/srand() in current execution while currently is not.
I think of 2 choices to fix this behavior:
- Set BG(mt_rand_is_seeded) = 0 by RINIT always and force to reseed by
system when it is applicable. - Add new BG(mt_rand_is_user_seeded) flag if it is 1,
BG(mt_rand_is_seeded) = 0 by RINIT. (A little efficient than 1)
Thoughts?
In addition to previous issue, rand()
/srand() is alias of
mt_rand()
/mt_srand() now. Most developers
expect rand()
and mt_rand()
as unrelated PRNG and may write following code
srand(1234);
$rnd = rand()
; // We need the same rand()
for XXX
Somewhere in other code in the same app,
$rnd = mt_rand()
; // We need hard to predict non CS purpose random here.
Obviously, the mt_rand()
call is not random at all.
This affects all of MT rand usage such as shuffle()
, etc.
Instead of sharing the same MT rand state, it may be better to have
dedicated state for rand()
/srand() at least.
There are few functions use MT rand like shuffle()
, but I would like to
avoid to allocate state buffers for each MT rand usage. One possible
resolution may be adding reseed flag to srand()
/mt_srand().
// Force system reseeding
srand(TRUE);
mt_srand(TRUE);
then users may be used as follows
// Need randomness that is not affected by other parts of codes. i.e.
srand(123)/mt_rand(123) somewhere else.
mt_srand(TRUE);
shuffle($my_random_array);
I don't like this idea myself. I don't like seeding flag for shuffle()
/etc
neither. Writing code is easy, but this issue is not easy to fix.
Any better ideas?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
I'll prepare patch that allows int array
initialization formt_srand()
so that whole state buffer can be
initialized as user specifies.
Don't.
See these:
https://github.com/php/php-src/pull/2089
internals@lists.php.net/msg87590.html" rel="nofollow" target="_blank">https://www.mail-archive.com/internals@lists.php.net/msg87590.html
--
Lauri Kenttä
Hi Lauri,
On Mon, Jan 30, 2017 at 7:41 PM, Lauri Kenttä lauri.kentta@gmail.com
wrote:
I'll prepare patch that allows int array
initialization formt_srand()
so that whole state buffer can be
initialized as user specifies.Don't.
See these:
https://github.com/php/php-src/pull/2089
internals@lists.php.net/msg87590.html" rel="nofollow" target="_blank">https://www.mail-archive.com/internals@lists.php.net/msg87590.html
Nice!
Thank you for letting me know.
BTW, do you have good idea for the seeding issue?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
On Mon, Jan 30, 2017 at 7:41 PM, Lauri Kenttä lauri.kentta@gmail.com
wrote:I'll prepare patch that allows int array
initialization formt_srand()
so that whole state buffer can be
initialized as user specifies.Don't.
See these:
https://github.com/php/php-src/pull/2089
internals@lists.php.net/msg87590.html" rel="nofollow" target="_blank">https://www.mail-archive.com/internals@lists.php.net/msg87590.htmlNice!
Thank you for letting me know.
BTW, do you have good idea for the seeding issue?
I got it now. Object based PRNG sounds nice. I may write it, but
I have long todo lists already.
Anyway, we have to take care existing API and make usable.
We must have MT rand reseed API somehow.
Ideas are appreciated.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
On Mon, Jan 30, 2017 at 7:41 PM, Lauri Kenttä lauri.kentta@gmail.com
wrote:I'll prepare patch that allows int array
initialization formt_srand()
so that whole state buffer can be
initialized as user specifies.Don't.
See these:
https://github.com/php/php-src/pull/2089
internals@lists.php.net/msg87590.html" rel="nofollow" target="_blank">https://www.mail-archive.com/internals@lists.php.net/msg87590.htmlNice!
Thank you for letting me know.
BTW, do you have good idea for the seeding issue?I got it now. Object based PRNG sounds nice. I may write it, but
I have long todo lists already.Anyway, we have to take care existing API and make usable.
We must have MT rand reseed API somehow.Ideas are appreciated.
Posting RFC draft before discussion
https://wiki.php.net/rfc/improve_predictable_prng_random
This RFC includes results of recent PRNG related discussions.
I would like to keep it simple, but basic object feature will be
implemented.
Methods could raise exceptions for invalid operations rather than ignoring.
Comments?
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Posting RFC draft before discussion
https://wiki.php.net/rfc/improve_predictable_prng_random
This RFC includes results of recent PRNG related discussions.
I would like to keep it simple, but basic object feature will be
implemented.Methods could raise exceptions for invalid operations rather than ignoring.
Comments?
--
Yasuo Ohgaki
yohgaki@ohgaki.net
The proposed interface has way too many methods. Nor should CSPRNG extend
the other PRNG interface. And also not the other way round, because then
you can pass a non-CSPRNG to an only CSPRNG-accepting function.
interface RandomInterface {
public function getInt(int $min = NULL, int $max = NULL); // Int random
public function getBytes(int $length); // Raw bytes
public function getString(int $length, int $bits = 6); // String
[0-9a-zA-Z,-]+
public function seed($seed = NULL); // No use with CS RNG, return TRUE
always.
public function getState(); // Return string representation PRNG state.
No use with CS RNG, return NULL.
public function setState(string $state); // Set PRNG state. No use with
CS RNG, return TRUE
always.
public function getCount(); // No use with CS RNG, return 0 always.
public function getReseedCycle(); // No use with CS RNG, return 0
always.
public function setReseedCycle(int $count); // No use with CS RNG,
return TRUE
always.
}
You can already see from the comments that there are too many methods, e.g.
"No use with CS RNG, return 0 always."
I think the interfaces should look like that:
interface PRNG {
public function getInt(int $min = null, int $max = null): int;
public function getBytes(int $length): string;
}
interface SeedablePRNG extends PRNG {
public function seed(?? $seed): void;
}
interface CSPRNG {
public function getInt(int $min = null, int $max = null): int;
public function getBytes(int $length): string;
}
getString()
should probably be a function accepting a PRNG / CSPRNG and a
string containing possible chars. Maybe also an array, so you can use
non-single-byte encodings like UTF-8.
While getState and setState might be useful to save a state and continue
somewhere in the future with the same state, it's not something for an
interface. If you do something like that, you need the same RNG again, so
those can be exposed only in specific implementations, they shouldn't be in
the interface.
getCount
and getReseedCycle
: What's the use case? I think those should
be handled internally, nothing that should be user configurable.
There are also a lot of functions, I don't think we should introduce them.
We can just add static methods with a default instance to each RNG.
Regards, Niklas
Posting RFC draft before discussion
https://wiki.php.net/rfc/improve_predictable_prng_random
This RFC includes results of recent PRNG related discussions.
I would like to keep it simple, but basic object feature will be
implemented.Methods could raise exceptions for invalid operations rather than
ignoring.Comments?
--
Yasuo Ohgaki
yohgaki@ohgaki.net
I think this RFC is badly prepared. You're overhauling the whole mt_rand
system in one go, but you're not doing it properly. There is no
justification for breaking compability, not in 7.x and not even in 8.0
in my opinion.
There's now three completely unrelated "issues":
-
You want to improve automatic seeding and GENERATE_SEED. You could
just generate a 32-bit value from php_random_bytes and silently use the
current as fallback; this solution was practically accepted already. You
just waste time with your arguments about CSPRNG being sooooo important:
everybody has already heard you, and most people seem to disagree. -
You want to support long seeds. However, 2^32 is a lot of random
states. It's enough for almost any legitimate MT use case. As was
earlier discussed, adding this support to the global mt_srand is not
practical. Anyone who really needs a longer seed should most probably
also use a PRNG object to avoid cases where some internal function (say,
shuffle) modifies the MT state by accident. -
You want to use long seeds by default. This would be possible, as
discussed earlier, by seeding the whole MT state buffer from a CSPRNG.
However, you should consider also the possible performance impact of
generating 2,5 kB from CSPRNG on each request/reseed. And again, 2^32 is
probably enough already.
FWIW, my Raspberry Pi kernel log has several lines about /dev/urandom
not being properly seeded before the system is fully started, so using a
CSPRNG is not guaranteed to work so well.
--
Lauri Kenttä
Hi Lauri,
I think this RFC is badly prepared. You're overhauling the whole mt_rand
system in one go, but you're not doing it properly. There is no
justification for breaking compability, not in 7.x and not even in 8.0 in
my opinion.
Which part are you referring as BC? I suppose it is
<?php
$state = mt_srand(1234);
mt_rand($state); // Get static seed result
?>
Current behavior is extremely bad because if there is srand()
/mt_srand()
call in execution path.
- When seed is provided to
srand()
/mt_srand(),
consecutiverand()
/mt_rand() calls became non random. (Unacceptable) -
- applies to across request. (Extremely bad)
- Even without seed,
srand()
/mt_srand() call makes guess a lot easier than
it should be. i.e. Combined LCG is weak.
Suppose attacker find execution path that has static seed for
srand()
/mt_srand(), then attacker can predict exact random or can guess
easily.
a.php
<?php
// I need static random sequence
srand(1234);
for ($i = 0; $i < 10; $i++)
$rand[] = rand()
;
?>
b.php
<?php
// I need random sequence, let PHP seed it
for ($i = 0; $i < 10; $i++)
$better_rand[] = mt_rand()
;
?>
If a.php is called, then b.php's mt_rand()
is not random at all.
Even if mt_rand()
/rand() is predictable, this kind of behavior is
ABSOLUTELY unacceptable. IMO.
It's BC, but how many srand()
/mt_srand() calls are used in real apps?
I know there are, but benefits outweighs BC impact.
There's now three completely unrelated "issues":
- You want to improve automatic seeding and GENERATE_SEED. You could just
generate a 32-bit value from php_random_bytes and silently use the current
as fallback; this solution was practically accepted already. You just waste
time with your arguments about CSPRNG being sooooo important: everybody has
already heard you, and most people seem to disagree.
True, but I've never heard of CSPRNG failure that could cause BC that
matter as PHP problem yet.
You want to support long seeds. However, 2^32 is a lot of random
states. It's enough for almost any legitimate MT use case. As was earlier
discussed, adding this support to the global mt_srand is not practical.
Anyone who really needs a longer seed should most probably also use a PRNG
object to avoid cases where some internal function (say, shuffle) modifies
the MT state by accident.You want to use long seeds by default. This would be possible, as
discussed earlier, by seeding the whole MT state buffer from a CSPRNG.
However, you should consider also the possible performance impact of
generating 2,5 kB from CSPRNG on each request/reseed. And again, 2^32 is
probably enough already.
FWIW, my Raspberry Pi kernel log has several lines about /dev/urandom not
being properly seeded before the system is fully started, so using a CSPRNG
is not guaranteed to work so well.
Although users must never do this, but there are codes that generate random
password/access key by mt_rand()
.
If algorithm is known to attacker, all attacker should do could be making
very small size of dictionary
or current computer is fast enough to compute next random string to be
generated in short enough time.
If MT rand is seed fully, above task becomes hard enough task attackers to
avoid attacks.
Reading data from CSPRNG shouldn't be too slow as you know. One in a
hundred calls shouldn't matter. Even when it could matter, users may set
PHP to reseed one in a million, but the default shouldn't be too large.
When hardware TRNG is not available, CSPRNG could generate poor random
until system corrects enough entropy events. i.e. matter of time. CSPRNG
output should be available. If not, I'm surprised and I'll change my
opinion.
How many mt_rand()
/rand() calls are executed during PHP process
lifetime? With 32 bits seed, users will never use even 0.0000000000000001%
of MT rand potential. It's too much waste of the algorithm, isn't it?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Although users must never do this, but there are codes that generate random
password/access key bymt_rand()
.
There is also code that stores clear text passwords. How would you
prevent that?
IMHO, if users don't care to read the docs[1], it's their fault, and we
shouldn't waste our time to fix their bugs.
[1] http://php.net/manual/en/function.mt-rand.php
--
Christoph M. Becker
Although users must never do this, but there are codes that generate
random
password/access key bymt_rand()
.There is also code that stores clear text passwords. How would you
prevent that?IMHO, if users don't care to read the docs[1], it's their fault, and
we
shouldn't waste our time to fix their bugs.
We cannot fix these bugs without making mt_rand a CSPRNG, which means it
is no longer mt_rand.
All we can do is mitigate the problem (to some unknowable extent) by
seeding mt_rand from php_random_bytes. I don't care if we do this or not
so long as the change is simple and BC, i.e. 32-bit seed that falls back
to something else if php_random_bytes fails.
Tom
2017-02-02 14:24 GMT+01:00 Christoph M. Becker cmbecker69@gmx.de:
Although users must never do this, but there are codes that generate
random
password/access key bymt_rand()
.There is also code that stores clear text passwords. How would you
prevent that?IMHO, if users don't care to read the docs[1], it's their fault, and we
shouldn't waste our time to fix their bugs.
While the documentation states that, it can still be improved.
I've just submitted a patch, you can find the diff here:
https://gist.github.com/kelunik/bb534d4c4ede160d97ef17014052052a (linking
patches via edit.php.net doesn't really work, it just links to the newest
patch of a file and will break once merged).
Regards, Niklas
Hi Niklas,
2017-02-02 14:24 GMT+01:00 Christoph M. Becker cmbecker69@gmx.de:
Although users must never do this, but there are codes that generate
random
password/access key bymt_rand()
.There is also code that stores clear text passwords. How would you
prevent that?IMHO, if users don't care to read the docs[1], it's their fault, and we
shouldn't waste our time to fix their bugs.While the documentation states that, it can still be improved.
I've just submitted a patch, you can find the diff here:
https://gist.github.com/kelunik/bb534d4c4ede160d97ef17014052052a (linking
patches via edit.php.net doesn't really work, it just links to the newest
patch of a file and will break once merged).
Nice patch! I'm OK with your patch.
Currently, mt_rand()
value is affected by srand()
in PHP 7.1.
It may be described, but I think there will be new PRNG state for
rand()
/srand() at least, hopefully soon.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
On Thu, Feb 2, 2017 at 10:24 PM, Christoph M. Becker cmbecker69@gmx.de
wrote:
Although users must never do this, but there are codes that generate
random
password/access key bymt_rand()
.There is also code that stores clear text passwords. How would you
prevent that?IMHO, if users don't care to read the docs[1], it's their fault, and we
shouldn't waste our time to fix their bugs.
I totally agree.
However, there are valid usage like
a.php
<?php
// I need static random sequence
srand(1234);
for ($i = 0; $i < 10; $i++)
$rand[] = rand()
;
?>
b.php
<?php
// I need random sequence, let PHP seed it
for ($i = 0; $i < 10; $i++)
$better_rand[] = mt_rand()
;
// VALID USAGE
// I'm going to randomize which quiz is displayed
?>
This is unacceptable BC in PHP 7.1.
For PHP 7.1, there must be rand()
own state at least.
This was discussed in other thread, "Reseeding rand()
/mt_rand()".
User and system seed should be separated and independent.
Anyway, which code must be fixed, a.php or b.php in such case?
Suppose you are drupal (or any apps) module developer and using mt_rand()
that
requires random values. Someone else wants to use static random sequence,
which is rare usage compare to plain mt_rand()
call w/o user seed, then
suddenly your code became broken. IMO, a.php must be fixed.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Posting RFC draft before discussion
https://wiki.php.net/rfc/improve_predictable_prng_random
This RFC includes results of recent PRNG related discussions.
I would like to keep it simple, but basic object feature will be
implemented.Methods could raise exceptions for invalid operations rather than
ignoring.Comments?
I don't see in it any results of recent PRNG related discussions. I only
see your ideas and a disregard for other opinions.
1 The very first sentence
"Current predictable PRNG, i.e. mt_rand()
and rand()
, produces very weak
random values even produces non random values."
is plain nonsense. mt_rand implements the Mersenne Twister 19937 with
32-bit seed. It's a standard algorithm that's been used widely. It does
exactly what it's supposed to do.
The idea that it produces "very weak random values" expresses your
obsession with its use to produce unpredictable values, which it cannot
possibly do. Nobody else is trying to make it do this.
2 Purpose of object API
You've misunderstood the idea of the "object-based PRNG interface". It's
purpose is not to instrument mt_rand internals but to introduce
alternative new generators. We concluded last year that there's very
little we can do about mt_rand without breaking BC and/or making more
mess, so it should be left to rot while we offer users something better.
The new OOP API might offer things like MT, Xorshift, PCG, legacy LCGs,
or whatever you fancy (maybe even MT-PHP :), distributions and
utilities.
3 This API
Where to start? Have you any experience with the random APIs of any
decent statistical packages? Take a look at Boost Random, R, ...
Who wants a low-level interface to the MT algorithm's internals?
4 "Fix" for abuse of mt_rand
Your concern that people abuse mt_rand to get unpredictable values is
valid. But I don't think your attempts to mitigate this are useful.
Hypothetically, say we take a radical approach and make mt_rand draw
from php_random_bytes by default on every call. How much good would it
do?
Unmaintained legacy apps are unaffected because nobody upgrades them
from whatever PHP 4 or 5 they are on. Only apps that have stopped
receiving security updates but that are nevertheless actively maintained
to migrate them new major PHP versions can benefit. It's a niche we can
better target with new APIs and education. W.r.t. unpredictable randoms,
that's already done in 7.0.
Hence I don't believe abuse of mt_rand is something we can fix by
modifying its behavior. It's too late.
But if you really insist on doing something to mt_rand then make its
automatic seed use php_random_bytes and, if that fails, fall back to
GENERATE_SEED. While I doubt it will make the world any safer, it is as
effective as my hypothetical radical "fix" and should be harmless so
long as you don't introduce new bugs.
Tom
Hi Tom,
Posting RFC draft before discussion
https://wiki.php.net/rfc/improve_predictable_prng_random
This RFC includes results of recent PRNG related discussions.
I would like to keep it simple, but basic object feature will be
implemented.Methods could raise exceptions for invalid operations rather than
ignoring.Comments?
I don't see in it any results of recent PRNG related discussions. I only
see your ideas and a disregard for other opinions.
Incorrect. See "Reseeding rand()
/mt_rand()" thread.
My initial idea was to separate rand()
and mt_rand()
.
Andrea suggests user and system state should be separated and I agree.
Therefore, current RFC is written.
1 The very first sentence
"Current predictable PRNG, i.e.
mt_rand()
andrand()
, produces very weak
random values even produces non random values."is plain nonsense. mt_rand implements the Mersenne Twister 19937 with
32-bit seed. It's a standard algorithm that's been used widely. It does
exactly what it's supposed to do.The idea that it produces "very weak random values" expresses your
obsession with its use to produce unpredictable values, which it cannot
possibly do. Nobody else is trying to make it do this.
How could you say current MT rand usage is strong as it could/should?
There are 2^19937-1 states with MT rand, but we only use 2^32 (or less due
to LCG seed).
Current PHP usage is extremely weak obviously.
2 Purpose of object API
You've misunderstood the idea of the "object-based PRNG interface". It's
purpose is not to instrument mt_rand internals but to introduce alternative
new generators. We concluded last year that there's very little we can do
about mt_rand without breaking BC and/or making more mess, so it should be
left to rot while we offer users something better. The new OOP API might
offer things like MT, Xorshift, PCG, legacy LCGs, or whatever you fancy
(maybe even MT-PHP :), distributions and utilities.
That's what the RFC does with minimal BC.
Swift even changes syntax, why not rare API usage?
3 This API
Where to start? Have you any experience with the random APIs of any
decent statistical packages? Take a look at Boost Random, R, ...Who wants a low-level interface to the MT algorithm's internals?
I'm not trying to full featured API, but minimal API enough to fix mess.
Improvement suggestion for minimal API is welcomed.
4 "Fix" for abuse of mt_rand
Your concern that people abuse mt_rand to get unpredictable values is
valid. But I don't think your attempts to mitigate this are useful.Hypothetically, say we take a radical approach and make mt_rand draw from
php_random_bytes by default on every call. How much good would it do?
I'm not proposing CSPRNG seed value for every mt_rand()
calls, but CSPRNG
value use when initialization and reseeding, reseeding period is
configurable also.
What's good is written previously.
"There are 2^19937-1 states with MT rand, but we only use 2^32 (or less due
to LCG seed).
Current PHP usage is extremely weak obviously."
Unmaintained legacy apps are unaffected because nobody upgrades them from
whatever PHP 4 or 5 they are on. Only apps that have stopped receiving
security updates but that are nevertheless actively maintained to migrate
them new major PHP versions can benefit. It's a niche we can better target
with new APIs and education. W.r.t. unpredictable randoms, that's already
done in 7.0.Hence I don't believe abuse of mt_rand is something we can fix by
modifying its behavior. It's too late.
IMO, nothing is too late.
We broke predictable PRNG in PHP 7.1. See other my reply in this thread.
This should be fixed ASAP.
But if you really insist on doing something to mt_rand then make its
automatic seed use php_random_bytes and, if that fails, fall back to
GENERATE_SEED. While I doubt it will make the world any safer, it is as
effective as my hypothetical radical "fix" and should be harmless so long
as you don't introduce new bugs.
Basically, I'm trying to use MT rand with optimal usage. Safety is side
effect.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
Posting RFC draft before discussion
https://wiki.php.net/rfc/improve_predictable_prng_random
This RFC includes results of recent PRNG related discussions.
I would like to keep it simple, but basic object feature will be
implemented.Methods could raise exceptions for invalid operations rather than ignoring.
Comments?
I realized BC in PHP 7.1 is not so obvious. I added what's broken by PHP
7.1.
I also made to raise exception for insane usage.
Reseed cycle can be set up to 2^31.
https://wiki.php.net/rfc/improve_predictable_prng_random
RFC is written to implement to be minimal/enough to fix issues
consistently.
Adding new Random object or methods are easy if these are needed.
Please note that RFC objectives are
- Fix BC made by PHP 7.1. (For PHP 7.1, we may simply add new state for
rand()
/srand(), but this is out of scope) - Use MT rand optimal way
- Keep it simple to be minimal
- Keep consistency across APIs
There are worry about CSPRNG overhead, reading 2500 bytes from CSPRNG
on PHP 7.1-dev debug build took less than 0.00001 sec with my PC.
[yohgaki@dev PHP-7.1]$ ./php-bin -d error_reporting=-1 -r '$s =
microtime(true);for ($i=0; $i<1000; $i++) $n=random_bytes(2500);
var_dump(microtime(true)-$s);'
float(0.010347843170166)
Since it will reseed 1 in 100, average overhead is less than 0.0000001 sec
with debug build. 100 seems good choice. IMO. I don't mind to use larger
reseed cycle default if it isn't too large.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net