Hi internals.
Although I have explained that due to the global turmoil I will delay
voting on the RFC as long as possible, it is time to start voting on the
RFC in order to get the implementation to full status by the PHP 8.2
Feature Freeze.
I apologize for the delay in responding to the points you had already
pointed out. It has been addressed as follows.
- Random\Engine\PCG64::__construct() now takes an
int|string
$inc as its
second argument- This makes it fully compatible with the PCG64 original implementation
- Fixed an algorithm implementation error in PCG64
- Fixed compatibility issues with PHP 8.2 in test cases
- More detailed description in RFC
Previous discussion thread: https://externals.io/message/117295
Voting will begin at 2022-06-14 00:00:00 (UTC).
https://wiki.php.net/rfc/rng_extension
Regards,
Go Kudo
Le 31 mai 2022 11:54:18 GMT+02:00, Go Kudo g-kudo@colopl.co.jp a écrit :
Hi internals.
Although I have explained that due to the global turmoil I will delay
voting on the RFC as long as possible, it is time to start voting on the
RFC in order to get the implementation to full status by the PHP 8.2
Feature Freeze.I apologize for the delay in responding to the points you had already
pointed out. It has been addressed as follows.
- Random\Engine\PCG64::__construct() now takes an
int|string
$inc as its
second argument
- This makes it fully compatible with the PCG64 original implementation
- Fixed an algorithm implementation error in PCG64
- Fixed compatibility issues with PHP 8.2 in test cases
- More detailed description in RFC
Previous discussion thread: https://externals.io/message/117295
Voting will begin at 2022-06-14 00:00:00 (UTC).
https://wiki.php.net/rfc/rng_extension
Regards,
Go Kudo
The first section starts with "However", which sounds like an error from a text reorganization?
Also:
"adding any randomization functions between the seeding the intended usage would break the code. " -> missing "and" I suppose
"Even with JIT enabled, the native implementation is not far behind. " -> it is far behind no? It's not even behind but ahead as I see it so maybe I just misunderstand this comment.
Côme
2022年6月4日(土) 18:03 Côme Chilliet come@chilliet.eu:
Le 31 mai 2022 11:54:18 GMT+02:00, Go Kudo g-kudo@colopl.co.jp a écrit :
Hi internals.
Although I have explained that due to the global turmoil I will delay
voting on the RFC as long as possible, it is time to start voting on the
RFC in order to get the implementation to full status by the PHP 8.2
Feature Freeze.I apologize for the delay in responding to the points you had already
pointed out. It has been addressed as follows.
- Random\Engine\PCG64::__construct() now takes an
int|string
$inc as its
second argument
- This makes it fully compatible with the PCG64 original
implementation- Fixed an algorithm implementation error in PCG64
- Fixed compatibility issues with PHP 8.2 in test cases
- More detailed description in RFC
Previous discussion thread: https://externals.io/message/117295
Voting will begin at 2022-06-14 00:00:00 (UTC).
https://wiki.php.net/rfc/rng_extension
Regards,
Go KudoThe first section starts with "However", which sounds like an error from a
text reorganization?
Also:
"adding any randomization functions between the seeding the intended usage
would break the code. " -> missing "and" I suppose
"Even with JIT enabled, the native implementation is not far behind. " ->
it is far behind no? It's not even behind but ahead as I see it so maybe
I just misunderstand this comment.Côme
--
To unsubscribe, visit: https://www.php.net/unsub.php
The first section starts with "However", which sounds like an error from
a text reorganization?
"adding any randomization functions between the seeding the intended
usage would break the code. " -> missing "and" I suppose
Thanks, I fixed it.
"Even with JIT enabled, the native implementation is not far behind. " ->
it is far behind no? It's not even behind but ahead as I see it so maybe
I just misunderstand this comment.
Just wanted to explain that the native implementation is faster. (C > PHP)
I changed it to a more clear form.
(I was replying directly to you. sorry)
Hi
- More detailed description in RFC
a)
For the Random\Engine interface I suggest to clarify that the returned
bytestring will be interpreted as a little endian integer.
b)
I'm still missing an explanation of the guarantees the Randomizer
implementation will make (last bullet point in
https://externals.io/message/117295#117299).
To me the guarantees is the most important thing about this RFC. As a
user of the API I need to know what of the behavior can and what cannot
change in future PHP versions. I don't really care whether the RNG is
PCG64 or some Xoshiro. Both are great.
The Engine part is pretty solid: They implement a well-defined RNG and
any numbers are returned as little endian integers (see (a)). Either the
implementation is correct or it is not. This is not likely to change in
the future.
However the Randomizer part is pretty undefined: As an example:
->getInt() will return an integer uniformly selected from the given
range. But there's more than one way to perform this uniform selection.
Will the algorithm stay the safe in future PHP versions? There are more
examples in my previous emails.
Best regards
Tim Düsterhus
2022年6月8日(水) 0:24 Tim Düsterhus tim@bastelstu.be:
Hi
- More detailed description in RFC
a)
For the Random\Engine interface I suggest to clarify that the returned
bytestring will be interpreted as a little endian integer.b)
I'm still missing an explanation of the guarantees the Randomizer
implementation will make (last bullet point in
https://externals.io/message/117295#117299).To me the guarantees is the most important thing about this RFC. As a
user of the API I need to know what of the behavior can and what cannot
change in future PHP versions. I don't really care whether the RNG is
PCG64 or some Xoshiro. Both are great.The Engine part is pretty solid: They implement a well-defined RNG and
any numbers are returned as little endian integers (see (a)). Either the
implementation is correct or it is not. This is not likely to change in
the future.However the Randomizer part is pretty undefined: As an example:
->getInt() will return an integer uniformly selected from the given
range. But there's more than one way to perform this uniform selection.
Will the algorithm stay the safe in future PHP versions? There are more
examples in my previous emails.Best regards
Tim Düsterhus
Hi Tim.
Thanks for the continued feedback!
I have added the following regarding the points you pointed out.
- interface Random\Engine
It has a single generate(): string method that generates random numbers
as a binary string. This string must be non-empty and attempting to return
an empty will result in a RuntimeException.
If you implement a random number generator in PHP, the generated numbers
must be converted to binary using thepack()
function, and the values must
be little-endian.
- class Random Randomizer
The same current PHP algorithm is used to generate random numbers within
the specified range in Randomizer::getInt(). This is necessary for backward
compatibility.
It also provides a guarantee of consistency in the future.
However, I understand that perhaps this fix will not satisfy your request.
I just do not have a good understanding of your intentions due to my poor
English....
I am considering the following possibilities regarding your intentions:
- Should be stated in detail so that consistency of results is maintained
in the future. - Current PHP ranged random number generation algorithm has room for
improvement and should be examined further - Consistency of results is difficult to maintain in the future (Maybe
incorrect)
In case 1, I think the following statement would satisfy the requirement.
The values generated by the seedable Engine guarantee future
reproducibility of results, and the Randomizer uses the results to process
them, so if the results generated by the Engine are identical, the
Randomizer's results will also be consistent.
Although the consistency of the Randomizer results is not mentioned here,
it would be a clear BC Break if the results were to change after the
Randomizer is officially implemented, so I do think it is sufficient that
an RFC be created and voted on as necessary.
In case 2, I also thought about it along the way. Nikita also taught me
about better algorithms: https://externals.io/message/115918#115982
But, I also think that the current PHP implementation is good enough, and
there is no need to change it to the point of breaking compatibility.
I think the current global scope MT implementation is very troublesome for
some use cases and should first be able to be drop-in-replaceable with this
implementation.
In case 3, I think it is necessary to guarantee consistency at least once
at the language level, even though this may change in the future. I have
already indicated the need for this in the RFC.
Most of all, I do not believe you intend this to be the case. (And this
sentence is not intended to offend you either. Please don't misunderstand
me.)
I would like to hear more about your opinion.
Regards,
Go Kudo
Hi
It has a single generate(): string method that generates random numbers
as a binary string. This string must be non-empty and attempting to return
an empty will result in a RuntimeException.
If you implement a random number generator in PHP, the generated numbers
must be converted to binary using thepack()
function, and the values must
be little-endian.
Thanks, that looks good to me.
- class Random Randomizer
The same current PHP algorithm is used to generate random numbers within
the specified range in Randomizer::getInt(). This is necessary for backward
compatibility.
It also provides a guarantee of consistency in the future.However, I understand that perhaps this fix will not satisfy your request.
I just do not have a good understanding of your intentions due to my poor
English....
Don't worry. I understand that using a foreign language can sometimes be
hard - I am not a native speaker of English either and I suspect that
this also applies to many other participants.
I am considering the following possibilities regarding your intentions:
- Should be stated in detail so that consistency of results is maintained
in the future.- Current PHP ranged random number generation algorithm has room for
improvement and should be examined further- Consistency of results is difficult to maintain in the future (Maybe
incorrect)In case 1, I think the following statement would satisfy the requirement.
The values generated by the seedable Engine guarantee future
reproducibility of results, and the Randomizer uses the results to process
them, so if the results generated by the Engine are identical, the
Randomizer's results will also be consistent.Although the consistency of the Randomizer results is not mentioned here,
it would be a clear BC Break if the results were to change after the
Randomizer is officially implemented, so I do think it is sufficient that
an RFC be created and voted on as necessary.
If I understand you correctly, then (1) is what I am looking for: It
should be spelled out explicitly what behavior the user may rely on and
what should be considered an implementation detail.
Something like the following would work for me:
The engines implement a specific well-defined random number generator.
For a given seed it is guaranteed that they return the same sequence as
the reference implementation.
For the Randomizer it is considered a breaking change if the observable
behavior of the methods changes. For a given seeded engine and identical
method parameters the following must hold:
- The number of calls to the Engine's ->generate() method remains the same.
- The return value remains the same for a given result retrieved from
->generate().
Any changes to the Randomizer that violate these guarantees require a
separate RFC.
In case 2, I also thought about it along the way. Nikita also taught me
about better algorithms: https://externals.io/message/115918#115982But, I also think that the current PHP implementation is good enough, and
there is no need to change it to the point of breaking compatibility.I think the current global scope MT implementation is very troublesome for
some use cases and should first be able to be drop-in-replaceable with this
implementation.In case 3, I think it is necessary to guarantee consistency at least once
at the language level, even though this may change in the future. I have
already indicated the need for this in the RFC.
Can you comment on whether the Randomizer behaves identically on both 32
and 64 bit PHP and also on little endian and big endian architectures?
As an example: Will ->getInt() calculate the same "uniform distribution"
on all bitnesses? If not I consider that a bug.
The engines should behave identically, because of the "little endian
string" return value.
If that's already the case then something like the following should be
added to the RFC guarantees:
- The implementation will guarantee that the same results are returned
independent of the processor architecture (little endian / big endian)
and integer bit length's (32 / 64).
Most of all, I do not believe you intend this to be the case. (And this
sentence is not intended to offend you either. Please don't misunderstand
me.)
Best regards
Tim Düsterhus
2022年6月10日(金) 19:42 Tim Düsterhus tim@bastelstu.be:
Hi
It has a single generate(): string method that generates random numbers
as a binary string. This string must be non-empty and attempting to
return
an empty will result in a RuntimeException.
If you implement a random number generator in PHP, the generated numbers
must be converted to binary using thepack()
function, and the values
must
be little-endian.Thanks, that looks good to me.
- class Random Randomizer
The same current PHP algorithm is used to generate random numbers within
the specified range in Randomizer::getInt(). This is necessary for
backward
compatibility.
It also provides a guarantee of consistency in the future.However, I understand that perhaps this fix will not satisfy your
request.
I just do not have a good understanding of your intentions due to my poor
English....Don't worry. I understand that using a foreign language can sometimes be
hard - I am not a native speaker of English either and I suspect that
this also applies to many other participants.I am considering the following possibilities regarding your intentions:
- Should be stated in detail so that consistency of results is
maintained
in the future.- Current PHP ranged random number generation algorithm has room for
improvement and should be examined further- Consistency of results is difficult to maintain in the future (Maybe
incorrect)In case 1, I think the following statement would satisfy the requirement.
The values generated by the seedable Engine guarantee future
reproducibility of results, and the Randomizer uses the results to
process
them, so if the results generated by the Engine are identical, the
Randomizer's results will also be consistent.Although the consistency of the Randomizer results is not mentioned here,
it would be a clear BC Break if the results were to change after the
Randomizer is officially implemented, so I do think it is sufficient that
an RFC be created and voted on as necessary.If I understand you correctly, then (1) is what I am looking for: It
should be spelled out explicitly what behavior the user may rely on and
what should be considered an implementation detail.Something like the following would work for me:
The engines implement a specific well-defined random number generator.
For a given seed it is guaranteed that they return the same sequence as
the reference implementation.For the Randomizer it is considered a breaking change if the observable
behavior of the methods changes. For a given seeded engine and identical
method parameters the following must hold:
- The number of calls to the Engine's ->generate() method remains the same.
- The return value remains the same for a given result retrieved from
->generate().Any changes to the Randomizer that violate these guarantees require a
separate RFC.
In case 2, I also thought about it along the way. Nikita also taught me
about better algorithms: https://externals.io/message/115918#115982But, I also think that the current PHP implementation is good enough, and
there is no need to change it to the point of breaking compatibility.I think the current global scope MT implementation is very troublesome
for
some use cases and should first be able to be drop-in-replaceable with
this
implementation.In case 3, I think it is necessary to guarantee consistency at least once
at the language level, even though this may change in the future. I have
already indicated the need for this in the RFC.Can you comment on whether the Randomizer behaves identically on both 32
and 64 bit PHP and also on little endian and big endian architectures?
As an example: Will ->getInt() calculate the same "uniform distribution"
on all bitnesses? If not I consider that a bug.The engines should behave identically, because of the "little endian
string" return value.If that's already the case then something like the following should be
added to the RFC guarantees:
- The implementation will guarantee that the same results are returned
independent of the processor architecture (little endian / big endian)
and integer bit length's (32 / 64).Most of all, I do not believe you intend this to be the case. (And this
sentence is not intended to offend you either. Please don't misunderstand
me.)Best regards
Tim Düsterhus
It does not depend on endianness, but does depend on the number of bits.
This is because the new algorithm generates 64-bit values.
When using a 64-bit RNG in a 32-bit environment, Engine::generate() returns
the same binary string as in a 64-bit environment, but the Randomizer
methods return different values. This is because the size of zend_long in a
32-bit environment does not match uint64_t and is truncated.
To keep the results the same in 32-bit / 64-bit environments, only the
lower 32-bit of the 64-bit value should be used. However, this leads to
reduced randomness and does not seem appropriate considering that most
environments running PHP are 64-bit.
I have created a PoC that allows all internal operations to be performed in
64-bit environments to achieve the same results, although the efficiency of
execution in 32-bit environments will be reduced. (Note that
Randomizer::getInt() with no argument is still incompatible.)
https://github.com/php/php-src/commit/dbed218bfcd45e713fa3df2c88a4c2efce9f0651
Another idea I had was to throw an exception when trying to generate a
64-bit RNG in a 32-bit environment.
Regards,
Go Kudo
Hi
I have created a PoC that allows all internal operations to be performed in
64-bit environments to achieve the same results, although the efficiency of
execution in 32-bit environments will be reduced. (Note that
Randomizer::getInt() with no argument is still incompatible.)https://github.com/php/php-src/commit/dbed218bfcd45e713fa3df2c88a4c2efce9f0651
I believe this is a good solution. The majority of the modern setups
(i.e. the setups that are using PHP 8.2) will likely be 64-bit anyway
and reduced performance on 32-bit is then acceptable.
Best regards
Tim Düsterhus
2022年6月13日(月) 16:14 Tim Düsterhus tim@bastelstu.be:
Hi
I have created a PoC that allows all internal operations to be performed
in
64-bit environments to achieve the same results, although the efficiency
of
execution in 32-bit environments will be reduced. (Note that
Randomizer::getInt() with no argument is still incompatible.)https://github.com/php/php-src/commit/dbed218bfcd45e713fa3df2c88a4c2efce9f0651
I believe this is a good solution. The majority of the modern setups
(i.e. the setups that are using PHP 8.2) will likely be 64-bit anyway
and reduced performance on 32-bit is then acceptable.Best regards
Tim Düsterhus
Hi Tim
Thanks! I have updated the implementation and RFCs.
https://wiki.php.net/rfc/rng_extension
Voting will open tomorrow (2022-06-14 00:00:00 UTC) as planned.
I have reviewed the implementation and would like to clean up the
implementation as it is cluttered with past spec changes. This may possibly
be too late to start voting, but I do not plan to change the behavior in
any way.
Best regards,
Go Kudo
2022年5月31日(火) 18:54 Go Kudo g-kudo@colopl.co.jp:
Hi internals.
Although I have explained that due to the global turmoil I will delay
voting on the RFC as long as possible, it is time to start voting on the
RFC in order to get the implementation to full status by the PHP 8.2
Feature Freeze.I apologize for the delay in responding to the points you had already
pointed out. It has been addressed as follows.
- Random\Engine\PCG64::__construct() now takes an
int|string
$inc as its
second argument
- This makes it fully compatible with the PCG64 original implementation
- Fixed an algorithm implementation error in PCG64
- Fixed compatibility issues with PHP 8.2 in test cases
- More detailed description in RFC
Previous discussion thread: https://externals.io/message/117295
Voting will begin at 2022-06-14 00:00:00 (UTC).
https://wiki.php.net/rfc/rng_extension
Regards,
Go Kudo
Good evening.
While refactoring, I found an error in the PCG64 algorithm we are
implementing.
It should be implemented as pcg64_oneseq_128 (XSL-RR), but currently
pcg64_setseq_128 (XSL-RR-RR) is implemented. The RFC also incorrectly
states that NumPy implements pcg64_oneseq_128.
https://wiki.php.net/rfc/rng_extension
The RFC has been corrected for the above. This change removes the
int|string $inc argument from RandomEngine\PCG64::__construct(), leaving
only int|string|null $seed.
Regards,
Go Kudo