Hi Internals.
Random Extension is accepted and being implemented in PHP 8.2. Many thanks
for the review.
The changes to enable arc4random in glibc were recently merged.
https://github.com/php/php-src/pull/8984
This has the effect of reducing the number of getrandom system calls
issued on Linux, which is effective in improving performance.
However, this will only work in environments that use GNU libc, and will
not work on Linuxes that use other libc (e.g. Alpine Linux that uses musl).
As we discussed a bit above in PR (which is inherently a bad thing, because
it's not a good thing), the following is an example of a "good" PR campaign
Apologies), if we could implement CSPRNG on PHP, for example, it would
improve performance on all platforms.
However, there are several challenges to this.
- Increased maintenance costs
- Requires optimization for CPU architecture
- Requires familiarity with CSPRNG
PHP already bundles xxHash and appears ready to make this happen.
Also, an appropriate CSPRNG implementation may be able to resolve the
current complex macro branching.
What do you think about this?
Regards
Go Kudo
2022年7月16日(土) 0:54 Go Kudo zeriyoshi@gmail.com:
Hi Internals.
Random Extension is accepted and being implemented in PHP 8.2. Many thanks
for the review.The changes to enable arc4random in glibc were recently merged.
https://github.com/php/php-src/pull/8984
This has the effect of reducing the number of getrandom system calls
issued on Linux, which is effective in improving performance.However, this will only work in environments that use GNU libc, and will
not work on Linuxes that use other libc (e.g. Alpine Linux that uses musl).As we discussed a bit above in PR (which is inherently a bad thing,
because it's not a good thing), the following is an example of a "good" PR
campaign Apologies), if we could implement CSPRNG on PHP, for example, it
would improve performance on all platforms.However, there are several challenges to this.
- Increased maintenance costs
- Requires optimization for CPU architecture
- Requires familiarity with CSPRNG
PHP already bundles xxHash and appears ready to make this happen.
Also, an appropriate CSPRNG implementation may be able to resolve the
current complex macro branching.What do you think about this?
Regards
Go Kudo
xxHash has nothing to do with it. Forget it.
Hi
However, there are several challenges to this.
- Increased maintenance costs
- Requires optimization for CPU architecture
- Requires familiarity with CSPRNG
PHP already bundles xxHash and appears ready to make this happen.
Also, an appropriate CSPRNG implementation may be able to resolve the
current complex macro branching.What do you think about this?
This would be a strong no from my side. There's all types of failure
modes that decrease the security of the CSPRNG (i.e. making it insecure)
and we really don't want to be the ones to blame if something goes
wrong. And historically many non-kernel CSPRNGs later proved to be
insecure in specific situations.
I also would assume that for a typical PHP application both of the
following is true:
- The majority of the requests don't need any randomness.
- The majority of the requests that need randomness don't need any
significant amount of randomness. - The majority of the requests that need significant amounts of
randomness are fine with a regular PRNG (e.g. Xoshiro or Pcg). - The cost of a few getrandom() syscalls is not really measurable
compared to the time spent waiting for the database, file IO or template
rendering.
Attempting to optimize the speed of the CSPRNG is premature
optimization. That also the reason why I suggested to use the 'Secure'
engine by default in the Randomizer: It's a safe default choice for the
vast majority of users.
Personally I likely wouldn't have merged the PR in question for the same
reasons. But at least in that case glibc is at fault :-)
Best regards
Tim Düsterhus
2022年7月17日(日) 6:33 Tim Düsterhus tim@bastelstu.be:
Hi
However, there are several challenges to this.
- Increased maintenance costs
- Requires optimization for CPU architecture
- Requires familiarity with CSPRNG
PHP already bundles xxHash and appears ready to make this happen.
Also, an appropriate CSPRNG implementation may be able to resolve the
current complex macro branching.What do you think about this?
This would be a strong no from my side. There's all types of failure
modes that decrease the security of the CSPRNG (i.e. making it insecure)
and we really don't want to be the ones to blame if something goes
wrong. And historically many non-kernel CSPRNGs later proved to be
insecure in specific situations.I also would assume that for a typical PHP application both of the
following is true:
- The majority of the requests don't need any randomness.
- The majority of the requests that need randomness don't need any
significant amount of randomness.- The majority of the requests that need significant amounts of
randomness are fine with a regular PRNG (e.g. Xoshiro or Pcg).- The cost of a few getrandom() syscalls is not really measurable
compared to the time spent waiting for the database, file IO or template
rendering.Attempting to optimize the speed of the CSPRNG is premature
optimization. That also the reason why I suggested to use the 'Secure'
engine by default in the Randomizer: It's a safe default choice for the
vast majority of users.Personally I likely wouldn't have merged the PR in question for the same
reasons. But at least in that case glibc is at fault :-)Best regards
Tim Düsterhus
Hi Tim.
You are right. Implementing a CSPRNG on your own obviously increases
maintenance costs and security risks.
However, I still think the overhead of the getrandom syscall in a Linux
environment is significant and should be considered.
I would suggest deprecating mt_srand()
/srand() and using php_random_bytes()
in sessions etc. for PHP 8.3 for better security.
However, as Nikita mentioned before, the overhead of the getrandom syscall
in a Linux environment is significant and this proposal may result in
performance degradation.
https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13
Therefore, I have created a PoC to buffer php_random_bytes. This has
resulted in a significant performance improvement.
https://github.com/zeriyoshi/php-src/tree/random_buf
# perf result of current implementation:
Samples: 120 of event 'cpu-clock:pppH', Event count (approx.): 30000000
Overhead Command Shared Object Symbol
32.50% php [kernel.kallsyms] [k] preempt_count_sub
30.00% php libc.so.6 [.] syscall
18.33% php [kernel.kallsyms] [k] do_el0_svc
2.50% php php [.] execute_ex
1.67% php [kernel.kallsyms] [k] __this_cpu_preempt_check
1.67% php [kernel.kallsyms] [k] __uaccess_mask_ptr
1.67% php [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.83% php [kernel.kallsyms] [k] __arm64_sys_getrandom
0.83% php [kernel.kallsyms] [k] __kern_my_cpu_offset
0.83% php [kernel.kallsyms] [k] __mod_lruvec_state
0.83% php [kernel.kallsyms] [k] __mod_memcg_state
0.83% php [kernel.kallsyms] [k] __next_zones_zonelist
0.83% php [kernel.kallsyms] [k] arch_local_irq_restore
0.83% php [kernel.kallsyms] [k] arch_local_irq_save
0.83% php [kernel.kallsyms] [k] check_stack_object
0.83% php ld-linux-aarch64.so.1 [.] 0x000000000000a260
0.83% php php [.] php_random_bytes
0.83% php php [.] syscall@plt
0.83% php php [.] zend_hash_add
0.83% php php [.]
zend_hash_graceful_reverse_destroy
0.83% php php [.]
zend_string_init_interned_permanent
$ time ./sapi/cli/php -r 'for ($i = 0; $i < 65535; $i++){ random_bytes(4);
}'
real 0m0.069s
user 0m0.025s
sys 0m0.044s
# PoC result:
Samples: 19 of event 'cpu-clock:pppH', Event count (approx.): 4750000
Overhead Command Shared Object Symbol
31.58% php [kernel.kallsyms] [k] __flush_cache_user_range
15.79% php php [.] _efree
10.53% php php [.] zend_register_functions
5.26% php [kernel.kallsyms] [k] __rcu_read_unlock
5.26% php [kernel.kallsyms] [k] page_add_file_rmap
5.26% php [kernel.kallsyms] [k] pfn_valid
5.26% php [kernel.kallsyms] [k] preempt_count_sub
5.26% php libc.so.6 [.] cfree
5.26% php libc.so.6 [.] 0x0000000000097b40
5.26% php php [.] execute_ex
5.26% php php [.] zend_register_constant
$ time ./sapi/cli/php -r 'for ($i = 0; $i < 65535; $i++){ random_bytes(4);
}'
real 0m0.016s
user 0m0.009s
sys 0m0.007s
I think this is a safe implementation due to the nature of CSPRNG, what do
you think?
Best Regards,
Go Kudo
2022年7月17日(日) 6:33 Tim Düsterhus tim@bastelstu.be:
Hi
However, there are several challenges to this.
- Increased maintenance costs
- Requires optimization for CPU architecture
- Requires familiarity with CSPRNG
PHP already bundles xxHash and appears ready to make this happen.
Also, an appropriate CSPRNG implementation may be able to resolve the
current complex macro branching.What do you think about this?
This would be a strong no from my side. There's all types of failure
modes that decrease the security of the CSPRNG (i.e. making it insecure)
and we really don't want to be the ones to blame if something goes
wrong. And historically many non-kernel CSPRNGs later proved to be
insecure in specific situations.I also would assume that for a typical PHP application both of the
following is true:
- The majority of the requests don't need any randomness.
- The majority of the requests that need randomness don't need any
significant amount of randomness.- The majority of the requests that need significant amounts of
randomness are fine with a regular PRNG (e.g. Xoshiro or Pcg).- The cost of a few getrandom() syscalls is not really measurable
compared to the time spent waiting for the database, file IO or template
rendering.Attempting to optimize the speed of the CSPRNG is premature
optimization. That also the reason why I suggested to use the 'Secure'
engine by default in the Randomizer: It's a safe default choice for the
vast majority of users.Personally I likely wouldn't have merged the PR in question for the same
reasons. But at least in that case glibc is at fault :-)Best regards
Tim DüsterhusHi Tim.
You are right. Implementing a CSPRNG on your own obviously increases
maintenance costs and security risks.However, I still think the overhead of the getrandom syscall in a Linux
environment is significant and should be considered.
There is already a good CSPRNG available in OpenSSL which we expose
with openssl_random_pseudo_bytes (except on Windows which is historical and
should change) so for those that are impacted by the syscall overhead, this
might be the best option considering that most users are using at least
OpenSSL version 1.1.1 where the new CSPRNG is available.
Regards
Jakub
2022年7月25日(月) 20:32 Jakub Zelenka bukka@php.net:
2022年7月17日(日) 6:33 Tim Düsterhus tim@bastelstu.be:
Hi
However, there are several challenges to this.
- Increased maintenance costs
- Requires optimization for CPU architecture
- Requires familiarity with CSPRNG
PHP already bundles xxHash and appears ready to make this happen.
Also, an appropriate CSPRNG implementation may be able to resolve the
current complex macro branching.What do you think about this?
This would be a strong no from my side. There's all types of failure
modes that decrease the security of the CSPRNG (i.e. making it insecure)
and we really don't want to be the ones to blame if something goes
wrong. And historically many non-kernel CSPRNGs later proved to be
insecure in specific situations.I also would assume that for a typical PHP application both of the
following is true:
- The majority of the requests don't need any randomness.
- The majority of the requests that need randomness don't need any
significant amount of randomness.- The majority of the requests that need significant amounts of
randomness are fine with a regular PRNG (e.g. Xoshiro or Pcg).- The cost of a few getrandom() syscalls is not really measurable
compared to the time spent waiting for the database, file IO or template
rendering.Attempting to optimize the speed of the CSPRNG is premature
optimization. That also the reason why I suggested to use the 'Secure'
engine by default in the Randomizer: It's a safe default choice for the
vast majority of users.Personally I likely wouldn't have merged the PR in question for the same
reasons. But at least in that case glibc is at fault :-)Best regards
Tim DüsterhusHi Tim.
You are right. Implementing a CSPRNG on your own obviously increases
maintenance costs and security risks.However, I still think the overhead of the getrandom syscall in a Linux
environment is significant and should be considered.There is already a good CSPRNG available in OpenSSL which we expose
with openssl_random_pseudo_bytes (except on Windows which is historical and
should change) so for those that are impacted by the syscall overhead, this
might be the best option considering that most users are using at least
OpenSSL version 1.1.1 where the new CSPRNG is available.Regards
Jakub
Hi (Sorry, I sent you directly)
Indeed, But ext-openssl is not always available.
To use it in a ext-session, etc., it must be bundled reliably.
Best Regards
Go Kudo
Indeed, But ext-openssl is not always available.
To use it in a ext-session, etc., it must be bundled reliably.
Which means that users can generally call random_bytes()/random_int()
for
an always available, but maybe not most performant CSPRNG source, or if
they need those extra cycles, they can make sure OpenSSL is installed and
available and call that API instead.
Library authors can even abstract this away using function_exists()
and a
graceful fallback to random_int()
. I think it's okay to trust developers
to know how to program well, and to learn when they don't.
-Sara
2022年7月17日(日) 6:33 Tim Düsterhus tim@bastelstu.be:
However, there are several challenges to this.
- Increased maintenance costs
- Requires optimization for CPU architecture
- Requires familiarity with CSPRNG
PHP already bundles xxHash and appears ready to make this happen.
Also, an appropriate CSPRNG implementation may be able to resolve the
current complex macro branching.What do you think about this?
This would be a strong no from my side. There's all types of failure
modes that decrease the security of the CSPRNG (i.e. making it insecure)
and we really don't want to be the ones to blame if something goes
wrong. And historically many non-kernel CSPRNGs later proved to be
insecure in specific situations.I also would assume that for a typical PHP application both of the
following is true:
- The majority of the requests don't need any randomness.
- The majority of the requests that need randomness don't need any
significant amount of randomness.- The majority of the requests that need significant amounts of
randomness are fine with a regular PRNG (e.g. Xoshiro or Pcg).- The cost of a few getrandom() syscalls is not really measurable
compared to the time spent waiting for the database, file IO or template
rendering.Attempting to optimize the speed of the CSPRNG is premature
optimization. That also the reason why I suggested to use the 'Secure'
engine by default in the Randomizer: It's a safe default choice for the
vast majority of users.Personally I likely wouldn't have merged the PR in question for the same
reasons. But at least in that case glibc is at fault :-)You are right. Implementing a CSPRNG on your own obviously increases
maintenance costs and security risks.However, I still think the overhead of the getrandom syscall in a Linux
environment is significant and should be considered.There is already a good CSPRNG available in OpenSSL which we expose
with openssl_random_pseudo_bytes (except on Windows which is historical and
should change)
TIL! Yes, that should change.
so for those that are impacted by the syscall overhead, this
might be the best option considering that most users are using at least
OpenSSL version 1.1.1 where the new CSPRNG is available.
We cannot, however, rely on any OpenSSL functionality in the core or
ext/standard, since OpenSSL might not be available.
--
Christoph M. Becker
Hi
However, I still think the overhead of the getrandom syscall in a Linux
environment is significant and should be considered.
I disagree. On my Intel(R) Core(TM) i5-2430M with Ubuntu 20.04 with
Linux 5.4.0-123-generic I can call random_bytes(16) (128 Bits of
randomness which is sufficient for ~everything) 100000 times in ~140ms:
<?php
for ($i = 0; $i < 100000; $i++) {
$foo = random_bytes(16);
}
The same script modified to just set $foo = is_int(1)
runs in 20ms:
<?php
for ($i = 0; $i < 100000; $i++) {
$foo = is_int(1);
}
Thus the time of syscalling getrandom() on my machine (which definitely
is not modern hardware) 100k times is 120ms or 1.2us per call.
I would suggest deprecating
mt_srand()
/srand() and using php_random_bytes()
in sessions etc. for PHP 8.3 for better security.
Syscalling getrandom() a few times to seed a PRNG or to generate a
session ID is not going to have a measurable effect. As I said in my
previous email:
"The cost of a few getrandom() syscalls is not really measurable
compared to the time spent waiting for the database, file IO or template
rendering."
I think this is a safe implementation due to the nature of CSPRNG, what do
you think?
I'm pretty sure the implementation is unsafe when the process calls
fork() which might happen with
https://www.php.net/manual/en/function.pcntl-fork.php.
The only thing I trust with actually generating proper
cryptographically secure randomness is the kernel. Non-kernel
implementations have proven to be insecure over and over again.
Best regards
Tim Düsterhus
Hi
Personally I likely wouldn't have merged the PR in question for the same
reasons. But at least in that case glibc is at fault :-)
For those following along:
It turns out the glibc "userland" implementation of arc4random() was
questionable and was simplified to be a relatively simple wrapper around
getrandom():
https://github.com/php/php-src/pull/8984#issuecomment-1195986646
and
https://sourceware.org/pipermail/libc-alpha/2022-July/140939.html
Best regards
Tim Düsterhus
2022年7月28日(木) 1:47 Tim Düsterhus tim@bastelstu.be:
Hi
Personally I likely wouldn't have merged the PR in question for the same
reasons. But at least in that case glibc is at fault :-)For those following along:
It turns out the glibc "userland" implementation of arc4random() was
questionable and was simplified to be a relatively simple wrapper around
getrandom():https://github.com/php/php-src/pull/8984#issuecomment-1195986646
and
https://sourceware.org/pipermail/libc-alpha/2022-July/140939.html
Best regards
Tim Düsterhus--
To unsubscribe, visit: https://www.php.net/unsub.php
Hi
Thank you. After considering various points of view, I realized that my
proposal is very dangerous. The language side should not be working on
something that will cause confusion even at the libc layer.
Also, the newly discussed vDSO implementation of getrandom (which I see no
safe way to do at the moment) seems like a better option that would benefit
all Linux distributions. Perhaps waiting for this is the better option than
anything else.
Thank you!
Regards,
Go Kudo