i want BLAKE3 hash support in php,
some justifications can be found here https://bugs.php.net/bug.php?id=79492
primary reasons being that it's a "cryptographically secure hash",
predecessor BLAKE was nearly crowned "SHA3" (SHA3 finalist),
and it's very fast in software,
when i run it against php-src/ext/hash/bench.php , on a cpu without
the most beneficial instruction-set available, it was still over 4
times faster than sha3-224, outperforming even several
non-crypto-hashes, exact numbers here:
https://github.com/php/php-src/pull/6358#issuecomment-713506971 )
i have a PR here: https://github.com/php/php-src/pull/6358
initially had several issues, but i think most of them are resolved now,
-
SSE/AVX optimizations are added for "gcc+(unix|windows)+x86_64" targets,
and i have asked the php-windows mailing list for help in adding
SSE/AVX optimizations to the MSVC builds,
but for the time being, MSVC builds are just using a portable generic
C implementation.
it would also be possible to add an optimized implementation for
ARM+Neon targets, i haven't looked into that at all. -
currently the code is rolling it's own cpu feature detection,
instead of asking Zend/zend_cpuinfo.c (specifically
ext/hash/blake3/blake3_dispatch.c ),
is that something that "must be fixed" before upstream merging can be
considered? or is it ok-ish? -
frankly, there's just a lot of code in there, is it possible that
it will be rejected just because "it's too much code" ? (there's a
reason there's so much code though, it contains cpu-specific
implementations for a lot of different environments, compilers, OS's,
and CPU's, from SSE2 to AVX512 to ARM+Neon, also a much smaller
implementation using just the "portable generic C implementation"
would be possible, at the cost of performance)
hmm i'll try again,
made another PR: https://github.com/php/php-src/pull/6393
this PR is an alternative to https://github.com/php/php-src/pull/6358
with some notable differences,
-
this PR is roughly 1700 SLOC, compared to #6358 at roughly 31,000 SLOC
-
this PR only bundles the "portable C" version of BLAKE3, it does not
contain any of the SSE/AVX/NEON cpu-optimized non-portable
implementations (hence the huge SLOC difference)
the idea here is that just having BLAKE3 supported in PHP, even if
it's not the most performant implementation, is better than no support
at all,
unsurprisingly, the portable version of BLAKE3 isn't nearly as fast as
in #6358 , being beaten by MD4 (still beats MD5 though), but it still
does a pretty good job in /ext/hash/bech.php, being roughly 1.9 times
faster than sha3-224 on my laptop (compared to 4.2 times faster in
#6358 ), and being the fastest secure hash on the list, for exact
numbers check the PR.
thoughts?
i want BLAKE3 hash support in php,
some justifications can be found here https://bugs.php.net/bug.php?id=79492
primary reasons being that it's a "cryptographically secure hash",
predecessor BLAKE was nearly crowned "SHA3" (SHA3 finalist),
and it's very fast in software,
when i run it against php-src/ext/hash/bench.php , on a cpu without
the most beneficial instruction-set available, it was still over 4
times faster than sha3-224, outperforming even several
non-crypto-hashes, exact numbers here:
https://github.com/php/php-src/pull/6358#issuecomment-713506971 )i have a PR here: https://github.com/php/php-src/pull/6358
initially had several issues, but i think most of them are resolved now,
SSE/AVX optimizations are added for "gcc+(unix|windows)+x86_64" targets,
and i have asked the php-windows mailing list for help in adding
SSE/AVX optimizations to the MSVC builds,
but for the time being, MSVC builds are just using a portable generic
C implementation.
it would also be possible to add an optimized implementation for
ARM+Neon targets, i haven't looked into that at all.currently the code is rolling it's own cpu feature detection,
instead of asking Zend/zend_cpuinfo.c (specifically
ext/hash/blake3/blake3_dispatch.c ),
is that something that "must be fixed" before upstream merging can be
considered? or is it ok-ish?frankly, there's just a lot of code in there, is it possible that
it will be rejected just because "it's too much code" ? (there's a
reason there's so much code though, it contains cpu-specific
implementations for a lot of different environments, compilers, OS's,
and CPU's, from SSE2 to AVX512 to ARM+Neon, also a much smaller
implementation using just the "portable generic C implementation"
would be possible, at the cost of performance)
hmm i'll try again,
made another PR: https://github.com/php/php-src/pull/6393this PR is an alternative to https://github.com/php/php-src/pull/6358
with some notable differences,
this PR is roughly 1700 SLOC, compared to #6358 at roughly 31,000 SLOC
this PR only bundles the "portable C" version of BLAKE3, it does not
contain any of the SSE/AVX/NEON cpu-optimized non-portable
implementations (hence the huge SLOC difference)the idea here is that just having BLAKE3 supported in PHP, even if
it's not the most performant implementation, is better than no support
at all,unsurprisingly, the portable version of BLAKE3 isn't nearly as fast as
in #6358 , being beaten by MD4 (still beats MD5 though), but it still
does a pretty good job in /ext/hash/bech.php, being roughly 1.9 times
faster than sha3-224 on my laptop (compared to 4.2 times faster in
#6358 ), and being the fastest secure hash on the list, for exact
numbers check the PR.thoughts?
I think the slower, portable version will be fine.
However, I still see CPU related code in get_cpu_features. Are you
sure you pushed the right thing?
@Levi
However, I still see CPU related code in get_cpu_features. Are you
sure you pushed the right thing?
I think it's correct, yes, the problem is ext/hash/blake3/blake3_dispatch.c,
which is just a copypaste of the upstream
https://github.com/BLAKE3-team/BLAKE3/blob/master/c/blake3_dispatch.c
(specifically version 0.3.7, with 2 patches that will be part of
0.3.8.. specifically https://github.com/BLAKE3-team/BLAKE3/pull/128
and https://github.com/BLAKE3-team/BLAKE3/pull/131
, both of which were accepted upstream)
Unfortunately the upstream blake3_dispatch.c always run cpu feature
detection when compiled for x86,
even when it's only compiling the portable implementation.
we are free to rip out that part for PHP's copy, and it should be a
pretty easy job to do too (i'm guessing ~30min including make test
on decent hardware),
note that if we do it, comparing php's version with the upstream
version is slightly more work, and updating to newer upstream versions
will be slightly more work
(we'll need to rip out the cpu-detection stuff next time we want to
sync with upstream and the upstream dispatch code changed. I'm
guessing it will be years between each time someone actually has a
good reason to do this, though.)
if ripping out the cpu feature detection code in blake3_dispatch is
desired, i can volunteer for that, if nobody else want to ^^