Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122251 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 11541 invoked from network); 25 Jan 2024 05:22:03 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 25 Jan 2024 05:22:03 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1706160167; bh=plM/uTQKtG2JejP7nEwgICViPRRccN7T0fdQnrjHNcE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=fmYXbw6aADs3T9aAEz10gNCuoEZCjiK8Ix2eqvItWFnsTla2UWBLcmZ4AVgwDkc15 u9PbjMMskOjRUGCdh66AH0AboOR2VVFdtNui2BdjlEI/5M9iajsqcGIIBUgOljHT1A tGSRuP6OQlJkYblmKLN+sKl2vNEgPQ7bzQEsc1YwI+ZZvNXcqnDOiAMDZK7aix/3rU tKsmkHxDnQgGAaZD4f5r8sP5SFhiWtFvtH7h6ESRMvnZIaQS+Q/FmU6iJUMecu8egF RplkTToPRBk3HkcCJW0o5WTD3LHR8sjqjSKqGSqbW+5MvN4oZqsUwrgVXICAWsmeIc Rz6SI3iZh9x0g== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id DB7BD180061 for ; Wed, 24 Jan 2024 21:22:46 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-oa1-f46.google.com (mail-oa1-f46.google.com [209.85.160.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 24 Jan 2024 21:22:46 -0800 (PST) Received: by mail-oa1-f46.google.com with SMTP id 586e51a60fabf-214d79d4a3cso63890fac.0 for ; Wed, 24 Jan 2024 21:22:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706160121; x=1706764921; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=y23CRq1lKGKGqaYGjhJfkWdljZKnyi8ed5F86C9nufM=; b=V+MK7FjbxVhnHMMJpf9OCxpGKj0iXuPSMP0fPiNBbggJXcBnUOpLmNKEGifKNFy8O1 UqaZ3tYPZowLfiT+d9FPw2tTBWpnNYXUcLYXZCBgjre0MH8SL95wBUJskUMlKmmfM83g 94NgN/7k2j2ZvymoNuw6EjNxiAlxMSaqf5AlANgA5tBqWUHY3PzGvPRkVXuVcuW79dQI 9ly7jcZ15vonTe5+A+NeQJuI+hYrVcShq5aYNoVq+i9R8fyMADtcF2KuRKQWFuLia71i XXNhKnPfGHw1/Hd2Fi8ecKIDjWIJaRHIBiOr1gwuBWY4g8Wxa8FiUZ77dre6wQI/w82h cCKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706160121; x=1706764921; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=y23CRq1lKGKGqaYGjhJfkWdljZKnyi8ed5F86C9nufM=; b=tc/+B7VrjWvdf5duu9BOSdjYxqBrtxXt0k61lb0hWxwCvhKk/K3fQiZ2WD5y3lP8So plM/br66MzrIapNwT0FhqriwQYzaX2Ri8rZ6F9Mm5qvB7LIZHZ+Cat/XV2Acs1qiYsrc KfveuDaMzAI92sKDJuy+7BWQ6f85/KvOfFks7/+45OE7a2aVOqSaVyZ9+7TjFEMeuAJy FiZsSQa3xsiBfvJKe3An26zAAhPkudTDy+mL/QYOQmfwPTnEDYHw0ACD15rGBPW9FkOP pL+DMuAmxBmmj1Xr32GrsQfYMk7jDu/HEINmERgH2yTeykH3eRv+1IRg2ClmrOl89jcv 6oyQ== X-Gm-Message-State: AOJu0YzhE+skZf38Zmyx3HAxm8nvsKRH7ywPUZBn8LF2Cuzq4M0OrnwP zfllYO9FaPqC6UVDKMmTLGzC3Ei2ImBbTXOTXKaWNRzFuInNCTpzhH0SXlTiKrfOTb4L1gX+ZX6 PnOdZJv4JLCsJm+t3LDo9Nx7gpwIw6aCz5Ko= X-Google-Smtp-Source: AGHT+IG/0ODu41JF40RdHWiNcp3zY2J5de+GPgr3NWp2tBuj1CdMMfX9itgR5ErV1oo/ffcPTmTYS6mxQe472MNg8vM= X-Received: by 2002:a05:6870:15c9:b0:214:d3ae:d47e with SMTP id k9-20020a05687015c900b00214d3aed47emr289354oad.53.1706160120910; Wed, 24 Jan 2024 21:22:00 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 25 Jan 2024 06:21:25 +0100 Message-ID: To: Marco Pivetta Cc: tag Knife , Hans Henrik Bergan , internals@lists.php.net Content-Type: text/plain; charset="UTF-8" Subject: Re: [PHP-DEV] BLAKE3 hash From: divinity76@gmail.com (Hans Henrik Bergan) On Wed, 24 Jan 2024 at 17:59, Marco Pivetta wrote: > > Depends on the actual numbers: is there any way to make a comparison that > is relatively stable across architectures? > > Would it be feasible to start with the > cross-platform-let-the-compiler-do-its-job version (that somebody may > actually be capable of auditing), and then introduce other versions when > the jump is significant enough? > don't know about "relatively stable across architectures" but wrote some benchmarking code, keep reading. On Wed, 24 Jan 2024 at 17:55, tag Knife wrote: > Should we even be considering the specific instruction implementations? > I've always been in the camp > of you are not smarter than the compiler. As even the best human written > ASM code can be slower > than the obscure instructions the compiler might choose to use in a weird > and wonderful way. The BLAKE3 team is smarter than GCC11.4, even with -march=native -mtune=native, which is *not* commonly used in PHP, the compiler didn't stand a chance against the hand-optimized assembly versions, wrote some benchmarks, but the TL;DR is: portable -O2 usually used by PHP managed 1126MB/s, portable -O2 -march=native managed 533MB/s (wtf? gcc obviously got something wrong here), hand-written -O2 SSE2 managed 3144MB/s, hand-written -O2 SSE41 managed 3332MB/s, hand-written -O2 avx2 managed 6554MB/s, hand-writen -O2 AVX512 managed 8913MB/s, on my AMD Ryzen 9 7950x, benchmarking code: https://gist.github.com/divinity76/5729472dd5d77e94cd0acb245aac2226 raw output: array(6) { ["O2-portable-march"]=> array(2) { ["microseconds_for_16_kib"]=> int(29295) ["mb_per_second"]=> float(533.3674688513398) } ["O2-portable"]=> array(2) { ["microseconds_for_16_kib"]=> int(13876) ["mb_per_second"]=> float(1126.0449697319111) } ["O2-sse2"]=> array(2) { ["microseconds_for_16_kib"]=> int(4969) ["mb_per_second"]=> float(3144.4958744214127) } ["O2-sse41"]=> array(2) { ["microseconds_for_16_kib"]=> int(4688) ["mb_per_second"]=> float(3332.977815699659) } ["O2-avx2"]=> array(2) { ["microseconds_for_16_kib"]=> int(2384) ["mb_per_second"]=> float(6554.1107382550335) } ["O2-avx512"]=> array(2) { ["microseconds_for_16_kib"]=> int(1753) ["mb_per_second"]=> float(8913.291500285226) } }