Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122252 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 21456 invoked from network); 25 Jan 2024 06:59:53 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 25 Jan 2024 06:59:53 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1706166038; bh=aIVIXp00MimXWpn/tCpzZzltFUoSLWV2XUuFtUk+6KI=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=gExVBcu0qXPOlT7/GsKyXKsMo2bmt5kXkYeodfwGvvuGSVECDn9Ouq9fprzItt5hy YlAR1ZGurowfEfLYourscNDyua+i8AGtwDQtESnrm8U7MVnUGbuYZchjKG2WtmwzFv AHprnEtRp0VxoLFrRQU+8Yrj2syMyGaTii+Yi2Cky/IQafcsCJWrCSKTCTeCFRN/N+ K5IgHE/bEWwne1MTGTWQFagWLr7UNUmVrOKXghEGuhUn5EMUwU66qgsLtR5GfCFL2S AdhE4gXIPYCBBP3gpgWBJVd4djv5YfpTkxIvo4iqOO7AlqW2AzyghG5Cd78nRG7zaf wbvxsTBFcP6sQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id AC5A8180054 for ; Wed, 24 Jan 2024 23:00:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 24 Jan 2024 23:00:37 -0800 (PST) Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-6da9c834646so6341682b3a.3 for ; Wed, 24 Jan 2024 22:59:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706165991; x=1706770791; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=s0I36rn4pO8Zxm0yohyIJDG4VGD1xfRmO5wWlMh7nZ8=; b=DqPhzdQ3hiYGEUUXoiFgNHa+pOCSa8UCKmMNnhXhEPOYX7PeVZ6T7GEY9VrpHpMaaa BSsxNamxi5KokFbEaw2TiVkVomwG+HNHJFgSjLcVDod3OXyHYbwv5BCCj3H/TlDUOV6A 8YxsLBTnydCj6PONdp/evhUZUK7PnIi8WAVGDy7SVgBMDpUQYqWOybHSthZu55FNrlNA IRPl5b5HVOUxEW+cxMErafAC2KpO4k1jaTWRy1pYxVO9PLSH5n/cbIMfsydmTnKs79PQ 2KF7ewvcybGjm04vqlmCi/rENDv6oKTYXGDwYctS0xQ4eDn81IMMgr5MH/X/SXLut7zE BiQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706165991; x=1706770791; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=s0I36rn4pO8Zxm0yohyIJDG4VGD1xfRmO5wWlMh7nZ8=; b=pxzrg4YIr+yd+YcQTZ3KweAhY/3bdwbv1StTICVST8+ZqDqgb6vwIVWN8dl0Z6Yi6O q1BygiLcHYuX5VchjJt5rfx4dBKLEkN40fY0GhOHxkDnuRYHSldZgf4oEqG3MseB8xuz wE2ELBn2B0gq1tUv1wxLbMhSlOvQThMko1BBx50VVMIMHXLuyETRvS+6xkNW9yi75X1I P5jU4DOVjqOb0qdrBz9LzQzw177PJh0ofRYPQ8PofAg8G8LMIAwRQUUjxlh+LBnOR910 XBw1VZ0yrZq5wEQqfOU8xqjB/KHVWrmzRQ3maLNC1JVHgTiwu9aUr/yzizeAVzc4w9g3 ziMA== X-Gm-Message-State: AOJu0YxJD0E6dIGPEiZ8dkRVcbgDqwrw0xrm+QXeSYz4mcL+hRm1iTYH BzULUHCV88QITl9tjp1KXckHzeuNtEW8Qk7l2UwTJb1I7zKPXajIRlmZU+PB/yleUj/ynI7Av/S hw6no3YMiKNldw454yajmWRqNgBE= X-Google-Smtp-Source: AGHT+IE3bWFETAkHaGrG0BI1mpVjw8E1MJGCSrEmUexucqFW+5frAyjZ4cgr8+oCyZFQfE3Lct4Vgfm3bvfEE2tox4U= X-Received: by 2002:a05:6a00:a27:b0:6da:401a:c5e1 with SMTP id p39-20020a056a000a2700b006da401ac5e1mr392433pfh.49.1706165991534; Wed, 24 Jan 2024 22:59:51 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 25 Jan 2024 07:59:41 +0100 Message-ID: To: Hans Henrik Bergan Cc: tag Knife , Hans Henrik Bergan , PHP Internals List Content-Type: multipart/alternative; boundary="000000000000da7490060fbfb920" Subject: Re: [PHP-DEV] BLAKE3 hash From: ocramius@gmail.com (Marco Pivetta) --000000000000da7490060fbfb920 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 25 Jan 2024, 06:22 Hans Henrik Bergan, wrote= : > On Wed, 24 Jan 2024 at 17:59, Marco Pivetta wrote: > > > > Depends on the actual numbers: is there any way to make a comparison th= at > > is relatively stable across architectures? > > > > Would it be feasible to start with the > > cross-platform-let-the-compiler-do-its-job version (that somebody may > > actually be capable of auditing), and then introduce other versions whe= n > > the jump is significant enough? > > > > don't know about "relatively stable across architectures" but wrote > some benchmarking code, keep reading. > > > > On Wed, 24 Jan 2024 at 17:55, tag Knife wrote: > > Should we even be considering the specific instruction implementations? > > I've always been in the camp > > of you are not smarter than the compiler. As even the best human writte= n > > ASM code can be slower > > than the obscure instructions the compiler might choose to use in a wei= rd > > and wonderful way. > > The BLAKE3 team is smarter than GCC11.4, even with -march=3Dnative > -mtune=3Dnative, which is *not* commonly used in PHP, > the compiler didn't stand a chance against the hand-optimized assembly > versions, > > wrote some benchmarks, but the TL;DR is: > portable -O2 usually used by PHP managed 1126MB/s, > portable -O2 -march=3Dnative managed 533MB/s (wtf? gcc obviously got > something wrong here), > hand-written -O2 SSE2 managed 3144MB/s, > hand-written -O2 SSE41 managed 3332MB/s, > hand-written -O2 avx2 managed 6554MB/s, > hand-writen -O2 AVX512 managed 8913MB/s, > on my AMD Ryzen 9 7950x, > benchmarking code: > https://gist.github.com/divinity76/5729472dd5d77e94cd0acb245aac2226 > raw output: > array(6) { > ["O2-portable-march"]=3D> > array(2) { > ["microseconds_for_16_kib"]=3D> > int(29295) > ["mb_per_second"]=3D> > float(533.3674688513398) > } > ["O2-portable"]=3D> > array(2) { > ["microseconds_for_16_kib"]=3D> > int(13876) > ["mb_per_second"]=3D> > float(1126.0449697319111) > } > ["O2-sse2"]=3D> > array(2) { > ["microseconds_for_16_kib"]=3D> > int(4969) > ["mb_per_second"]=3D> > float(3144.4958744214127) > } > ["O2-sse41"]=3D> > array(2) { > ["microseconds_for_16_kib"]=3D> > int(4688) > ["mb_per_second"]=3D> > float(3332.977815699659) > } > ["O2-avx2"]=3D> > array(2) { > ["microseconds_for_16_kib"]=3D> > int(2384) > ["mb_per_second"]=3D> > float(6554.1107382550335) > } > ["O2-avx512"]=3D> > array(2) { > ["microseconds_for_16_kib"]=3D> > int(1753) > ["mb_per_second"]=3D> > float(8913.291500285226) > } > } > Oh yes, the AVX jump is impressive =F0=9F=98=B5 --000000000000da7490060fbfb920--