Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122821 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 0A81B1A009C for ; Sat, 30 Mar 2024 12:03:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1711800223; bh=kITONjEnl7+l8FmFgDjJ1XLLTnlJmfkaoZLzmFUylcE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Xppqmidx4v7h+dvskGX/kZeAjYc2eRTS3b5pCGrm2GbxvWQoe6G21b/6WZ1MsGBBi 8eoPCz6br1S6E8FAC1GnHVN/faa01E1vnxbgHzEuj+2kA+tv6U6AbGIkLZTeJo79k8 yTjM8MnVHM5ZWbgTPHdKQeYO/05A8RV0AmuE8N4TFrqzx81cA02ESJjfVrYQK8Fjfo xWHHE4lzA6Fq4TGkAz7TNp/ZSEM8P4eFgPTQMSBuTc8rDdnofV3+yU3M+SVWDqLK9n u0EGT7vostY5qBIslLaG6urToyiF7v/Drn37mG1bSTYzTYBxNx1axOdkvadp81qUzp 6dNYQ2DaENFHA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 323E618006B for ; Sat, 30 Mar 2024 12:03:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_50,DMARC_MISSING, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 30 Mar 2024 12:03:41 +0000 (UTC) Received: by mail-ed1-f51.google.com with SMTP id 4fb4d7f45d1cf-56bf6591865so4019727a12.0 for ; Sat, 30 Mar 2024 05:03:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711800194; x=1712404994; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=exgcrTSkg5VMHKAvpqcbYt3SNjx4wj2Pfy//RMBlQm0=; b=fgxrps06C6sezw7SudTrHm/zFlS8Zl2bFWWypuHeFPu2dniJ/cuGwz9FPYWMuNkIAE akyACe++IsfNlmfaebALmwMkw34XuJ90HkS+n1jm8eoG/NL4JNEtc2JSejn2fcpBdGg9 EB/583kk1b1GeqHjnnlg/I//vT5daCrZT9NfP+4mdRyXuYn+ObT/yQV9G+lb3tBkiMS0 miw62cYp9JG/RqeYTAiC9ZsLcUNAJmf7qOaZMDFR1apuMPlqzvCV8CR+xDLNsIN0NlBu BLUrpjTCjTSXBkcWmmB/KNviqmrz5oGP8/n2tiJfRH9QjKXM4ZDdnW3VcW9y1TYw5Kt0 VFbw== X-Forwarded-Encrypted: i=1; AJvYcCW0a8TRXFX5Ju1jjPUaHTzeupZCtstZFCOXkNUFWXDRzjmCk/otPX4Md6XXwyRsyGQSKjpnqdlLnx0Bppimi6PMhPrBleEJ6g== X-Gm-Message-State: AOJu0Yxc0ss1Wn4Ea9E2/ITcXtmu6wObNwmchgjCO2jZd91ZRR+uJq4r 6s2QORdedWQ6ZlZd9XvtzTQPf9jDvCfFj7OdrG1rRUHOuJhWWFwTgQEurUp6hNwqx0Yq7YUmkks wMQkYOmTcvNf/0O9tG1v7p3fD+GIjaM/p X-Google-Smtp-Source: AGHT+IHOdesNrUjwMjac802qsOmemKB0NY32IelsyLNWn6gUKEYwAsmHeMX02YVZUkDjDMlG+soyTLe81hp8xVghKsg= X-Received: by 2002:a50:99d5:0:b0:56c:19d2:85b2 with SMTP id n21-20020a5099d5000000b0056c19d285b2mr2701364edb.35.1711800193843; Sat, 30 Mar 2024 05:03:13 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: <9008050F-4EE1-4E19-B513-654602E118A7@benramsey.com> In-Reply-To: Date: Sat, 30 Mar 2024 12:03:02 +0000 Message-ID: Subject: Re: [PHP-DEV] Consider removing autogenerated files from tarballs To: Marco Pivetta Cc: Ben Ramsey , Bob Weinand , Daniil Gentili , PHP Internals List Content-Type: multipart/alternative; boundary="0000000000007b54600614df8a3e" From: bukka@php.net (Jakub Zelenka) --0000000000007b54600614df8a3e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, On Sat, Mar 30, 2024 at 7:08=E2=80=AFAM Marco Pivetta = wrote: > > > On Sat, 30 Mar 2024, 05:19 Ben Ramsey, wrote: > >> On Mar 29, 2024, at 20:20, Bob Weinand wrote: >> >> =EF=BB=BF >> On 29.3.2024 23:31:26, Daniil Gentili wrote: >> >> In light of the recent supply chain attack in xz/lzma, leading to a >> backdoor in openSSH ( >> https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe >> that it would be a good idea to remove the huge attack surface offered b= y >> the pre-generated autoconf build scripts and lexers, offered in the rele= ase >> tarballs. >> >> In particular, the xz supply chain attack injected the exploit with a fe= w >> obfuscated lines, manually added to the end of the pre-generated configu= re >> script, that was only bundled in the tarballs. >> >> Even if the exploits themselves were committed to the repo in the form o= f >> test files, the code that actually injected the exploit in the library w= as >> not committed to the repo, and was only present in the pre-generated >> configure script in the tarball: this injection mode makes sense, as ext= ra >> files in the tarball not present in the git repo would raise suspicions, >> but machine-generated configure scripts containing hundreds of thousands= of >> lines of code not present in the upstream VCS are the norm, and are usua= lly >> not checked before execution. >> >> Specifically in the case of PHP, along from the configure script, the >> tarball also bundles generated lexer files which contain actual C code, >> which is an additional attack vector, i.e. here's the diff between the >> tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo: >> >> ``` >> ~ $ diff -r php-8.3.4 php-src -q >> Only in php-src: >> .git Files >> php-8.3.4/NEWS and php-src/NEWS differ Fil= es >> php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ Onl= y >> in php-8.3.4/Zend: zend_ini_parser.c >> Only in php-8.3.4/Zend: zend_ini_parser.h >> Only in php-8.3.4/Zend: >> zend_ini_parser.output Only in php-8.3.4/Zen= d: >> zend_ini_scanner.c >> Only in php-8.3.4/Zend: zend_ini_scanner_defs.h >> Only in php-8.3.4/Zend: >> zend_language_parser.c Only in php-8.3.4/Zen= d: >> zend_language_parser.h Only in php-8.3.4/Zen= d: >> zend_language_parser.output >> Only in php-8.3.4/Zend: zend_language_scanner.c >> Only in php-8.3.4/Zend: >> zend_language_scanner_defs.h Only in php-8.3.4: >> configure Files php-8.3.4/ >> configure.ac and php-src/configure.ac differ Only in >> php-8.3.4/ext/json: json_parser.tab.c Only = in >> php-8.3.4/ext/json: json_parser.tab.h >> Only in php-8.3.4/ext/json: json_scanner.c >> Only in php-8.3.4/ext/json: >> php_json_scanner_defs.h Only in php-8.3.4/ext/pdo= : >> pdo_sql_parser.c >> Only in php-8.3.4/ext/phar: >> phar_path_check.c Only in >> php-8.3.4/ext/standard: url_scanner_ex.c >> Only in php-8.3.4/ext/standard: var_unserializer.c >> Only in php-8.3.4/main: php_config.h.in >> Files php-8.3.4/main/php_version.h and php-src/main/php_version.h >> differ Only in php-8.3.4/pear: >> install-pear-nozlib.phar Only in >> php-8.3.4/sapi/phpdbg: phpdbg_lexer.c Only = in >> php-8.3.4/sapi/phpdbg: phpdbg_parser.c Only = in >> php-8.3.4/sapi/phpdbg: phpdbg_parser.h >> Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output >> ``` >> >> To prevent attacks from malevolent/compromised RMs, I propose completely >> removing all autogenerated files from the release tarballs, and ensuring >> their content exactly matches the content of the associated git tag (thi= s >> means also removing the -dev prefix from the version number in >> main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag). >> >> Of course this means that users will have to generate the build scripts >> when compiling PHP, as when installing PHP from the VCS repo. >> >> I'm sending a copy of this email to security@php.net as well. >> >> Hey Daniil, >> >> You can also have a public CI (i.e. a github action) generate the >> artifacts, along with hash computation. >> It should be a github action which runs on tags. This makes it fully >> verifiable; i.e. the code for the generation of action, including the ha= sh. >> Anyone who wants can trivially trace this back. >> >> There's nothing in the tarballs which cannot be trivially automated and >> made verifiable. >> >> I don't think providing pre-generated files is fundamentally flawed, the >> primary lacking thing is verifiability. Which is also what enabled the x= z >> backdoor. >> >> Bob >> >> >> This is also why our release managers sign the tarballs with their own >> GPG keys, after generating the artifacts. This verifies the release mana= ger >> was the one who generated the files. >> >> Cheers, >> Ben >> > > Hey Ben, > > I understand that the XZ project had signed releases too: that still mean= s > that downstream consumers would need to trust the release managers anyway= , > and reproduce the whole chain themselves. > > I suppose that's part of OP's concern. > > I agree that compromised RM is a problem that we should look into. We have been actually already discussing something similar. I have been thinking about it and it could be potentially used for all builds. The idea is that we would setup worklfow on CI that would run on tag push and it would call (authenticated https request) downloads.php.net server that could do the actual build, sign them and return the hashes to the CI job which would display them and do extra verification (probably its own build to verify that download server work as expected). Then the builds would be made available for download. The RM job would be just to check that everything worked as expected, potentially verify that the builds for download and do all the announcements. This is a bit of work to do but I think it should then completely remove the possibility of compromised RM to compromise the builds which is currently possible. It would probably makes sense to let RM to sign the builds as well which should then reduce chance of downloads server being compromised. It needs more thinking to iron out all details and make sure it is a secure but I think it would be something worth to look at. Regards Jakub --0000000000007b54600614df8a3e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

On Sat, Mar 30, 2024 at 7:08=E2=80=AFAM Marco Pive= tta <ocramius@gmail.com> wr= ote:


On Sat, 30 Mar 2024, 05:19 Ben Ramsey, <ben@benramsey.com> wrote:
=
On Mar 29, 2024, at 20:20, Bob Weinand = <bobwei9@hotmail.com> wrote:

=EF=BB=BF =20 =20
On 29.3.2024 23:31:26, Daniil Gentili wrote:
=20 =20 In light= of the recent supply chain attack in xz/lzma, leading to a backdoor in openSSH (https://www.openwall.com/lists/oss-= security/2024/03/29/4), I believe that it would be a good idea to remove the huge attack surface offered by the pre-generated autoconf build scripts and lexers, offered in the release tarballs.

In particular, the xz supply chain attack injected the exploit with a few obfuscated lines, manually added to the end of the pre-generated configure script, that was only bundled in the tarballs.

Even if = the exploits themselves were committed to the repo in the form of test files, the code that actually injected the exploit in the library was not committed to the repo, and was only present in the pre-generated configure script in the tarball: this injection mode makes sense, as extra files in the tarball not present in the git repo would raise suspicions, but machine-generated configure scripts containing hundreds of thousands of lines of code not present in the upstream VCS are the norm, and are usually not checked before execution.

Specific= ally in the case of PHP, along from the configure script, the tarball also bundles generated lexer files which contain actual C code, which is an additional attack vector, i.e. here's the diff between the tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo:

```
~ $ diff= -r php-8.3.4 php-src -q
Only in php-src: .git=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Files php-8.3.4/NEWS and php-src/NEWS differ=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Files php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_ini_parser.c
Only in php-8.3.4/Zend: zend_ini_parser.h
Only in php-8.3.4/Zend: zend_ini_parser.output=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_ini_scanner.c
Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
Only in php-8.3.4/Zend: zend_language_parser.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_language_parser.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_language_parser.output
Only in php-8.3.4/Zend: zend_language_scanner.c
Only in php-8.3.4/Zend: zend_language_scanner_defs.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 Only in php-8.3.4: configure=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Files php-8.3.4/configure.ac and php-src/configure.ac differ=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/json: json_parser.tab.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/json: json_parser.tab.h
Only in php-8.3.4/ext/json: json_scanner.c
Only in php-8.3.4/ext/json: php_json_scanner_defs.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/pdo: pdo_sql_parser.c
Only in php-8.3.4/ext/phar: phar_path_check.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/standard: url_scanner_ex.c
Only in php-8.3.4/ext/standard: var_unserializer.c
Only in php-8.3.4/main: php_config.h.in
Files php-8.3.4/main/php_version.h and php-src/main/php_version.h differ=C2=A0=C2=A0 Only in php-8.3.4/pear: install-pear-nozlib.phar=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/sapi/phpdbg: phpdbg_lexer.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.h
Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
```

To preve= nt attacks from malevolent/compromised RMs, I propose completely removing all autogenerated files from the release tarballs, and ensuring their content exactly matches the content of the associated git tag (this means also removing the -dev prefix from the version number in main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag).

Of cours= e this means that users will have to generate the build scripts when compiling PHP, as when installing PHP from the VCS repo.

I'm = sending a copy of this email to security@php.net as well.

Hey Daniil,

You can also have a public CI (i.e. a github action) generate the artifacts, along with hash computation.
It should be a github action which runs on tags. This makes it fully verifiable; i.e. the code for the generation of action, including the hash. Anyone who wants can trivially trace this back.

There's nothing in the tarballs which cannot be trivially automated and made verifiable.

I don't think providing pre-generated files is fundamentally flawed, the primary lacking thing is verifiability. Which is also what enabled the xz backdoor.

Bob

=20

This is also why our release managers sign the = tarballs with their own GPG keys, after generating the artifacts. This veri= fies the release manager was the one who generated the files.
Cheers,
Ben

Hey Ben,
I understand that the XZ project had signed relea= ses too: that still means that downstream consumers would need to trust the= release managers anyway, and reproduce the whole chain themselves.

I suppose that's part of OP= 's concern.

I agree that compromised RM is a problem that we should look i= nto.

We have been actually already discussing some= thing similar. I have been thinking about it and it could be potentially us= ed for all builds. The idea is that we would setup worklfow on CI that woul= d run on tag push and it would call (authenticated https request) downloads.php.net server that could do th= e actual build, sign them and return the hashes to the CI job which would d= isplay them and do extra verification (probably its own build to verify tha= t download server work as expected). Then the builds would be made availabl= e for download. The RM job would be just to check that everything worked as= expected, potentially verify that the builds for download and do all the a= nnouncements. This is a bit of work to do but I think it should then comple= tely remove the possibility of compromised RM to compromise the builds whic= h is currently possible. It would probably makes sense to let RM to sign th= e builds as well which should then reduce chance of downloads server being = compromised.

It needs more thinking to iron out al= l details and make sure it is a secure but I think it would be something wo= rth to look at.

Regards

J= akub


--0000000000007b54600614df8a3e--