Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:122818 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 551EA1A009C for ; Sat, 30 Mar 2024 07:03:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1711782225; bh=JYfGDHv+g0+f4L0Iix6pm17QLrwKhKuS6FuhtZ9zDlU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=m2VPjpL6xjY4yopNw3Ps0DCmS82PUazKGgPiAnGkENqX3nVg7z6jvgPgmVEvRopZt LJO2U2fjdJ87/MltjzLZ2rRyY+Nik9BaJ7mxe0pS2c/SAjdjiLXu5c63D35DmwSawl Ci2ajwmZJcrLicnmipK3MZGzW3lNiYEpChPv3Y4vh9wCHuWyqMV/a6gEvpD4IME1RQ IW4ghaCKyZDsWrSXt2/sp8Akr3VXezDIMQXR3iTemjr6zfLrKqm6ztxb/T6FXA5b4M 9jaSu2+idwQMqm+GM+UkY+WdJ+tM5O6qjfhqvh1MIoI/Q9WAdAro95ksUVzXmFGUQ6 4/+VCzFAvm7HA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8211E18006F for ; Sat, 30 Mar 2024 07:03:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 30 Mar 2024 07:03:44 +0000 (UTC) Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-6e6c0098328so2173833b3a.3 for ; Sat, 30 Mar 2024 00:03:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711782196; x=1712386996; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=unyuH9zAURmNvmHXzZVM6YfoMWuR1vyeK2Eh6UqFgrI=; b=IQ9lkGHis9l9q5nIVFfSM01PBc4d1UlqRRu+uoW/wyVEGGdft+8VLiY6al6DwR27JQ 0uSvrGlqq+Yr9//DMtbRZwB1QKArky7JzjLWUlxoAN6yNGWDFtlnjfhoEpsEsCOOVHRy 3I8rGyqMt5ioU3NzhC4F2iDqa2hLe31xa52MGkB++bOELYug7l1ZeAfevtxCZYHwRUDJ 5MCg2mUOblixRdIhT7pGFg7TqZ4x2bC55EhqZEWKSX/AH3VlZDMATXRApgBWTv2lwZSH bb6cAZk/NonKqN95aJ8A/W8rrJ9dvlgcwC/Zqgnue/YZQnfwWm2/OBKNi6wJ3WkSspGm +F8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711782196; x=1712386996; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=unyuH9zAURmNvmHXzZVM6YfoMWuR1vyeK2Eh6UqFgrI=; b=bIJAKhlx40RPY1Z9IaQyviSARWk61AYCHIS9u5ldgMv4QQwiXEY6qUS3SZKNQfZEt8 1Vw6ngw0GONiEHsc3+G2gtdNFiYzsDWLCnVZWhVIQvM9csJR6MHBdFW+tY/Gmve2Ume0 Nm7dVskctTP3NfelrpjakLRO5gF3VEuUM6suS4rZ6NOSAUqQ9YVIc3wAVZeecCGKMphj IGXLI2EvIG+O+VqGuPGFBU53bJAe7R4gbMhe0iH4cXu1F+SyQ6q+Zx+kmszEyYTTSY88 P07ocQrvXmuMJIMe6kL6+l9ZCETV4XnfzDYf4oigjpkdwPv8kE3jFitCSGfOmPowaj7X VASw== X-Forwarded-Encrypted: i=1; AJvYcCXk8ebbX/zi9tSHp8hiPuczeVBPC17RaIRwClZIMVU6oGreTYcGDJAe28DP2W6GeC/0nsCPvUXHRT6Mh2lJTi0Hx5nIRUQ5ww== X-Gm-Message-State: AOJu0Yys8jVeX37e9SxHAYBnlfIjJHCf3mfOsln2uyDHsgiwd+XA0kr/ 5TydbHtHNvYrXE8d8venB0ljDMecJVaiygFbJlq6lF0oKz5fRUVSwlxEBbsky2Er4WPcECuVU7Y 3F7/KUqKIvSdAv7T2IXu/Bbz5ag8= X-Google-Smtp-Source: AGHT+IGGS3jgAwJouuFpqzZw7dNDmCrB57vHHCsHMEiPaF2yvodh+7u/UPmXMn/Ll5KeA9GJ25thH14+wWQ1Py92qO4= X-Received: by 2002:a05:6a20:c90d:b0:1a7:a32:4af9 with SMTP id gx13-20020a056a20c90d00b001a70a324af9mr448281pzb.34.1711782196194; Sat, 30 Mar 2024 00:03:16 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: <9008050F-4EE1-4E19-B513-654602E118A7@benramsey.com> In-Reply-To: <9008050F-4EE1-4E19-B513-654602E118A7@benramsey.com> Date: Sat, 30 Mar 2024 08:03:02 +0100 Message-ID: Subject: Re: [PHP-DEV] Consider removing autogenerated files from tarballs To: Ben Ramsey Cc: Bob Weinand , Daniil Gentili , PHP Internals List Content-Type: multipart/alternative; boundary="000000000000bcb7920614db59ae" From: ocramius@gmail.com (Marco Pivetta) --000000000000bcb7920614db59ae Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, 30 Mar 2024, 05:19 Ben Ramsey, wrote: > On Mar 29, 2024, at 20:20, Bob Weinand wrote: > > =EF=BB=BF > On 29.3.2024 23:31:26, Daniil Gentili wrote: > > In light of the recent supply chain attack in xz/lzma, leading to a > backdoor in openSSH ( > https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe that > it would be a good idea to remove the huge attack surface offered by the > pre-generated autoconf build scripts and lexers, offered in the release > tarballs. > > In particular, the xz supply chain attack injected the exploit with a few > obfuscated lines, manually added to the end of the pre-generated configur= e > script, that was only bundled in the tarballs. > > Even if the exploits themselves were committed to the repo in the form of > test files, the code that actually injected the exploit in the library wa= s > not committed to the repo, and was only present in the pre-generated > configure script in the tarball: this injection mode makes sense, as extr= a > files in the tarball not present in the git repo would raise suspicions, > but machine-generated configure scripts containing hundreds of thousands = of > lines of code not present in the upstream VCS are the norm, and are usual= ly > not checked before execution. > > Specifically in the case of PHP, along from the configure script, the > tarball also bundles generated lexer files which contain actual C code, > which is an additional attack vector, i.e. here's the diff between the > tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo: > > ``` > ~ $ diff -r php-8.3.4 php-src -q > Only in php-src: .git > Files php-8.3.4/NEWS and php-src/NEWS differ > Files php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ > Only in php-8.3.4/Zend: zend_ini_parser.c > Only in php-8.3.4/Zend: zend_ini_parser.h > Only in php-8.3.4/Zend: zend_ini_parser.output > Only in php-8.3.4/Zend: zend_ini_scanner.c > Only in php-8.3.4/Zend: zend_ini_scanner_defs.h > Only in php-8.3.4/Zend: zend_language_parser.c > Only in php-8.3.4/Zend: zend_language_parser.h > Only in php-8.3.4/Zend: zend_language_parser.output > Only in php-8.3.4/Zend: zend_language_scanner.c > Only in php-8.3.4/Zend: zend_language_scanner_defs.h > Only in php-8.3.4: configure > Files php-8.3.4/configure.ac and php-src/configure.ac > differ Only in php-8.3.4/ext/json: > json_parser.tab.c Only in php-8.3.4/ext/json= : > json_parser.tab.h > Only in php-8.3.4/ext/json: json_scanner.c > Only in php-8.3.4/ext/json: php_json_scanner_defs.h > Only in php-8.3.4/ext/pdo: pdo_sql_parser.c > Only in php-8.3.4/ext/phar: phar_path_check.c > Only in php-8.3.4/ext/standard: url_scanner_ex.c > Only in php-8.3.4/ext/standard: var_unserializer.c > Only in php-8.3.4/main: php_config.h.in > Files php-8.3.4/main/php_version.h and php-src/main/php_version.h differ > Only in php-8.3.4/pear: install-pear-nozlib.phar > Only in php-8.3.4/sapi/phpdbg: phpdbg_lexer.c > Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.c > Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.h > Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output > ``` > > To prevent attacks from malevolent/compromised RMs, I propose completely > removing all autogenerated files from the release tarballs, and ensuring > their content exactly matches the content of the associated git tag (this > means also removing the -dev prefix from the version number in > main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag). > > Of course this means that users will have to generate the build scripts > when compiling PHP, as when installing PHP from the VCS repo. > > I'm sending a copy of this email to security@php.net as well. > > Hey Daniil, > > You can also have a public CI (i.e. a github action) generate the > artifacts, along with hash computation. > It should be a github action which runs on tags. This makes it fully > verifiable; i.e. the code for the generation of action, including the has= h. > Anyone who wants can trivially trace this back. > > There's nothing in the tarballs which cannot be trivially automated and > made verifiable. > > I don't think providing pre-generated files is fundamentally flawed, the > primary lacking thing is verifiability. Which is also what enabled the xz > backdoor. > > Bob > > > This is also why our release managers sign the tarballs with their own GP= G > keys, after generating the artifacts. This verifies the release manager w= as > the one who generated the files. > > Cheers, > Ben > Hey Ben, I understand that the XZ project had signed releases too: that still means that downstream consumers would need to trust the release managers anyway, and reproduce the whole chain themselves. I suppose that's part of OP's concern. --000000000000bcb7920614db59ae Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sat, 30 Mar 2024, 05:19 Ben Ramsey, <ben@benramsey.com> wrote:
On Mar 29, 2024, at 20:20, Bob Weinand <bobwei9@hotmail.com> wrote:

=EF=BB=BF =20 =20
On 29.3.2024 23:31:26, Daniil Gentili wrote:
=20 =20 In light of the recent supply chain attack in xz/lzma, leading to a backdoor in openSSH (https://www.openwall.com/lists/oss-= security/2024/03/29/4), I believe that it would be a good idea to remove the huge attack surface offered by the pre-generated autoconf build scripts and lexers, offered in the release tarballs.

In particular, the xz supply chain attack injected the exploit with a few obfuscated lines, manually added to the end of the pre-generated configure script, that was only bundled in the tarballs.

Even if the exploits themselves were committed to the repo in the form of test files, the code that actually injected the exploit in the library was not committed to the repo, and was only present in the pre-generated configure script in the tarball: this injection mode makes sense, as extra files in the tarball not present in the git repo would raise suspicions, but machine-generated configure scripts containing hundreds of thousands of lines of code not present in the upstream VCS are the norm, and are usually not checked before execution.

Specifically in the case of PHP, along from the configure script, the tarball also bundles generated lexer files which contain actual C code, which is an additional attack vector, i.e. here's the diff between the tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo:

```
~ $ diff -r php-8.3.4 php-src -q
Only in php-src: .git=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Files php-8.3.4/NEWS and php-src/NEWS differ=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Files php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_ini_parser.c
Only in php-8.3.4/Zend: zend_ini_parser.h
Only in php-8.3.4/Zend: zend_ini_parser.output=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_ini_scanner.c
Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
Only in php-8.3.4/Zend: zend_language_parser.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_language_parser.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/Zend: zend_language_parser.output
Only in php-8.3.4/Zend: zend_language_scanner.c
Only in php-8.3.4/Zend: zend_language_scanner_defs.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 Only in php-8.3.4: configure=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Files php-8.3.4/configure.ac and php-src/configure.ac differ=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/json: json_parser.tab.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/json: json_parser.tab.h
Only in php-8.3.4/ext/json: json_scanner.c
Only in php-8.3.4/ext/json: php_json_scanner_defs.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/pdo: pdo_sql_parser.c
Only in php-8.3.4/ext/phar: phar_path_check.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/ext/standard: url_scanner_ex.c
Only in php-8.3.4/ext/standard: var_unserializer.c
Only in php-8.3.4/main: php_config.h.in
Files php-8.3.4/main/php_version.h and php-src/main/php_version.h differ=C2=A0=C2=A0 Only in php-8.3.4/pear: install-pear-nozlib.phar=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/sapi/phpdbg: phpdbg_lexer.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.h
Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
```

To prevent attacks from malevolent/compromised RMs, I propose completely removing all autogenerated files from the release tarballs, and ensuring their content exactly matches the content of the associated git tag (this means also removing the -dev prefix from the version number in main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag).

Of course this means that users will have to generate the build scripts when compiling PHP, as when installing PHP from the VCS repo.

I'm send= ing a copy of this email to security@php.net as well.

Hey Daniil,

You can also have a public CI (i.e. a github action) generate the artifacts, along with hash computation.
It should be a github action which runs on tags. This makes it fully verifiable; i.e. the code for the generation of action, including the hash. Anyone who wants can trivially trace this back.

There's nothing in the tarballs which cannot be trivially automated and made verifiable.

I don't think providing pre-generated files is fundamentally flawed, the primary lacking thing is verifiability. Which is also what enabled the xz backdoor.

Bob

=20

This is also why our release managers sign the = tarballs with their own GPG keys, after generating the artifacts. This veri= fies the release manager was the one who generated the files.
Cheers,
Ben

Hey Ben,
I understand that the XZ project had signed relea= ses too: that still means that downstream consumers would need to trust the= release managers anyway, and reproduce the whole chain themselves.

I suppose that's part of OP= 's concern.


--000000000000bcb7920614db59ae--