Hi all,
Debian, Arch Linux and other distro's are trying to get full
reproducible builds. There are some issues in PHP's codebase which makes
builds unreproducible. Reprodicuble builds are currently reproduced in
Arch Linux by building PHP twice, and in two different env's, varying
hostname, system time, etc. [1]
Once issue is the PHP_BUILD_DATE, which makes the build
non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
set in the reprodiculbe build env. This should keep the current
functionality intact, while adding support for reproducible builds. [2]
[3]
Another issue is the php_uname functions which contains the
hostname, since the hostname is varied per build this makes it
non-reproducible. This is caused by the following line:
configure.ac:PHP_UNAME=uname -a | xargs
required in:
ext/standard/info.c: php_uname = PHP_UNAME;
Which is there as fallback as the php.net documentation describes:
"On some older UNIX platforms, it may not be able to determine the
current OS information in which case it will revert to displaying the OS
PHP was built on. This will only happen if your uname() library call
either doesn't exist or doesn't work.".
I would argue that this is strange unexpected behaviour, and maybe it
should throw an exception instead? Or can it show only "Linux" as
fallback? basically PHP_OS. Ideas?
The last issue is phar.phar being non-reproducible of which I am not
sure what the issue would be. I'm not sure how the binary data in the
phar.phar is generated.
[1] https://tests.reproducible-builds.org/archlinux/extra/php/php-7.2.0-2-x86_64.pkg.tar.xz.html
[2] https://github.com/php/php-src/pull/2965
[3] https://reproducible-builds.org/specs/source-date-epoch/
Thanks,
--
Jelle van der Waa
Arch Linux Developer
Hi!
Once issue is the PHP_BUILD_DATE, which makes the build
non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
set in the reprodiculbe build env. This should keep the current
functionality intact, while adding support for reproducible builds. [2]
[3]
SOURCE_DATE_EPOCH (or any other variable) looks like a good way to make
it predictable.
Another issue is the php_uname functions which contains the
hostname, since the hostname is varied per build this makes it
non-reproducible. This is caused by the following line:configure.ac:PHP_UNAME=
uname -a | xargs
required in:
ext/standard/info.c: php_uname = PHP_UNAME;
I think the best solution here would be to have another variable to
override this.
I would argue that this is strange unexpected behaviour, and maybe it
should throw an exception instead? Or can it show only "Linux" as
fallback? basically PHP_OS. Ideas?
If those old systems run PHP and need uname, changing stuff there is
probably harder and more expensive than on other systems. With this in
mind, I'd rather not mess with it, especially for a purpose that can
easily be achieved without it.
--
Stas Malyshev
smalyshev@gmail.com
Hi!
Once issue is the PHP_BUILD_DATE, which makes the build
non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
set in the reprodiculbe build env. This should keep the current
functionality intact, while adding support for reproducible builds. [2]
[3]SOURCE_DATE_EPOCH (or any other variable) looks like a good way to make
it predictable.Another issue is the php_uname functions which contains the
hostname, since the hostname is varied per build this makes it
non-reproducible. This is caused by the following line:configure.ac:PHP_UNAME=
uname -a | xargs
required in:
ext/standard/info.c: php_uname = PHP_UNAME;I think the best solution here would be to have another variable to
override this.
The issue with this approach would be that every distribution has to set
this variable. I know it's the same with SOURCE_DATE_EPOCH, but that is
well established.
I would argue that this is strange unexpected behaviour, and maybe it
should throw an exception instead? Or can it show only "Linux" as
fallback? basically PHP_OS. Ideas?If those old systems run PHP and need uname, changing stuff there is
probably harder and more expensive than on other systems. With this in
mind, I'd rather not mess with it, especially for a purpose that can
easily be achieved without it.
Hmmm true, but the fallback being the hostname where PHP was build on
seems a little bit odd, doesn't it?
--
Jelle van der Waa
Hi!
I think the best solution here would be to have another variable to
override this.The issue with this approach would be that every distribution has to set
this variable. I know it's the same with SOURCE_DATE_EPOCH, but that is
well established.
All distros that want reproducible build of PHP. But I assume they need
to do some special magic to initiate reproducible build anyway, if so,
we could document the procedure of setting up reproducible build in some
readme file, and make it easy to set it up. They won't need to set it up
for all builds, just for PHP build, and since most use special scripts
to build PHP anyway, it shouldn't be too hard to add.
If those old systems run PHP and need uname, changing stuff there is
probably harder and more expensive than on other systems. With this in
mind, I'd rather not mess with it, especially for a purpose that can
easily be achieved without it.Hmmm true, but the fallback being the hostname where PHP was build on
seems a little bit odd, doesn't it?
Yes, but I'd follow "Chesterton fence" principle here. Maybe we could
use some ifdefs and configure magic to ensure this is actually not
happening on the kind of systems where reproducible builds are run?
Stas Malyshev
smalyshev@gmail.com
Hi all,
Debian, Arch Linux and other distro's are trying to get full
reproducible builds. There are some issues in PHP's codebase which makes
builds unreproducible. Reprodicuble builds are currently reproduced in
Arch Linux by building PHP twice, and in two different env's, varying
hostname, system time, etc. [1]Once issue is the PHP_BUILD_DATE, which makes the build
non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
set in the reprodiculbe build env. This should keep the current
functionality intact, while adding support for reproducible builds. [2]
[3]
It looks good to me.
Another issue is the php_uname functions which contains the
hostname, since the hostname is varied per build this makes it
non-reproducible. This is caused by the following line:configure.ac:PHP_UNAME=
uname -a | xargs
required in:
ext/standard/info.c: php_uname = PHP_UNAME;Which is there as fallback as the php.net documentation describes:
"On some older UNIX platforms, it may not be able to determine the
current OS information in which case it will revert to displaying the OS
PHP was built on. This will only happen if your uname() library call
either doesn't exist or doesn't work.".I would argue that this is strange unexpected behaviour, and maybe it
should throw an exception instead? Or can it show only "Linux" as
fallback? basically PHP_OS. Ideas?
I wouldn't throw an exception here. It seems PHP_OS
is
under-documented; maybe PHP_OS_FAMILY is better:
The operating system family PHP was built for. Either of 'Windows', 'BSD', 'Darwin', 'Solaris', 'Linux' or 'Unknown'. Available as of PHP 7.2.0.
However, I really don't think we should change this for already
released PHP versions. We should our maintainers how they feel about
changing it in a x.y.NEXT patch. My inclination is to do this for PHP
7.3 and beyond and accept that official PHP sources of earlier
versions will not produce reproducible builds.
The last issue is phar.phar being non-reproducible of which I am not
sure what the issue would be. I'm not sure how the binary data in the
phar.phar is generated.
Phars are like tars
that are also valid PHP files. This means there
are probably modification times, etc, set in there. Not sure what else
would need to be changed.
Hi all,
Debian, Arch Linux and other distro's are trying to get full
reproducible builds. There are some issues in PHP's codebase which makes
builds unreproducible. Reprodicuble builds are currently reproduced in
Arch Linux by building PHP twice, and in two different env's, varying
hostname, system time, etc. [1]Once issue is the PHP_BUILD_DATE, which makes the build
non-reproducible. I've made a PR which uses SOURCE_DATE_EPOCH which is
set in the reprodiculbe build env. This should keep the current
functionality intact, while adding support for reproducible builds. [2]
[3]It looks good to me.
Another issue is the php_uname functions which contains the
hostname, since the hostname is varied per build this makes it
non-reproducible. This is caused by the following line:configure.ac:PHP_UNAME=
uname -a | xargs
required in:
ext/standard/info.c: php_uname = PHP_UNAME;Which is there as fallback as the php.net documentation describes:
"On some older UNIX platforms, it may not be able to determine the
current OS information in which case it will revert to displaying the OS
PHP was built on. This will only happen if your uname() library call
either doesn't exist or doesn't work.".I would argue that this is strange unexpected behaviour, and maybe it
should throw an exception instead? Or can it show only "Linux" as
fallback? basically PHP_OS. Ideas?I wouldn't throw an exception here. It seems
PHP_OS
is
under-documented; maybe PHP_OS_FAMILY is better:
PHP_OS
and PHP_OS_FAMILY is a strange difference indeed. I'll have to do
some further digging.
The operating system family PHP was built for. Either of 'Windows', 'BSD', 'Darwin', 'Solaris', 'Linux' or 'Unknown'. Available as of PHP 7.2.0.
However, I really don't think we should change this for already
released PHP versions. We should our maintainers how they feel about
changing it in a x.y.NEXT patch. My inclination is to do this for PHP
7.3 and beyond and accept that official PHP sources of earlier
versions will not produce reproducible builds.
Indeed, as an Arch Linux developer I'm fine with these changes adding up
in the next release and no backporting.
The last issue is phar.phar being non-reproducible of which I am not
sure what the issue would be. I'm not sure how the binary data in the
phar.phar is generated.Phars are like
tars
that are also valid PHP files. This means there
are probably modification times, etc, set in there. Not sure what else
would need to be changed.
Thanks for the information, I'll see if I can do some more digging.
--
Jelle van der Waa
The last issue is phar.phar being non-reproducible of which I am not
sure what the issue would be. I'm not sure how the binary data in the
phar.phar is generated.Phars are like
tars
that are also valid PHP files. This means there
are probably modification times, etc, set in there. Not sure what else
would need to be changed.Thanks for the information, I'll see if I can do some more digging.
I have had similar issues with Phar files when I tried to make Composer
builds reproducible. The cause is that the Phar extension uses the
current unix timestamp as filemtime for all files in the table of
content (at least when using addFromString), so every time you build the
TOC is different and hence the signature at the end also is.
I built a tool to fix this which just overwrites the TOC timestamps with
whatever you want and then updates the signature.. If it helps, you can
find it there:
https://github.com/Seldaek/phar-utils
Example usage in Composer:
I guess an alternative fix would be for someone to actually fix the Phar
extension so addFromString has a filemtime parameter you can pass the
desired mtime to. I have not checked whether addFile suffers from the
same issue or not, but possibly it needs to be fixed to read the mtime
from the file you add.
Best,
Jordi
--
Jordi Boggiano
@seldaek - http://seld.be
Am 15.12.2017 um 11:13 schrieb Jordi Boggiano:
I guess an alternative fix would be for someone to actually fix the Phar
extension so addFromString has a filemtime parameter you can pass the
desired mtime to. I have not checked whether addFile suffers from the same
issue or not, but possibly it needs to be fixed to read the mtime from the
file you add.
+1
Am 15.12.2017 um 11:13 schrieb Jordi Boggiano:
I guess an alternative fix would be for someone to actually fix the Phar
extension so addFromString has a filemtime parameter you can pass the
desired mtime to. I have not checked whether addFile suffers from the same
issue or not, but possibly it needs to be fixed to read the mtime from the
file you add.+1
I'm not sure if timestamps are the issue, the created phar.phar binary
is non-reproducible as can be seen in this diff. I'll do some more
digging :)
https://tests.reproducible-builds.org/archlinux/extra/php/php-7.2.0-2-x86_64.pkg.tar.xz.html
--
Jelle van der Waa