Hi all,
this is about the PHP dependencies for Windows which are available at
https://downloads.php.net/~windows/php-sdk/deps/. As is, new
dependency builds are uploaded manually. This has a couple of problems:
- only few people can do these uploads
At first, that is a good thing, because many people could easily step on
each other's toes, since there are no locks when uploading. However,
whenever a new dependency build needs to be uploaded, you need to get
one of these few people to actually do the upload. It is not clear who
is allowed to do it, and who has time.
- the process is not transparent
I.e. there is no easy way of being informed about updates, so you would
need to regularly run phpsdk_deps -u
, often just to see that there are
no updates available.
- there is no history of the series
While not super important, it would be nice to able to fetch an older
set of dependencies to be able to build an older PHP version with its
original dependency version (a step towards reproducible builds).
- the process is prone to error
It's all too easy to make some mistakes when editing the series files.
Sometimes you forget to updates one of the series files, sometimes you
have a typo (e.g. you add an x64 dependency to an x86 series file),
sometimes you may update a series file which is not supposed to be
updated. And not so rarely, you may forget to archive a dependency
version which is no longer required.
All these problems could be solved, or at least mitigated, by setting up
a Github repository for the dependency builds and the series files. The
upload could be as simple as a cron'd git pull
on the server. The
history of the series would be implicitly tracked (or even explicitly
when using tags). Users who want to be informed about new dependencies
could subscribe to that repository. And mistakes are less likely to
occur due to the improved transparency, and there could be even GH
actions which do some basic sanity checks.
The only potential drawback I see would be the size of the repository.
While I believe the new repo would be way smaller than our distribution
repo[1], and we might exclude vc11 and vc15 builds, it may still be too
large for practical handling. But in this case we can consider using
Git LFS[2] which has been developed to address this issue.
What do you think? Would this require the RFC process?
Note that I am deliberately referring to php-sdk/deps/ only, since this
appears to be the most important part for now. Uploading the QA and
release builds of PHP is not supposed to be an issue, and the PECL
dependencies are less important than the PHP dependencies (besides these
do not have series files, which simplifies the upload process).
[1] https://github.com/php/web-php-distributions
[2] https://git-lfs.com/
Christoph
Hi Christoph,
- the process is prone to error
Originally, it was thought to do first this (painful) manual process
to validate the flow. Then the plan was to use the releases in each
dep's repository. Having one repository to define which version-build
is needed per php version (incl patch version, f.e. security update).
That has never been done.
It was also a step required for out of php-src extension builds
(pickle partially has it, no idea where the new tool stands about
this). But that's a bit off topic here :)
What do you think?
100% for it.
Would this require the RFC process?
A documentation for sure, especially for external developers
(nativephp, roadrunner, etc) so they can use it for their own flows.
Best,
Pierre
@pierrejoye | http://www.libgd.org
All these problems could be solved, or at least mitigated, by setting
up a Github repository for the dependency builds and the series files.
The upload could be as simple as a cron'd
git pull
on the server.
All our websites, including PHP downloads, use the rsync server for
this. That server has a GIT checkout, and then the Web servers rsync
from there. This is superior because it means the web servers never
need the GIT checkout, solving duplication and permission issues. It
also means it is easy to set up mirrors (unofficial).
The only potential drawback I see would be the size of the repository.
This is already a major issue for the normal PHP downloads, which
currently sits at 15 GB.
While I believe the new repo would be way smaller than our
distribution repo[1], and we might exclude vc11 and vc15 builds, it
may still be too large for practical handling.
But in this case we can consider using Git LFS[2] which has been
developed to address this issue.
Github LFS isn't free though, once you get to large storage sizes (over
1GB) [1]. I also found it really fiddly when using it. As it's not free,
I don't think it qould qualify as something for using in an Open Source
project.
I don't believe GIT is a good fit for versioning binaries. Not for this,
nor the PHP distributions. I understand the history aspects are useful,
but it's never been designed for keeping binaries. It's not a file
management tool, but a source code tracking solution.
I also don't think Git LFS has been created for this either. It is
useful for large files in a repository, not large repositories.
Files can be maximum 2GB, which is plenty for DLLs and our releases.
What do you think? Would this require the RFC process?
I think it needs some good thinking through first. I also don't believe
the RFC system is something we need to use for deciding how to serve
files.
cheers,
Derick
--
https://derickrethans.nl | https://xdebug.org | https://dram.io
Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support
mastodon: @derickr@phpc.social @xdebug@phpc.social
The upload could be as simple as a cron'd
git pull
on the server.All our websites, including PHP downloads, use the rsync server for
this. That server has a GIT checkout, and then the Web servers rsync
from there. This is superior because it means the web servers never
need the GIT checkout, solving duplication and permission issues. It
also means it is easy to set up mirrors (unofficial).
Thanks for the explanation!
Github LFS isn't free though, once you get to large storage sizes (over
1GB) [1]. I also found it really fiddly when using it. As it's not free,
I don't think it qould qualify as something for using in an Open Source
project.
Oh, I wasn't aware of that.
I don't believe GIT is a good fit for versioning binaries. Not for this,
nor the PHP distributions. I understand the history aspects are useful,
but it's never been designed for keeping binaries. It's not a file
management tool, but a source code tracking solution.
I agree.
I think it needs some good thinking through first. I also don't believe
the RFC system is something we need to use for deciding how to serve
files.
This is not about how to serve files, but rather which files to
serve; for instance, I'm currently working on updating libpng, where we
still ship v1.6.34 from Sep 29, 2017.
Anyhow, coming back to my list of problems:
(1) only few people can do these uploads
(2) the process is not transparent
(3) there is no history of the series
(4) the process is prone to error
If we only had the series files in a Github repository, (2) and (3)
would be solved, and (4) at least partially.
The workflow for updating a dependency might then look like:
- someone submits a PR with updates to the series files and a link to
the new dependency build on winlibs/winlib-builder (a PR template might
be useful) - after some basic CI had been run, a notification is sent to those who
can do the uploads to downloads.php.net (or to the rsync server) - one of these people can then check the PR, and if okay, upload the
dependency builds - afterwards the PR is merged, and synced with the server
- archiving no longer needed dependencies could be done on the server (a
simple script should do; and it's not a very important task anyway, and
maybe it shouldn't be done at all, so that older Git revisions of the
series are still useable)
While that would not solve problem (1), it would at least avoid having
to ping some "random" people ("can you please upload?"), and if there is
an appropriate PR template, some further issues with problem (4) could
be resolved (e.g. do the series files refer to existing files?)
Cheers,
Christoph
I think it needs some good thinking through first. I also don't
believe the RFC system is something we need to use for deciding how
to serve files.This is not about how to serve files, but rather which files to
serve; for instance, I'm currently working on updating libpng, where
we still ship v1.6.34 from Sep 29, 2017.
Ok, but I still don't see why you need an RFC for this? :-)
Anyhow, coming back to my list of problems:
(1) only few people can do these uploads
(2) the process is not transparent
(3) there is no history of the series
(4) the process is prone to errorIf we only had the series files in a Github repository, (2) and (3)
would be solved, and (4) at least partially.The workflow for updating a dependency might then look like:
- someone submits a PR with updates to the series files and a link to
the new dependency build on winlibs/winlib-builder (a PR template might
be useful)- after some basic CI had been run, a notification is sent to those who
can do the uploads to downloads.php.net (or to the rsync server)- one of these people can then check the PR, and if okay, upload the
dependency builds- afterwards the PR is merged, and synced with the server
- archiving no longer needed dependencies could be done on the server (a
simple script should do; and it's not a very important task anyway, and
maybe it shouldn't be done at all, so that older Git revisions of the
series are still useable)While that would not solve problem (1), it would at least avoid having
to ping some "random" people ("can you please upload?"), and if there is
an appropriate PR template, some further issues with problem (4) could
be resolved (e.g. do the series files refer to existing files?)
I know Shivam (https://github.com/php/web-downloads) has also been
working on doing automatic pulls of PECL builds onto the "downloads"
server.
The idea was to trigger a GitHub action to call to this API to then
download file file. Ideally the downloads server pulls files, as
uploading to it can't work through GHA as we require 2FA through a
jump host.
We'll have to have multiple Git repositories, and perhaps subdomain
names to make this all work.
The downloads.php.net site currently doesn't have any code yet, as I am
waiting for this 404 ErrorHandler to be included in it:
<?php
if (preg_match('/Win32-vc/', $_SERVER['REQUEST_URI'])) {
$fixed = str_replace( 'Win32-vc', 'Win32-VC', $_SERVER['REQUEST_URI'] );
header("Location: $fixed", true, 301);
exit();
}
header('Location: /', true, 404);
Ideally, instead of having downloads.php.net/~windows, we have
downloads.php.net/windows which is a Git repository for the series
files, but it is probably better if it's all in that same web-downloads
repository.
cheers,
Derick
--
https://derickrethans.nl | https://xdebug.org | https://dram.io
Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support
mastodon: @derickr@phpc.social @xdebug@phpc.social
Ok, but I still don't see why you need an RFC for this? :-)
Oh, I don't need an RFC for this. Actually, you can read my question as
"does this really need an RFC, or can we do without?"
I know Shivam (https://github.com/php/web-downloads) has also been
working on doing automatic pulls of PECL builds onto the "downloads"
server.The idea was to trigger a GitHub action to call to this API to then
download file file. Ideally the downloads server pulls files, as
uploading to it can't work through GHA as we require 2FA through a
jump host.
I see.
We'll have to have multiple Git repositories, and perhaps subdomain
names to make this all work.The downloads.php.net site currently doesn't have any code yet, as I am
waiting for this 404 ErrorHandler to be included in it:<?php
if (preg_match('/Win32-vc/', $_SERVER['REQUEST_URI'])) {
$fixed = str_replace( 'Win32-vc', 'Win32-VC', $_SERVER['REQUEST_URI'] );
header("Location: $fixed", true, 301);
exit();
}header('Location: /', true, 404);
Ideally, instead of having downloads.php.net/~windows, we have
downloads.php.net/windows which is a Git repository for the series
files, but it is probably better if it's all in that same web-downloads
repository.
Oh, there might be slight misunderstanding. By "series files" I only
refer to what is in
https://downloads.php.net/~windows/php-sdk/deps/series/; nothing else.
While the upload/download issue might be solved one way or the other,
having a Git repository for the series files might solve a couple of
issues. I've set up https://github.com/cmb69/php-windeps-series as a
demonstration of what I have in mind. For instance, it is hard to keep
an overview of which packages are in which series; packages.csv helps a
bit (and GH displays this nicely[1]). And then I made PR #1, which
shows a x64/x86 mismatch[2] (actually multiple, but ignore the trailing
-1 lines mismatches). Finally, I made PR #2, which uses a locally run
script to push staging to stable. Many more checks and automations can
be done in the future; but you already get the gist.
[1] https://github.com/cmb69/php-windeps-series/blob/main/packages.csv
[2]
https://github.com/cmb69/php-windeps-series/actions/runs/10815285786/job/30003806974?pr=1#step:3:17
Christoph
Ideally, instead of having downloads.php.net/~windows, we have
downloads.php.net/windows which is a Git repository for the series
files, but it is probably better if it's all in that same
web-downloads repository.Oh, there might be slight misunderstanding. By "series files" I only
refer to what is in
https://downloads.php.net/~windows/php-sdk/deps/series/; nothing
else.
Yes, I realised that.
While the upload/download issue might be solved one way or the other,
having a Git repository for the series files might solve a couple of
issues. I've set up https://github.com/cmb69/php-windeps-series as a
demonstration of what I have in mind. For instance, it is hard to keep
an overview of which packages are in which series; packages.csv helps a
bit (and GH displays this nicely[1]).
Yeah, that's exactly what we had been talking about already, but
generating the series from the CSV file seems like a step forwards, as
maintaining the series files by hand was a little awkward.
I probably would have sorted the versions the other way around though.
And then I made PR #1, which shows a x64/x86 mismatch[2] (actually
multiple, but ignore the trailing -1 lines mismatches). Finally, I
made PR #2, which uses a locally run script to push staging to stable.
Many more checks and automations can be done in the future; but you
already get the gist.[1] https://github.com/cmb69/php-windeps-series/blob/main/packages.csv
[2] https://github.com/cmb69/php-windeps-series/actions/runs/10815285786/job/30003806974?pr=1#step:3:17
Once the actually pulling from GHA to the downloads server has been
added, I can add variants of this to our deployment scripts (aka
https://github.com/php/systems/blob/master/update-phpweb-backend).
I am not sure if you're aware, but some time ago we created a Google doc
to list all the things for moving the windows downloads to
downloads.php.net:
https://docs.google.com/document/d/10YUSdAcSP0xd9XbShKYlyRxG3Q3LNvHr9_P71Ml8XC4/edit#heading=h.11s3yohk7tu9
If you provide an email address, I can give you access (or request it).
cheers,
Derick
--
https://derickrethans.nl | https://xdebug.org | https://dram.io
Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support
mastodon: @derickr@phpc.social @xdebug@phpc.social
While the upload/download issue might be solved one way or the other,
having a Git repository for the series files might solve a couple of
issues. I've set up https://github.com/cmb69/php-windeps-series as a
demonstration of what I have in mind. For instance, it is hard to keep
an overview of which packages are in which series; packages.csv helps a
bit (and GH displays this nicely[1]).Yeah, that's exactly what we had been talking about already, but
generating the series from the CSV file seems like a step forwards, as
maintaining the series files by hand was a little awkward.
Well, as is the CSV is generated from the series files (not the other
way round), and while changing this is possible, I don't think it's the
best option. Instead I'd add some simple scripts, e.g. so you could do
something like update openssl-3.0.15 8.2 8.3 8.4 master
.
I probably would have sorted the versions the other way around though.
I didn't sort them (only the package names are sorted). Just a quick
script.
Once the actually pulling from GHA to the downloads server has been
added, I can add variants of this to our deployment scripts (aka
https://github.com/php/systems/blob/master/update-phpweb-backend).I am not sure if you're aware, but some time ago we created a Google doc
to list all the things for moving the windows downloads to
downloads.php.net:
https://docs.google.com/document/d/10YUSdAcSP0xd9XbShKYlyRxG3Q3LNvHr9_P71Ml8XC4/edit#heading=h.11s3yohk7tu9
If you provide an email address, I can give you access (or request it).
I'm aware of this document, and if you like you can take my email
address from the From address of this email.
However, I'm afraid that this will not be finished within the next
couple of weeks, and I really think that some further updates to Windows
dependencies are long overdue (besides those already available in the
winlibs repositories). Would be great to get these uploaded soon, so
there are still a few 8.4.0RCs left to be able to correct potential issues.
Cheers,
Christoph