Hi,
I think, many core developers saw unexpected changes in "zend_labguages_scanner.c" or "var_unserializer.c" after rebuilds.
This occurs, because we use different versions of re2c, and some of them produce really different code.
They also embed version number into the generate source. Currently different files in PHP source tree generated by different re2c versions:
ext/json/json_scanner.c 0.16
ext/date/lib/parse_date.c 0.15.3
ext/date/lib/parse_iso_intervals.c 0.15.3
ext/pdo/pdo_sql_parser.c 0.16
ext/phar/phar_path_check.c 1.0.3
sapi/phpdbg/phpdbg_lexer.c 0.16
ext/standard/url_scanner_ex.c 0.16
ext/standard/var_unserializer.c 1.0.1
Zend/zend_ini_scanner.c 0.15
Zend/zend_language_scanner.c 1.0.1
I propose, to change build scripts (in master and PHP-7.3) to require at least re2c version 1.0.0 (it seems 1.0.0-1.0.3 produce the same result) and suppress version output into the generated files.
I'm not sure about timelib files.
Thanks. Dmitry.
Hi,
I think, many core developers saw unexpected changes in
"zend_labguages_scanner.c" or "var_unserializer.c" after rebuilds.This occurs, because we use different versions of re2c, and some of them
produce really different code.They also embed version number into the generate source. Currently
different files in PHP source tree generated by different re2c versions:ext/json/json_scanner.c 0.16
ext/date/lib/parse_date.c 0.15.3
ext/date/lib/parse_iso_intervals.c 0.15.3
ext/pdo/pdo_sql_parser.c 0.16
ext/phar/phar_path_check.c 1.0.3
sapi/phpdbg/phpdbg_lexer.c 0.16
ext/standard/url_scanner_ex.c 0.16
ext/standard/var_unserializer.c 1.0.1
Zend/zend_ini_scanner.c 0.15
Zend/zend_language_scanner.c 1.0.1
I propose, to change build scripts (in master and PHP-7.3) to require at
least re2c version 1.0.0 (it seems 1.0.0-1.0.3 produce the same result) and
suppress version output into the generated files.I'm not sure about timelib files.
Thanks. Dmitry.
I don't think normalizing the version really solves anything. These files
should be dropped from version control entirely instead. Generated files do
not belong in version control.
re2c is widely available on Linux distros nowadays (probably specifically
because PHP uses it) and while there might have been historical ground to
bundle these generated files, there no longer is one.
Nikita
I think, many core developers saw unexpected changes in
"zend_labguages_scanner.c" or "var_unserializer.c" after rebuilds.This occurs, because we use different versions of re2c, and some of them
produce really different code.They also embed version number into the generate source. Currently
different files in PHP source tree generated by different re2c versions:ext/json/json_scanner.c 0.16
ext/date/lib/parse_date.c 0.15.3
ext/date/lib/parse_iso_intervals.c 0.15.3
ext/pdo/pdo_sql_parser.c 0.16
ext/phar/phar_path_check.c 1.0.3
sapi/phpdbg/phpdbg_lexer.c 0.16
ext/standard/url_scanner_ex.c 0.16
ext/standard/var_unserializer.c 1.0.1
Zend/zend_ini_scanner.c 0.15
Zend/zend_language_scanner.c 1.0.1
I propose, to change build scripts (in master and PHP-7.3) to require at
least re2c version 1.0.0 (it seems 1.0.0-1.0.3 produce the same result) and
suppress version output into the generated files.I'm not sure about timelib files.
I don't think normalizing the version really solves anything. These
files should be dropped from version control entirely instead.
Generated files do not belong in version control.
the timelib files must be generated with 0.15.3:
https://github.com/mongodb/mongo/blob/master/src/third_party/scripts/timelib_get_sources.sh#L10-L11
re2c is widely available on Linux distros nowadays (probably
specifically because PHP uses it) and while there might have been
historical ground to bundle these generated files, there no longer is
one.
The timelib files will need to be continued to be bundled. There is no
rule in the Makefile either (unlike zend_language parser(s)), so this
should not cause issues with GIT.
cheers,
Derick
--
https://derickrethans.nl | https://xdebug.org | https://dram.io
Like Xdebug? Consider a donation: https://xdebug.org/donate.php,
or become my Patron: https://www.patreon.com/derickr
twitter: @derickr and @xdebug
Hi!
the timelib files must be generated with 0.15.3:
https://github.com/mongodb/mongo/blob/master/src/third_party/scripts/timelib_get_sources.sh#L10-L11
That comment says 0.16 was problematic. Is it still true for 1.0.*? Was
it reported to re2c?
--
Stas Malyshev
smalyshev@gmail.com
I propose, to change build scripts (in master and PHP-7.3) to require at
least re2c version 1.0.0 (it seems 1.0.0-1.0.3 produce the same result) and
suppress version output into the generated files.I don't think normalizing the version really solves anything. These files
should be dropped from version control entirely instead. Generated files do
not belong in version control.
+1
--
Christoph M. Becker
Hi!
re2c is widely available on Linux distros nowadays (probably
On Linux distros on common platforms (Intel/AMD) - sure. But what if you
need an uncommon platform, or one that does not run Linux? It's those
platforms where you'd have to build PHP from source (after all, PHP is
also widely available as a package on Linux distros anyway) and adding
another hassle of figuring out how to build a third-party tool - I don't
think it's a good service to the community.
--
Stas Malyshev
smalyshev@gmail.com
Hi!
On Linux distros on common platforms (Intel/AMD) - sure. But what if you
need an uncommon platform, or one that does not run Linux? It's those
platforms where you'd have to build PHP from source (after all, PHP is
also widely available as a package on Linux distros anyway) and adding
another hassle of figuring out how to build a third-party tool - I don't
think it's a good service to the community.
BTW, looking at re2c source, I don't see any mention of Windows. Does it
even build on Windows? What would be prerequisites for it? I thought
Windows is still a supported platform? It won't be nice if one couldn't
build PHP from source on Windows anymore...
--
Stas Malyshev
smalyshev@gmail.com
-----Original Message-----
From: Stanislav Malyshev smalyshev@gmail.com
Sent: Friday, July 13, 2018 8:46 PM
To: Nikita Popov nikita.ppv@gmail.com; Dmitry Stogov dmitry@zend.com
Cc: PHP internals list internals@lists.php.net; derick@derickrethans.nl;
Christoph M. Becker cmbecker69@gmx.de
Subject: Re: [PHP-DEV] re2c version(s)Hi!
On Linux distros on common platforms (Intel/AMD) - sure. But what if
you need an uncommon platform, or one that does not run Linux? It's
those platforms where you'd have to build PHP from source (after all,
PHP is also widely available as a package on Linux distros anyway) and
adding another hassle of figuring out how to build a third-party tool
- I don't think it's a good service to the community.
BTW, looking at re2c source, I don't see any mention of Windows. Does it even
build on Windows? What would be prerequisites for it? I thought Windows is still
a supported platform? It won't be nice if one couldn't build PHP from source on
Windows anymore...
The binary SDK has undergone a major revamp in the last couple of years. It's now using the MSYS2 port, which works almost fine and is an active project. Before that, some tools like bison 2.4.1 was used, because there was no way to upgrade it. See https://github.com/Microsoft/php-sdk-binary-tools/tree/master/msys2/usr . I was repeatedly asking systems@ to move this repo to git.php.net, but unfortunately that didn't happen. Anyway, the approach is platform specific and was inspired by how mozbuild does it. One always has the full control on which tool versions are used and bad versions can be omitted.
The binary SDK for Windows currently uses bison 3.0.4 (3.0.5 in staging) and re2c 1.0.3 on all the branches. The exact versions run on AppVeyor and are tested continuously. Also every release is tested individually and snapshots from windows.php.net are tested, too. The binary SDK has all the tools that are needed to build PHP. The sources for Windows zipballs are distributed separately, which has its obvious reasons.
Regards
Anatol
re2c is widely available on Linux distros nowadays (probably
On Linux distros on common platforms (Intel/AMD) - sure. But what if you
need an uncommon platform, or one that does not run Linux? It's those
platforms where you'd have to build PHP from source (after all, PHP is
also widely available as a package on Linux distros anyway) and adding
another hassle of figuring out how to build a third-party tool - I don't
think it's a good service to the community.
I would offer that this is what official releases and/or snaps.php.net is for.
Checks snaps.php.net
...wait, is this not a thing anymore?
-Sara
Checks snaps.php.net
....wait, is this not a thing anymore?
IIRC, that was already gone before I got my php.net account. Nowadays
users are supposed to checkout from Git.
--
Christoph M. Becker
Hi!
I would offer that this is what official releases and/or snaps.php.net is for.
Do you mean releases would contain the generated files but the regular
source won't? I am not sure then why - if we say our "official" source
has generated files, why keep VCS out of sync with the source? Also,
what if you want to build a version that has not been released yet? I
think keeping VCS in a state where it can not be built in a major system
is not a good thing.
--
Stas Malyshev
smalyshev@gmail.com
On Fri, Jul 13, 2018 at 8:40 PM, Stanislav Malyshev smalyshev@gmail.com
wrote:
Hi!
re2c is widely available on Linux distros nowadays (probably
On Linux distros on common platforms (Intel/AMD) - sure. But what if you
need an uncommon platform, or one that does not run Linux? It's those
platforms where you'd have to build PHP from source (after all, PHP is
also widely available as a package on Linux distros anyway) and adding
another hassle of figuring out how to build a third-party tool - I don't
think it's a good service to the community.
We always bundle generated files in distributed sources, so end-users need
neither re2c nor bison to build releases.
re2c is not a problem for Windows either, it is bundled as part of the PHP
SDK.
Nikita
Hi!
re2c is not a problem for Windows either, it is bundled as part of the
PHP SDK.
OK then, Windows is not a problem, that makes it better.
--
Stas Malyshev
smalyshev@gmail.com
Stanislav Malyshev in php.internals (Fri, 13 Jul 2018 11:40:12 -0700):
re2c is widely available on Linux distros nowadays (probably
On Linux distros on common platforms (Intel/AMD) - sure. But what if you
need an uncommon platform, or one that does not run Linux? It's those
platforms where you'd have to build PHP from source (after all, PHP is
also widely available as a package on Linux distros anyway) and adding
another hassle of figuring out how to build a third-party tool - I don't
think it's a good service to the community.
Directadmin just makes the tarballs from php.net available at every new
release. re2c is not even installed on my Directadmin powered CentOS 6
systems and bison is not used in the build process.
The Directadmin admins surely will not be happy if the generated files
are removed from the tarballs. And the Directadmin users will probably
not be happy either, because chances are high that new releases will
have a time delay. Currently a new PHP release is distributed through
Directadmin on the day of release.
Jan
Stanislav Malyshev in php.internals (Fri, 13 Jul 2018 11:40:12 -0700):
re2c is widely available on Linux distros nowadays (probably
On Linux distros on common platforms (Intel/AMD) - sure. But what if you
need an uncommon platform, or one that does not run Linux? It's those
platforms where you'd have to build PHP from source (after all, PHP is
also widely available as a package on Linux distros anyway) and adding
another hassle of figuring out how to build a third-party tool - I don't
think it's a good service to the community.Directadmin just makes the tarballs from php.net available at every new
release. re2c is not even installed on my Directadmin powered CentOS 6
systems and bison is not used in the build process.The Directadmin admins surely will not be happy if the generated files
are removed from the tarballs. And the Directadmin users will probably
not be happy either, because chances are high that new releases will
have a time delay. Currently a new PHP release is distributed through
Directadmin on the day of release.
We're not talking about removing from tarballs, only git. RMs will continue generating bison and re2c targets so that these are not build requirements for normal users.
-Sara
-----Original Message-----
From: Nikita Popov [mailto:nikita.ppv@gmail.com]
Sent: Friday, July 13, 2018 12:26 PM
To: Dmitry Stogov dmitry@zend.com
Cc: PHP internals list internals@lists.php.net; Stanislav Malyshev
smalyshev@gmail.com; derick@derickrethans.nl; Christoph M. Becker
cmbecker69@gmx.de
Subject: Re: [PHP-DEV] re2c version(s)Hi,
I think, many core developers saw unexpected changes in
"zend_labguages_scanner.c" or "var_unserializer.c" after rebuilds.This occurs, because we use different versions of re2c, and some of
them produce really different code.They also embed version number into the generate source. Currently
different files in PHP source tree generated by different re2c versions:ext/json/json_scanner.c 0.16
ext/date/lib/parse_date.c 0.15.3
ext/date/lib/parse_iso_intervals.c 0.15.3
ext/pdo/pdo_sql_parser.c 0.16
ext/phar/phar_path_check.c 1.0.3
sapi/phpdbg/phpdbg_lexer.c 0.16
ext/standard/url_scanner_ex.c 0.16
ext/standard/var_unserializer.c 1.0.1
Zend/zend_ini_scanner.c 0.15
Zend/zend_language_scanner.c 1.0.1
I propose, to change build scripts (in master and PHP-7.3) to require
at least re2c version 1.0.0 (it seems 1.0.0-1.0.3 produce the same
result) and suppress version output into the generated files.I'm not sure about timelib files.
Thanks. Dmitry.
I don't think normalizing the version really solves anything. These files should be
dropped from version control entirely instead. Generated files do not belong in
version control.re2c is widely available on Linux distros nowadays (probably specifically because
PHP uses it) and while there might have been historical ground to bundle these
generated files, there no longer is one.
I agree that these files shouldn't be in version control, but I think normalizing the version does buy us something - it ensures (or at least tries to) that whomever does generate these files does so with a version that we believe is suitable for the job. This is of course especially important for the RM's machine - but it's also important for whomever else might be building from a direct checkout and not a source package (I think the files we currently see in source control are indicative of that).
So why not do both - remove these files from version control, but also update the re2c requirements in configure and makedist..?
And of course we still want to bundle these in our distros - just not track them in our source control.
Zeev
Le 13/07/2018 à 23:48, Zeev Suraski a écrit :
So why not do both - remove these files from version control, but also update the re2c requirements in configure and makedist..?
And of course we still want to bundle these in our distros - just not track them in our source control.
I agree
Perhaps we can also add all the generated files (including configure) in
the tagged versions, so the tag will have same content than the official
archive.
Remi
Le 13/07/2018 à 23:48, Zeev Suraski a écrit :
Perhaps we can also add all the generated files (including configure) in
the tagged versions, so the tag will have same content than the official
archive.
Ick, no. That's the worst outcome IMO. I don't think they need to be
in git, but if we're going to have them there during tags, then they
should always be there. One way or the other, not some
middle-of-the-road thing.
-Sara
On Tue, Jul 17, 2018 at 1:04 AM, Remi Collet remi@fedoraproject.org
wrote:Le 13/07/2018 à 23:48, Zeev Suraski a écrit :
Perhaps we can also add all the generated files (including configure) in
the tagged versions, so the tag will have same content than the official
archive.Ick, no. That's the worst outcome IMO. I don't think they need to be
in git, but if we're going to have them there during tags, then they
should always be there. One way or the other, not some
middle-of-the-road thing.
I can explain why I think that what I propose is the best outcome:
- It ensures that the correct versions of re2c are always used.
- It doesn't track generated files in source control.
- It allows users to build PHP from source on platforms where re2c is
unavailable.
Correct me if I'm wrong, but isn't that exactly what we do with
zend_language_parser.c? makedist is responsible for generating it so that
it's available in distributions, but it isn't tracked. Why would
zend_language_scanner.c be any different? My guess is that it's probably
because at the time we moved to re2c it wasn't nearly as ubiquitous as it
is today and even most developers didn't have access to it, but now that's
changed.
I'm also fine with what Remi proposed which is adding these files
specifically to the source control at the time of tagging, in the spirit of
tracking everything that we actually end up releasing in source control
(and perhaps do that for zend_language_parser.c if we decide that this is
the right thing to do).
Either way - the first step (normalizing re2c versions - updating our r2ec
requirements) seems to make sense. We can decide about whether or not we
track the generated files in git as we do today, only during tagging or not
at all independently of that.
Zeev
On Tue, Jul 17, 2018 at 1:04 AM, Remi Collet remi@fedoraproject.org
wrote:Le 13/07/2018 à 23:48, Zeev Suraski a écrit :
Perhaps we can also add all the generated files (including configure)
in
the tagged versions, so the tag will have same content than the
official
archive.Ick, no. That's the worst outcome IMO. I don't think they need to be
in git, but if we're going to have them there during tags, then they
should always be there. One way or the other, not some
middle-of-the-road thing.I can explain why I think that what I propose is the best outcome:
- It ensures that the correct versions of re2c are always used.
- It doesn't track generated files in source control.
- It allows users to build PHP from source on platforms where re2c is
unavailable.Correct me if I'm wrong, but isn't that exactly what we do with
zend_language_parser.c? makedist is responsible for generating it so that
it's available in distributions, but it isn't tracked. Why would
zend_language_scanner.c be any different? My guess is that it's probably
because at the time we moved to re2c it wasn't nearly as ubiquitous as it
is today and even most developers didn't have access to it, but now that's
changed.I'm also fine with what Remi proposed which is adding these files
specifically to the source control at the time of tagging, in the spirit of
tracking everything that we actually end up releasing in source control
(and perhaps do that for zend_language_parser.c if we decide that this is
the right thing to do).Either way - the first step (normalizing re2c versions - updating our r2ec
requirements) seems to make sense. We can decide about whether or not we
track the generated files in git as we do today, only during tagging or not
at all independently of that.Zeev
I feel like we are all really in violent agreement that these files should
be dropped from git, and at this point I'm not even sure what the
discussion is about anymore. Let's wait until after PHP-7.3 branching in
two weeks and drop them at that point.
Normalizing the version numbers seems unnecessary after they are dropped --
at least Dmitry's original motivation for that was related exclusively to
the spurious diffs caused by different versions, which will no longer be an
issue.
Nikita
I feel like we are all really in violent agreement that these files should
be dropped from git, and at this point I'm not even sure what the
discussion is about anymore. Let's wait until after PHP-7.3 branching in
two weeks and drop them at that point.Normalizing the version numbers seems unnecessary after they are dropped
-- at least Dmitry's original motivation for that was related exclusively
to the spurious diffs caused by different versions, which will no longer be
an issue.
While we all agree that the files should be dropped from git - there
appears to be disagreement regarding what else we need to do in addition.
In my opinion if that's the only action we'd take then I don't think we
should do it and the status quo is actually better - as it would mean that
it will no longer be possible to build our packages in platforms that don't
have re2c available or typically installed. It needs to happen hand in
hand with providing these files in the source packages, and also ensuring
that whatever boxes one uses to create the packages - as well as developers
who check out the source code directly from git - have an acceptable
version of re2c. It may be that we can accept a wide range of re2c
versions (although if there are substantial differences in code perhaps
it's better to err on the side of caution).
I'm not sure why we're not simply following exactly what we're doing with
the parser. We have a list of acceptable bison versions. We check both in
configure and makedist against that list, and refuse to generate the parser
otherwise. We don't track the generated .c file in source control - but we
do include it in distros to account for environments that don't typically
have bison installed. Why not do exactly the same with the re2c scanner?
Zeev
I feel like we are all really in violent agreement that these files
should be dropped from git, and at this point I'm not even sure what the
discussion is about anymore. Let's wait until after PHP-7.3 branching in
two weeks and drop them at that point.Normalizing the version numbers seems unnecessary after they are dropped
-- at least Dmitry's original motivation for that was related exclusively
to the spurious diffs caused by different versions, which will no longer be
an issue.While we all agree that the files should be dropped from git - there
appears to be disagreement regarding what else we need to do in addition.
In my opinion if that's the only action we'd take then I don't think we
should do it and the status quo is actually better - as it would mean that
it will no longer be possible to build our packages in platforms that don't
have re2c available or typically installed. It needs to happen hand in
hand with providing these files in the source packages, and also ensuring
that whatever boxes one uses to create the packages - as well as developers
who check out the source code directly from git - have an acceptable
version of re2c. It may be that we can accept a wide range of re2c
versions (although if there are substantial differences in code perhaps
it's better to err on the side of caution).I'm not sure why we're not simply following exactly what we're doing with
the parser. We have a list of acceptable bison versions. We check both in
configure and makedist against that list, and refuse to generate the parser
otherwise. We don't track the generated .c file in source control - but we
do include it in distros to account for environments that don't typically
have bison installed. Why not do exactly the same with the re2c scanner?
Ah yes, of course the generated files will be part of distribution
tarballs, just like we do with all generated files (not just the parser,
but also configure.) While I forgot to write this in my original mail, it
has been mentioned already 4 days ago. So again, it seems like we're really
in total agreement here, just a matter of turning it into reality ;)
Nikita
Ah yes, of course the generated files will be part of distribution
tarballs, just like we do with all generated files (not just the parser,
but also configure.) While I forgot to write this in my original mail, it
has been mentioned already 4 days ago. So again, it seems like we're really
in total agreement here, just a matter of turning it into reality ;)
You know, you made me go back to Sara's email where she disagreed with me,
only to find she actually was disagreeing with Remi's proposal to track the
generated files for releases in git. That's definitely not a hill to die
on for me :)
So all in all I think you're right, we agree on the important things:
- Remove generated files from git
- Keep them in source packages
What we seem to disagree on is that we should have a narrower list of
acceptable re2c versions determined by configure/makedist. This isn't a
hill to die on for me either, although I think that narrowing it down is
better in terms of our ability to deliver a source package with confidence,
and be sure that everyone who's testing/using it is testing the same thing
(similar to how we do it with bison). Arguably it's more important to be
on the safe wide with makedist than it is with configure.
Thanks,
Zeev
You know, you made me go back to Sara's email where she disagreed with me,
only to find she actually was disagreeing with Remi's proposal to track the
generated files for releases in git. That's definitely not a hill to die on
for me :)
Correct. I'm in agreement with you that we DO include the generated
files in tarballs (using a standardized version of re2c, or at least,
a consistent-per-branch version), but that we do NOT ever check those
generated files into git.
So all in all I think you're right, we agree on the important things:
- Remove generated files from git
- Keep them in source packages
+1
What we seem to disagree on is that we should have a narrower list of
acceptable re2c versions determined by configure/makedist. This isn't a
hill to die on for me either, although I think that narrowing it down is
better in terms of our ability to deliver a source package with confidence,
and be sure that everyone who's testing/using it is testing the same thing
(similar to how we do it with bison). Arguably it's more important to be on
the safe wide with makedist than it is with configure.
I think devs should be able to use flexible versions of re2c (and
other tools, e.g. bison), BUT that we should declare formally what
versions of these build tools will be used on what branches so that
those working on features can predictably know what their changes will
generate.
For example, my builder (which Remi and I both use for 7.2 builds)
currently uses debian:jessie (and therefore re2c 0.13.5 and bison
3.0.2). Barring any pressing need, we'll plan to keep them at these
versions until 7.2 goes EOL.
-Sara
I think devs should be able to use flexible versions of re2c (and
other tools, e.g. bison), BUT that we should declare formally what
versions of these build tools will be used on what branches so that
those working on features can predictably know what their changes will
generate.For example, my builder (which Remi and I both use for 7.2 builds)
currently uses debian:jessie (and therefore re2c 0.13.5 and bison
3.0.2). Barring any pressing need, we'll plan to keep them at these
versions until 7.2 goes EOL.-Sara
I really like this, because when packager build systems do need to
regenerate the files - the build system can package the exact versions
specified even if just for the build system regardless of the version
the distro has, and take one possible no matter how unlikely source of
obscure bugs out of the equation.