Hi internals!
Currently our git repo contains files like zend_language_scanner.c, 
zend_ini_scanner.c, etc which are files generated by re2c. Historically 
these files have been included because re2c was not readily available on 
many platforms. In the thread on bison 3 compatibility 1 there was some 
discussion as to whether this limitation still applies. Quoting Adam Harvey:
+1. I don't think re2c is that onerous a requirement anyway, for the
most part: it's available through apt-get, brew, yum, and probably
most other packaging systems. Given the amount of other things a
developer has to install to build php-src from git, re2c is hardly
going to break the camel's back.
So, I'd like to bring this up again. Are there any objections to removing 
generated lexer files from the repo? If not, does it suffice to just git rm 
them and add them to .gitignore, or are other changes required?
Nikita
Currently our git repo contains files like zend_language_scanner.c,
zend_ini_scanner.c, etc which are files generated by re2c. Historically
these files have been included because re2c was not readily available on
many platforms. In the thread on bison 3 compatibility [1] there was some
discussion as to whether this limitation still applies.
On a similar theme, could we also get rid of the generated Zend VM, and wire up zend_vm_gen.php to make like we wire up bison and re2c? That would make PHP a dependency to build PHP, but it would hardly be the first language which is reliant on itself to be built. Are there any distributions out there in which PHP is not available? Bear in mind that for non-developers, we would still include a generated VM in the source packages so most people wishing to compile PHP don’t need it, this would only affect people using git.
-- 
Andrea Faulds 
http://ajf.me/
Hi!
On a similar theme, could we also get rid of the generated Zend VM,
and wire up zend_vm_gen.php to make like we wire up bison and re2c?
That would make building PHP for non-internals person harder and the 
list of dependencies they need to get right longer. While providing zero 
added value. And yes, there are a lot of non-internals persons building 
PHP. Sometimes they may even use git.
That would make PHP a dependency to build PHP, but it would hardly be
the first language which is reliant on itself to be built. Are there
It would not be the first, but why do it? It works just fine right now, 
why break it?
any distributions out there in which PHP is not available?
There are systems that aren't Linux and don't have PHP as standard 
package, yes.
-- 
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/
发自我的 iPad
在 2014年8月27日,4:09,Stas Malyshev smalyshev@sugarcrm.com 写道:
Hi!
On a similar theme, could we also get rid of the generated Zend VM,
and wire up zend_vm_gen.php to make like we wire up bison and re2c?That would make building PHP for non-internals person harder and the
list of dependencies they need to get right longer. While providing zero
added value. And yes, there are a lot of non-internals persons building
PHP. Sometimes they may even use git.That would make PHP a dependency to build PHP, but it would hardly be
the first language which is reliant on itself to be built. Are thereIt would not be the first, but why do it? It works just fine right now,
why break it?
+1 here.
I am wondering how it will benefit us?
IMO, nothing but a worthless discussion again.
Thanks
any distributions out there in which PHP is not available?
There are systems that aren't Linux and don't have PHP as standard
package, yes.--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
发自我的 iPad
在 2014年8月27日,4:09,Stas Malyshev smalyshev@sugarcrm.com 写道:
Hi!
On a similar theme, could we also get rid of the generated Zend VM,
and wire up zend_vm_gen.php to make like we wire up bison and re2c?That would make building PHP for non-internals person harder and the
list of dependencies they need to get right longer. While providing zero
added value. And yes, there are a lot of non-internals persons building
PHP. Sometimes they may even use git.That would make PHP a dependency to build PHP, but it would hardly be
the first language which is reliant on itself to be built. Are thereIt would not be the first, but why do it? It works just fine right now,
why break it?
+1 here.I am wondering how it will benefit us?
Same, while I am annoyed from time to time due to git co after new 
builds, there is nothing so bad that I would ask anyone else to 
install PHP to compile PHP.
IMO, nothing but a worthless discussion again.
Only off base answers are worthless. Discussions happen and it is up 
to us to make it constructive and end it in time.
-- 
Pierre
@pierrejoye | http://www.libgd.org
Currently our git repo contains files like zend_language_scanner.c,
zend_ini_scanner.c, etc which are files generated by re2c. Historically
these files have been included because re2c was not readily available on
many platforms. In the thread on bison 3 compatibility [1] there was some
discussion as to whether this limitation still applies.On a similar theme, could we also get rid of the generated Zend VM, and
wire up zend_vm_gen.php to make like we wire up bison and re2c? That would
make PHP a dependency to build PHP, but it would hardly be the first
language which is reliant on itself to be built. Are there any
distributions out there in which PHP is not available? Bear in mind that
for non-developers, we would still include a generated VM in the source
packages so most people wishing to compile PHP don’t need it, this would
only affect people using git.
-1 on removing zend_vm_execute.h. Having to install php before I can build 
php would be very inconvenient.
Nikita
Hi!
Spinning this discussion (removing generated VM from git) off into its own thread.
-1 on removing zend_vm_execute.h. Having to install php before I can build php would be very inconvenient.
I’d point out PHP already requires extra dependencies (and your proposal adds one further) before building PHP from git. I also doubt it would cause inconvenience to developers, as if you’re developing PHP you need a stable PHP install anyway. You wouldn’t need PHP to build PHP for most cases, only if you’re building it from git.
That would make building PHP for non-internals person harder and the
list of dependencies they need to get right longer. While providing zero
added value. And yes, there are a lot of non-internals persons building
PHP. Sometimes they may even use git.
Why would a non-internals person want to build PHP from git? That’s just making things harder on themselves. Besides PHP itself (should my proposal succeed), you need other extra dependencies, including a lexer generator if Nikita’s proposal succeeds.
That would make PHP a dependency to build PHP, but it would hardly be
the first language which is reliant on itself to be built. Are thereIt would not be the first, but why do it? It works just fine right now,
why break it?
The advantages are twofold:
- 
We avoid git tracking generated files that don’t provide meaningful diffs and that can have massive changes just from changing the source code or the generation script.
 - 
It’s no longer necessary to manually generate the VM every time an opcode is modified. (Less debugging pain if you forget.)
 
any distributions out there in which PHP is not available?
There are systems that aren't Linux and don't have PHP as standard
package, yes.
OK, that’s true. But on such systems the release package would still build.
-- 
Andrea Faulds 
http://ajf.me/
Hi!
Why would a non-internals person want to build PHP from git? That’s
Why not? It's an open-source project, isn't it? People may prefer using 
git, many integration systems (including PHP's own composer) rely on git.
just making things harder on themselves. Besides PHP itself (should
my proposal succeed), you need other extra dependencies, including a
lexer generator if Nikita’s proposal succeeds.
It's like "since we depend on gcc, adding more dependencies is no 
problem". Doesn't make any sense to me, adding dependencies makes it 
harder, so it is more problems. Especially with recursive dependencies.
- We avoid git tracking generated files that don’t provide
 
meaningful diffs and that can have massive changes just from changing
the source code or the generation script.- It’s no longer necessary to manually generate the VM every time an
 
opcode is modified. (Less debugging pain if you forget.)
That's not an advantage. Having extra file in git is no problem at all, 
we don't pay per byte, and somebody who can't handle regenerating the VM 
file should not be messing with the VM (for one, they would notice the 
problem immediately on running the test for the change locally, and if 
they don't test the changes locally we probably don't want these 
changes). And it's not like we change the VM every day.
-- 
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/
That's not an advantage. Having extra file in git is no problem at all,
we don't pay per byte, and somebody who can't handle regenerating the VM
file should not be messing with the VM (for one, they would notice the
problem immediately on running the test for the change locally, and if
they don't test the changes locally we probably don't want these
changes). And it's not like we change the VM every day.
Actually, I wouldn’t be so bothered by this if I didn’t have to manually regenerate the VM. Would it be possible to make make track Zend/zend_vm_def.h?
-- 
Andrea Faulds 
http://ajf.me/
On Tuesday 26 August 2014 21:30:16 Andrea Faulds wrote:
Why would a non-internals person want to build PHP from git? That’s
just
making things harder on themselves.
I disagree.
The build environment I scripted together for myself, initializes a build 
tree (php-src and various extensions) from git and svn. For git it also 
automatically uses a local git mirror, so I only need to fetch changes 
once regardless of what I want to build.
This allows me to easily set up several build trees which check out 
different tags / branches.
Furthermore I "enter" the build tree with some magic that bind-mounts, 
in a local mount namespace, a build-tree local directory to /opt/php (and 
another one to /usr/lib64/apache2 for mod_php), so while building, the 
"outer system" PHP is completely hidden.
This bind-mounting-in-namespace stuff then permits me to simply run 
make install, run any kind of private tests against the newly built setup, 
run several of these build trees in parallel without disturbing each other, 
and without disturbing the outer system; I can even enter this as root, 
and restart apache in a build tree to test out a fresh mod_php
So, I use git as a source for building, AND I have a setup that has no 
php CLI available during the build.
Besides PHP itself (should my proposal
succeed), you need other extra dependencies, including a lexer
generator if
Nikita’s proposal succeeds.
The difference is that none of these are both required during the build 
and produced by the build. They are just installed once on the build 
system and used by all builds.
best regards 
Patrick
Besides PHP itself (should my proposal
succeed), you need other extra dependencies, including a lexer generator if
Nikita’s proposal succeeds.The difference is that none of these are both required during the build and produced by the build. They are just installed once on the build system and used by all builds.
You can install PHP on the build system and use it for all builds if you wish.
-- 
Andrea Faulds 
http://ajf.me/
Hi internals!
Currently our git repo contains files like zend_language_scanner.c,
zend_ini_scanner.c, etc which are files generated by re2c. Historically
these files have been included because re2c was not readily available on
many platforms. In the thread on bison 3 compatibility 1 there was some
discussion as to whether this limitation still applies. Quoting Adam Harvey:+1. I don't think re2c is that onerous a requirement anyway, for the
most part: it's available through apt-get, brew, yum, and probably
most other packaging systems. Given the amount of other things a
developer has to install to build php-src from git, re2c is hardly
going to break the camel's back.So, I'd like to bring this up again. Are there any objections to removing
generated lexer files from the repo? If not, does it suffice to just git rm
them and add them to .gitignore, or are other changes required?Nikita
I'd like to bring this up again. It seems the thread got side-tracked with 
unrelated discussion (about vm_execute.h) and I didn't get any answers to 
the original question.
Case in point: The current zend_language_scanner.c has been generated by 
Andrea, who uses a different re2c version from everybody else (0.13.6 
instead of 0.13.5). This means that if I do some tiny change to 
zend_language_scanner.l I immediately get a 3000 line diff. So we just end 
up changing this file back and forth depending on the algorithm used by 
different versions.
Nikita
Hello,
sorry if I'm totally wrong - I have no experience in this area, but if 
someone is strongly against removing those files for any reason (not 
that I would) then we might consider passing -i flag while generating C 
files with re2c.
As far as I can see most of the changes listed on every diff touching 
this file are different line numbers in comments. This option should 
disable them if only re2c documentation1 is right.
Regards, 
Maciej.
I'd like to bring this up again. It seems the thread got side-tracked with
unrelated discussion (about vm_execute.h) and I didn't get any answers to
the original question.Case in point: The current zend_language_scanner.c has been generated by
Andrea, who uses a different re2c version from everybody else (0.13.6
instead of 0.13.5). This means that if I do some tiny change to
zend_language_scanner.l I immediately get a 3000 line diff. So we just end
up changing this file back and forth depending on the algorithm used by
different versions.Nikita
Hi!
Case in point: The current zend_language_scanner.c has been generated by
Andrea, who uses a different re2c version from everybody else (0.13.6
instead of 0.13.5). This means that if I do some tiny change to
zend_language_scanner.l I immediately get a 3000 line diff. So we just end
up changing this file back and forth depending on the algorithm used by
different versions.
So, from time to time we'd get a big diff. But what's a big problem with 
that? It doesn't seem to hurt anything. And language scanner is not 
changed every day. I don't see any benefit in such change, just making 
building PHP harder.
-- 
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/
Hi,
Sorry to intrude, but why would building be harder? Tbh I don't see the 
point of keeping generated files in git. Why not keep release binaries too! 
(I'm kidding ofc.)
Also, there may be a small number of "big diffs", but one is enough to 
introduce a bug. Generating the file every time ensures there is no hidden 
bug.
Just my 2 cents.
Regards, 
Florian Margaine 
Le 2 oct. 2014 21:04, "Stas Malyshev" smalyshev@sugarcrm.com a écrit :
Hi!
Case in point: The current zend_language_scanner.c has been generated by
Andrea, who uses a different re2c version from everybody else (0.13.6
instead of 0.13.5). This means that if I do some tiny change to
zend_language_scanner.l I immediately get a 3000 line diff. So we just
end
up changing this file back and forth depending on the algorithm used by
different versions.So, from time to time we'd get a big diff. But what's a big problem with
that? It doesn't seem to hurt anything. And language scanner is not
changed every day. I don't see any benefit in such change, just making
building PHP harder.--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
Hi!
Sorry to intrude, but why would building be harder? Tbh I don't see the
Because there are more dependencies and tools needed to build the 
parsers. You'd have to have recent re2c, for example. Which does not 
come by default with many systems. I see no reason to add this hurdle 
with no benefit for anybody.
Also, there may be a small number of "big diffs", but one is enough to
introduce a bug. Generating the file every time ensures there is no
hidden bug.
Which bug? They are generated parsers from the same source, what kind of 
bug you're talking about?
-- 
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/
Hi,
On Thu, Oct 2, 2014 at 10:12 PM, Stas Malyshev smalyshev@sugarcrm.com 
wrote:
Hi!
Sorry to intrude, but why would building be harder? Tbh I don't see the
Because there are more dependencies and tools needed to build the
parsers. You'd have to have recent re2c, for example. Which does not
come by default with many systems. I see no reason to add this hurdle
with no benefit for anybody.
It's already necessary; PHP won't pass the configure step without re2c.
Also, there may be a small number of "big diffs", but one is enough to
introduce a bug. Generating the file every time ensures there is no
hidden bug.Which bug? They are generated parsers from the same source, what kind of
bug you're talking about?
Slipping in a malicious code in such a diff could easily go unnoticed, for 
instance. Or simply going to the file to see the generated code, leave a 
letter by mistake because the cat jumps on the keyboard, and there you go. 
I just mean that such big diffs are simply unreviewable; you have to trust 
that it was generated and not touched after. Why this unnecessary trust to 
give, when we can simply not have the file?
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
-- 
Florian Margaine
Hi!
Slipping in a malicious code in such a diff could easily go unnoticed,
This is not a bug. And if we have a malicious comitter, we have much 
bigger problems than generated lexers. Fortunately, there's exactly zero 
evidence that it is of any concern to us.
you have to trust that it was generated and not touched after. Why this
unnecessary trust to give, when we can simply not have the file?
Again, if you do not trust people who are working on most sensitive part 
of the engine with being able to observe minimal rules of sane coding, 
you have bigger problems than lexers. Not that there are hundreds of 
them committing any way, this year we had exactly 1 (one) big lexer 
commit so far, last year there were three. And it's not that hard to 
scan through them either, if you're interested.
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/