Hi internals!
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our compilation
process:
https://wiki.php.net/rfc/abstract_syntax_tree
The RFC outlines why an AST is beneficial, how it impacts performance and
memory usage of the compile process and what changes to syntax or behavior
it introduces.
Furthermore the RFC contains an outline of how the current implementation
works and what APIs it provides. This section is just an overview and I
hope to extend it in the future.
Note: I'm on vacation as of tomorrow and wanted to put this up to
discussion beforehand. I won't be able to implement any feedback while I'm
away, but can of course answer questions :)
Thanks,
Nikita
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our compilation
process:https://wiki.php.net/rfc/abstract_syntax_tree
The RFC outlines why an AST is beneficial, how it impacts performance and
memory usage of the compile process and what changes to syntax or behavior
it introduces.Furthermore the RFC contains an outline of how the current implementation
works and what APIs it provides. This section is just an overview and I
hope to extend it in the future.
+1 all the way, man.
Hi internals!
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our compilation
process:https://wiki.php.net/rfc/abstract_syntax_tree
The RFC outlines why an AST is beneficial, how it impacts performance and
memory usage of the compile process and what changes to syntax or behavior
it introduces.Furthermore the RFC contains an outline of how the current implementation
works and what APIs it provides. This section is just an overview and I
hope to extend it in the future.
Sounds great! BTW, wasn't there a related AST GSOC project once?
--
Regards,
Mike
OMG, so +1 on this! rolling eyes
We'd be a step away for hookable grammars! \o/
Hi internals!
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our
compilation
process:https://wiki.php.net/rfc/abstract_syntax_tree
The RFC outlines why an AST is beneficial, how it impacts performance and
memory usage of the compile process and what changes to syntax or
behavior
it introduces.Furthermore the RFC contains an outline of how the current implementation
works and what APIs it provides. This section is just an overview and I
hope to extend it in the future.Sounds great! BTW, wasn't there a related AST GSOC project once?
--
Regards,
Mike
--
Guilherme Blanco
MSN: guilhermeblanco@hotmail.com
GTalk: guilhermeblanco
Toronto - ON/Canada
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our compilation
process:
Oh man, you got the implementation done? That makes me so happy. ^^
Like Sara, I am very, very much for this. There are so many potential benefits, and the only real downside IMO (memory use and performance at parsing-time) is largely negated by using opcache, a built-in feature of PHP. An AST is really the most obvious and best way to do compilation, IMO, and as it so happens, I’ve used them in the past when implementing a programming language I designed.
A benefit I particularly like is we could perhaps add nicer generator expression and list() syntaxes, as you have described. I can’t remember if you implemented the former. If you did, I’d love it if you’d also propose it should ASTs get in, they’re an awesome feature and it is a shame we didn’t get them when we got generators.
On another note, while we could allow extensions to hook into the AST, I’d stray away from letting them modify it and implement new features. That really doesn’t sound like a good idea at all (I could elaborate on why, but I won’t bother unless someone asks me). Letting them read it would clearly be a good thing though. Side note: Would that obsolete your PHP-Parser userland library once an extension allows reading the AST? If an extension is written, perhaps it should have the same API as your library had. I believe many projects already use it.
On an implementation note, I assume this is based on the existing constant ASTs, unless you renamed it to something other then zend_ast. Does this cause any problems for them? While I can see the benefits of course, I’d just like to be sure that the validator doesn’t let anything slip through.
It’s also nice to see zend_emit_op and _jump, I suspect they’ll make code easier to read (currently I have to scan for the other operands sometimes as they’re not always set together).
Again, I really like this. I look forward to its inclusion. Thank you so much!
--
Andrea Faulds
http://ajf.me/
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our
compilation
process:Oh man, you got the implementation done? That makes me so happy. ^^
Like Sara, I am very, very much for this. There are so many potential
benefits, and the only real downside IMO (memory use and performance at
parsing-time) is largely negated by using opcache, a built-in feature of
PHP. An AST is really the most obvious and best way to do compilation, IMO,
and as it so happens, I’ve used them in the past when implementing a
programming language I designed.A benefit I particularly like is we could perhaps add nicer generator
expression and list() syntaxes, as you have described. I can’t remember if
you implemented the former. If you did, I’d love it if you’d also propose
it should ASTs get in, they’re an awesome feature and it is a shame we
didn’t get them when we got generators.On another note, while we could allow extensions to hook into the AST,
I’d stray away from letting them modify it and implement new features. That
really doesn’t sound like a good idea at all (I could elaborate on why, but
I won’t bother unless someone asks me). Letting them read it would clearly
be a good thing though. Side note: Would that obsolete your PHP-Parser
userland library once an extension allows reading the AST? If an extension
is written, perhaps it should have the same API as your library had. I
believe many projects already use it.On an implementation note, I assume this is based on the existing
constant ASTs, unless you renamed it to something other then zend_ast. Does
this cause any problems for them? While I can see the benefits of course,
I’d just like to be sure that the validator doesn’t let anything slip
through.It’s also nice to see zend_emit_op and _jump, I suspect they’ll make code
easier to read (currently I have to scan for the other operands sometimes
as they’re not always set together).Again, I really like this. I look forward to its inclusion. Thank you so
much!--
Andrea Faulds
http://ajf.me/--
That's nice Nik ;-)
As I told you on IRC, I'm all in for developping an extension that could
play with the AST.
I like Andrea's suggestion of publishing to userland an API that looks like
PHPParser :-)
Julien.P
Hi,
I don't really have anything new to say about this, just couldn't
resist another +1.
This RFC is an example to follow - well thought and written, detailed,
implementation ready in spite of it being a real challenge,
non-controversial, and even as a userland developer, I can see the
benefits. If we were on 9gag, I'd be throwing up rainbows.
Great job!
Cheers,
Andrey.
On another note, while we could allow extensions to hook into the AST,
I’d stray away from letting them modify it and implement new features. That
really doesn’t sound like a good idea at all (I could elaborate on why, but
I won’t bother unless someone asks me).
Why do you think this isn't a good idea? I think it would be a nice way to
prototype language features before pulling them into PHP. Though admittedly
I don't think there are many things that could be implemented that way.
Letting them read it would clearly be a good thing though. Side note:
Would that obsolete your PHP-Parser userland library once an extension
allows reading the AST? If an extension is written, perhaps it should have
the same API as your library had. I believe many projects already use it.
A native extension has the limitation that it will not be able to parse
files for newer PHP versions (which, depending on the use case may or may
not be a problem) and probably won't provide a stable structure across
versions. At least I think giving BC guarantees on the AST structure
between minor versions would be way too limiting for us. As such I think
both a native ext (which provides awesome perf) and PHP-Parser (which
provides x-compat) have their place ;)
On an implementation note, I assume this is based on the existing
constant ASTs, unless you renamed it to something other then zend_ast. Does
this cause any problems for them? While I can see the benefits of course,
I’d just like to be sure that the validator doesn’t let anything slip
through.
Yes, the AST structure is based on the existing work on constant scalar
expressions, though by now the structure and API deviate a good bit from
that. It doesn't cause problems for them - constant expressions go through
a validation that checks that only valid nodes are used and adjusts those
nodes that have special representation for the constexpr case (e.g.
constant and class constant access).
Nikita
Hi!
Why do you think this isn't a good idea? I think it would be a nice way to prototype language features before pulling them into PHP. Though admittedly I don't think there are many things that could be implemented that way.
If people can extend the syntax, they will, and I don’t like the possible consequences of that. I’m all for overloading, but if people start relying on custom syntactical features, it means non-portable and confusing to read code. Granted, it might be useful for prototyping, but prototyping itself would be easier with an AST, so I’m not sure it matters.
A native extension has the limitation that it will not be able to parse files for newer PHP versions (which, depending on the use case may or may not be a problem) and probably won't provide a stable structure across versions. At least I think giving BC guarantees on the AST structure between minor versions would be way too limiting for us. As such I think both a native ext (which provides awesome perf) and PHP-Parser (which provides x-compat) have their place ;)
Yeah, I was thinking that you could keep it around for version compatibility. You could even make your library pass through to the native ext where possible. :)
Yes, the AST structure is based on the existing work on constant scalar expressions, though by now the structure and API deviate a good bit from that. It doesn't cause problems for them - constant expressions go through a validation that checks that only valid nodes are used and adjusts those nodes that have special representation for the constexpr case (e.g. constant and class constant access)
Ah, I see.
--
Andrea Faulds
http://ajf.me/
When is this planned to go through voting process?
Hi!
Why do you think this isn't a good idea? I think it would be a nice way
to prototype language features before pulling them into PHP. Though
admittedly I don't think there are many things that could be implemented
that way.If people can extend the syntax, they will, and I don’t like the possible
consequences of that. I’m all for overloading, but if people start relying
on custom syntactical features, it means non-portable and confusing to read
code. Granted, it might be useful for prototyping, but prototyping itself
would be easier with an AST, so I’m not sure it matters.A native extension has the limitation that it will not be able to parse
files for newer PHP versions (which, depending on the use case may or may
not be a problem) and probably won't provide a stable structure across
versions. At least I think giving BC guarantees on the AST structure
between minor versions would be way too limiting for us. As such I think
both a native ext (which provides awesome perf) and PHP-Parser (which
provides x-compat) have their place ;)Yeah, I was thinking that you could keep it around for version
compatibility. You could even make your library pass through to the native
ext where possible. :)Yes, the AST structure is based on the existing work on constant scalar
expressions, though by now the structure and API deviate a good bit from
that. It doesn't cause problems for them - constant expressions go through
a validation that checks that only valid nodes are used and adjusts those
nodes that have special representation for the constexpr case (e.g.
constant and class constant access)Ah, I see.
--
Andrea Faulds
http://ajf.me/--
--
Guilherme Blanco
MSN: guilhermeblanco@hotmail.com
GTalk: guilhermeblanco
Toronto - ON/Canada
On Wed, Aug 13, 2014 at 7:18 PM, guilhermeblanco@gmail.com <
guilhermeblanco@gmail.com> wrote:
When is this planned to go through voting process?
Before voting I'd like to have some opinions on the behavioral differences
the proposal introduces. In particular I'd like to know whether we are okay
with changing order of evaluation in some places. We document that
evaluation order is undefined, we document that it can change without
warning between versions and that it is inconsistent within one version.
However in the current implementation evaluation is usually left-to-right.
The AST implementation introduces some places where evaluation is
explicitly right-to-left, and not just incidentally.
So basically the question is whether we are committed to things like
$a[$i++] = $i++ or $a[$i++][$i++] = $j being undefined behavior, in which
case order doesn't matter. If not, I can preserve left-to-right behavior
here (CVs notwithstanding of course), but it would come at the cost of a
good bit of additional complexity in the implementation (I suspect that I'd
have to reintroduce parts of the bp stack to properly shuffle the oplines
around).
My personal opinion is that things like $a[$i++] = $i++ have zero practical
relevance (and also think that anyone using something like this deserves
any breakage he gets). I'd rather save some lines of code than maintain any
behavior guarantees for that. But some people have contrary opinions on
this, so I'm bringing up the point for discussion.
Nikita
Hi,
I'd personally find it horrible if $foo[$i] = $bar[$i++]; is executed
right-to-left, but given that your examples are a bit weird, I'm not sure
if mine is affected. Is it?
Cheers,
Andrey.
On Wed, Aug 13, 2014 at 7:18 PM, guilhermeblanco@gmail.com <
guilhermeblanco@gmail.com> wrote:When is this planned to go through voting process?
Before voting I'd like to have some opinions on the behavioral differences
the proposal introduces. In particular I'd like to know whether we are
okay
with changing order of evaluation in some places. We document that
evaluation order is undefined, we document that it can change without
warning between versions and that it is inconsistent within one version.
However in the current implementation evaluation is usually left-to-right.
The AST implementation introduces some places where evaluation is
explicitly right-to-left, and not just incidentally.So basically the question is whether we are committed to things like
$a[$i++] = $i++ or $a[$i++][$i++] = $j being undefined behavior, in which
case order doesn't matter. If not, I can preserve left-to-right behavior
here (CVs notwithstanding of course), but it would come at the cost of a
good bit of additional complexity in the implementation (I suspect that
I'd
have to reintroduce parts of the bp stack to properly shuffle the oplines
around).My personal opinion is that things like $a[$i++] = $i++ have zero
practical
relevance (and also think that anyone using something like this deserves
any breakage he gets). I'd rather save some lines of code than maintain
any
behavior guarantees for that. But some people have contrary opinions on
this, so I'm bringing up the point for discussion.Nikita
Hi,
I'd personally find it horrible if $foo[$i] = $bar[$i++]; is executed
right-to-left, but given that your examples are a bit weird, I'm not sure
if mine is affected. Is it?Cheers,
Andrey.
I've decided that I'd rather avoid this particular discussion and just
implemented LTR evaluation.
Nikita
Andrey Andreev:
I'd personally find it horrible if $foo[$i] = $bar[$i++]; is executed
right-to-left, but given that your examples are a bit weird, I'm not sure
if mine is affected. Is it?
Isn't that evaluated from right to left already since PHP 5.1.0? See
http://3v4l.org/Ivfdr#v510.
--
Christoph M. Becker
Andrey Andreev:
I'd personally find it horrible if $foo[$i] = $bar[$i++]; is executed
right-to-left, but given that your examples are a bit weird, I'm not sure
if mine is affected. Is it?Isn't that evaluated from right to left already since PHP 5.1.0? See
http://3v4l.org/Ivfdr#v510.
Yes, anything containing simple variables (like $i) will often deviate from
the usual evaluation order due to the CV optimization. I don't touch that.
What I've changed now is that ${a()}[b()] = c() will continue to call a, b,
c, rather than calling c, b, a (which was what the AST implementation
initially did). I honestly don't think that it matters, but I figured that
it's best to stick with the more "logical" behavior as long as it's not too
much trouble :)
Nikita
Nikita Popov wrote:
Yes, anything containing simple variables (like $i) will often deviate from
the usual evaluation order due to the CV optimization. I don't touch that.
What I've changed now is that ${a()}[b()] = c() will continue to call a, b,
c, rather than calling c, b, a (which was what the AST implementation
initially did). I honestly don't think that it matters, but I figured that
it's best to stick with the more "logical" behavior as long as it's not too
much trouble :)
Thanks for the explanation. :)
Anyway, I try to avoid relying on any particular evaluation order
between sequence points.
--
Christoph M. Becker
Andrey Andreev:
I'd personally find it horrible if $foo[$i] = $bar[$i++]; is executed
right-to-left, but given that your examples are a bit weird, I'm not sure
if mine is affected. Is it?Isn't that evaluated from right to left already since PHP 5.1.0? See
http://3v4l.org/Ivfdr#v510.--
Christoph M. Becker
Huh, I never suspected that. I've written a lot of code that relies on
increments in one-liners, possibly not for assignment though.
I better go and do some greps for this ...
Cheers,
Andrey.
On Wed, Aug 13, 2014 at 7:18 PM, guilhermeblanco@gmail.com <
guilhermeblanco@gmail.com> wrote:When is this planned to go through voting process?
Before voting I'd like to have some opinions on the behavioral differences
the proposal introduces. In particular I'd like to know whether we are okay
with changing order of evaluation in some places. We document that
evaluation order is undefined, we document that it can change without
warning between versions and that it is inconsistent within one version.
However in the current implementation evaluation is usually left-to-right.
The AST implementation introduces some places where evaluation is
explicitly right-to-left, and not just incidentally.So basically the question is whether we are committed to things like
$a[$i++] = $i++ or $a[$i++][$i++] = $j being undefined behavior, in which
case order doesn't matter. If not, I can preserve left-to-right behavior
here (CVs notwithstanding of course), but it would come at the cost of a
good bit of additional complexity in the implementation (I suspect that I'd
have to reintroduce parts of the bp stack to properly shuffle the oplines
around).My personal opinion is that things like $a[$i++] = $i++ have zero practical
relevance (and also think that anyone using something like this deserves
any breakage he gets). I'd rather save some lines of code than maintain any
behavior guarantees for that. But some people have contrary opinions on
this, so I'm bringing up the point for discussion.
+1. It's documented as undefined behaviour, doing it in the first
place is highly non-obvious to the reader with a clear ambiguity, it's
fine by me to break from the current behaviour.
PHP is not and should not be designed around people who spend their
days playing code golf and/or optimising away one opcode at a time.
My personal opinion is that things like $a[$i++] = $i++ have zero practical
relevance (and also think that anyone using something like this deserves
any breakage he gets). I'd rather save some lines of code than maintain any
behavior guarantees for that. But some people have contrary opinions on
this, so I'm bringing up the point for discussion.+1. It's documented as undefined behaviour, doing it in the first
place is highly non-obvious to the reader with a clear ambiguity, it's
fine by me to break from the current behaviour.
+1
It is OK to define some things as undefined and have them break between releases -
even minor releases. I always teach it as undefined - as it is in most
programming languages.
Likewise, if the next 2 characters on input are 'a' then 'b' what gets assigned
could either be 'ab' or 'ba':
$TwoChars = fgetc($in) . fgetc($in);
Some code is just broken.
--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
#include <std_disclaimer.h
My personal opinion is that things like $a[$i++] = $i++ have zero practical
relevance (and also think that anyone using something like this deserves
any breakage he gets). I'd rather save some lines of code than maintain any
behavior guarantees for that. But some people have contrary opinions on
this, so I'm bringing up the point for discussion.+1. It's documented as undefined behaviour, doing it in the first
place is highly non-obvious to the reader with a clear ambiguity, it's
fine by me to break from the current behaviour.+1
It is OK to define some things as undefined and have them break between releases -
even minor releases. I always teach it as undefined - as it is in most
programming languages.Likewise, if the next 2 characters on input are 'a' then 'b' what gets assigned
could either be 'ab' or 'ba':$TwoChars = fgetc($in) . fgetc($in);
Some code is just broken.
Wat? No. This is totally different to two in/decrements on one line. This has to be executed in order.
Cheers,
Mike
Michael Wallner wrote:
It is OK to define some things as undefined and have them break between releases -
even minor releases. I always teach it as undefined - as it is in most
programming languages.Likewise, if the next 2 characters on input are 'a' then 'b' what gets assigned
could either be 'ab' or 'ba':$TwoChars = fgetc($in) . fgetc($in);
Some code is just broken.
Wat? No. This is totally different to two in/decrements on one line. This has to be executed in order.
The current specification draft says[1]:
| Unless stated explicitly in this specification, the order in which
| the operands in an expression are evaluated relative to each other is
| unspecified.
| [...]
| Finally, in the full expression f() + g() * h(), the order in which
| the three functions are called, is unspecified).
[1]
https://github.com/php/php-langspec/blob/master/spec/10-expressions.md#expressions
--
Christoph M. Becker
Michael Wallner wrote:
Some code is just broken.
Wat? No. This is totally different to two in/decrements on one line. This has to be executed in order.
Ignore that.
The current specification draft says[1]:
| Unless stated explicitly in this specification, the order in which
| the operands in an expression are evaluated relative to each other is
| unspecified.
| [...]
| Finally, in the full expression f() + g() * h(), the order in which
| the three functions are called, is unspecified).[1]
https://github.com/php/php-langspec/blob/master/spec/10-expressions.md#expressions
You're right, it's documented, too; in the operator section not in the
expression section, though:
---8<---
PHP does not (in the general case) specify in which order an expression
is evaluated and code that assumes a specific order of evaluation should
be avoided, because the behavior can change between versions of PHP or
depending on the surrounding code.
--->8---
http://php.net/manual/en/language.operators.precedence.php
--
Regards,
Mike
Am 31.07.2014 um 20:11 schrieb Nikita Popov:
Is it Christmas already? Seriously: +1 from me. Especially because of
"The generated AST can be exposed to userland via an extension, for use
by static analysers. This should be relatively easy to implement and we
might even want to provide this as a bundled extension."
Your PHP-Parser is great, Nikita, but having the same functionality
built into PHP means that it will be authorative (as it is based on the
actual AST that the compiler uses) and faster.
Hi internals!
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our compilation
process:https://wiki.php.net/rfc/abstract_syntax_tree
The RFC outlines why an AST is beneficial, how it impacts performance and
memory usage of the compile process and what changes to syntax or behavior
it introduces.Furthermore the RFC contains an outline of how the current implementation
works and what APIs it provides. This section is just an overview and I
hope to extend it in the future.Note: I'm on vacation as of tomorrow and wanted to put this up to
discussion beforehand. I won't be able to implement any feedback while I'm
away, but can of course answer questions :)Thanks,
Nikita
I know I am not supposed to comment as just a PHP dev who uses the
language day to day for my living -- above is another example of why
Nikita is a person who just delivers - for me he completely defused the
64bit vs php-ng stuff a few weeks back with a technical analysis i.e. no
political crap - so maybe we need an rfc for a benign dictator for PHP
(as Rasmus is too busy I guess) and the community seems pretty
fractioned with endless arguments .. anyway I think you'd know who'd get
my vote :) -- we all know php-ng and 64bit are essential so get them
implemented and the documentation can follow... :)
Sorry for any unwelcome noise from user land
Rich
Hi Nikita,
I think RFC misses few important notes about behavior changes:
-
The behavior of $a->$b[$c] is changed. PHP apps that used such syntax
have to be changed into $a->{$b[$c]}. -
The evaluation order of left and right parts of assignment is changed.
$a[$i++] = $a[$i++]; It wasn't guaranteed before, but it may break some
code anyway. -
$a=[1,2]; list($a,$b) = $a; won't work in the same way as before
-
Usage of undefined constants now leads to fatal errors? (see
Zend/tests/use_const/no_global_fallback.php) -
The patch also enables expressions on constant arrays e.g.
isset(arra(1,2,3)[$x]);
Personally, I would prefer to separate syntax and behavior changes from the
AST patch itself or at least don't miss such changes, because they must be
very significant from users point of view.
Also some implementation related notes:
-
Changes to Zend/tests/bug43450.phpt are unrelated to AST. It's just a bad
test that may be fixed separately. -
I didn't get why you deleted Zend/tests/errmsg_014.php
-
ZEND_INIT_FCALL_BY_NAME handler shouldn't check if constant operand is a
string. Compiler must not provide non-string constant operand. (All the
changes to ZEND_INIT_FCALL_BY_NAME need to be verified more careful) -
The same in ext/opcache/Optimizer/optimize_func_calls.c, but it's not
critical here. -
I think you don't need IS_CONST operand for ZEND_ISSET_ISEMPTY_PROP_OBJ
handler. -
Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (ext/opcache/zend_accelerator_util_funcs.c)
Thanks. Dmitry.
Hi internals!
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our compilation
process:https://wiki.php.net/rfc/abstract_syntax_tree
The RFC outlines why an AST is beneficial, how it impacts performance and
memory usage of the compile process and what changes to syntax or behavior
it introduces.Furthermore the RFC contains an outline of how the current implementation
works and what APIs it provides. This section is just an overview and I
hope to extend it in the future.Note: I'm on vacation as of tomorrow and wanted to put this up to
discussion beforehand. I won't be able to implement any feedback while I'm
away, but can of course answer questions :)Thanks,
Nikita
Am 18.08.2014 um 09:32 schrieb Dmitry Stogov:
Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (ext/opcache/zend_accelerator_util_funcs.c)
Why would the AST need to be cached?
Am 18.08.2014 um 09:32 schrieb Dmitry Stogov:
Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (ext/opcache/zend_accelerator_util_funcs.c)Why would the AST need to be cached?
I can imagine optimizations happening during the cache lifetime and
not only right on the 1st request/compilation. Having the AST at hand
can be handy.
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
Am 18.08.2014 um 10:01 schrieb Pierre Joye:
I can imagine optimizations happening during the cache lifetime and
not only right on the 1st request/compilation. Having the AST at hand
can be handy.
That makes sense.
constant expressions for properties, etc.
class Foo {
public $x = 5 + SOME_CONST;
}
Thanks. Dmitry.
On Mon, Aug 18, 2014 at 11:52 AM, Sebastian Bergmann sebastian@php.net
wrote:
Am 18.08.2014 um 09:32 schrieb Dmitry Stogov:
Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request?
(ext/opcache/zend_accelerator_util_funcs.c)Why would the AST need to be cached?
Magento doesn't work with AST patch.
500 response on home page, no crash.
Other applications I tested seem to work.
Thanks. Dmitry.
Hi Nikita,
I think RFC misses few important notes about behavior changes:
The behavior of $a->$b[$c] is changed. PHP apps that used such syntax
have to be changed into $a->{$b[$c]}.The evaluation order of left and right parts of assignment is changed.
$a[$i++] = $a[$i++]; It wasn't guaranteed before, but it may break some
code anyway.$a=[1,2]; list($a,$b) = $a; won't work in the same way as before
Usage of undefined constants now leads to fatal errors? (see
Zend/tests/use_const/no_global_fallback.php)The patch also enables expressions on constant arrays e.g.
isset(arra(1,2,3)[$x]);Personally, I would prefer to separate syntax and behavior changes from
the AST patch itself or at least don't miss such changes, because they must
be very significant from users point of view.Also some implementation related notes:
Changes to Zend/tests/bug43450.phpt are unrelated to AST. It's just a
bad test that may be fixed separately.I didn't get why you deleted Zend/tests/errmsg_014.php
ZEND_INIT_FCALL_BY_NAME handler shouldn't check if constant operand is
a string. Compiler must not provide non-string constant operand. (All the
changes to ZEND_INIT_FCALL_BY_NAME need to be verified more careful)The same in ext/opcache/Optimizer/optimize_func_calls.c, but it's not
critical here.I think you don't need IS_CONST operand for ZEND_ISSET_ISEMPTY_PROP_OBJ
handler.Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (ext/opcache/zend_accelerator_util_funcs.c)Thanks. Dmitry.
On Thu, Jul 31, 2014 at 10:11 PM, Nikita Popov nikita.ppv@gmail.com
wrote:Hi internals!
I've created a draft RFC and implementation for the introduction of an
Abstract Syntax Tree (AST) as an intermediate structure in our compilation
process:https://wiki.php.net/rfc/abstract_syntax_tree
The RFC outlines why an AST is beneficial, how it impacts performance and
memory usage of the compile process and what changes to syntax or behavior
it introduces.Furthermore the RFC contains an outline of how the current implementation
works and what APIs it provides. This section is just an overview and I
hope to extend it in the future.Note: I'm on vacation as of tomorrow and wanted to put this up to
discussion beforehand. I won't be able to implement any feedback while I'm
away, but can of course answer questions :)Thanks,
Nikita
Hi Nikita,
I think RFC misses few important notes about behavior changes:
- The behavior of $a->$b[$c] is changed. PHP apps that used such syntax
have to be changed into $a->{$b[$c]}.
This is a change from the uniform variable syntax (
https://wiki.php.net/rfc/uniform_variable_syntax), not due to the AST. The
AST patch already includes the uniform variable syntax patch for technical
reasons.
- The evaluation order of left and right parts of assignment is changed.
$a[$i++] = $a[$i++]; It wasn't guaranteed before, but it may break some
code anyway.
This is no longer true, the AST now implements left-to-right evaluation for
assignments.
- $a=[1,2]; list($a,$b) = $a; won't work in the same way as before
Yes, this is true. I think assigning list() elements from left-to-right
makes a lot more sense and I also think that this is the kind of behavior
that it's okay to change in a major version. In any case, I've added this
particular example to the list() section of the RFC as well.
- Usage of undefined constants now leads to fatal errors? (see
Zend/tests/use_const/no_global_fallback.php)
This was a bug in the previous implementation - it uses a namespaced
constant "foo\bar\baz" and usage of undefined namespaced constants results
in a fatal error. This does not change that undefined unqualified constants
(like just "baz") will results in a notice and string fallback. This only
fixes the resolution of "use const" imports.
- The patch also enables expressions on constant arrays e.g.
isset(arra(1,2,3)[$x]);
This is also a change from the uniform variable syntax, not from the AST.
Personally, I would prefer to separate syntax and behavior changes from
the AST patch itself or at least don't miss such changes, because they must
be very significant from users point of view.
I think the confusion here is mainly that the AST patch already includes
the UVS patch (which was already voted in previously). I didn't list the
changes from the UVS patch in this RFC, because they're already listed in
https://wiki.php.net/rfc/uniform_variable_syntax.
Also some implementation related notes:
- Changes to Zend/tests/bug43450.phpt are unrelated to AST. It's just a
bad test that may be fixed separately.
The AST makes it fail due to different memory usage patterns, which is why
I changed it to use a test compatible with different peak memory usage.
- I didn't get why you deleted Zend/tests/errmsg_014.php
I allowed doing $obj->__clone() calls, because __clone was the only magic
method that had a compile-time checks preventing some calls to it. If we
allow all other magic methods to be called, then there's no reason to
forbid __clone. I've added a note about this to the RFC.
- ZEND_INIT_FCALL_BY_NAME handler shouldn't check if constant operand is
a string. Compiler must not provide non-string constant operand. (All the
changes to ZEND_INIT_FCALL_BY_NAME need to be verified more careful)
How should this be implemented instead? The case where an IS_CONST operand
is called needs to be handled somewhere. One possible approach would be to
send all constant-string calls through ZEND_INIT_FCALL. It uses the same
code code as ZEND_INIT_FCALL_BY_NAME with the difference that BY_NAME has
an extra zval literal storing the original (non-lowercased) function name.
We could store this additional function name only in cases of a
non-ct-bound function, because that's the only case where the undefined
function error can be thrown. Does that make sense?
The same in ext/opcache/Optimizer/optimize_func_calls.c, but it's not
critical here.I think you don't need IS_CONST operand for ZEND_ISSET_ISEMPTY_PROP_OBJ
handler.Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (
ext/opcache/zend_accelerator_util_funcs.c)
I'm not particularly familiar with opcache, so I'm not sure whether this is
possible. Right now this doesn't work because the constant expression AST
is heap allocated, so we have efree calls. But if we allocate the
constant-expression AST in CG(arena), maybe this would be possible?
Thanks for your review!
Nikita
Hi Nikita,
I think RFC misses few important notes about behavior changes:
- The behavior of $a->$b[$c] is changed. PHP apps that used such syntax
have to be changed into $a->{$b[$c]}.This is a change from the uniform variable syntax (
https://wiki.php.net/rfc/uniform_variable_syntax), not due to the AST.
The AST patch already includes the uniform variable syntax patch for
technical reasons.
I missed it. I think you should add a clear note that AST RFC also
implements "Uniform Variable Syntax" (completely or with some exceptions).
- The evaluation order of left and right parts of assignment is changed.
$a[$i++] = $a[$i++]; It wasn't guaranteed before, but it may break some
code anyway.This is no longer true, the AST now implements left-to-right evaluation
for assignments.
great. I don't talk that the old evaluation order better or worse. I just
don't see a big reason to change it.
- $a=[1,2]; list($a,$b) = $a; won't work in the same way as before
Yes, this is true. I think assigning list() elements from left-to-right
makes a lot more sense and I also think that this is the kind of behavior
that it's okay to change in a major version. In any case, I've added this
particular example to the list() section of the RFC as well.
thanks.
- Usage of undefined constants now leads to fatal errors? (see
Zend/tests/use_const/no_global_fallback.php)This was a bug in the previous implementation - it uses a namespaced
constant "foo\bar\baz" and usage of undefined namespaced constants results
in a fatal error. This does not change that undefined unqualified constants
(like just "baz") will results in a notice and string fallback. This only
fixes the resolution of "use const" imports.
- The patch also enables expressions on constant arrays e.g.
isset(arra(1,2,3)[$x]);
This is also a change from the uniform variable syntax, not from the AST.
Personally, I would prefer to separate syntax and behavior changes from
the AST patch itself or at least don't miss such changes, because they must
be very significant from users point of view.I think the confusion here is mainly that the AST patch already includes
the UVS patch (which was already voted in previously). I didn't list the
changes from the UVS patch in this RFC, because they're already listed in
https://wiki.php.net/rfc/uniform_variable_syntax.Also some implementation related notes:
- Changes to Zend/tests/bug43450.phpt are unrelated to AST. It's just a
bad test that may be fixed separately.The AST makes it fail due to different memory usage patterns, which is why
I changed it to use a test compatible with different peak memory usage.
This test is ugly, and it starts fail from time to time after changes in
absolutely unrelated areas :(
- I didn't get why you deleted Zend/tests/errmsg_014.php
I allowed doing $obj->__clone() calls, because __clone was the only magic
method that had a compile-time checks preventing some calls to it. If we
allow all other magic methods to be called, then there's no reason to
forbid __clone. I've added a note about this to the RFC.
ok. I can't remember why it prohibited.
- ZEND_INIT_FCALL_BY_NAME handler shouldn't check if constant operand is
a string. Compiler must not provide non-string constant operand. (All the
changes to ZEND_INIT_FCALL_BY_NAME need to be verified more careful)How should this be implemented instead? The case where an IS_CONST operand
is called needs to be handled somewhere. One possible approach would be to
send all constant-string calls through ZEND_INIT_FCALL. It uses the same
code code as ZEND_INIT_FCALL_BY_NAME with the difference that BY_NAME has
an extra zval literal storing the original (non-lowercased) function name.
We could store this additional function name only in cases of a
non-ct-bound function, because that's the only case where the undefined
function error can be thrown. Does that make sense?
I mean the additional check (Z_TYPE_P(opline->op2.zv) == IS_STRING) at line
2391 must be useless.
Am I wrong?
or is it intended to handle "class_name", "method_name" syntax?
The same in ext/opcache/Optimizer/optimize_func_calls.c, but it's not
critical here.I think you don't need IS_CONST operand for ZEND_ISSET_ISEMPTY_PROP_OBJ
handler.Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (
ext/opcache/zend_accelerator_util_funcs.c)I'm not particularly familiar with opcache, so I'm not sure whether this
is possible. Right now this doesn't work because the constant expression
AST is heap allocated, so we have efree calls. But if we allocate the
constant-expression AST in CG(arena), maybe this would be possible?
If the constant expression AST is not changed at run-time it may be kept in
SHM.
In this case you may just avoid copying from SHM to process memory and keep
pointers to AST in SHM.
Probably, some special flags should be introduced to avoid such AST
destruction.
You may run PHP disable the AST copying code and run PHP with
"opcache.protect_memory=1".
Once PHP will try to modify the SHM contents, it'll crash.
Thanks. Dmitry.
Thanks for your review!
Nikita
one more problem
sapi/cli/php -r ' $a = [1,2]; list($b, $a) = $a; var_dump($a,$b);'
Segmentation fault
Thanks. Dmitry.
On Mon, Aug 18, 2014 at 2:26 PM, Nikita Popov nikita.ppv@gmail.com
wrote:Hi Nikita,
I think RFC misses few important notes about behavior changes:
- The behavior of $a->$b[$c] is changed. PHP apps that used such syntax
have to be changed into $a->{$b[$c]}.This is a change from the uniform variable syntax (
https://wiki.php.net/rfc/uniform_variable_syntax), not due to the AST.
The AST patch already includes the uniform variable syntax patch for
technical reasons.I missed it. I think you should add a clear note that AST RFC also
implements "Uniform Variable Syntax" (completely or with some exceptions).
- The evaluation order of left and right parts of assignment is
changed. $a[$i++] = $a[$i++]; It wasn't guaranteed before, but it may break
some code anyway.This is no longer true, the AST now implements left-to-right evaluation
for assignments.great. I don't talk that the old evaluation order better or worse. I just
don't see a big reason to change it.
- $a=[1,2]; list($a,$b) = $a; won't work in the same way as before
Yes, this is true. I think assigning list() elements from left-to-right
makes a lot more sense and I also think that this is the kind of behavior
that it's okay to change in a major version. In any case, I've added this
particular example to the list() section of the RFC as well.thanks.
- Usage of undefined constants now leads to fatal errors? (see
Zend/tests/use_const/no_global_fallback.php)This was a bug in the previous implementation - it uses a namespaced
constant "foo\bar\baz" and usage of undefined namespaced constants results
in a fatal error. This does not change that undefined unqualified constants
(like just "baz") will results in a notice and string fallback. This only
fixes the resolution of "use const" imports.
- The patch also enables expressions on constant arrays e.g.
isset(arra(1,2,3)[$x]);
This is also a change from the uniform variable syntax, not from the AST.
Personally, I would prefer to separate syntax and behavior changes
from the AST patch itself or at least don't miss such changes, because they
must be very significant from users point of view.I think the confusion here is mainly that the AST patch already includes
the UVS patch (which was already voted in previously). I didn't list the
changes from the UVS patch in this RFC, because they're already listed in
https://wiki.php.net/rfc/uniform_variable_syntax.Also some implementation related notes:
- Changes to Zend/tests/bug43450.phpt are unrelated to AST. It's just a
bad test that may be fixed separately.The AST makes it fail due to different memory usage patterns, which is
why I changed it to use a test compatible with different peak memory usage.This test is ugly, and it starts fail from time to time after changes in
absolutely unrelated areas :(
- I didn't get why you deleted Zend/tests/errmsg_014.php
I allowed doing $obj->__clone() calls, because __clone was the only magic
method that had a compile-time checks preventing some calls to it. If we
allow all other magic methods to be called, then there's no reason to
forbid __clone. I've added a note about this to the RFC.ok. I can't remember why it prohibited.
- ZEND_INIT_FCALL_BY_NAME handler shouldn't check if constant operand
is a string. Compiler must not provide non-string constant operand. (All
the changes to ZEND_INIT_FCALL_BY_NAME need to be verified more careful)How should this be implemented instead? The case where an IS_CONST
operand is called needs to be handled somewhere. One possible approach
would be to send all constant-string calls through ZEND_INIT_FCALL. It uses
the same code code as ZEND_INIT_FCALL_BY_NAME with the difference that
BY_NAME has an extra zval literal storing the original (non-lowercased)
function name. We could store this additional function name only in cases
of a non-ct-bound function, because that's the only case where the
undefined function error can be thrown. Does that make sense?I mean the additional check (Z_TYPE_P(opline->op2.zv) == IS_STRING) at
line 2391 must be useless.
Am I wrong?or is it intended to handle "class_name", "method_name" syntax?
The same in ext/opcache/Optimizer/optimize_func_calls.c, but it's not
critical here.I think you don't need IS_CONST operand for ZEND_ISSET_ISEMPTY_PROP_OBJ
handler.Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (
ext/opcache/zend_accelerator_util_funcs.c)I'm not particularly familiar with opcache, so I'm not sure whether this
is possible. Right now this doesn't work because the constant expression
AST is heap allocated, so we have efree calls. But if we allocate the
constant-expression AST in CG(arena), maybe this would be possible?If the constant expression AST is not changed at run-time it may be kept
in SHM.
In this case you may just avoid copying from SHM to process memory and
keep pointers to AST in SHM.
Probably, some special flags should be introduced to avoid such AST
destruction.You may run PHP disable the AST copying code and run PHP with
"opcache.protect_memory=1".
Once PHP will try to modify the SHM contents, it'll crash.Thanks. Dmitry.
Thanks for your review!
Nikita
On Mon, Aug 18, 2014 at 2:26 PM, Nikita Popov nikita.ppv@gmail.com
wrote:Hi Nikita,
I think RFC misses few important notes about behavior changes:
- The behavior of $a->$b[$c] is changed. PHP apps that used such syntax
have to be changed into $a->{$b[$c]}.This is a change from the uniform variable syntax (
https://wiki.php.net/rfc/uniform_variable_syntax), not due to the AST.
The AST patch already includes the uniform variable syntax patch for
technical reasons.I missed it. I think you should add a clear note that AST RFC also
implements "Uniform Variable Syntax" (completely or with some exceptions).
- The evaluation order of left and right parts of assignment is
changed. $a[$i++] = $a[$i++]; It wasn't guaranteed before, but it may break
some code anyway.This is no longer true, the AST now implements left-to-right evaluation
for assignments.great. I don't talk that the old evaluation order better or worse. I just
don't see a big reason to change it.
- $a=[1,2]; list($a,$b) = $a; won't work in the same way as before
Yes, this is true. I think assigning list() elements from left-to-right
makes a lot more sense and I also think that this is the kind of behavior
that it's okay to change in a major version. In any case, I've added this
particular example to the list() section of the RFC as well.thanks.
- Usage of undefined constants now leads to fatal errors? (see
Zend/tests/use_const/no_global_fallback.php)This was a bug in the previous implementation - it uses a namespaced
constant "foo\bar\baz" and usage of undefined namespaced constants results
in a fatal error. This does not change that undefined unqualified constants
(like just "baz") will results in a notice and string fallback. This only
fixes the resolution of "use const" imports.
- The patch also enables expressions on constant arrays e.g.
isset(arra(1,2,3)[$x]);
This is also a change from the uniform variable syntax, not from the AST.
Personally, I would prefer to separate syntax and behavior changes
from the AST patch itself or at least don't miss such changes, because they
must be very significant from users point of view.I think the confusion here is mainly that the AST patch already includes
the UVS patch (which was already voted in previously). I didn't list the
changes from the UVS patch in this RFC, because they're already listed in
https://wiki.php.net/rfc/uniform_variable_syntax.Also some implementation related notes:
- Changes to Zend/tests/bug43450.phpt are unrelated to AST. It's just a
bad test that may be fixed separately.The AST makes it fail due to different memory usage patterns, which is
why I changed it to use a test compatible with different peak memory usage.This test is ugly, and it starts fail from time to time after changes in
absolutely unrelated areas :(
Maybe we should just remove the memory usage checks and rely on ZMM
warnings in debug mode?
- I didn't get why you deleted Zend/tests/errmsg_014.php
I allowed doing $obj->__clone() calls, because __clone was the only magic
method that had a compile-time checks preventing some calls to it. If we
allow all other magic methods to be called, then there's no reason to
forbid __clone. I've added a note about this to the RFC.ok. I can't remember why it prohibited.
- ZEND_INIT_FCALL_BY_NAME handler shouldn't check if constant operand
is a string. Compiler must not provide non-string constant operand. (All
the changes to ZEND_INIT_FCALL_BY_NAME need to be verified more careful)How should this be implemented instead? The case where an IS_CONST
operand is called needs to be handled somewhere. One possible approach
would be to send all constant-string calls through ZEND_INIT_FCALL. It uses
the same code code as ZEND_INIT_FCALL_BY_NAME with the difference that
BY_NAME has an extra zval literal storing the original (non-lowercased)
function name. We could store this additional function name only in cases
of a non-ct-bound function, because that's the only case where the
undefined function error can be thrown. Does that make sense?I mean the additional check (Z_TYPE_P(opline->op2.zv) == IS_STRING) at
line 2391 must be useless.
Am I wrong?or is it intended to handle "class_name", "method_name" syntax?
Yes, that's the case it handles.
The same in ext/opcache/Optimizer/optimize_func_calls.c, but it's not
critical here.I think you don't need IS_CONST operand for ZEND_ISSET_ISEMPTY_PROP_OBJ
handler.Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (
ext/opcache/zend_accelerator_util_funcs.c)I'm not particularly familiar with opcache, so I'm not sure whether this
is possible. Right now this doesn't work because the constant expression
AST is heap allocated, so we have efree calls. But if we allocate the
constant-expression AST in CG(arena), maybe this would be possible?If the constant expression AST is not changed at run-time it may be kept
in SHM.
In this case you may just avoid copying from SHM to process memory and
keep pointers to AST in SHM.
Probably, some special flags should be introduced to avoid such AST
destruction.You may run PHP disable the AST copying code and run PHP with
"opcache.protect_memory=1".
Once PHP will try to modify the SHM contents, it'll crash.
Looking at the AST evaluation code, I think this would be problematic:
http://lxr.php.net/xref/phpng/Zend/zend_ast.c#286 Here the constant is
updated in-place.
Nikita
On Mon, Aug 18, 2014 at 2:26 PM, Nikita Popov nikita.ppv@gmail.com
wrote:Hi Nikita,
I think RFC misses few important notes about behavior changes:
- The behavior of $a->$b[$c] is changed. PHP apps that used such
syntax have to be changed into $a->{$b[$c]}.This is a change from the uniform variable syntax (
https://wiki.php.net/rfc/uniform_variable_syntax), not due to the AST.
The AST patch already includes the uniform variable syntax patch for
technical reasons.I missed it. I think you should add a clear note that AST RFC also
implements "Uniform Variable Syntax" (completely or with some exceptions).
- The evaluation order of left and right parts of assignment is
changed. $a[$i++] = $a[$i++]; It wasn't guaranteed before, but it may break
some code anyway.This is no longer true, the AST now implements left-to-right evaluation
for assignments.great. I don't talk that the old evaluation order better or worse. I just
don't see a big reason to change it.
- $a=[1,2]; list($a,$b) = $a; won't work in the same way as before
Yes, this is true. I think assigning list() elements from left-to-right
makes a lot more sense and I also think that this is the kind of behavior
that it's okay to change in a major version. In any case, I've added this
particular example to the list() section of the RFC as well.thanks.
- Usage of undefined constants now leads to fatal errors? (see
Zend/tests/use_const/no_global_fallback.php)This was a bug in the previous implementation - it uses a namespaced
constant "foo\bar\baz" and usage of undefined namespaced constants results
in a fatal error. This does not change that undefined unqualified constants
(like just "baz") will results in a notice and string fallback. This only
fixes the resolution of "use const" imports.
- The patch also enables expressions on constant arrays e.g.
isset(arra(1,2,3)[$x]);
This is also a change from the uniform variable syntax, not from the AST.
Personally, I would prefer to separate syntax and behavior changes
from the AST patch itself or at least don't miss such changes, because they
must be very significant from users point of view.I think the confusion here is mainly that the AST patch already includes
the UVS patch (which was already voted in previously). I didn't list the
changes from the UVS patch in this RFC, because they're already listed in
https://wiki.php.net/rfc/uniform_variable_syntax.Also some implementation related notes:
- Changes to Zend/tests/bug43450.phpt are unrelated to AST. It's just
a bad test that may be fixed separately.The AST makes it fail due to different memory usage patterns, which is
why I changed it to use a test compatible with different peak memory usage.This test is ugly, and it starts fail from time to time after changes in
absolutely unrelated areas :(Maybe we should just remove the memory usage checks and rely on ZMM
warnings in debug mode?
The test is intended to check that the engine completely frees unused
resources at specific points,
but it may fail if MM cache some of released memory blocks for quick reuse.
It's better to ignore this test failure now, then hide it. Most probably
this failure is completely unrelated to AST, however few weeks ago it
helped me found a real problem in phpng.
- I didn't get why you deleted Zend/tests/errmsg_014.php
I allowed doing $obj->__clone() calls, because __clone was the only
magic method that had a compile-time checks preventing some calls to it. If
we allow all other magic methods to be called, then there's no reason to
forbid __clone. I've added a note about this to the RFC.ok. I can't remember why it prohibited.
- ZEND_INIT_FCALL_BY_NAME handler shouldn't check if constant operand
is a string. Compiler must not provide non-string constant operand. (All
the changes to ZEND_INIT_FCALL_BY_NAME need to be verified more
careful)How should this be implemented instead? The case where an IS_CONST
operand is called needs to be handled somewhere. One possible approach
would be to send all constant-string calls through ZEND_INIT_FCALL. It uses
the same code code as ZEND_INIT_FCALL_BY_NAME with the difference that
BY_NAME has an extra zval literal storing the original (non-lowercased)
function name. We could store this additional function name only in cases
of a non-ct-bound function, because that's the only case where the
undefined function error can be thrown. Does that make sense?I mean the additional check (Z_TYPE_P(opline->op2.zv) == IS_STRING) at
line 2391 must be useless.
Am I wrong?or is it intended to handle "class_name", "method_name" syntax?
Yes, that's the case it handles.
uh :)
The same in ext/opcache/Optimizer/optimize_func_calls.c, but it's
not critical here.I think you don't need IS_CONST operand for ZEND_ISSET_ISEMPTY_PROP_OBJ
handler.Can OPcahce always keep AST in shared memory and don't copy it into
process memory on each request? (
ext/opcache/zend_accelerator_util_funcs.c)I'm not particularly familiar with opcache, so I'm not sure whether this
is possible. Right now this doesn't work because the constant expression
AST is heap allocated, so we have efree calls. But if we allocate the
constant-expression AST in CG(arena), maybe this would be possible?If the constant expression AST is not changed at run-time it may be kept
in SHM.
In this case you may just avoid copying from SHM to process memory and
keep pointers to AST in SHM.
Probably, some special flags should be introduced to avoid such AST
destruction.You may run PHP disable the AST copying code and run PHP with
"opcache.protect_memory=1".
Once PHP will try to modify the SHM contents, it'll crash.Looking at the AST evaluation code, I think this would be problematic:
http://lxr.php.net/xref/phpng/Zend/zend_ast.c#286 Here the constant is
updated in-place.
I remember some related problems, but it would be great to find a better
solution.
Of course, it's not required to do it together with AST patch, it might be
done later or during vote
Some more problems I found:
- few tests started fail with the patch
Zend Multibyte and ShiftJIS
[Zend/tests/multibyte/multibyte_encoding_001.phpt]
zend multibyte (8) [ext/mbstring/tests/zend_multibyte-08.phpt]
serialization: arrays with references amonst elements
[ext/standard/tests/serialize/serialization_arrays_002.phpt]
Object serialization / unserialization: references amongst properties
[ext/standard/tests/serialize/serialization_objects_013.phpt]
-
make install-pear
reports memory leaks in debug build
Thanks. Dmitry.
Nikita
Some more problems I found:
- few tests started fail with the patch
Zend Multibyte and ShiftJIS
[Zend/tests/multibyte/multibyte_encoding_001.phpt]
zend multibyte (8) [ext/mbstring/tests/zend_multibyte-08.phpt]
Thanks, those are fixed now.
serialization: arrays with references amonst elements
[ext/standard/tests/serialize/serialization_arrays_002.phpt]
Object serialization / unserialization: references amongst properties
[ext/standard/tests/serialize/serialization_objects_013.phpt]
These are caused by
https://wiki.php.net/rfc/abstract_syntax_tree#auto-vivification_order_for_by-reference_assignments,
which is an intentional change - I've now adjusted the tests to use an
explicit order.
make install-pear
reports memory leaks in debug build
I've fixed a PEAR install related leak in master - likely these leaks were
the same.
Nikita