Prior to PHP 5.1 (and the introduction of compiled variables), the
following code would output warnings for the undefined variables in
the order they were used:
echo $x . $y . $z;
However, with the introduction of CVs, we wind up getting the warning
for $y, then $x, and finally $z. ((Insert "middle-out" reference from
HBO's Silicon Valley, here)) The reason for this, is that ZEND_CONCAT
(and indeed, most binary operations) is implemented as a single inline
function call:
concat_function(EX_VAR(opline->result.var),
GET_OP1_ZVAL_PTR(BP_VAR_R), GET_OP2_ZVAL_PTR(BP_VAR_R));
When gcc looks at this function call, it treats the execution order
for the arguments as undefined. In practice (at least on 3v4l.org) it
ends up evaluating the third argument (OP2) first, then the second
argument (OP1), and finally the first argument (result.var). This is
perfectly legal as far as GCC is concerned since the result of one
function call argument shouldn't have side effects relative to another
argument. I'd like to propose making the order of evaluation defined
by splitting this into separate statements:
zval *op1 = GET_OP1_ZVAL_PTR(BP_VAR_R);
zval *op2 = GET_OP2_ZVAL_PTR(BP_VAR_R);
concat_function(EX_VAR(opline->result.var), op1, op2);
The optimizer should do a proper job of removing the intermediate
assignment while keeping the defined resolution order intact.
God help us if anyone is actually /depending/ on this behavior...
-Sara
Hi!
argument. I'd like to propose making the order of evaluation defined
by splitting this into separate statements:
What is the purpose of this? I.e. why is it important that these notices
would be produced in certain order?
Stas Malyshev
smalyshev@gmail.com
Hi!
----- Original Message -----
From: "Stanislav Malyshev" smalyshev@gmail.com
Sent: Monday, May 11, 2015
Hi!
argument. I'd like to propose making the order of evaluation defined
by splitting this into separate statements:What is the purpose of this? I.e. why is it important that these notices
would be produced in certain order?
Logic, consistency, "polish," etc. :-)
I never noticed this, but seems like a good change to make. I guess GCC,
and other compilers, are just executing the arguments in the order they're
push'ed (x86), which is reasonable and logical...
Stas Malyshev
smalyshev@gmail.com
- Matt
----- Original Message -----
From: "Stanislav Malyshev" smalyshev@gmail.com
Sent: Monday, May 11, 2015argument. I'd like to propose making the order of evaluation defined
by splitting this into separate statements:What is the purpose of this? I.e. why is it important that these notices
would be produced in certain order?Logic, consistency, "polish," etc. :-)
Exactly this. :)
It's just my CDO kicking in. It's not necessary for the language to
function. Heck, HHVM already matches PHP 5.1-7.0's out-of-order
behavior, so fixing PHP only creates double work for me. I just look
at these error messages and twitch uncomfortably.
-Sara
----- Original Message -----
From: "Stanislav Malyshev" smalyshev@gmail.com
Sent: Monday, May 11, 2015argument. I'd like to propose making the order of evaluation defined
by splitting this into separate statements:What is the purpose of this? I.e. why is it important that these notices
would be produced in certain order?Logic, consistency, "polish," etc. :-)
Exactly this. :)It's just my CDO kicking in. It's not necessary for the language to
function. Heck, HHVM already matches PHP 5.1-7.0's out-of-order
behavior, so fixing PHP only creates double work for me. I just look
at these error messages and twitch uncomfortably.
I don't think it is worth the change. I actually prefer to leave code path optimization up to gcc here vs. trying to tell it what to do.
Andi
-Sara
I don't think it is worth the change. I actually prefer to leave code path
optimization up to gcc here vs. trying to tell it what to do.
This isn't about optimizing code paths, it's about changing undefined
behavior into defined behavior. Either way the compiler is left to
optimize and organize code according to its preferences.
But whatever, this is so not something worth arguing over. If you
like PHP being clowny in this minor edge case, I'm not going to get in
your way.
Cheers,
-Sara
Hi all,
----- Original Message -----
From: "Sara Golemon"
Sent: Monday, May 11, 2015
On Mon, May 11, 2015 at 5:12 PM, Matt Wilmas php_lists@realplain.com
wrote:----- Original Message -----
From: "Stanislav Malyshev" smalyshev@gmail.com
Sent: Monday, May 11, 2015argument. I'd like to propose making the order of evaluation defined
by splitting this into separate statements:What is the purpose of this? I.e. why is it important that these notices
would be produced in certain order?Logic, consistency, "polish," etc. :-)
Exactly this. :)
It's just my CDO kicking in. It's not necessary for the language to
function. Heck, HHVM already matches PHP 5.1-7.0's out-of-order
behavior, so fixing PHP only creates double work for me. I just look
at these error messages and twitch uncomfortably.
Hah, looks like this just changed last week after barely 3 weeks. :-P I
didn't verify, just noticed the code change:
http://git.php.net/?p=php-src.git;a=commitdiff;h=c09698753e7d1d95299dca54c8ca888c885fd45b
Andi hasn't objected, yet. :-O And ironically it was part of an
optimization...
Now CONCAT is consistent, but not others. Which places need changing
anyway? Just binary ops in VM?
A couple/few extra instructions are unavoidable, I guess on any
architecture, for the intermediate save of op1, but can anyone show that it
makes ANY measurable difference (other than instruction count)?
For ADD, MUL, IS_[NOT_]IDENTICAL, IS_[NOT_]EQUAL, BW_*, and BOOL_XOR, ops
can be passed as
... result, op2, op1);
with no ill effects (I think) and still have the same (fewest) instructions
as now! That is, if anyone thinks, or can show, that it matters. :-)
Whoops, didn't check first -- I see ADD, MUL, and IS_[NOT_]EQUAL (among
others) are already consistent/fixed (but extra instructions :-D).
-Sara
- Matt
I don't see a big problem changing this for few opcodes in PHP-7 VM, and
make behavior "more defined".
On the other hand I don't think we shouldn't "define" the order of operand
evaluation.
This may prevent us performing optimisations in the future.
Fixing this in PHP-5.* doesn't cost the effort from my point of view.
Thanks. Dmitry.
Hi all,
----- Original Message -----
From: "Sara Golemon"
Sent: Monday, May 11, 2015On Mon, May 11, 2015 at 5:12 PM, Matt Wilmas php_lists@realplain.com
wrote:
----- Original Message -----
From: "Stanislav Malyshev" smalyshev@gmail.com
Sent: Monday, May 11, 2015argument. I'd like to propose making the order of evaluation defined
by splitting this into separate statements:
What is the purpose of this? I.e. why is it important that these notices
would be produced in certain order?Logic, consistency, "polish," etc. :-)
Exactly this. :)
It's just my CDO kicking in. It's not necessary for the language to
function. Heck, HHVM already matches PHP 5.1-7.0's out-of-order
behavior, so fixing PHP only creates double work for me. I just look
at these error messages and twitch uncomfortably.Hah, looks like this just changed last week after barely 3 weeks. :-P I
didn't verify, just noticed the code change:
http://git.php.net/?p=php-src.git;a=commitdiff;h=c09698753e7d1d95299dca54c8ca888c885fd45bAndi hasn't objected, yet. :-O And ironically it was part of an
optimization...Now CONCAT is consistent, but not others. Which places need changing
anyway? Just binary ops in VM?A couple/few extra instructions are unavoidable, I guess on any
architecture, for the intermediate save of op1, but can anyone show that it
makes ANY measurable difference (other than instruction count)?For ADD, MUL, IS_[NOT_]IDENTICAL, IS_[NOT_]EQUAL, BW_*, and BOOL_XOR, ops
can be passed as... result, op2, op1);
with no ill effects (I think) and still have the same (fewest)
instructions as now! That is, if anyone thinks, or can show, that it
matters. :-)Whoops, didn't check first -- I see ADD, MUL, and IS_[NOT_]EQUAL (among
others) are already consistent/fixed (but extra instructions :-D).-Sara
- Matt
Hah, looks like this just changed last week after barely 3 weeks. :-P I
didn't verify, just noticed the code change:
http://git.php.net/?p=php-src.git;a=commitdiff;h=c09698753e7d1d95299dca54c8ca888c885fd45b
Dmitry, what's the reasoning behind this diff in the first place?
Doesn't the compiler fold (<const-string> . <const-string>) already
anyhow? How would we wind up with CONCAT_CONST_CONST at runtime?
Now CONCAT is consistent, but not others. Which places need changing
anyway? Just binary ops in VM?A couple/few extra instructions are unavoidable, I guess on any
architecture, for the intermediate save of op1, but can anyone show that it
makes ANY measurable difference (other than instruction count)?
I'm 90% certain that fixing the ordering (by saving to a temp var)
doesn't actually result in extra instructions. GCC's optimizer is
pretty clever. It'll notice you're only writing to the temp var once,
and reading from it once (immediately), so it'll end up organizing the
instructions in a way which satisfies the explicit order, but as an
inline call.
-Sara
Dmitry, what's the reasoning behind this diff in the first place?
Doesn't the compiler fold (<const-string> . <const-string>) already
anyhow? How would we wind up with CONCAT_CONST_CONST at runtime?
Derp. Looked again, and it's (const OR string) && (const OR
string). I read those as ANDs initially.
But now I'm confused again, because that makes it look like we can
reach this with an expression like "1 . 2" 1 and 2 are consts, but not
strings, so then we get to:
zend_string *op1_str = Z_STR_P(op1);
Which will lead to op1_str pointing at 0x00000001 and a segfault.
What am I missing here?
-Sara
Dmitry, what's the reasoning behind this diff in the first place?
Doesn't the compiler fold (<const-string> . <const-string>) already
anyhow? How would we wind up with CONCAT_CONST_CONST at runtime?Derp. Looked again, and it's (const OR string) && (const OR
string). I read those as ANDs initially.But now I'm confused again, because that makes it look like we can
reach this with an expression like "1 . 2" 1 and 2 are consts, but not
strings, so then we get to:zend_string *op1_str = Z_STR_P(op1);
Which will lead to op1_str pointing at 0x00000001 and a segfault.
What am I missing here?
Okay, answered my own question (I need to be gentler with the [Send] button).
Zend/zend_compile.c sets up an assumption:
if (left_node.op_type == IS_CONST) {
convert_to_string(&left_node.u.constant);
}
if (right_node.op_type == IS_CONST) {
convert_to_string(&right_node.u.constant);
}
So any const nodes passed to CONCAT will always be strings (having
been preconverted), and the if check using an OR make sense, because
if it's CONST, then we KNOW it's a string, and there's no point
testing for it. We won't ever reach CONCAT_CONST_CONST (since that
would be folded in the compiler), but we might reach CONCAT_CONST_CV
or CONCAT_TMPVAR_CONST or some other combination thereof for which
this case is set to handle.
TL;DR - Never mind me. I just didn't think it all the way through.
-Sara
P.S. - An assert(Z_TYPE_P(op1) == IS_STRING); and assert(Z_TYPE_P(op2)
== IS_STRING); does seem reasonable though... Maybe even with a
comment referencing that we can make that assumption because the
compiler fixes up the types.
Dmitry, what's the reasoning behind this diff in the first place?
Doesn't the compiler fold (<const-string> . <const-string>) already
anyhow? How would we wind up with CONCAT_CONST_CONST at runtime?Derp. Looked again, and it's (const OR string) && (const OR
string). I read those as ANDs initially.But now I'm confused again, because that makes it look like we can
reach this with an expression like "1 . 2" 1 and 2 are consts, but not
strings, so then we get to:zend_string *op1_str = Z_STR_P(op1);
Which will lead to op1_str pointing at 0x00000001 and a segfault.
What am I missing here?
Okay, answered my own question (I need to be gentler with the [Send]
button).Zend/zend_compile.c sets up an assumption:
if (left_node.op_type == IS_CONST) {
convert_to_string(&left_node.u.constant);
}
if (right_node.op_type == IS_CONST) {
convert_to_string(&right_node.u.constant);
}So any const nodes passed to CONCAT will always be strings (having
been preconverted), and the if check using an OR make sense, because
if it's CONST, then we KNOW it's a string, and there's no point
testing for it. We won't ever reach CONCAT_CONST_CONST (since that
would be folded in the compiler), but we might reach CONCAT_CONST_CV
or CONCAT_TMPVAR_CONST or some other combination thereof for which
this case is set to handle.
exectly.
anyway CONCAT_CONST_CONST may be removed at all, at least to reduce the
code size.
TL;DR - Never mind me. I just didn't think it all the way through.
-Sara
P.S. - An assert(Z_TYPE_P(op1) == IS_STRING); and assert(Z_TYPE_P(op2)
== IS_STRING); does seem reasonable though... Maybe even with a
comment referencing that we can make that assumption because the
compiler fixes up the types.
ZEND_ASSERT() would be better.
Thanks. Dmitry.
Hi Sara,
https://gist.github.com/dstogov/6a90601872b538d2ddd6
I see no problems committing this (running tests now).
Thanks. Dmitry.
Dmitry, what's the reasoning behind this diff in the first place?
Doesn't the compiler fold (<const-string> . <const-string>) already
anyhow? How would we wind up with CONCAT_CONST_CONST at runtime?Derp. Looked again, and it's (const OR string) && (const OR
string). I read those as ANDs initially.But now I'm confused again, because that makes it look like we can
reach this with an expression like "1 . 2" 1 and 2 are consts, but not
strings, so then we get to:zend_string *op1_str = Z_STR_P(op1);
Which will lead to op1_str pointing at 0x00000001 and a segfault.
What am I missing here?
Okay, answered my own question (I need to be gentler with the [Send]
button).Zend/zend_compile.c sets up an assumption:
if (left_node.op_type == IS_CONST) {
convert_to_string(&left_node.u.constant);
}
if (right_node.op_type == IS_CONST) {
convert_to_string(&right_node.u.constant);
}So any const nodes passed to CONCAT will always be strings (having
been preconverted), and the if check using an OR make sense, because
if it's CONST, then we KNOW it's a string, and there's no point
testing for it. We won't ever reach CONCAT_CONST_CONST (since that
would be folded in the compiler), but we might reach CONCAT_CONST_CV
or CONCAT_TMPVAR_CONST or some other combination thereof for which
this case is set to handle.exectly.
anyway CONCAT_CONST_CONST may be removed at all, at least to reduce the
code size.TL;DR - Never mind me. I just didn't think it all the way through.
-Sara
P.S. - An assert(Z_TYPE_P(op1) == IS_STRING); and assert(Z_TYPE_P(op2)
== IS_STRING); does seem reasonable though... Maybe even with a
comment referencing that we can make that assumption because the
compiler fixes up the types.ZEND_ASSERT() would be better.
Thanks. Dmitry.
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, June 10, 2015
Hi Sara,
https://gist.github.com/dstogov/6a90601872b538d2ddd6
I see no problems committing this (running tests now).
Cool. :-) Just FYI, I noticed you didn't update ZEND_BOOL_XOR, at least...
Thanks. Dmitry.
- Matt
and also forgot _DEREF suffix in IDENTICAL/NOT_IDENTICAL opcodes. :(
Thanks. Dmitry.
On Wed, Jun 10, 2015 at 9:45 PM, Matt Wilmas php_lists@realplain.com
wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, June 10, 2015Hi Sara,
https://gist.github.com/dstogov/6a90601872b538d2ddd6
I see no problems committing this (running tests now).
Cool. :-) Just FYI, I noticed you didn't update ZEND_BOOL_XOR, at least...
Thanks. Dmitry.
- Matt
Hi Sara,
----- Original Message -----
From: "Sara Golemon"
Sent: Wednesday, June 10, 2015
On Tue, Jun 9, 2015 at 6:05 AM, Matt Wilmas php_lists@realplain.com
wrote:Hah, looks like this just changed last week after barely 3 weeks. :-P I
didn't verify, just noticed the code change:
http://git.php.net/?p=php-src.git;a=commitdiff;h=c09698753e7d1d95299dca54c8ca888c885fd45bDmitry, what's the reasoning behind this diff in the first place?
Doesn't the compiler fold (<const-string> . <const-string>) already
anyhow? How would we wind up with CONCAT_CONST_CONST at runtime?Now CONCAT is consistent, but not others. Which places need changing
anyway? Just binary ops in VM?A couple/few extra instructions are unavoidable, I guess on any
architecture, for the intermediate save of op1, but can anyone show that
it
makes ANY measurable difference (other than instruction count)?I'm 90% certain that fixing the ordering (by saving to a temp var)
doesn't actually result in extra instructions. GCC's optimizer is
pretty clever. It'll notice you're only writing to the temp var once,
and reading from it once (immediately), so it'll end up organizing the
instructions in a way which satisfies the explicit order, but as an
inline call.
I don't know... Yeah, if op1 is coming from a "plain" variable, either
something simple (like CONST?) or after function inlining. e.g. Like
"directly" accessible, then it should be able to be done the same. (I don't
know a lot about any of this though.)
I was thinking of when some sort of function call is involved (as far as
unavoidable extra instructions). If op1 is fetched after op2, it's
optimized for passing to concat_function(), etc. Doing stuff in reverse
order will need fewer instructions than "human order," at least when
function calls are needed for "stuff."
Hmm, maybe that's only for x86 with push'ed args, where each thing can keep
push'ing whatever it needs (eax) for the final (outer) function... x86_64
is I guess still going to need to store stuff "out of the way" for the next
(inner) call, before moving it back to the necessary registers for the final
function! If that's true, maybe x86_64 doesn't need more instructions.
I think I've confused myself more trying to picture what's happening. :-O
-Sara
- Matt
On Tue, Jun 9, 2015 at 6:05 AM, Matt Wilmas php_lists@realplain.com
wrote:Hah, looks like this just changed last week after barely 3 weeks. :-P I
didn't verify, just noticed the code change:http://git.php.net/?p=php-src.git;a=commitdiff;h=c09698753e7d1d95299dca54c8ca888c885fd45b
Dmitry, what's the reasoning behind this diff in the first place?
Inlining of fast path and omitting check for Z_TYPE_P() for IS_CONST
operands.
Doesn't the compiler fold (<const-string> . <const-string>) already
anyhow? How would we wind up with CONCAT_CONST_CONST at runtime?
It's probably makes sense to try to do this for most binary/unary opcodes.
I'll think about it.
Now CONCAT is consistent, but not others. Which places need changing
anyway? Just binary ops in VM?A couple/few extra instructions are unavoidable, I guess on any
architecture, for the intermediate save of op1, but can anyone show that
it
makes ANY measurable difference (other than instruction count)?I'm 90% certain that fixing the ordering (by saving to a temp var)
doesn't actually result in extra instructions. GCC's optimizer is
pretty clever. It'll notice you're only writing to the temp var once,
and reading from it once (immediately), so it'll end up organizing theIt
won't take more than a half an hour.instructions in a way which satisfies the explicit order, but as an
inline call.
I also think, this won't affect performance, and I may check this, creating
a patch for all binary operators.
Anyway, I don't object against this change, but I think, the order of
operands evaluation must not be defined (by PHP specification).
Thanks. Dmitry.
-Sara