Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expression
Due to the feedback I decided to take the RFC in a slightly different
direction. The main objections were:
- It didn't behave like the traditional switch so it shouldn't use
the switch keyword - It didn't fix type coercion
- The RFC was poorly structured
In hindsight I understand why I had a hard time writing the RFC. I
tried making a case against the switch statement while really not
addressing the switch statement at all. The new RFC proposes a match
expression that fixes all the objections raised against the switch
statement. Additionally, match arms can now contain statement lists
which allows you to use the match expression anywhere the switch
statement would have been used previously.
While some people have suggested statement lists aren't necessary I
don't think it makes sense to raise objections against the switch
statement without offering an alternative.
I also experimented with pattern matching but decided against it. The
exact reason is described in the RFC.
I'm now confident this is the right approach. I hope you will be
happier with this proposal.
Happy Easter!
Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expressionDue to the feedback I decided to take the RFC in a slightly different
direction. The main objections were:
- It didn't behave like the traditional switch so it shouldn't use
the switch keyword- It didn't fix type coercion
- The RFC was poorly structured
In hindsight I understand why I had a hard time writing the RFC. I
tried making a case against the switch statement while really not
addressing the switch statement at all. The new RFC proposes a match
expression that fixes all the objections raised against the switch
statement. Additionally, match arms can now contain statement lists
which allows you to use the match expression anywhere the switch
statement would have been used previously.While some people have suggested statement lists aren't necessary I
don't think it makes sense to raise objections against the switch
statement without offering an alternative.I also experimented with pattern matching but decided against it. The
exact reason is described in the RFC.I'm now confident this is the right approach. I hope you will be
happier with this proposal.Happy Easter!
--
Hello Ilija,
I think this is a step in the right direction compared to your previous
attempt.
Your reasoning for not including pattern matching in this RFC also makes
sense.
Moreover, this could be added at a future time when it may make more sense.
One small decision that I personally disagree with is the usage of =>
compared to using the colon as the switch statement does.
As for me => means assigning a value to a key which isn't the case here
IMHO.
Also, combining this with less/greater or equal than binary operators makes
for a rather confusing line in my eyes.
E.g. Zend/tests/match/004.phpt in your PR, I don't really see the separation
from where condition and where 'case' starts:
Best regards
George P. Banyard
Hi George
I appreciate your feedback.
One small decision that I personally disagree with is the usage of
=>
compared to using the colon as the switch statement does.
I picked the =>
symbol because IMO the extra spacing and wider
symbol visually separates the two sides better. The difference isn't
huge, I don't have a strong preference either way.
Also, combining this with less/greater or equal than binary operators makes for a rather confusing line in my eyes.
I guess that depends on the lhs. For example, using the ternary
operator has the opposite effect.
match (true) {
// =>
$cond ? $foo : $bar => $val, // Better
$val >= 42 => $val, // Worse
// :
$cond ? $foo : $bar: $val, // Worse
$val >= 42: $val, // Better
}
Regards,
Ilija
Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expressionDue to the feedback I decided to take the RFC in a slightly different
direction. The main objections were:
- It didn't behave like the traditional switch so it shouldn't use
the switch keyword- It didn't fix type coercion
- The RFC was poorly structured
In hindsight I understand why I had a hard time writing the RFC. I
tried making a case against the switch statement while really not
addressing the switch statement at all. The new RFC proposes a match
expression that fixes all the objections raised against the switch
statement. Additionally, match arms can now contain statement lists
which allows you to use the match expression anywhere the switch
statement would have been used previously.While some people have suggested statement lists aren't necessary I
don't think it makes sense to raise objections against the switch
statement without offering an alternative.I also experimented with pattern matching but decided against it. The
exact reason is described in the RFC.I'm now confident this is the right approach. I hope you will be
happier with this proposal.Happy Easter!
Hello Ilija,
I really like much more this new approach. Only a pair of questions:
Would be posible for blocks to require a return statement instead of raising an error?
$y = match($x) {
0 => ‘a’,
1 => {
foo();
return 'b’; // OK
}
2 => {
var();
} // Error, return statement is required
}
Would be feasible for the fallthrought problem to use “continue” when you really want to chain “cases”?
match ($x) {
0 => {
// Only for 0
continue; // Same as omitting a break in a traditional switch
},
1 => {
// Same for 0 and 1
}
}
Regards,
Iván Arias.
Hi Iván
Would be posible for blocks to require a return statement instead of raising an error?
It would be possible for the blocks to return values (though not with
the return keyword). I've created a small prototype a few weeks ago:
$result = match ($x) {
0 => {
echo 'Foo';
echo 'Bar';
'Baz'
},
...
};
https://github.com/php/php-src/compare/master...iluuu1994:block-expression
Heavily inspired by Rust:
https://doc.rust-lang.org/reference/expressions/block-expr.html
While this would perfectly solve the problem perfectly it feels a
little foreign in PHP. Either way, this is something that can be
discussed an implemented in a separate RFC.
Would be feasible for the fallthrought problem to use “continue” when you really want to chain “cases”?
Yes, that would be feasible. I personally just don't see a huge
benefit of fallthrough. If we ever implemented pattern matching it
could become unfeasible again (or at least require additional sanity
checks):
match ($value) {
Foo { foo: $foo } => {
fallthrough;
}
Bar { bar: $bar } => {
// $bar is unbound
}
}
Ilija
Would be posible for blocks to require a return statement instead of raising an error?
It would be possible for the blocks to return values (though not with
the return keyword). I've created a small prototype a few weeks ago:$result = match ($x) { 0 => { echo 'Foo'; echo 'Bar'; 'Baz' }, ... };
https://github.com/php/php-src/compare/master...iluuu1994:block-expression
Heavily inspired by Rust:
https://doc.rust-lang.org/reference/expressions/block-expr.htmlWhile this would perfectly solve the problem perfectly it feels a
little foreign in PHP. Either way, this is something that can be
discussed an implemented in a separate RFC.
My only concern about blocks is that I fell a bit odd that they can only be used
under some circustances. I think I would prefer to simply do not include them
for now, and discuss these “block expressions” in another RFC as they seem to
be an independent language feature.
Regards,
Iván Arias.
Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expression
Hi Ilija,
Thanks for continuing to work on this, I think the new syntax feels much more natural when introduced with a new keyword.
Additionally, match arms can now contain statement lists
which allows you to use the match expression anywhere the switch
statement would have been used previously.
I quite strongly disagree with this part, however. I don't think it's necessary for the new keyword to be a replacement for every switch statement, any more than switch replaces every if statement or vice versa, and doing so adds a lot of complexity to the proposal.
In languages where expression blocks with an implicit return are a standard feature, the combination works naturally, but without that we have to add a bunch of special rules:
- Can match arms contain other flow control, such as "return" or "continue"? What if we later want to give those special meaning?
- Can I mix a bare value and a block as arms of the same match, or does the whole construct need to be in either "expression form" or "statement form"? Will this restriction be easy to explain to users?
- Can I omit the trailing semicolon even in "expression form", i.e. if my branches have no braces around, as long as I don't use the result? What will the error message look like if I get this rule wrong?
I also think the use of => and , feels natural when pairing an input to an output, but odd when pairing a test to a block of imperative code. In the "expression form" (no braces, right hand side is an expression evaluated if the branch matches), it matches array syntax (right hand side is an expression immediately evaluated as a value) and short closure syntax (right hand side is an expression evaluated when function is executed).
In the "statement form", its closest relative would be if-elseif-else, but only if we wrote those like this:
if ($score==10) => {
echo "Congratulations";
},
elseif ($score==9) => {
echo "Nearly there";
},
else => {
echo "Keep trying";
}
Switch statements are different again, but their ancestor is not if blocks but goto labels, which explains both the syntax and the fall-through behaviour:
goto "case_$score" else "default"; // imaginary dynamic goto becomes "switch ($score)"
case_10:
echo"Congratulations";
goto end; // becomes "break"
case_9:
echo "Nearly there";
goto end;
default:
echo "Keep trying";
goto end;
end: // becomes closing brace of switch block
The inability to nest ternary ?: without parentheses makes the single expression form particularly useful in PHP, so I'd love to see the RFC focus on that.
We can always add statement blocks in later, either as a special case as currently proposed, or because we're adding them elsewhere, and perhaps with a way to return a value rather a requirement not to use it.
Regards,
--
Rowan Tommins
[IMSoP]
Hi Rowan
Can match arms contain other flow control, such as "return" or "continue"?
Yes and yes. They behave exactly the same as in the switch. This
should definitely be described in the RFC.
Can I mix a bare value and a block as arms of the same match, or does the whole construct need to be in either "expression form" or "statement form"?
Yes, you can mix blocks and expressions given you don't use the return
value of the whole match expression.
Can I omit the trailing semicolon even in "expression form"
Yes. To clarify what's going on here: There's no statement variant of
the match expression. PHP already allows using any expression as a
statement.
https://github.com/php/php-src/blob/422c8390a01ce9b6f612e0183bd0e4500c0608be/Zend/zend_language_parser.y#L446
That's why you can do something like 10 + 20;
. PHP will just discard
the result value. The same thing is happening here. The only
difference is that I added a rule to the grammar to allow dropping the
semicolon in this case because I noticed that I kept forgetting to add
it in my tests.
We can always add statement blocks in later
I think we'll never completely agree on this point, and that's fine.
As mentioned in my announcement IMO it doesn't make sense criticizing
the switch statement but then not offering an alternative.
Thanks for your input!
Ilija
Can match arms contain other flow control, such as "return" or "continue"?
Yes and yes. They behave exactly the same as in the switch. This
should definitely be described in the RFC.
You didn't quote the second part of that question, which admittedly was
a bit rhetorical, but important: allowing these rules out using either
of those keywords in future extensions of this syntax. That means if we
want to add a way to get a value out of a statement block, we can't use
"return"; and if we want to add explicit fallthrough, we can't use
"continue".
Also note that in switch, "continue" and "break" are actually synonyms,
both jumping to the end of the switch block (see also
https://wiki.php.net/rfc/continue_on_switch_deprecation). I'm not sure
that's actually all that intuitive or useful in the proposed match
statement.
Can I mix a bare value and a block as arms of the same match, or does the whole construct need to be in either "expression form" or "statement form"?
Yes, you can mix blocks and expressions given you don't use the return
value of the whole match expression.
Again, you skipped the second part of the question - will this make
sense to users? What error message will they get if they write this:
$x = match($y) { 1 => "Hello", 2 => "Goodbye", 3 => { throw new
Exception; } };
(If https://wiki.php.net/rfc/throw_expression passes, then "3 => throw
new Exception" without the braces will presumably be valid, but that
just makes it even more confusing, IMO.)
Can I omit the trailing semicolon even in "expression form"
Yes.
So this would be valid PHP?
$x = match($y) { 1 => "Hello", 2 => "Goodbye" } echo $y;
To clarify what's going on here: There's no statement variant of
the match expression [...]
I don't think users will see it that way. An "expression" where you
can't use the result (but can use flow control like "throw" and
"return") feels very much like a statement, regardless of how the parser
defines it.
... because I noticed that I kept forgetting to add it in my tests.
I think needing to add a special case to the grammar is a sign that the
proposed syntax doesn't fit well with the rest of the language. Did you
ever forget to put the semicolon when using it in expression context, or
only when using statement blocks?
As mentioned in my announcement IMO it doesn't make sense criticizing
the switch statement but then not offering an alternative.
I don't think this is true. If you were proposing "foreach", you might
criticize the "for" statement, but that doesn't mean you'd extend
"foreach" so that it could replace every single "for" statement.
To reiterate, I think the match expression - with no statement blocks -
stands up really well as a replacement for those switch statements where
you're deciding on a value, and for nested ternaries (which are pretty
much unused in PHP due to the broken precedence).
Other proposals can replace other uses of switch, or switch can live on
with more limited use - after all, we even have a "goto" statement in
PHP, for those rare occasions where none of the built-in flow control
is good enough.
To respond to something you said to Larry:
People can't be forced into
writing clean code and my fear here is that they will just revert back
to using what is more convenient.
That's their loss! Whether or not "match" supports blocks, some people
won't use it, just as some people write unnecessarily complicated if
statements because they don't spend the time structuring them with
elseif or switch.
My concern is that if match supports blocks, but the syntax and
behaviour is unintuitive, we'll be stuck with another feature that
nobody likes very much but we can't easily replace. Worse, the RFC might
lose support based on that part, and we'd miss out on getting a really
nice new expression.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Hi Rowan
That means if we want to add a way to get a value out of a statement block,
we can't use "return"; and if we want to add explicit fallthrough, we can't
use "continue".
And we probably shouldn't do that. return means return from the
function, it makes little sense to change it in one specific context.
We'd either need a different keyword (like pass) or some syntax that
doesn't require a keyword at all (like the block expressions I've
mentioned). The same goes for continue. A keyword like fallthrough
would be much more fitting.
Again, you skipped the second part of the question - will this make sense to
users? What error message will they get if they write this:$x = match($y) { 1 => "Hello", 2 => "Goodbye", 3 => { throw new
Exception; } };
It's described in the RFC:
//> Fatal error: Match expressions that utilize the result value can't
contain statement lists
So this would be valid PHP?
$x = match($y) { 1 => "Hello", 2 => "Goodbye" } echo $y;
No. A semicolon is needed. Here the semicolon belongs to the
assignment, not the match. The parsing would be done somewhat like
this:
- Expression statement <- Requires a semicolon
- Assignment
- $x
- match
- Assignment
The only case where expression statement doesn't require a semicolon is here:
- Expression statement <- Doesn't require a semicolon
- match
Because match is found directly under the expression statement.
I think needing to add a special case to the grammar is a sign that the
proposed syntax doesn't fit well with the rest of the language.
I find it odd that people keep referring to Rust but apparently have
no clue how it works. This is precisely what they do.
https://doc.rust-lang.org/reference/statements.html#expression-statements
ExpressionStatement :
ExpressionWithoutBlock ;
| ExpressionWithBlock
A lot of our last conversation is repeating and very little new has
been said. The only productive thing is to move on. I think this RFC
satisfies your needs. It has some extra parts you don't need, and
that's fine.
Thanks again for taking the time to read the RFC and provide your feedback!
Ilija
It's described in the RFC:
//> Fatal error: Match expressions that utilize the result value can't
contain statement lists
Thanks, I missed that. Reasonably clear, I guess, although I imagine there'll be questions on Stack Overflow asking what a "statement list" is.
So this would be valid PHP?
$x = match($y) { 1 => "Hello", 2 => "Goodbye" } echo $y;
No. A semicolon is needed. Here the semicolon belongs to the
assignment, not the match.
...
The only case where expression statement doesn't require a semicolon is
here:
- Expression statement <- Doesn't require a semicolon
- match
Right, so I go back to my earlier point that there are basically two forms of match proposed:
- one where you can use it anywhere an expression is expected, with no statement blocks
- one which can be used as a standalone statement, which can contain statement blocks, and which doesn't need a trailing semicolon
I think needing to add a special case to the grammar is a sign that
the
proposed syntax doesn't fit well with the rest of the language.I find it odd that people keep referring to Rust but apparently have
no clue how it works. This is precisely what they do.https://doc.rust-lang.org/reference/statements.html#expression-statements
I didn't mention Rust, and am fully aware that I don't know how its syntax works. What I said was that the proposed syntax doesn't fit into PHP easily.
Two things stand out to me following that link:
Firstly, block expressions are a foundational part of the Rust language, not a special case used in only one place. They're used as the basis for everything from if statements to async blocks, all of which evaluate to a value. That's a really neat design feature, but not something that exists in PHP.
It also makes the semicolon rule somewhat more natural: it makes all the control flow blocks in Rust feel more natural to developers coming from other languages. It's not an ad hoc rule for one expression, it's a general rule across the language.
Secondly, the page you linked calls out an ambiguity caused by omitting the semicolon, which would apply equally in PHP:
match($x) { 1 => ['foo', 'bar'] }
[0];
It then explains the important restriction of only omitting the semicolon if the result is void/null rather than a concrete value. An equivalent in PHP would be to only allow omitting the semicolon if every arm of the match is a statement list, further distinguishing the expression and statement forms.
I think this RFC
satisfies your needs. It has some extra parts you don't need, and
that's fine.
Just to be clear, that may be fine with you, but if I had a vote, I would probably vote no to the current proposal, because I actively dislike the proposed syntax and semantics of statement blocks. They "poison the barrel" for me. That's a real shame, because I really like the core of the proposal.
Regards,
--
Rowan Tommins
[IMSoP]
Firstly, block expressions are a foundational part of the Rust
language, not a special case used in only one place.
From a previous e-mail:
It would be possible for the blocks to return values (though not with
the return keyword). I've created a small prototype a few weeks ago:https://github.com/php/php-src/compare/master...iluuu1994:block-expression
Heavily inspired by Rust:
https://doc.rust-lang.org/reference/expressions/block-expr.htmlWhile this would perfectly solve the problem perfectly it feels a
little foreign in PHP. Either way, this is something that can be
discussed an implemented in a separate RFC.
We can't do everything at once. The current block syntax is completely
compatible with that suggestion. All we'd have to do is relax the
restriction of only allowing blocks when the result isn't used.
it makes all the control flow blocks in Rust feel more natural to
developers coming from other languages.
Same thing here. If you don't use the result value, it looks just like
the switch. That's why I'd feel odd to require a semicolon.
the page you linked calls out an ambiguity caused by omitting the
semicolon
This code is parsed like this:
match($x) { 1 => ['foo', 'bar'] }
[0];
- statement list
- match expr
- array literal
Note that we can't have ambiguous grammar since Bison would
immediately expose it.
It then explains the important restriction of only omitting the
semicolon if the result is void/null rather than a concrete value.
That's true. Unfortunately we don't have this info at compile time.
That's why we throw away the expression result just like with any
other expression statement.
I would probably vote no to the current proposal
I understand. Everybody is entitled to their opinion. If the vote
fails that doesn't mean we'll just completely give up. I would
reiterate and try again. Sadly, it's very hard to get a feel for how
many people have which opinion through the mailing list.
Ilija
It would be possible for the blocks to return values (though not with
the return keyword).
[...]
We can't do everything at once.
Then why not introduce the match expression first, and add block
expressions (as a general concept) later? That way, we don't need all
the special restrictions on when to use them, and can work out a
consistent set of rules for them which can apply language-wide, rather
than being constrained later by decisions we make half-implementing them
now.
The ability to use them with short closure syntax would be very
interesting, if the scoping/capture rules could be agreed.
it makes all the control flow blocks in Rust feel more natural to
developers coming from other languages.
Same thing here. If you don't use the result value, it looks just like
the switch. That's why I'd feel odd to require a semicolon.
Not really. In Rust, it's about making a whole bunch of control
structures, which are already consistent with each other, more user
friendly (but still consistent). In your proposal, it's about making one
specific control structure look slightly more like other syntaxes in the
same language, by adding a special case which applies nowhere else.
To be specific, in PHP, there is no "omitted semicolon" at the end of
"if ($x) { y(); }" - that is just how the syntax is defined. There's no
need to have a parser rule to allow it to look like C or Java, because
it already does.
This code is parsed like this:
match($x) { 1 => ['foo', 'bar'] } [0];
- statement list
- match expr
- array literal
Note that we can't have ambiguous grammar since Bison would
immediately expose it.
I'm not sure what you mean by "we can't have ambiguous grammar" - the
code above is ambiguous, in that it could be defined to mean one of
two things; what you've shown is which definition your implementation
defines.
That resolution makes sense for that example, but consider some others:
// As proposed, parsed as two meaningless expression-statements; as a
normal expression, would select and then invoke a callable
match($x) { 1 => 'print', 2 => 'var_dump' }
($x);
// Invalid under the proposed rule, but a valid method call if match was
always a normal expression
match($x) { 1 => new Foo, 2 => new Bar }
->doSomething();
To be fair, the same limitation exists for closures, where this is invalid:
function($x) { echo $x; } ("Hello");
and you have to write this instead:
(function($x) { echo $x; }) ("Hello");
So I guess the above would follow the same rule to force expression context:
(match($x) { 1 => 'print', 2 => 'var_dump' })($x);
(match($x) { 1 => new Foo, 2 => new Bar })->doSomething();
The only downside is that omitting the parentheses wouldn't be an error,
as with the closure case, it would just silently do nothing, which might
be rather confusing. Dropping the special semicolon handling would mean
it either did the expected thing or gave a parse error straight away.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
With all the humility of the world and without wanting to be exhaustive
about this, my only question is why can't we keep it as a switch, instead
of creating a new keyword?
$x = switch ($y) {
case 0: return 1;
case 1: return 20;
// default: return null;
};
I say this because, in the future, we could do something similar for if():
$x = if ($y instanceof Foo) {
return $y->bar();
} else if ($y instanceof Bar) {
return $y->foo();
}; // else { return null; }
Or even for loopings:
$x = foreach ($items as $item) {
yield $item;
}; // $x == $items
So we would not need to create a new keyword for each equivalent type when
used as an expression.
The only reason would be something like "switch has no return support", but
this could happen exclusively in this scope, just as "use ()" exists only
in functions and not in methods, which are basically the same thing,
however, in different scopes. Likewise, "break 2" would be impossible in
this case, so just issue an error saying "break n is not supported in this
situation".
Atenciosamente,
David Rodrigues
Em seg., 13 de abr. de 2020 às 11:49, Rowan Tommins rowan.collins@gmail.com
escreveu:
It would be possible for the blocks to return values (though not with
the return keyword).
[...]
We can't do everything at once.Then why not introduce the match expression first, and add block
expressions (as a general concept) later? That way, we don't need all
the special restrictions on when to use them, and can work out a
consistent set of rules for them which can apply language-wide, rather
than being constrained later by decisions we make half-implementing them
now.The ability to use them with short closure syntax would be very
interesting, if the scoping/capture rules could be agreed.it makes all the control flow blocks in Rust feel more natural to
developers coming from other languages.
Same thing here. If you don't use the result value, it looks just like
the switch. That's why I'd feel odd to require a semicolon.Not really. In Rust, it's about making a whole bunch of control
structures, which are already consistent with each other, more user
friendly (but still consistent). In your proposal, it's about making one
specific control structure look slightly more like other syntaxes in the
same language, by adding a special case which applies nowhere else.To be specific, in PHP, there is no "omitted semicolon" at the end of
"if ($x) { y(); }" - that is just how the syntax is defined. There's no
need to have a parser rule to allow it to look like C or Java, because
it already does.This code is parsed like this:
match($x) { 1 => ['foo', 'bar'] } [0];
- statement list
- match expr
- array literal
Note that we can't have ambiguous grammar since Bison would
immediately expose it.I'm not sure what you mean by "we can't have ambiguous grammar" - the
code above is ambiguous, in that it could be defined to mean one of
two things; what you've shown is which definition your implementation
defines.That resolution makes sense for that example, but consider some others:
// As proposed, parsed as two meaningless expression-statements; as a
normal expression, would select and then invoke a callable
match($x) { 1 => 'print', 2 => 'var_dump' }
($x);// Invalid under the proposed rule, but a valid method call if match was
always a normal expression
match($x) { 1 => new Foo, 2 => new Bar }
->doSomething();To be fair, the same limitation exists for closures, where this is invalid:
function($x) { echo $x; } ("Hello");
and you have to write this instead:
(function($x) { echo $x; }) ("Hello");
So I guess the above would follow the same rule to force expression
context:(match($x) { 1 => 'print', 2 => 'var_dump' })($x);
(match($x) { 1 => new Foo, 2 => new Bar })->doSomething();
The only downside is that omitting the parentheses wouldn't be an error,
as with the closure case, it would just silently do nothing, which might
be rather confusing. Dropping the special semicolon handling would mean
it either did the expected thing or gave a parse error straight away.Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
With all the humility of the world and without wanting to be exhaustive
about this, my only question is why can't we keep it as a switch, instead
of creating a new keyword?
That was the original proposal. The main issue came down to it being too much different functionality crammed into one bit of syntax, where lots of subtle little context would change the meaning and what you had to do. That's difficult for users, and for the parser.
Fundamentally, this is not a switch statement. I think it is an error to equate it to or relate it to a switch statement. It's a more powerful, multi-branch version of the ternary operator, and should be conceptualized as such. And conceptualized that way, I really want it. :-)
Piggybacking off of if makes slightly more sense than off of switch, but it then involves a lot more syntax and a lot more typing, plus the confusion about when it needs a semi-colon and when it doesn't.
A new keyword creates a mental separation that is beneficial to users, and allows it to be implemented with a minimum of syntax and no conditional syntax (where certain sigils are needed only in some situations and are an error in others).
Em seg., 13 de abr. de 2020 às 11:49, Rowan Tommins rowan.collins@gmail.com
escreveu:It would be possible for the blocks to return values (though not with
the return keyword).
[...]
We can't do everything at once.Then why not introduce the match expression first, and add block
expressions (as a general concept) later? That way, we don't need all
the special restrictions on when to use them, and can work out a
consistent set of rules for them which can apply language-wide, rather
than being constrained later by decisions we make half-implementing them
now.The ability to use them with short closure syntax would be very
interesting, if the scoping/capture rules could be agreed.
I really feel like this is the best solution. Multi-line expressions are not a logical one-off part of match(). They may or may not be a good idea for PHP generally, in which case they should be proposed and thought through generally. If they do end up being a good move generally, they'll apply to match naturally just like everywhere else; if not, then they wouldn't confuse people by being a one-off part of match() and nowhere else.
The various discussion about parse logic when the semicolon is there or not only highlights that it is not a fully-developed concept. There's a lot more thought and design that needs to go into multi-line expressions that is not in scope for a match expression; embedding statements inside an expression is just weird...
Please, pull that part out. It's the most (only?) controversial part right now. I suspect (note: no data) that the RFC would pass almost unanimously without it (which is good; I want this RFC to pass). With it, it may pass or not, but it would be a weaker proposal.
At the very least bump it to an secondary add-on vote.
--Larry Garfield
Please, pull that part out.
Guys. You've made your opinions loud and clear. But I'm not just
building this for you, I'm also building it for others and myself.
With it, it may pass or not, but it would be a weaker proposal.
Maybe, maybe not. You're assuming people will share your opinion but
that is not just a given. We probably won't find out until this is put
to a vote.
Ilija
Hi Rowan
Then why not introduce the match expression first, and add block
expressions (as a general concept) later?
Mainly because I fear it won't pass and we'll end up with a
handicapped match expression that can't be used half the time.
That way, we don't need all the special restrictions on when to use them
I think that's a little overblown. We're talking about ~5 lines here.
the code above is ambiguous
Whether or not some code is ambiguous depends on the grammar that
defines how it's interpreted. "1 + 2 * 3" can be ambiguous if the
precedence isn't clearly defined. Bison will check that your grammar
is unambiguous, otherwise you'll get a shift/reduce or reduce/reduce
error (as in there's multiple ways to parse your code). The suggested
grammar defines that the semicolon can be dropped under exactly one
circumstance: When the match is the first and thus only element of an
expression statement. This is what Rust does too. This clearly needs
to be described better in the RFC.
The only downside is that omitting the parentheses wouldn't be an error,
as with the closure case, it would just silently do nothing, which might
be rather confusing.
That is true. Since array subscripts and function calls don't allow
arbitrary lhs expressions the code wouldn't work without parentheses
but at least you'd get a parser error.
Ilija
I love everything about this, except for one deal killer.
I love the use of =>. And I love the blocks.
Plus I love that it can fully replace switch because of how using switch can easily allow an developer's oversight to introduce subtle bugs.
The deal killer IMO is this:
"match was added as a keyword (reserved_non_modifiers). This means it can't be used in the following contexts anymore:
• namespaces
• class names
• function names
• global constants"
-Mike
Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expressionDue to the feedback I decided to take the RFC in a slightly different
direction. The main objections were:
- It didn't behave like the traditional switch so it shouldn't use
the switch keyword- It didn't fix type coercion
- The RFC was poorly structured
In hindsight I understand why I had a hard time writing the RFC. I
tried making a case against the switch statement while really not
addressing the switch statement at all. The new RFC proposes a match
expression that fixes all the objections raised against the switch
statement. Additionally, match arms can now contain statement lists
which allows you to use the match expression anywhere the switch
statement would have been used previously.While some people have suggested statement lists aren't necessary I
don't think it makes sense to raise objections against the switch
statement without offering an alternative.I also experimented with pattern matching but decided against it. The
exact reason is described in the RFC.I'm now confident this is the right approach. I hope you will be
happier with this proposal.Happy Easter!
Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expressionDue to the feedback I decided to take the RFC in a slightly different
direction. The main objections were:
- It didn't behave like the traditional switch so it shouldn't use
the switch keyword- It didn't fix type coercion
- The RFC was poorly structured
In hindsight I understand why I had a hard time writing the RFC. I
tried making a case against the switch statement while really not
addressing the switch statement at all. The new RFC proposes a match
expression that fixes all the objections raised against the switch
statement. Additionally, match arms can now contain statement lists
which allows you to use the match expression anywhere the switch
statement would have been used previously.While some people have suggested statement lists aren't necessary I
don't think it makes sense to raise objections against the switch
statement without offering an alternative.I also experimented with pattern matching but decided against it. The
exact reason is described in the RFC.I'm now confident this is the right approach. I hope you will be
happier with this proposal.Happy Easter!
This is a substantial improvement over the previous RFC, thank you!
-
The use of => is fine with me. I'm not bothered by it, but I'm also open to other options like : if there's a strong consensus otherwise.
-
I am also still against block statements in the right-hand side, largely for the same reasons as Rowan. They're unnecessary and lead to confusion. In my mind, match expressions are more like a multi-branch ternary than a switch statement.
When I say unnecessary, I mean that very literally. Because it can take any expression, that means it can simply execute a pre-defined function or anonymous function.
function b() {
// Something multi-line and complicated.
}
$c = function ($x) {
// Something multi-line and complicated.
};
match ($x) {
1 => 'A',
2 => b(),
3 => $c($x),
};
That not only alleviates the need to support multi-line blocks, it keeps the match statement itself clearer to understand at-a-glance, and encourages the definition of named, testable, small blocks of code (ie, functions whether anonymous or not), which is a net win on its own. The overall effect is the same.
And if you really, really need to inline the block for some reason, this is ugly but syntactically valid:
match ($x) {
1 => 'A',
2 => b(),
3 => function() use ($whatever) {
// Something multi-line and complicated.
}(),
};
(Incidentally, the same "factor complex expressions out to a function an everyone is better off" applies to the left hand side, too. That's a good thing.)
I would strongly recommend removing the block statement support, as it just muddies the water.
-
I'm fine with match() always being strict comparison, regardless of the type mode. However, it should probably be made more explicit in the RFC that it is unaffected by the type mode.
-
Possible addition: Make the match input optional, defaulting to "true". That is:
match (true) {
$x < 5 => 'A',
$x > 10 => 'B',
$x ==7 => 'C',
default => 'D',
}
Can be shortened to:
match {
$x < 5 => 'A',
$x > 10 => 'B',
$x ==7 => 'C',
default => 'D',
}
Since it's an expression it still has all the values available from the scope, so it's just as capable and saves a little typing in that case.
--Larry Garfield
Hi Larry
That not only alleviates the need to support multi-line blocks, it keeps the match statement itself clearer to understand at-a-glance, and encourages the definition of named, testable, small blocks of code (ie, functions whether anonymous or not), which is a net win on its own.
I'm generally not a fan of this approach. I think it's unrealistic to
expect users to move code into a separate function just because it
takes two lines of code instead of one. People can't be forced into
writing clean code and my fear here is that they will just revert back
to using what is more convenient.
I'm fine with match() always being strict comparison, regardless of the type mode. However, it should probably be made more explicit in the RFC that it is unaffected by the type mode.
That's a good point, I will add it to the RFC.
Possible addition: Make the match input optional, defaulting to "true".
This has been proposed before. I do like it but I'm not sure if other people do.
I would strongly recommend removing the block statement support, as it just muddies the water.
As mentioned in my e-mail to Rowan, we'll probably never agree on this
point. At the end of the day, you might be getting a feature you don't
have to use. It barely adds any complexity to the language but will
make other users happy.
Thanks for you input!
Ilija
Hi Ilja,
Ilija Tovilo wrote:
Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
I like this proposal in general, I've wanted something like Haskell and
Rust's case/match statements for years but never got round to proposing
them. I do have some comments and concerns however:
-
While I like the Rust-like feature of being able to use it both as a
statement and as an expression, I share others' concern that it is
inconsistent with PHP's existing statements. I would suggest either
an RFC to apply this to other statements, or to take it out for now. -
It is a shame you don't suggest taking this opportunity to add pattern
matching. We already have destructuring assignment syntax in PHP,
albeit only for arrays, so why not support pattern-matching for arrays
inmatch
right now? I guess variable scope might be awkward (matched
values would leak into function scope), but it is for other statements
already. If we do not add pattern matching right now, perhaps we can
design the syntax so it doesn't prevent adding it later? Using a
keyword prefix (e.g.let
) in future means we end up with two kinds
of arm (one with === behaviour, one with pattern-matching behaviour),
but wouldn't it be simpler to have just one? -
Relatedly, it is a shame not to have guards.
switch (true)
is not a
very nice idiom,match (true)
is worse (no handling of truthiness),
and neither provides the full power that patterns + guards together
provide, where you can neatly combine pure value matching with
arbitrary boolean checks. Sure, you can write($foo === BAR) =>
,
but you might as well use an if-statement then. I shouldn't let
perfect be the enemy of the good, but I think there's so much more
that could be done here. Note that even the limited possible patterns
without inventing new syntax (a plain variable, and an array pattern)
are more useful if guards are available. -
It is unfortunate if we have to add another reserved word… I don't
know if there'd be a way to do this withswitch
that isn't
confusable with the existing statement, though.
Thanks,
Andrea
Hi internals
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expressionDue to the feedback I decided to take the RFC in a slightly different
direction. The main objections were:
- It didn't behave like the traditional switch so it shouldn't use
the switch keyword- It didn't fix type coercion
- The RFC was poorly structured
In hindsight I understand why I had a hard time writing the RFC. I
tried making a case against the switch statement while really not
addressing the switch statement at all. The new RFC proposes a match
expression that fixes all the objections raised against the switch
statement. Additionally, match arms can now contain statement lists
which allows you to use the match expression anywhere the switch
statement would have been used previously.While some people have suggested statement lists aren't necessary I
don't think it makes sense to raise objections against the switch
statement without offering an alternative.I also experimented with pattern matching but decided against it. The
exact reason is described in the RFC.I'm now confident this is the right approach. I hope you will be
happier with this proposal.Happy Easter!
Thanks for the proposal, I like the direction.
Regarding the block syntax: I do think that block support is pretty
important, but the way it is introduced in the RFC is not very clear to me.
I think part of the problem is that it's a case of either / or. If you have
a match that returns a value, but one of the match cases becomes too
complicated and would benefit from a temporary variable, you cannot
introduce it easily. You have to go all the way from
$x = match ($y) {
0 => foo(),
1 => bar(),
2 => baz(),
3 => oof(),
};
to
match ($y) {
0 => {
$x = foo();
},
1 => {
$x = bar();
},
2 => {
$x = baz();
},
3 => {
$tmp = foobar(); // <-- The only new part!
$x = oof($tmp);
}
}
While what you really wanted is the Rust-style:
$x = match ($y) {
0 => foo(),
1 => bar(),
2 => baz(),
3 => {
$tmp = foobar();
oof($tmp)
},
};
We already have a similar issue with arrow functions, where adding an extra
line requires conversion to a normal closure. This is already not great,
but it's a very contained change. Here you may have to refactor a large
match expression to incorporate a local change.
I'm not quite sure what the way forward regarding that is. I can understand
the reluctance to propose the Rust-style block expression syntax, given its
currently fairly limited usefulness outside match expressions. I do think
that this is principally the right approach though, if we want to introduce
control flow expressions like match. The alternative is to leave match as a
statement only. (If we do not support both expressions and statements,
then I believe we should only be supporting statements, not the other way
around.)
Regarding pattern matching: I agree that we shouldn't tackle this topic as
part of this proposal, but we do need to already have a pretty firm idea of
how that is going to look like to avoid forward-compatibility hazards. It
is very easy to run into pattern vs expression ambiguities.
The section of your RFC discussing pattern matching tries to avoids most of
the problem by prefixing patterns with "let". However, even under that
premise there are going to be issues. For example, you have this example:
match ($value) {
let [1, 2, $c] => ..., // Array pattern
}
The intention here is clearly that $c is supposed to capture the third
array element. But if we stay consistent with the remaining proposal, then
$c should be interpreted as a value to match against here, just like 1 and
- The way to actually capture $c should be
match ($value) {
let [1, 2, let $c] => ...
}
I believe, which is sub-optimal. If we allow arbitrary expressions as match
values, then every captured variable would have to be prefixed by let
individually to make things work (I think, correct me if I'm missing
something here).
Another consideration here is that "let" carries a very strong implication
of block scoping. Block scoping is something we might well want to have,
but I'm not sure we would want it to be bound to pattern matching matching
syntax.
Finally, the use of "," to specify multiple match values could be a
composition problem. Rust uses | to specify multiple match values, and also
allows its use in sub-patterns (this is the "or patterns" feature). This
allows you to write patterns like
Some(Enum::A | Enum::B | Enum::C | Enum::D) => ...
which are equivalent to
Some(Enum::A) | Some(Enum::B) | Some(Enum::C) | Some(Enum::D) => ..
Picking "," as separator makes that impossible, as it would cause
ambiguities with commas in array patterns, if nothing else. On the other
hand, we cannot use "|" if we want to allow arbitrary expressions for match
values. (Do we really want that?)
Regards,
Nikita
Hi Nikita
I think part of the problem is that it's a case of either / or.
I agree that your example is an unpleasant limitation.
I'm not quite sure what the way forward regarding that is. I can understand
the reluctance to propose the Rust-style block expression syntax, given its
currently fairly limited usefulness outside match expressions. I do think
that this is principally the right approach though, if we want to introduce
control flow expressions like match.
I personally don't have anything against that approach. I only fear
that proposing a block expression RFC first will most likely fail
because at present the only real use case is arrow functions. If PHP
had block scoping it could have other benefits.
$this->foo = {
$foo = new Foo();
$foo->prop1 = ...;
$foo->prop2 = ...;
$foo
};
But since $foo escapes the block this isn't incredibly helpful either
other than making it a little more readable. There are also a few
issues we'll have to tackle. In my prototype block expressions had the
following grammar:
'{' inner_statement_list expr '}'
This means the following would be invalid.
$x = fn() => {
return 'Foo';
// Parse error, expression expected
};
match ($x) {
0 => {
echo 'Foo';
// Parse error, expression expected
}
}
Neither of those are great. If we define the grammar like this:
'{' inner_statement_list optional_expr '}'
Well have a conflict with statement lists and if statements.
function foo() {
{
// Valid today
}
}
if ($x) {
// Is this a block expression or just an if block?
};
So to make that implementation compatible we'd have to allow dropping
the semicolon in this case as well which was highly criticized in this
RFC. We'd also have to check on a case per case basis if the ending
expression in the block expression is allowed/required or not.
Since there's quite a bit of time left until the feature freeze I can
try to think of a few solutions and propose this RFC but I'm afraid if
it fails we'll be right back where we started.
But if we stay consistent with the remaining proposal, then
$c should be interpreted as a value to match against here, just like 1 and
- The way to actually capture $c should be
match ($value) {
let [1, 2, let $c] => ...
}
This isn't quite how I imagined it. In this example, 1 and 2 are not
arbitrary expressions but literal patterns.
https://doc.rust-lang.org/reference/patterns.html#literal-patterns
Thus every member of the pattern array itself is a pattern, not an
arbitrary expression which would make the let keyword in the array
pattern unnecessary (and actually invalid). Of course it doesn't
necessarily need to be this way. Your example would work but as you
said prefixing every single pattern would be very tedious.
With the pattern only approach, if you wanted to check if some member
of the array has a dynamic value you'd do something like this:
match ($value) {
let [1, 2, $c] if $c === $y => ...
}
In other words, the let is more of a prefix of the whole arm, not just
a single pattern.
Another consideration here is that "let" carries a very strong implication
of block scoping.
Agreed. I couldn't think of a better keyword but I'm definitely open
to suggestions.
Finally, the use of "," to specify multiple match values could be a
composition problem. Rust uses | to specify multiple match values, and also
allows its use in sub-patterns (this is the "or patterns" feature). This
allows you to write patterns like
Unfortunately using | is not possible for arbitrary expressions as it
collides with the bitwise or operator. However it can be used in
patterns because they would not allow arbitrary expressions.
match ($x) {
let [1|2, 3|4] => ...,
}
The inconsistency is a little unfortunate. But I don't think there's
another feasible way.
Ilija
Hi!
I'm not quite sure what the way forward regarding that is. I can understand
the reluctance to propose the Rust-style block expression syntax, given its
Speaking of which, what is the problem with that? I mean, if we just
declare that the value of a block is the return of the last expression,
would it break anything? Of course, by itself it'd be useless since we
still aren't using blocks as expressions. But if we then had constructs
that could accept both blocks and expressions - and we don't need to
make everything that accepts expressions to accept blocks - we just need
to make it make sense in those "dual" constructs.
--
Stas Malyshev
smalyshev@gmail.com
Hi!
I'm not quite sure what the way forward regarding that is. I can understand
the reluctance to propose the Rust-style block expression syntax, given its
Speaking of which, what is the problem with that? I mean, if we just
declare that the value of a block is the return of the last expression,
would it break anything? Of course, by itself it'd be useless since we
still aren't using blocks as expressions. But if we then had constructs
that could accept both blocks and expressions - and we don't need to
make everything that accepts expressions to accept blocks - we just need
to make it make sense in those "dual" constructs.
I think that introducing a new construct like that would require its own RFC
Especially since there is a lot of confusion around how those blocks
should work.
Eg. why should the fact that I use or not a semicolon depend on the fact
what I will do with return value of the expression.
I think a much more sane solution would be to only allow expressions in
match.
$x = match(true) {
$a === 1 => foo(),
$b > 2 => bar(),
}
$a = match($a) {
1 => foo(),
2 => bar(),
}
And then a block could be just an alias for a self calling anonymous
function with all of its consequences:
- continue and break forbidden
- return being the keyword used to match the result instead of missing
semicolon - if no return was provided result defaults to null
- etc.
eg. this two are equal statements assigning result of baz() to $y:
$y = match($x) {
1 => {
foo();
bar();
return baz();
},
...
}
$y = match($x) {
1 => function() use(everything) {
foo();
bar();
return baz();
}(),
...
}
Also I think that suggestion that someone already made about making the
parameter default to true is a really god idea,
I even believe that statements in this form are probably going to be
more common than the standard one:
$y = match {
$a < 10 => foo(),
$a < 20 => bar(),
...
}
Hi Ilija,
I'd like to announce the match expression RFC that replaces the switch
expression RFC I announced a few weeks ago.
New: https://wiki.php.net/rfc/match_expression
Old: https://wiki.php.net/rfc/switch_expression
I know I'm a little late to the game and sorry for that, but I didn't
have time to thoroughly read the proposal earlier.
And again, sorry if it's just me going "blind" but I can't find
anything explaining what will happen if there is more than one
matching arm?
I imagine that since the case of no matching arms will throw an
UnhandledMatchError exception, having several matching arms will also
complain that something looks wrong?
However, that will require all the arms to be evaluated every time,
even after a match is found, which might not be the most optimal
approach.
Then again, if not all arms might be evaluated, the order of
evaluation is important (since several matching arms could be found),
but it's not stated in the RFC if the order of evaluating arms follows
the order they are written, or if the compiler/interpreter is allowed
to choose another order if it thinks it might find the matching arm
sooner?
To recap, perhaps you can answer these questions clearly in the RFC:
- What happens if more than one arm is matching?
- In which order are the arms evaluated (in case order is significant)
- is it predictable or an implementation detail that the developer
should not rely on? - Can the left side of the arms be any expression or only
"constant-style"? (If expressions are allowed, order also becomes
significant)
Apart from these questions, it might be a good idea to evaluate the
impact of the Backwards Incompatible Changes by checking top
repositories for the use of the new keyword "match" in classes and
namespaces etc.
Finally, I find the mixing of match as expression and match as
stand-alone statement unfortunate. I understand very well the
closeness between the two, but I also think that those are two quite
different use-cases which presume different intentions and
expectations for behavior:
For example, when using "match" as an expression and assigning the
value to a variable (the second example at the top of the RFC - as an
alternative to hash maps and nested ternary operators) it's natural to
assume that one and exactly one arm must match. Otherwise the variable
will end up with an ambiguous or undefined value.
I also would consider it "flippant" and not a common need to use block
statements as the arms in this case!
On the contrary, when using "match" as a statement (much more like
"switch" and like "if"), there is no problem if no arms match, and if
several arms match, why not execute all of them? Also, block
statements in the arms would be natural and common.
Would it be a possibility to separate the two into different RFC's?
Personally I find the expression style most useful and I'd be happy to
see this part passed, which I think would be relatively easy since a
lot of the more complex points could be evaded: Block statements,
return statements, break/continue statements (which I don't really
understand in the context of an expression) and the trailing semicolon
issue.
Hope the feedback was useful and best of luck with the RFC!
Jakob
Hi internals
Just a heads up, I'd like to start the voting on the match expression
RFC in a couple of days.
https://wiki.php.net/rfc/match_expression
I have made a number of changes to the RFC.
- Block return values are now allowed but limited to match arms
(https://wiki.php.net/rfc/match_expression#blocks)- Nikita and I have discussed this here:
https://github.com/php/php-src/pull/5407
- Nikita and I have discussed this here:
- continue targeting match is no longer allowed (compilation error)
(https://wiki.php.net/rfc/match_expression#breakcontinue) - Minor rewording
The jury is still out on:
- Optional semicolon when using match as a statement
(https://wiki.php.net/rfc/match_expression#semicolon)- It makes the grammar somewhat complicated
- Nesting match expression blocks without parentheses aren't
possible (https://github.com/php/php-src/pull/5407#issuecomment-616612763)
- Compilation error when not returning a value from a block
(https://wiki.php.net/rfc/match_expression#blocks)- When the block terminates (e.g. through throw or exit) no
return value is needed but the current implementation requires it - We could make this a runtime error instead
- When the block terminates (e.g. through throw or exit) no
If you have anything new to add to the discussion, this is your chance!
Ilija
Just a heads up, I'd like to start the voting on the match expression
RFC in a couple of days.
https://wiki.php.net/rfc/match_expressionI have made a number of changes to the RFC.
- Block return values are now allowed but limited to match arms
(https://wiki.php.net/rfc/match_expression#blocks)
[...]
If you have anything new to add to the discussion, this is your chance!
Would you consider making the block support an additional vote?
Alternatively, if the RFC as proposed doesn't pass, would you be willing
to propose a version without this? If not, I might be inclined to do so,
because I know I'm not alone in liking the expression part as a
standalone feature, without the complexity of using it as a flow-control
statement as well.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Hi Rowan
If we were to remove blocks we'd probably also reconsider other things
(namely the optional semicolon and break/continue) and many examples
in the RFC would become invalid and irrelevant. This would probably
lead to even more confusion which is why I will most likely not move
blocks to an additional vote.
However, I will definitely include a poll to find out why it failed. I
am committed to getting this into the language, in some form or
another.
Regards,
Ilija
Just a heads up, I'd like to start the voting on the match expression
RFC in a couple of days.
The jury is still out on:<*multiple issues>
If you have anything new to add to the discussion, this is your chance!
Just a heads up, I'd like to start the voting on the match expression
RFC in a couple of days.
The jury is still out on:<*multiple issues>
If you have anything new to add to the discussion, this is your chance!
In the tests there is one test for "Test warning on duplicate match
conditions". I can't see that mentioned in the RFC. Is that going to
be applied? To me, that sounds more like something that should be done
at a static analysis level as:
- Sometimes it's convenient to have redundant match conditions. e.g.
within generated code, as otherwise you would have to eliminate
duplicate conditions. - I don't think PHP can guarantee to detect duplicate conditions.
Would the duplicate condition* below be detected?
Are the match conditions guaranteed to be called in order that they
are defined? I'm guessing yes, but it doesn't appear to be mentioned.
I would prefer having an explicit fall-through statement, for the
relatively rare cases when fall-through is needed. Is there a strong
reason to not add it now?
Is there any protection against people doing odd stuff with
references*, or is that their own problem if they choose to do that?
cheers
Dan
Ack
//* Duplicate condition
function in_range(int $value, int $min, int $max) {
if ($value < $min) {return $false;}
if ($value > $max) {return $false;}
return true;
}
$value = rand(5, 20);
match (true) {
in_range($value, 10, 15) => 'in range',
in_range($value, 10, 15) => 'in range',
default => 'out of range'
}
//* odd stuff with references
function why_do_this(&$value) {
$oldValue = $value;
$value = rand(0, 10);
if ($oldValue < $newValue) {
return true;
}
return false;
}
$value = rand(5, 20);
match ($value) {
why_do_this($value) => 'foo',
why_do_this($value) => 'bar',
7 => 'lucky number'
default => 'not sure why anyone would do this'
}
Hi Dan
I don't think PHP can guarantee to detect duplicate conditions
This warning is only shown if duplication is guaranteed, namely when a
jumptable is generated. Which also makes the check fairly cheap.
Would the duplicate condition* below be detected?
No, as jumptables are only generated when the conditions are all
integers or all strings. It's worth noting there are some cases that
could be detected but aren't. For example:
match ($x) {
1, 1 => { echo $x; }
}
This is because we don't generate a jumptable when there are less than
5 integer conditions. Because of this inconsistency I'm not sure if we
should just remove the warning. As you mentioned, it could be
intentional in some cases. Tyson also mentioned that some constants
are platform dependent which would make it warn on some platforms but
not on others.
Are the match conditions guaranteed to be called in order that they
are defined? I'm guessing yes, but it doesn't appear to be mentioned.
They are, yes. The behavior here should be equivalent to the switch.
It wouldn't hurt mentioning that in the RFC.
I would prefer having an explicit fall-through statement, for the
relatively rare cases when fall-through is needed. Is there a strong
reason to not add it now?
I wouldn't call it a strong reason but we'd have to be more careful
with fallthrough when we implement pattern matching.
match ($x) {
let $y @ 10..<20 => {
fallthrough;
},
let $z @ 0..<10 => {
// Pattern $z was never bound
}
}
Other than that, I just don't think it's awfully useful but that's of
course a matter of opinion. But there's nothing directly stopping us
from adding it. For me personally it's not a priority.
Is there any protection against people doing odd stuff with
references*, or is that their own problem if they choose to do that?
We have some tests for references, we couldn't discover any weird or
surprising behavior (although we could definitely use more tests
here).
Ilija
Hi Ilija,
This is because we don't generate a jumptable when there are less than
5 integer conditions. Because of this inconsistency I'm not sure if we
should just remove the warning. As you mentioned, it could be
intentional in some cases. Tyson also mentioned that some constants
are platform dependent which would make it warn on some platforms but
not on others.
It's also inconsistent in that the float, false, null, true, and possibly arrays not containing objects or references don't have warnings.
I'd prefer to go with leaving that check for linters or static analyzers, rather than an E_COMPILE_WARNING, because php compiles at runtime (unless preload is used properly)
Multiple static analyzers can do this for switch, e.g. phan --plugin DuplicateArrayKeyPlugin
https://github.com/phan/phan/tree/master/.phan/plugins#duplicatearraykeypluginphp
If we did start doing this, I'd personally want it both for switch and match at the same time to be consistent, as a separate RFC.
Another idea I'd thought of would be to add an opcache.pedantic
mode for checks like that,
which would warn about errors noticed by opcache (emitted at compile time, not runtime) within a single file without running the code
(e.g. calling a function with the wrong type (strlen($obj), too few or too many arguments to native functions, dead code (e.g. conditions that were always true/false on a platform), etc).
-
Due to platform dependence, that would have false positives, which is why this would be called pedantic
-
This would emit errors directly to a log file or stderr, in order to not trigger side effects such as user error handlers
-
Tyson