Hi internals,
PHP has a couple of legacy string interpolation syntaxes, the most
egregious being "$array[foobar]". The issue with this syntax is that the
array key is not quoted, is required to be not quoted and is silently
accepted.
We've been fighting back against unquoted strings for a long time,
culminating with $array[foobar] in normal code becoming an Error exception
in PHP 8, as the string fallback for unknown constants has been removed.
In this context, it is particularly weird that "$array[foobar]" continues
to be silently allowed, and the more expected form "$array['foobar']"
yields an inscrutable error message:
Parse error: syntax error, unexpected '' (T_ENCAPSED_AND_WHITESPACE),
expecting '-' or identifier (T_STRING) or variable (T_VARIABLE) or number
(T_NUM_STRING)
I think there's two ways to address this. One is to deprecate and
eventually remove the non-wrapped array interpolation syntax entirely,
requiring people to use the generic "{$array['foobar']}" syntax instead.
For the sake of consistency, I think this would also include deprecating
the "$array[0]" variant.
The other is to add support for "$array['foobar']" and
"$array[some(complex(expression()))]" in general, and only deprecate the
"$array[foobar]" syntax.
What do you think about this? I think my personal preference would be
towards phasing out the syntax entirely: If we're going to make people
migrate, we can just as well migrate them towards our generic and preferred
interpolation syntax.
Regards,
Nikita
PHP has a couple of legacy string interpolation syntaxes, the most
egregious being "$array[foobar]". The issue with this syntax is that the
array key is not quoted, is required to be not quoted and is silently
accepted....
What do you think about this? I think my personal preference would be
towards phasing out the syntax entirely: If we're going to make people
migrate, we can just as well migrate them towards our generic and preferred
interpolation syntax.
Hi Nikita,
I agree that this syntax is peculiar, and think you're right that the most
consistent way forward is to phase out both "$array[foobar]" and
"$array[0]".
The biggest problem I see with allowing "$array['foobar']" as an
alternative spelling is that it makes the inconsistent handling of the bare
string even more surprising. Consider these pairs:
const DEBUG='yes'
$array = [
'DEBUG' => 'blah blah blah',
'yes' => 'yes yes yes'
];
// Normal PHP code
echo $array['DEBUG']; // quoted string as key - echoes blah blah blah
echo $array[DEBUG]; // constant as key - echoes yes yes yes
// Properly delimited string interpolation
echo "{$array['DEBUG']}"; // quoted string as key inside string
interpolation - echoes blah blah blah
echo "{$array[DEBUG]}"; // constant as key inside string interpolation -
echoes yes yes yes
// Shorthand string interpolation; only the second form currently compiles
echo "$array['DEBUG']"; // quoted string as key inside string interpolation
- echoes blah blah blah
echo "$array[DEBUG]"; // looks like a constant, but acts like a string! -
echoes blah blah blah
My only hesitation is the usual one for language changes: how much working
code will need to be amended, and how easy is it to identify and correctly
fix that code?
Since the current syntax isn't ambiguous, it ought in principle to be
possible for a tool to parse source code and identify and fix instances. I
think in cases like this, the upgrading notes should link specifically to
one or more such tools.
Even with such a tool, it's potentially quite a disruptive change, so we
need to decide how important we think removing this syntax would be.
Regards,
Rowan Tommins
[IMSoP]
On Thu, Jan 9, 2020 at 12:12 PM Rowan Tommins rowan.collins@gmail.com
wrote:
PHP has a couple of legacy string interpolation syntaxes, the most
egregious being "$array[foobar]". The issue with this syntax is that the
array key is not quoted, is required to be not quoted and is silently
accepted....
What do you think about this? I think my personal preference would be
towards phasing out the syntax entirely: If we're going to make people
migrate, we can just as well migrate them towards our generic and
preferred
interpolation syntax.Hi Nikita,
I agree that this syntax is peculiar, and think you're right that the most
consistent way forward is to phase out both "$array[foobar]" and
"$array[0]".The biggest problem I see with allowing "$array['foobar']" as an
alternative spelling is that it makes the inconsistent handling of the bare
string even more surprising. Consider these pairs:
That's true, but I think if the "$array[foobar]" syntax is deprecated in
the same version that "$array['foobar']" is allowed, we can mostly avoid
confusion, because the deprecation message will tell people what's what.
[snip]
My only hesitation is the usual one for language changes: how much working
code will need to be amended, and how easy is it to identify and correctly
fix that code?Since the current syntax isn't ambiguous, it ought in principle to be
possible for a tool to parse source code and identify and fix instances. I
think in cases like this, the upgrading notes should link specifically to
one or more such tools.Even with such a tool, it's potentially quite a disruptive change, so we
need to decide how important we think removing this syntax would be.
Yes, reliable auto-upgrade should be possible for this change. I'll also do
an analysis run on how common this syntax is in open-source projects later.
Assuming that we do want to phase out the "$array[foobar]" interpolation
syntax, I think a relevant consideration regarding the upgrade-path is
this: Nearly all code-upgrades work by retaining compatibility with the
previous PHP version for a while (for open-source code often a long while),
so effectively the only possible upgrade pass is to go from
"$array[foobar]" to "{$array['foobar']}", as a hypothetical
"$array['foobar']" syntax would not be available in the previous PHP
version.
As such, the question of whether we want to support "$array['foobar']" is
not really relevant to the upgrade path, it can only be answered with
long-term evolution in mind. That is, as an end result, 10 years down the
line, does it make sense for us to support "$array['foobar']" or not? I
think the answer to that may well be "yes", because we already support
"$string" and "$object->prop", so it is in a way natural that
"$array['key']" is also supported, as the last of the "fundamental"
variable syntaxes.
Finally, if we want to leave things alone, we have the option of just
generating a better error message for this.
Regards,
Nikita
Hi,
[...] we already support
"$string" and "$object->prop", so it is in a way natural that
"$array['key']" is also supported, as the last of the "fundamental"
variable syntaxes.
What about rather deprecating "$object->prop" too? The current
situation can be surprising:
"$object->foo()" // $object->foo . '()'
"$object->obj->bar" // $object->obj . '->bar'
"$object->arr[qux]" // $object->arr . '[qux]'
"$array[arr][bar]" // $array['arr'] . '[bar]'
"$array[obj]->qux" // $array['obj'] . '->qux'
In any case, I'm +1 for deprecating "$array[key]", and "$array[0]" to
avoid introducing another inconsistency.
--
Guilliam Xavier
Hi!
I think there's two ways to address this. One is to deprecate and
eventually remove the non-wrapped array interpolation syntax entirely,
requiring people to use the generic "{$array['foobar']}" syntax instead.
For the sake of consistency, I think this would also include deprecating
the "$array[0]" variant.
The first part seems to make sense but I don't think losing "$array[0]"
does... I get the consistency argument but I feel most people would
rather have this useful syntax working and not worry about the fact that
it's theoretically "inconsistent". Consistency only helps when it allows
to make useful inferences, something not working is rarely useful
inference.
The other is to add support for "$array['foobar']" and
"$array[some(complex(expression()))]" in general, and only deprecate the
"$array[foobar]" syntax.
This also seems an acceptable option, if we can make stuff inside []
behave reasonably. We probably don't actually need anything really
complex - if one wants complex stuff, {} syntax should be used. I think
primary target here should be simple stuff like numbers and strings
work, all the rest is optional.
--
Stas Malyshev
smalyshev@gmail.com
The first part seems to make sense but I don't think losing "$array[0]"
does... I get the consistency argument but I feel most people would
rather have this useful syntax working and not worry about the fact that
it's theoretically "inconsistent". Consistency only helps when it allows
to make useful inferences, something not working is rarely useful
inference.
Agreed.
-Mike