Hi internals
I've been thinking about ways to improve string interpolation. String
interpolation in PHP is currently pretty limited to say the least.
What's worse is we have several ways to do the same thing and a few of
them being quite inconsistent and confusing.
We have these syntaxes:
- Directly embedding variables: "$foo"
- Braces outside the variable "{$foo}"
- Braces after the dollar sign: "${foo}"
- Dynamic variable lookup: "${expr}"
The difference between 3 and 4 is extra confusing. Passing a simple
variable name without a dollar inside the braces (and optionally an
offset access) will embed the given variable in the string. When
passing a more complex expression ${} will behave completely
differently and perform a dynamic variable lookup.
https://3v4l.org/uqcjf
Let's look at some different examples.
Simple local variables work well, although it is questionable why we
have 4 ways to do the same thing.
- "$foo"
- "{$foo}"
- "${foo}"
- "${'foo'}"
Accessing offsets is supported for syntax 1, 2 and 3. String quoting
in the offset expression is inconsistent.
- "$foo[bar]"
- "{$foo['bar']}"
- "${foo['bar']}"
Accessing properties is supported for syntax 1 and 2.
- "$foo->bar"
- "{$foo->bar}"
Nested property fetches or offset accesses only work with syntax 2.
- "$foo[bar][baz]" // [baz] is not part of the expression, equivalent
to "{$foo['bar']}[baz]" - "{$foo['bar']['baz']}"
- "${foo['bar']['baz']}" // Syntax error
Calling methods is only supported for syntax 2.
- "$foo->bar()" // Treats ->bar as a property access, equivalent to
"{$foo->bar}()" - "{$foo->bar()}"
Arbitrary expressions work for none of the syntaxes.
-
- "{$foo + 2}" // Syntax error
- "{Foo::bar()}" // Interpreted as string
The most functional of these syntaxes is 2 but even here only a small
subset of expressions are accepted. The distinction between syntax 3
and 4 is very confusing and we should probably deprecate and remove
syntax 3 (as it does pretty much the same as syntax 2). Sadly, the
only syntax that accepts arbitrary expressions (4) is also the least
useful.
As to allowing arbitrary string interpolation, we have three options:
- Deprecate syntax 4, remove it, and change its functionality in the future
- Use some new syntax (e.g. "(1 + 2)", "{1 + 2}", "$(1 + 2)", etc)
with the given BC break - Do nothing
Adding new syntax will break many regular expressions and embedded
jQuery code, although they are easily fixed by escaping the \ or $.
Deprecating, removing and then changing ${} would take many years. I
don't know which of these is the better approach.
Any thoughts or different ideas?
Ilija
Hi internals
I've been thinking about ways to improve string interpolation.
Absolutely overwhelmed by the feedback (:P) I've decided to create a small POC:
https://github.com/php/php-src/compare/master...iluuu1994:string-interpolation
The POC uses the following syntax:
echo "Static method call: #{Foo::bar()}";
Two questions arose:
- String prefix
To mitigate the BC break we could require strings that use the new
interpolation to be prefixed.
// Continues behaving the same
echo "#{Foo::bar()}";
// Actually makes use of the new interpolation
echo $"#{Foo::bar()}";
The main downside is that we have yet another type of string: Simple
quotes with no interpolation, double quotes with some interpolation
and fully interpolated strings ($""), each one with their heredoc
counterpart. It's unfortunate but we might prefer this approach to
mitigate the BC break.
Let me know if you prefer a prefix or no prefix. If the answers are
inconclusive this might become a secondary vote in the RFC.
- Escaping
It's not quite obvious how escaping should behave.
$foo = 'foo';
echo "#{$foo}";
// Could print
//> 1. #foo
// or
//> 2. #{$foo}"
We could 1. make the backslash escape just the hash and interpret
{$foo} as usual, or 2. make it escape both.
- Option 1 is more consistent with the rest of the language, backslash
normally just escapes the next character - Option 1 requires two backslashes when result 2 is desired ("#{$foo}")
- Option 2 makes it impossible to achieve result 1 with just braces
and would have to be written as "##{$foo}"
Let me know which option makes more sense to you.
Ilija
Hi internals
I've been thinking about ways to improve string interpolation.
Absolutely overwhelmed by the feedback (:P) I've decided to create a small
POC:https://github.com/php/php-src/compare/master...iluuu1994:string-interpolation
The POC uses the following syntax:
echo "Static method call: #{Foo::bar()}";Two questions arose:
- String prefix
To mitigate the BC break we could require strings that use the new
interpolation to be prefixed.// Continues behaving the same
echo "#{Foo::bar()}";
// Actually makes use of the new interpolation
echo $"#{Foo::bar()}";The main downside is that we have yet another type of string: Simple
quotes with no interpolation, double quotes with some interpolation
and fully interpolated strings ($""), each one with their heredoc
counterpart. It's unfortunate but we might prefer this approach to
mitigate the BC break.Let me know if you prefer a prefix or no prefix. If the answers are
inconclusive this might become a secondary vote in the RFC.
- Escaping
It's not quite obvious how escaping should behave.
$foo = 'foo';
echo "#{$foo}";// Could print
//> 1. #foo
// or
//> 2. #{$foo}"We could 1. make the backslash escape just the hash and interpret
{$foo} as usual, or 2. make it escape both.
- Option 1 is more consistent with the rest of the language, backslash
normally just escapes the next character- Option 1 requires two backslashes when result 2 is desired ("#{$foo}")
- Option 2 makes it impossible to achieve result 1 with just braces
and would have to be written as "##{$foo}"Let me know which option makes more sense to you.
Ilija
--
To unsubscribe, visit: https://www.php.net/unsub.php
Hi Ilija,
I think this could work and thank you very much for the effort into
clarifying the current interpolation situation.
Using a prefix for strings that have a saner interpretation reminds me of
f-strings in python 3. I would suppose you know about it already but I'll
leave a link here anyway: https://www.python.org/dev/peps/pep-0498/
Using $ for the prefix might not be the best option as it would look like
this:
echo $"next is {$i + 1}";
Maybe f can be used similar with how it is in python but I didn't gave too
much thought into it. I'm a casual python user so I need to understand more
the history how they got to it.
Replying from my phone, sorry if my answer is too short, incomplete and
with possible spelling mistakes.
Regards,
Alex
Hi Alex
I've been thinking about ways to improve string interpolation.
Using a prefix for strings that have a saner interpretation reminds me of f-strings in python 3. I would suppose you know about it already but I'll leave a link here anyway: https://www.python.org/dev/peps/pep-0498/
Maybe f can be used similar with how it is in python but I didn't gave too much thought into it. I'm a casual python user so I need to understand more the history how they got to it.
According to the RFC the f stands for formatted string. The $ I chose
comes from C#. I don't feel strongly about either option, I think
since IDEs will help with syntax highlighting it probably doesn't
matter too much. I might look into other languages and create a short
overview/comparison.
Ilija
Hi Ilija,
I've been thinking about ways to improve string interpolation. String
interpolation in PHP is currently pretty limited to say the least.
Thanks for putting together some thoughts on this.
My instinct is that rather than having a whole new syntax that's
basically a tweak on "${expression}", we could take the opportunity to
bring more features into the mix. If you think about it, string
interpolation is basically an inline template syntax, so a lot of the
features that are useful in template engines would be useful there.
Specifically:
-
modifiers; e.g. sprintf-style, f"Measure #0.2f{$value}" or
template-style, f"Hello #{$name|escape}" -
custom rules, like JavaScript's tagged templates [1], e.g. html"Hello
#{$name}" == "Hello " . htmlspecialchars($name)
Put these together, and you can have variables escaped by default, and
opted out via a modifier:
$tag = 'h1';
echo html"<#raw{$tag}>Hello #{$_GET['name']}</#raw{$tag}>";
If we took JS tagged templates as the inspiration, that would de-sugar
to something like:
echo render_html_string(
stringParts: [ '<', '>Hello ', '</', '>' ],
expressions: [ $tag, $_GET['name'], $tag ],
modifiers: [ ['raw'], [], ['raw'] ]
);
It will take a while to figure out the details, but I think a powerful
feature like this is more worthy of adding new syntax.
[1]
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Hi Rowan
I've been thinking about ways to improve string interpolation. String
interpolation in PHP is currently pretty limited to say the least.
Thanks for your feedback!
we could take the opportunity to bring more features into the mix
- modifiers; e.g. sprintf-style, f"Measure #0.2f{$value}" or
template-style, f"Hello #{$name|escape}"
sprintf-style modifiers are an interesting idea. But I'm not sure if
there would be a huge advantage over using number_format.
I see two issues with template-style filters. 1. The syntax is
ambiguous: $name|escape could mean "pass $name to the filter escape"
or "bitwise-or $name and the constant escape". We could disambiguate
with a different symbol ($name|>escape). 2. This strongly overlaps
with the pipe operator (https://wiki.php.net/rfc/pipe-operator-v2).
Pipe operators would probably automatically solve this use case
(although the given filters still have to be added).
- custom rules, like JavaScript's tagged templates [1], e.g. html"Hello
#{$name}" == "Hello " . htmlspecialchars($name)
Escaping is a nice idea but I'm not sure we want to go that route.
String interpolation is still a bad fit for many templating things,
like rendering lists (although possible with implode and array_map) or
if statements (although possible with ternary and : ''). For some
things you might potentially not expect escaping at all (script tags,
style attributes, etc). I'm not completely opposed to the idea but
it's not something I'd personally be interested to work on.
If we took JS tagged templates as the inspiration, that would de-sugar
to something like:
Interesting, I didn't know about this feature. Honestly, I can't think
of many use cases on the spot. It's very possible this is just my
limited imagination.
Luckily if we decide not adding tagged templates at first there's
nothing stopping us from adding them at a later point in time. The
same goes with the html tagged templates above.
Ilija
Hi Ilija,
sprintf-style modifiers are an interesting idea. But I'm not sure if
there would be a huge advantage over using number_format.[...]
Escaping is a nice idea but I'm not sure we want to go that route.
String interpolation is still a bad fit for many templating things...
The above reactions basically mirror my initial reaction to your initial
e-mail: an extended string interpolation feature sounds nice, but how does
it differ from string concatenation and function calls? That's why I
started thinking about additional features to bring in: I was trying to
think of the use cases this new syntax would be for, and what new things it
would make possible (or rather, easier).
If the proposal is to add a completely new string syntax, I feel like it
needs a stronger justification than being able to write $"Hello
#{Foo::bar()}\n" rather than "Hello " . Foo::bar() ."\n"
Regards,
Rowan Tommins
[IMSoP]