Hi all,
If $foo is not defined, statements such as $foo += 1 and $foo .= 'blah'
raise "undefined variable" Warnings in PHP 8, and will throw Errors in
PHP 9. However, the very similar looking $foo[] = 1 succeeds silently.
This seems odd to me, as "append to string" and "append to array" seem
like very similar operations, with most of the same use cases and
possible bugs.
From an implementation point of view, this is presumably because they
are defined as different Op Codes - ASSIGN_OP for .= and ASSIGN_DIM for
[]=, I believe. But that doesn't explain why ASSIGN_DIM behaves this way.
A historical explanation might relate to Perl's "autovivification"
feature. However, PHP does not implement the same rules as Perl: for
instance, Perl will create array dimensions just by reading them, which
PHP does not; and Perl has a completely different type system, with
cases like "a scalar becomes a reference to a hash unless already a
reference to a list". Note also that I'm not talking about
multi-dimensional arrays with missing keys here, only undefined local
variables, as were the subject of the recent RFC.
The observable behaviour for most operators in PHP is the same:
- if the variable is undefined, consider the value to be null
- coerce the value to the appropriate type; if the value is null, this
gives the relevant "empty" value, such as '', 0, or [] - apply the operator
There isn't anything particularly special with $foo[] = 'bar' in this
case, except that it's the only operator that doesn't raise a Warning
(and planned Error) at step 1. The same goes for all the other uses of
the [] syntax I can think of, like $foo['key'] = 'bar', or $ref =& $foo[].
For example, consider the following simple validation code
[https://3v4l.org/pP5CU]:
$requiredFields = ['name', 'age', 'hair_colour'];
// Note: $errorString is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorString .= "Missing required field '$field'. ";
}
}
echo $errorString;
This gives an "Undefined variable" Notice / Warning / Error, depending
on the version. That's reasonable, as failing to initialise $errorString
might cause problems if this code is integrated into a loop or larger
function later.
However, switch the code from building up a string to building up an
array, and the result is identical, but the Notice / Warning / Error
goes away [https://3v4l.org/ojZ1O]:
// Note: $errorArray is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorArray[] = "Missing required field '$field'. ";
}
}
echo implode('', $errorArray);
Can anyone give a compelling reason why we should keep this
inconsistency, or should I raise an RFC to bring it in line with other
undefined variable accesses?
Regards,
--
Rowan Tommins
[IMSoP]
Hi all,
If $foo is not defined, statements such as $foo += 1 and $foo .= 'blah'
raise "undefined variable" Warnings in PHP 8, and will throw Errors in
PHP 9. However, the very similar looking $foo[] = 1 succeeds silently.This seems odd to me, as "append to string" and "append to array" seem
like very similar operations, with most of the same use cases and
possible bugs.From an implementation point of view, this is presumably because they
are defined as different Op Codes - ASSIGN_OP for .= and ASSIGN_DIM for
[]=, I believe. But that doesn't explain why ASSIGN_DIM behaves this way.A historical explanation might relate to Perl's "autovivification"
feature. However, PHP does not implement the same rules as Perl: for
instance, Perl will create array dimensions just by reading them, which
PHP does not; and Perl has a completely different type system, with
cases like "a scalar becomes a reference to a hash unless already a
reference to a list". Note also that I'm not talking about
multi-dimensional arrays with missing keys here, only undefined local
variables, as were the subject of the recent RFC.The observable behaviour for most operators in PHP is the same:
- if the variable is undefined, consider the value to be null
- coerce the value to the appropriate type; if the value is null, this
gives the relevant "empty" value, such as '', 0, or []- apply the operator
There isn't anything particularly special with $foo[] = 'bar' in this
case, except that it's the only operator that doesn't raise a Warning
(and planned Error) at step 1. The same goes for all the other uses of
the [] syntax I can think of, like $foo['key'] = 'bar', or $ref =& $foo[].For example, consider the following simple validation code
[https://3v4l.org/pP5CU]:$requiredFields = ['name', 'age', 'hair_colour'];
// Note: $errorString is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorString .= "Missing required field '$field'. ";
}
}
echo $errorString;This gives an "Undefined variable" Notice / Warning / Error, depending
on the version. That's reasonable, as failing to initialise $errorString
might cause problems if this code is integrated into a loop or larger
function later.However, switch the code from building up a string to building up an
array, and the result is identical, but the Notice / Warning / Error
goes away [https://3v4l.org/ojZ1O]:// Note: $errorArray is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorArray[] = "Missing required field '$field'. ";
}
}
echo implode('', $errorArray);Can anyone give a compelling reason why we should keep this
inconsistency, or should I raise an RFC to bring it in line with other
undefined variable accesses?Regards,
--
Rowan Tommins
[IMSoP]--
To unsubscribe, visit: https://www.php.net/unsub.php
I think it's worth raising an RFC. There are other inconsistencies as
well now, depending on how the undefined variable comes to exist,
particularly around arrays.
For example:
$x = null;
$a = $x['nope'];
echo $a;
There should be an error here, but $a gets the value null and only a
warning is issued. Yet, this is already an error:
$x = new class { public string|null $nope; };
$a = $x->nope;
echo $a;
So, depending on how you json_decode()
you may or may not have issues.
Den 2022-03-29 kl. 21:44, skrev Rowan Tommins:
Hi all,
If $foo is not defined, statements such as $foo += 1 and $foo .= 'blah'
raise "undefined variable" Warnings in PHP 8, and will throw Errors in
PHP 9. However, the very similar looking $foo[] = 1 succeeds silently.This seems odd to me, as "append to string" and "append to array" seem
like very similar operations, with most of the same use cases and
possible bugs.From an implementation point of view, this is presumably because they
are defined as different Op Codes - ASSIGN_OP for .= and ASSIGN_DIM for
[]=, I believe. But that doesn't explain why ASSIGN_DIM behaves this way.A historical explanation might relate to Perl's "autovivification"
feature. However, PHP does not implement the same rules as Perl: for
instance, Perl will create array dimensions just by reading them, which
PHP does not; and Perl has a completely different type system, with
cases like "a scalar becomes a reference to a hash unless already a
reference to a list". Note also that I'm not talking about
multi-dimensional arrays with missing keys here, only undefined local
variables, as were the subject of the recent RFC.The observable behaviour for most operators in PHP is the same:
- if the variable is undefined, consider the value to be null
- coerce the value to the appropriate type; if the value is null, this
gives the relevant "empty" value, such as '', 0, or []- apply the operator
There isn't anything particularly special with $foo[] = 'bar' in this
case, except that it's the only operator that doesn't raise a Warning
(and planned Error) at step 1. The same goes for all the other uses of
the [] syntax I can think of, like $foo['key'] = 'bar', or $ref =& $foo[].For example, consider the following simple validation code
[https://3v4l.org/pP5CU]:$requiredFields = ['name', 'age', 'hair_colour'];
// Note: $errorString is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorString .= "Missing required field '$field'. ";
}
}
echo $errorString;This gives an "Undefined variable" Notice / Warning / Error, depending
on the version. That's reasonable, as failing to initialise $errorString
might cause problems if this code is integrated into a loop or larger
function later.However, switch the code from building up a string to building up an
array, and the result is identical, but the Notice / Warning / Error
goes away [https://3v4l.org/ojZ1O]:// Note: $errorArray is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorArray[] = "Missing required field '$field'. ";
}
}
echo implode('', $errorArray);Can anyone give a compelling reason why we should keep this
inconsistency, or should I raise an RFC to bring it in line with other
undefined variable accesses?
I think it deserves an RFC. Then we also capture your excellent
explanation above!
Regards //Björn L
RFC. There are other inconsistencies as well now, depending on how the
undefined variable comes to exist
absolutely, but shouldn't try to do too much in a single rfc, wouldn't want
it to be rejected for the wrong reasons ^^
On Wed, 30 Mar 2022 at 12:17, Björn Larsson via internals <
internals@lists.php.net> wrote:
Den 2022-03-29 kl. 21:44, skrev Rowan Tommins:
Hi all,
If $foo is not defined, statements such as $foo += 1 and $foo .= 'blah'
raise "undefined variable" Warnings in PHP 8, and will throw Errors in
PHP 9. However, the very similar looking $foo[] = 1 succeeds silently.This seems odd to me, as "append to string" and "append to array" seem
like very similar operations, with most of the same use cases and
possible bugs.From an implementation point of view, this is presumably because they
are defined as different Op Codes - ASSIGN_OP for .= and ASSIGN_DIM for
[]=, I believe. But that doesn't explain why ASSIGN_DIM behaves this
way.A historical explanation might relate to Perl's "autovivification"
feature. However, PHP does not implement the same rules as Perl: for
instance, Perl will create array dimensions just by reading them, which
PHP does not; and Perl has a completely different type system, with
cases like "a scalar becomes a reference to a hash unless already a
reference to a list". Note also that I'm not talking about
multi-dimensional arrays with missing keys here, only undefined local
variables, as were the subject of the recent RFC.The observable behaviour for most operators in PHP is the same:
- if the variable is undefined, consider the value to be null
- coerce the value to the appropriate type; if the value is null, this
gives the relevant "empty" value, such as '', 0, or []- apply the operator
There isn't anything particularly special with $foo[] = 'bar' in this
case, except that it's the only operator that doesn't raise a Warning
(and planned Error) at step 1. The same goes for all the other uses of
the [] syntax I can think of, like $foo['key'] = 'bar', or $ref =&
$foo[].For example, consider the following simple validation code
[https://3v4l.org/pP5CU]:$requiredFields = ['name', 'age', 'hair_colour'];
// Note: $errorString is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorString .= "Missing required field '$field'. ";
}
}
echo $errorString;This gives an "Undefined variable" Notice / Warning / Error, depending
on the version. That's reasonable, as failing to initialise $errorString
might cause problems if this code is integrated into a loop or larger
function later.However, switch the code from building up a string to building up an
array, and the result is identical, but the Notice / Warning / Error
goes away [https://3v4l.org/ojZ1O]:// Note: $errorArray is not initialised
foreach ( $requiredFields as $field ) {
if ( ! isset($_POST[$field]) ) {
$errorArray[] = "Missing required field '$field'. ";
}
}
echo implode('', $errorArray);Can anyone give a compelling reason why we should keep this
inconsistency, or should I raise an RFC to bring it in line with other
undefined variable accesses?I think it deserves an RFC. Then we also capture your excellent
explanation above!Regards //Björn L
--
To unsubscribe, visit: https://www.php.net/unsub.php
Le 29 mars 2022 à 21:44, Rowan Tommins rowan.collins@gmail.com a écrit :
Hi all,
If $foo is not defined, statements such as $foo += 1 and $foo .= 'blah' raise "undefined variable" Warnings in PHP 8, and will throw Errors in PHP 9. However, the very similar looking $foo[] = 1 succeeds silently.
This seems odd to me, as "append to string" and "append to array" seem like very similar operations, with most of the same use cases and possible bugs.
Hi,
There are various subcases to consider:
(1) $x[] = 42; and $x['y'] = 42; where $x is undefined
(2) $x[] = 42; and $x['y'] = 42; where $x is null
(3) $x['y'][] = 42; and $x['y']['w'] = 42; where $x is an array, and $x['y'] is undefined.
Of course, I agree that (1) should be an error.
The case (3) is similar to (1), but I think it is more controversial to change. I bet that they are various places in my code that take advantage of that feature, although personally I don’t mind to write $x['y'] ??= [ ]; when needed.
—Claude
Le 29 mars 2022 à 21:44, Rowan Tommins rowan.collins@gmail.com a écrit :
Hi all,
If $foo is not defined, statements such as $foo += 1 and $foo .= 'blah' raise "undefined variable" Warnings in PHP 8, and will throw Errors in PHP 9. However, the very similar looking $foo[] = 1 succeeds silently.
This seems odd to me, as "append to string" and "append to array" seem like very similar operations, with most of the same use cases and possible bugs.
Hi,
There are various subcases to consider:
(1) $x[] = 42; and $x['y'] = 42; where $x is undefined
(2) $x[] = 42; and $x['y'] = 42; where $x is null
(3) $x['y'][] = 42; and $x['y']['w'] = 42; where $x is an array, and
$x['y'] is undefined.Of course, I agree that (1) should be an error.
The case (3) is similar to (1), but I think it is more controversial to
change. I bet that they are various places in my code that take
advantage of that feature, although personally I don’t mind to write
$x['y'] ??= [ ]; when needed.
We should probably also consider if there are other places where we're comfortable with ??= working correctly. I'm not sure off hand if it should be acceptable in 1 or 2, but it's a question we should think through and document decisions on.
--Larry Garfield
Hi Rowan,
Not really a "compelling reason why we should keep this inconsistency", but
I have occasionally relied on array autovivification for sub-dimensions,
e.g.:
function f(iterable $xs) {
$map = []; // initialization!
foreach ($xs as $x) {
// $map[foo($x)] ??= []; not needed
$map[foo($x)][] = bar($x); // autovivification
}
// Then e.g.:
foreach ($map as $foo => $bars) {
foreach ($bars as $bar) {
/* ... */
}
}
}
(adapted from my https://externals.io/message/114595#114611 message in the
"Disable autovivification on false" thread).
On the other hand, I agree that $undefined[] = $x
looks like a bug... are
both cases the exact same opcode? (if yes, I wouldn't really mind updating
my code, especially for consistency with other "append" operators like .=
or +=, and that could even be an opportunity to rewrite it in a more
"functional" style...)
Regards,
--
Guilliam Xavier
On Wed, 30 Mar 2022 at 15:33, Guilliam Xavier guilliam.xavier@gmail.com
wrote:
Not really a "compelling reason why we should keep this inconsistency", but
I have occasionally relied on array autovivification for sub-dimensions,
I rely on this often when remapping data for analysis. These scripts are
run a handful of times and discarded. Autovivication keeps the code short.
Here's a snippet I wrote yesterday:
$out = [];
while ($row = $res->fetchAssociative()) {
// ...
$docId = $row['document_id'];
if (!isset($out[$docId])) {
$out[$docId] = [
'application_id' => $row['application_id'],
'document_id' => $docId,
'filename' => $row['filename'],
];
}
$out[$docId]['labels'][$row['document_rejection_reason_id']] =
true;
}
Naturally I would prefer to keep this behaviour for arrays.
Peter
On the other hand, I agree that
$undefined[] = $x
looks like a
bug... are both cases the exact same opcode?
Some very good points raised in this thread about how many different
closely-related cases there are here. I shall have to think clearly
about which cases we want to change, and look at the implementation to
see how easily those can be separated out from the others.
It sounds like there are at least some cases that might pass an RFC vote
to change, though.
Regards,
--
Rowan Tommins
[IMSoP]