Hi,
I've written a new draft RFC to address the NULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistency
This is due to the result of the Allow NULL
quiz:
https://quiz.craigfrancis.co.uk/
14 votes for Fatal Type Errors irrespective of strict_types=1
;
13 votes for NULL
coercion when not using strict_types=1
;
8 votes to update some parameters to allow NULL;
I appreciate some want to force strict type checking on everyone, but I
want to make sure we have this properly documented, with names, and
explanations.
Breaking changes should be justified - if they aren't, they only
make upgrading difficult and frustrating (bad for security).
Craig
Hi,
I've written a new draft RFC to address the
NULL
coercion problems:https://wiki.php.net/rfc/null_coercion_consistency
This is due to the result of the Allow
NULL
quiz:https://quiz.craigfrancis.co.uk/
14 votes for Fatal Type Errors irrespective of
strict_types=1
;
13 votes forNULL
coercion when not usingstrict_types=1
;
8 votes to update some parameters to allow NULL;I appreciate some want to force strict type checking on everyone, but I
want to make sure we have this properly documented, with names, and
explanations.Breaking changes should be justified - if they aren't, they only
make upgrading difficult and frustrating (bad for security).Craig
Hello,
While the RFC extensively documents the problem space, it's rather short on what
exactly it proposes.
In particular, does this propose changing user-defined functions under
strict_types=0 to accept null for scalar types?
Eg., this will be allowed (under strict_types=0):
function x(string $y, int $z) {
...
}
x(null, null); //no error, no warning
Regards,
Mel
In particular, does this propose changing user-defined functions under
strict_types=0 to accept null for scalar types?Eg., this will be allowed (under strict_types=0):
function x(string $y, int $z) { ... } x(null, null); //no error, no warning
tbh my focus has been on the problems that have come up with internal
functions.
With user defined functions, I think it's up for debate (still a draft),
but I think those NULL's should be coerced to the specified type (as
documented), where I don't think PHP should be doing type checking under
strict_types=0.
Craig
In particular, does this propose changing user-defined functions under
strict_types=0 to accept null for scalar types?With user defined functions, I think it's up for debate (still a draft),
but I think those NULL's should be coerced to the specified type (as
documented), where I don't think PHP should be doing type checking under
strict_types=0.
I've updated the RFC to note that user-defined functions don't cause a
backwards compatibility issue; but I have added an example to highlight
the coercion inconstancy:
function user_function(string $s, int $i, float $f, bool $b) {
var_dump($s, $i, $f, $b);
}
user_function('1', '1', '1', '1');
// string(1) "1" / int(1) / float(1) / bool(true)
user_function(2, 2, 2, 2);
// string(1) "2" / int(2) / float(2) / bool(true)
user_function(3.3, 3.3, 3.3, 3.3);
// string(3) "3.3" / int(3) / float(3.3) / bool(true)
user_function(false, false, false, false);
// string(0) "" / int(0) / float(0) / bool(false)
user_function(NULL, NULL, NULL, NULL);
// Uncaught TypeError x4?
Hi,
Thanks for the draft.
On Sat, Apr 9, 2022 at 11:30 AM Craig Francis craig@craigfrancis.co.uk
wrote:
On Fri, 8 Apr 2022 at 19:07, Craig Francis craig@craigfrancis.co.uk
wrote:In particular, does this propose changing user-defined functions under
strict_types=0 to accept null for scalar types?With user defined functions, I think it's up for debate (still a draft),
but I think those NULL's should be coerced to the specified type (as
documented), where I don't think PHP should be doing type checking under
strict_types=0.I've updated the RFC to note that user-defined functions don't cause a
backwards compatibility issue; but I have added an example to highlight
the coercion inconstancy:
You've updated the Documentation section (also: did you mean
"inconsistency" rather than "inconstancy"?) but still not the Proposal
(BTW all those sections between "Introduction" and "Proposal" would
probably better be sub-sections of a section named "Problem" or "Current
State" or something).
And for the question: (currently) the RFC is named "NULL Coercion
Consistency" and the Proposal says "Must keep the spirit of the original
RFC, and keep user-defined and internal functions consistent.", which (to
me) implies not only reverting the 8.1 deprecation on internal functions
but also "changing user-defined functions under strict_types=0 to
[coerce] null for scalar type[ declaration]s" indeed.
In any case, that should be written clear in the RFC (either in "Proposal"
or "Open Issues").
Regards,
--
Guilliam Xavier
On Mon, 11 Apr 2022 at 13:05, Guilliam Xavier guilliam.xavier@gmail.com
wrote:
You've updated the Documentation section (also: did you mean
"inconsistency" rather than "inconstancy"?) but still not the Proposal
(BTW all those sections between "Introduction" and "Proposal" would
probably better be sub-sections of a section named "Problem" or "Current
State" or something).And for the question: (currently) the RFC is named "NULL Coercion
Consistency" and the Proposal says "Must keep the spirit of the original
RFC, and keep user-defined and internal functions consistent.", which (to
me) implies not only reverting the 8.1 deprecation on internal functions
but also "changing user-defined functions under strict_types=0 to
[coerce] null for scalar type[ declaration]s" indeed.
In any case, that should be written clear in the RFC (either in "Proposal"
or "Open Issues").
Thank you Guilliam, I've applied all of your suggestions (with a slight
tweak to use strict_types=1
, just because I find it easier to read).
While I am open to for user-defined functions to keep the type check for
NULL
when not in strict_types=1
, it does feel like a bug, and I think it
would be better to be consistent (happy to discuss on or off list).
Craig
Hi,
I've written a new draft RFC to address the
NULL
coercion problems:
The statement that:
But the implementation caused a Type Error when coercing
NULL
for everyone
(even when not using strict_types=1), this seems more of an over-sight
is utterly wrong and was a conscious design choice based on the widely
accepted view that nullable by default like Java is a mistake and defeats
the purpose.
Even the competing RFC from Zend et al [1] proposed this behaviour
A new set of coercion rules will apply to both user-land type hints and
internal type hints. The guiding principals behind these new rules are:
- If the type of the value is an exact match to the type requested by
the hint - allow.- If the value can be coerced to the type requested by the hint
without data loss and without creation of likely unintended data - allow.- In all other cases - reject.
And later to describe the internal function changes:
This RFC proposes to bring the rule-set described in the last section to
internal functions as well, through updates to the zend_parse_parameters()
function.
There was obviously the caveat about NULL
being coerced for internals
functions, but other than usage, another reasons stated:
However, since this requires substantial auditing of internal functions -
especially ones that have default values but don't explicitly declare
themselves as accepting NULLs - it's outside the scope of this RFC and
will be revisited for 7.1. Note that if the
https://wiki.php.net/rfc/nullable_types
https://wiki.php.net/rfc/nullable_types_rfc is accepted it will further
reduce this discrepancy, by allowing user-land functions and internal
functions the same level of granularity in terms of accepting or rejecting
NULL
values for function arguments.
This auditing was to a large extent performed in PHP 8.0 with the
introduction of the stubs.
You can disagree with some of it, but this again brings us back to amending
functions to have nullable arguments, not a blanket change.
I appreciate some want to force strict type checking on everyone, but I
want to make sure we have this properly documented, with names, and
explanations.Breaking changes should be justified - if they aren't, they only
make upgrading difficult and frustrating (bad for security).
The breaking change was justified and explained, and voted unanimously in
favour with 46 votes.
The burden of justification is now on your end.
Moreover, the breaking change has also been announced with ample time to
fix the issues until the next major version.
As with the previous one, this RFC is getting a strong no from my end,
changing the type system into a broken and inconsistent way, and reversing
a unanimously voted RFC needs strong justification.
Best,
George Peter Banyard
But the implementation caused a Type Error when coercing
NULL
for everyone(even when not using strict_types=1), this seems more of an over-sight
is utterly wrong and was a conscious design choice based on the widely
accepted view that nullable by default like Java is a mistake and defeats
the purpose.
Bit hash... anyway, I've replaced that paragraph (it wasn't clear enough),
I just wanted to note how developers not using strict_types=1
would see
it as an over-sight that coercion works for string/int/float/bool but not
null (despite what the documentation says), and how it's introduced a type
check (that they didn't ask for).
I assume you're using strict_types=1
, and will be unaffected by this, so
with all due respect, I'm not considering how you view or use NULL, nor do
I care what Java does, I'm simply focused on how most developers use NULL
in PHP.
"2. If the value can be coerced to the type requested by the hint without
data loss and without creation of likely unintended data - allow"
Yep, as noted under the Documentation heading, and copying from the manual,
“PHP will coerce values of the wrong type into the expected scalar type
declaration if possible”... which it can, and does; for example, coercion
to a string "null is always converted to an empty string".
https://www.php.net/manual/en/language.types.declarations.php#language.types.declarations.strict
https://www.php.net/manual/en/language.types.string.php
[...] but this again brings us back to amending functions to have nullable
arguments, not a blanket change.
As noted in the Introduction, second paragraph, unfortunately this won't
work (especially considering the ~335 parameters affected by this)... just
taking htmlspecialchars()
as the most simple example, I asked a few
developers who use strict_types=1
, and they did not want $string
to
be a nullable argument, because NULL
being passed to this argument suggests
there's a bug elsewhere in their code.
This view was also reflected in the quiz (the names are links to view their
answers):
https://quiz.craigfrancis.co.uk/
The breaking change was justified and explained, and voted unanimously in
favour with 46 votes.
The burden of justification is now on your end.
My whole RFC is the justification, from the lack of discussion of the
impact (where I will note that I am in favour of its intention of
consistency), the roughly 85% of scripts that do not use strict_types=1
,
the almost certainly less than 33% developers who use static analysis, how
tools do not and will not fix this problem, how this goes against the
documentation, and all of the examples.
Keep in mind, if you're using strict_types=1
, great, you're not affected
(I'm not either)... however, I work with quite a few developers, and it's
causing them a lot of problems, and their current approach is to either
disable the deprecation warnings, or to stay with 8.0, simply because they
do not have the time to make the changes... I will note that one team has
been trying to update their code base, it's about 6 months later, and they
still keep adding strval()
to silence these deprecation warnings (yeah, I
know, not good, but it's the easy solution).
changing the type system into a broken and inconsistent way, and reversing
a unanimously voted RFC needs strong justification.
I really appreciate your time to read my draft RFC, and I know there is no
way I can change your mind, but can you explain how would this be "broken
and inconsistent"?
Just take NULL
to String coercion as an example, the documentation clearly
and simply says “null is always converted to an empty string”, this needs
to be updated to end with "... except when being passed to a function, in
which case NULL
will result in a Type Error" (I think that's "broken and
inconsistent").
Craig
Keep in mind, if you're using
strict_types=1
, great, you're not affected
(I'm not either)... however, I work with quite a few developers, and it's
causing them a lot of problems, and their current approach is to either
disable the deprecation warnings, or to stay with 8.0, simply because they
do not have the time to make the changes... I will note that one team has
been trying to update their code base, it's about 6 months later, and they
still keep addingstrval()
to silence these deprecation warnings (yeah, I
know, not good, but it's the easy solution).
This doesn't only apply to end user developers, but also to library and
toolchain developers who need to maintain code covering a range of PHP
versions, and who also need to make these trade-offs.
It's those maintainers that have to deal with the impact of this type of
change, whether needing to apply changes to their libraries to
"suppress" the deprecation notice; or dealing with end-user developers
raising the same issues time and again about it because they don't want
to see deprecation notices appearing in their logs. Even though this was
only a deprecation, most end-user developers don't recognise that
difference from logged errors when using libraries, so the impact is
felt most heavily by library and toolchain maintainers. This 8.1
deprecation was the one that created the most work, and the most stress
for me; and even with an average of 85% code coverage with tests, I'm
still not certain that I've resolved every instance using the dirty $x ?? ''
approach when calling PHP core functions. In many ways, readying
code for the 8.1 release because of this deprecation was more painful
and time consuming than for the 8.0 release. With all the new
deprecations being added for 8.2, I'm looking forward to it even less.
That's also why I wish there was a better risk assessment made for every
deprecation RFC; becase all too often there's little or none; and why I
get frustrated whenever a question is raised about an RFC, and the
person raising these concerns is too often shouted down for raising "a
storm in a teacup".
--
Mark Baker
|. \ -3
|J/ PHP |
|| | __ |
|| |m| |m|
I LOVE PHP
This doesn't only apply to end user developers, but also to library and
toolchain developers who need to maintain code covering a range of PHP
versions
Yep, and thank you for commenting.
Library authors are the ones receiving pressure at the moment, and while
users of the library sometimes help (e.g. pull requests), it's a fairly
thankless and time consuming task, with little/no value for scripts not
using strict_types=1
(you're not the only one simply adding ?? ''
everywhere).
Just one thing to add, this is from people who are complaining about a
deprecation notice (which can and is being ignored), now think of the
typical non-library code when it becomes a Fatal Error.
Craig
Just take
NULL
to String coercion as an example, the documentation
clearly
and simply says “null is always converted to an empty string”, this needs
to be updated to end with "... except when being passed to a function, in
which caseNULL
will result in a Type Error" (I think that's "broken and
inconsistent").
You are taking parts of the documentation out of context, and omitting
the start of the whole "Converting to string" section:
"A value can be converted to a string using the (string) cast or the
strval()
function. String conversion is automatically done in the scope
of an expression where a string is needed. This happens when using the
echo or print functions, or when a variable is compared to a string. The
sections on Types and Type Juggling will make the following clearer."
In the same section it is also defined how resources and arrays get
converted to strings. Following your argument, you should be able to
pass arrays and resources to a string parameter because there is a clear
definition in the documentation of what then happens.
I unfortunately really don't like the direction of your RFC, because I
see null as a real type and because it increases the divide between
strict_types and not using strict_types. I have never used strict_types
- in modern PHP code I find it unnecessary, if you use parameter,
property and return types. I am glad PHP has decreased the difference
between using strict_types or not, so your direction of trying to
increase the difference seems like the wrong way to go. Ideally I would
rather want to see a future where strict_types can be removed/reconciled.
You are taking parts of the documentation out of context, and omitting
the start of the whole "Converting to string" section:"A value can be converted to a string using the (string) cast or the
strval()
function. String conversion is automatically done in the scope
of an expression where a string is needed. This happens when using the
echo or print functions, or when a variable is compared to a string. The
sections on Types and Type Juggling will make the following clearer."
I'm sorry, I've read this several times now, and I'm not sure what this
adds. My RFC simply quotes the paragraphs that explain how null is coerced
(other than the small abbreviation for booleans, those paragraphs are
included in their entirety), I don't think it needs any more context than
that, it's not like I'm mis-representing how coercion works (and is used)
in PHP.
I could expand my first quote to include "... For example, a function that
is given an int for a parameter that expects a string will get a variable
of type string", but I don't think it's that useful.
In the same section it is also defined how resources and arrays get
converted to strings. Following your argument, you should be able to
pass arrays and resources to a string parameter because there is a clear
definition in the documentation of what then happens.
No, my RFC is about the problems this change has caused, how it is making
it difficult to upgrade. I'm only referencing the documentation to confirm
how NULL
has always been coerced.
I unfortunately really don't like the direction of your RFC, because I
see null as a real type and because it increases the divide between
strict_types and not using strict_types. I have never used strict_types
- in modern PHP code I find it unnecessary, if you use parameter,
property and return types. I am glad PHP has decreased the difference
between using strict_types or not, so your direction of trying to
increase the difference seems like the wrong way to go. Ideally I would
rather want to see a future where strict_types can be removed/reconciled.
You can see NULL
however you like, but most developers do not share that
view. NULL
has been passed to these functions, since, well, forever; and
changing code to manually convert NULL
to the relevant type is incredibly
time consuming, and of questionable value (e.g. when developers simply add
strval()
everywhere).
Either way, I'm not trying to "increase the difference" (this is how it has
always worked). If you want type checking, good, it can be really useful,
and I recommend it with static analysis... but it's enabled with
strict_types=1
.
Craig
You are taking parts of the documentation out of context, and omitting the start of the whole "Converting to string" section: "A value can be converted to a string using the (string) cast or the `strval()` function. String conversion is automatically done in the scope of an expression where a string is needed. This happens when using the echo or print functions, or when a variable is compared to a string. The sections on Types and Type Juggling will make the following clearer."
I'm sorry, I've read this several times now, and I'm not sure what
this adds. My RFC simply quotes the paragraphs that explain how null
is coerced (other than the small abbreviation for booleans, those
paragraphs are included in their entirety), I don't think it needs any
more context than that, it's not like I'm mis-representing how
coercion works (and is used) in PHP.
Mentioning the documentation as a reason to be "consistent" (which comes
up again and again in your arguments with this RFC) just seems like a
bogus reason to me. It is nitpicking about specific sentences in the
documentation without refering to the surrounding context. It would be
nicer to argue according to real technical arguments instead of
arguments about sentences taken out of context in the documentation.
You can see
NULL
however you like, but most developers do not share
that view.NULL
has been passed to these functions, since, well,
forever; and changing code to manually convertNULL
to the relevant
type is incredibly time consuming, and of questionable value (e.g.
when developers simply addstrval()
everywhere).
"Most developers do not share that view". I find statements such as
these as a tall order - did you interview the majority of developers
around the world? Are you suggesting you are talking as a representative
of a large population of PHP developers? How do you think you got such a
role? I would prefer some humility and not assume you are an elected
leader of an underrepresented group of developers and even knowing the
intention behind their code.
"NULL has been passed to these functions, since, well, forever" - this
also goes for wrong values being passed to functions since forever
leading to security issues and bugs. That is the whole point of
developing a language: Reducing the surface for bugs, improving parts of
it where a lot of bugs have happened historically and not making the
same mistakes as other languages (and learning from the good parts of
other languages). More people writing code in a certain way does not
make that code better or safer.
Mentioning the documentation as a reason to be "consistent" (which comes
up again and again in your arguments with this RFC) just seems like a
bogus reason to me. It is nitpicking about specific sentences in the
documentation without refering to the surrounding context. It would be
nicer to argue according to real technical arguments instead of
arguments about sentences taken out of context in the documentation.
I don't, do I?
I've checked multiple times, and I only use the documentation to note how
NULL
coercion works. I'm using the word "consistent" to be "consistent with
scalar types"... as in, a string/int/float/bool can be coerced, but NULL
coercion has now been deprecated in some cases, as shown with the examples.
"Most developers do not share that view". I find statements such as
these as a tall order - did you interview the majority of developers
around the world? Are you suggesting you are talking as a representative
of a large population of PHP developers? How do you think you got such a
role?
My intro says "Roughly 85% scripts do not use strict_types=1", and "Roughly
33% of developers use static analysis" (source of numbers noted in my
RFC)... while that does not guarantee anything, it's a fairly good
indication that most developers do not care about types (in regards to
coercion), or specifically, how you "see null as a real type".
As an aside, I would like to include more stats (maybe WordPress install
numbers?), but I couldn't think of any which were easy to source (ref
reliability), or be that useful to the discussion.
I would prefer some humility and not assume you are an elected
leader of an underrepresented group of developers and even knowing the
intention behind their code.
I'm simply describing the situation, and I have suggested two ways to avoid
the upgrade problems (the suggestion to update some arguments to be
nullable has been rejected by written feedback, and via my questionnaire).
"NULL has been passed to these functions, since, well, forever" - this
also goes for wrong values being passed to functions since forever
leading to security issues and bugs.
I say in the RFC that "Arrays, Resources, and Objects (without
__toString()) cannot be coerced (for fairly obvious reasons)".
For example htmlspecialchars(array())
will, as of 8.0, throw a Type Error
(and a warning before), that does represent a bug (it should not happen),
and I'm very happy with that situation. But NULL
coercion is different, it
has worked well for years, and it is used a lot (see the examples in my
RFC).
Craig
My intro says "Roughly 85% scripts do not use strict_types=1", and "Roughly
33% of developers use static analysis" (source of numbers noted in my
RFC)... while that does not guarantee anything, it's a fairly good
indication that most developers do not care about types (in regards to
coercion), or specifically, how you "see null as a real type".As an aside, I would like to include more stats (maybe WordPress install
numbers?), but I couldn't think of any which were easy to source (ref
reliability), or be that useful to the discussion.
I have never used strict_types in any code I have ever written, and I
care about types and type coercions. Yet I do not like the strict_types
distinction and I am glad that I do not need to use it, and I think we
are not that far away from even reconciling these two systems. I do not
mind that the string "5" can be passed to an int parameter, while
passing the string "hello" will throw a type exception no matter what
mode you use. With some adjustments to certain coercions the advantage
of strict_types would be even more neglible - as you can pass "hello" to
a boolean parameter without problems currently, and I don't like that.
Looking at usage numbers and assuming that supports your point seems
like a big leap to me. You are assuming these developers are making a
choice while it seems quite possible that a lot of code has been written
without thinking about all the implications, which is where so many bugs
are coming from. In the last months reviewing code I noticed quite a few
developers are copy-pasting code like crazy, or using generated code
from other tools - both of which are often not well reasoned about. So
this has little to do with developers not caring about methods of type
coercions and more about the language permitting such code, leading to
many follow-up problems.
Also, this "problem" only affects the internal functions, yet you want
to change it for all functions. If you use a framework and pass a value
to a method of that framework, it would already have created a type
error with null and a non-nullable argument for quite some time. I think
you are making this out to be a bigger problem than it is. Sure, in a
codebase with lots of custom code where no incoming variable is checked
if it even exists there will be problems, but even then it is easily
fixable. In a codebase based on a common framework it is much less of a
problem, as you will probably be using lots of framework components and
not mainly internal functions, and there you already have the
"limitation" of not being able to pass null to a non-nullable method.
I have never used strict_types in any code I have ever written, and I care
about types and type coercions. Yet I do not like the strict_types
distinction and I am glad that I do not need to use it, and I think we are
not that far away from even reconciling these two systems. I do not mind
that the string "5" can be passed to an int parameter, while passing the
string "hello" will throw a type exception no matter what mode you use.
With some adjustments to certain coercions the advantage of strict_types
would be even more neglible - as you can pass "hello" to a boolean
parameter without problems currently, and I don't like that.
Yep, I agree with everything you have said there, and it's what George
seems to be working towards (which I also agree with):
https://github.com/Girgias/unify-typing-modes-rfc
Looking at usage numbers and assuming that supports your point seems like
a big leap to me. You are assuming these developers are making a choice
while it seems quite possible that a lot of code has been written without
thinking about all the implications,
Actually, that's exactly what I see happening - NULL
is passed to these
internal functions without any thought, and unlike arrays/resources/etc
which cannot be sensibly coerced, NULL
coercion has been fine, and has
worked for (as long as I can tell) forever.
Back to the most simple example I can think of:
$search = filter_input(INPUT_GET, 'q'); // Or any of the framework examples.
echo 'Results for ' . htmlspecialchars($search);
It does not matter if the request explicitly set "/?q=" in the URL or not;
filter_input()
and frameworks show there is a difference by returning
NULL
or an Empty String... but most of the time, no thought is given to
that difference, developers just expect it to work.
which is where so many bugs are coming from. In the last months reviewing
code I noticed quite a few developers are copy-pasting code like crazy, or
using generated code from other tools - both of which are often not well
reasoned about. So this has little to do with developers not caring about
methods of type coercions and more about the language permitting such code,
leading to many follow-up problems.
Yep, and that's why I'm in favour of stopping the broken/weird coercions...
it's just that NULL
is different, and it's fine (as noted, there are some
developers who use it part of their strict type checking, but that seems to
be fairly unusual).
Also, this "problem" only affects the internal functions, yet you want to
change it for all functions.
My RFC is focused on the backwards compatibility problems for internal
functions... user defined functions are only included for consistency
reasons (keeping the spirit of the original RFC), because they probably
should coerce NULL, like how they coerce integers to strings, etc.
If you use a framework and pass a value to a method of that framework, it
would already have created a type
error with null and a non-nullable argument for quite some time.
Yep, and that has caused... I wouldn't say problems, but minor confusion?
Craig
Yep, I agree with everything you have said there, and it's what George
seems to be working towards (which I also agree with):
Yet your RFC goes exactly against the quoted document by making the
differences between strict_types larger.
I also cannot explain why
NULL
should be rejected, other than for those
developers who seeNULL
as an invalid value and also use
strict_types=1
... as in, when a team of developers spends a few hundred
hours addingstrval()
everywhere, what does that actually achieve? what do
they tell the client? does it make the code easier to read? does it avoid
any bugs? or is it only for 8.1 compatibility?
I don't get why you would add strval everywhere. Why are you getting
null everywhere? In most of the examples in your RFC "null" is
specifically chosen as a default value and could easily be an empty
string (why do something like "$_POST['value'] ?? null" if you actually
want an empty string?). For the framework examples, the second argument
of these functions is the default value - it does not have to be null.
And "filter_input" I have not seen in a codebase yet, probably because
of its weird double function as retrieving an input variable and also
filtering it, with both null, false and the result as a possible return
value.
The few cases where I encountered the deprecation notice it was always a
mistake and easy to fix, and I was glad that was pointed out to me.
There is another 3.5 years until PHP 9 is likely to come out, which is a
lot of time for people to adjust their codebase. I could even see an
argument for not promoting it to a fatal error in 9.0 if so many people
need more time. Removing the deprecation and changing fundamental
aspects about the type system in PHP by auto-coercing null just so
people do not need to update their codebase seems like the worst
possible way forward.
So far, your RFC does not mention the arguments made against your
proposed changes in an understandable way. George Peter Banyard wrote
some extensive arguments and you have only included one sentence without
any of his arguments and try to refute it in the very next sentence as
not really an argument, and it was mentioned by him that coercing null
in other languages like Java has lead to problems. Comparisons to other
languages could be helpful, as NULL
is not just a value in PHP - NULL
in
MySQL for example also cannot be coerced and is its own distinct value
(it even has its own syntax for comparisons).
I am still bothered by the language of the RFC in general where you
often write things like ".. it hasn't been the case for everyone else",
"most developers are not in a position" and so on. Assuming you know
what everyone is doing and how everyone is doing it in an RFC does not
seem constructive. All the numbers you cite are also circumstantial and
not about the supposed problem discussed in the RFC - for example, you
assume people not using static analysis are in favor of your RFC, even
though this is pure speculation. Compared to all the other RFCs in the
last 3 years I read through I don't only disagree with this RFC, I also
think it is not very well reasoned about and does not convey the
discussions that were held about the topic on this mailing list. It
mainly contains your opinion, which seems insufficient for an RFC.
I don't get why you would add strval everywhere. Why are you getting null
everywhere?
As to adding strval($var)
, or (string) $var
, or $var ?? ""
everywhere... that's because we (or frameworks) cannot simply change the
defaults for these "Common sources of NULL", nor can everyone simply add
strval()
around these sources, because NULL
can still be useful, with
possible NULL
checks later (e.g. if you need to distinguish between a
missing value and an empty string)... the only safe/easy way to do it is at
the sinks, and there are lots of them.
Unfortunately we cannot magically update all of the existing code in the
world... these changes have to be done manually (note the lack of tooling),
and the time taken should be justified (like how the removal of magic
quotes was justified).
In most of the examples in your RFC "null" is specifically chosen as a
default value and could easily be an empty string (why do something like
"$_POST['value'] ?? null" if you actually want an empty string?).
You could, but these are common and useful patterns/sources... and most
importantly, NULL
has been the default for years.
For the framework examples, the second argument of these functions is the
default value - it does not have to be null.
Yes, but we are talking about backwards compatibility... and while most
developers could always specify the default (second argument) to be an
empty string, that's not always appropriate (frameworks default to NULL
so
developers can tell when a value has not been provided, which can be useful
in some cases).
And "filter_input" I have not seen in a codebase yet, probably because of
its weird double function as retrieving an input variable and also
filtering it, with both null, false and the result as a possible return
value.
Sorry, but it is used, and it can return NULL.
The few cases where I encountered the deprecation notice it was always a
mistake and easy to fix, and I was glad that was pointed out to me.
Lucky you... and as noted with Rowan, getting to look at your code again
can be useful. I also note that "making each change is fairly easy", but
you will have much better luck with static analysis (where variables are
reliably checked if they can be nullable). But the important bit is that
these problems "are difficult to find" and "there are many of them".
If you want a very simple public example:
https://github.com/PHPCompatibility/PHPCompatibility/commit/ab7b0b82d95c82e62f9cffb0f1960f3ccbf0805e
Or, if you want to do some digging:
https://twitter.com/mwop/status/1441044164880355342
There is another 3.5 years until PHP 9 is likely to come out, which is a
lot of time for people to adjust their codebase. I could even see an
argument for not promoting it to a fatal error in 9.0 if so many people
need more time.
If it's deprecated, that is an intent to break... and if no other solutions
present themselves, and the vote for this RFC fails... why would it be
delayed? It will then be clear the Internals developers want this (they
would have made an informed choice, and put their name to it).
Removing the deprecation and changing fundamental aspects about the type
system in PHP by auto-coercing null just so people do not need to update
their codebase seems like the worst possible way forward.
Many times NULL
coercion is fine, like how the integer 5 can be coerced to
the string "5"... and it's worth remembering this is how it's always worked
(when not using strict_types=1
).
So far, your RFC does not mention the arguments made against your proposed
changes in an understandable way.
Are you going to suggest any improvements? what have I missed? I'm trying
to keep it short, because I know long RFC's can be a problem.
George Peter Banyard wrote some extensive arguments and you have only
included one sentence without any of his arguments and try to refute it in
the very next sentence as not really an argument,
The "one sentence" you refer to, I assume that's "Userland scalar types
[...] did not include coercion from NULL
for very good reasons"?
Unfortunately the email does not say what these "good reasons" are, so I
could only go with the Scalar Type Declarations RFC, where I summarised the
only paragraph that talks about NULL
(in regards to weak type checks). I
think “to be consistent with our existing type declarations” is a pretty
good quote summary of that paragraph, and I think it's fair for me to add
"no further details given". The comment that NULL
was "never accepted for
scalar type declarations" has already been discussed, and does not add
anything extra.
https://wiki.php.net/rfc/scalar_type_hints_v5#behaviour_of_weak_type_checks
As to the 2 emails from George... I read them very carefully, several
times, I have covered those details in my RFC, and I did reply... but just
to be absolutely sure I haven't missed anything:
https://externals.io/message/117501#117515
- The Type Error when coercing
NULL
seeming to be an oversight, I replied
with an explanation, and tweaked the wording of that paragraph:
-
Java's "nullable by default" is one way
NULL
is used, I don't think this
adds anything new. -
The RFC from Zend, I covered in my reply, specifically the second rule
"If the value can be coerced", whichNULL
can be (I will note that some
developers seeNULL
as an invalid value, but that's already mentioned in my
RFC). -
"amending functions to have nullable arguments", it was the basis of my
previous RFC, I've noted in this in this RFC, and in a few of my replies.
As an aside, no one seems to agree thathtmlspecialchars()
should
explicitly acceptNULL
(due to the reason noted in the second paragraph),
let alone the other ~334 parameters. -
"The breaking change was justified and explained", I've covered in my
RFC and reply, the justification was about consistency between internal and
external functions (which I talk about in my RFC, and agree with in
spirit), and the impact was not explained (ref "the lack of discussion of
the impact", and the single email from Craig Duncan). -
"the breaking change has also been announced with ample time to fix the
issues", see the "Temporary Solutions" section in my RFC, on how this is
being ignored, and the complete lack of tooling to make these changes.
https://externals.io/message/117501#117523
-
"strict_types were a mistake", yep, that's fine, I just don't think it
adds anything to my RFC. -
"making coercive typing mode more sensible", going through the quoted
RFC from Girgias, that seems to be fine, and I agree with. -
"Userland scalar types [...] did not include coercion from
NULL
for very
good reasons.", as noted above, I added this to the Open Issues on the 15th
April. -
"general consensus that internals and userland should behave
identically", I agree with in spirit, and has been included since my very
first draft - where I said we must "keep user-defined and internal
functions consistent". -
"align internal behaviour with the accepted better design choice of
userland", I'm not sure I can quote this, as "better" needs to be described
and explained. In short, it's not adding anything that hasn't already been
noted above.
I don't think the rest of the second email raises any other points that are
relevant to my RFC (it's more about how PHP is developed).
I am still bothered by the language of the RFC in general where you often
write things like ".. it hasn't been the case for everyone else", "most
developers are not in a position" and so on. Assuming you know what
everyone is doing and how everyone is doing it in an RFC does not seem
constructive. All the numbers you cite are also circumstantial and not
about the supposed problem discussed in the RFC.
I am human, and I wrote it in a way that I hope is understandable.
That said, I'm getting the feeling you are the one intentionally quoting
"out of context" (1st quote can only apply to those using strict_types=1
,
so by definition "it hasn't been the case for everyone else"; 2nd is about
the problem of how "most developers are not in a position" to use "very
strict Static Analysis", which I back up with the low percentage usage of
strict_types=1
and the quoted JetBrains developer survey).
Admittedly I've only got the 2 stats in my RFC intro, but they are
relevant, and I did say to you on 14 Apr 2022 "I would like to include more
stats (maybe WordPress install numbers?), but I couldn't think of any which
were easy to source (ref reliability), or be that useful to the discussion"
... do you have anything to add here?
- for example, you assume people not using static analysis are in favor of
your RFC, even though this is pure speculation.
What? Where?
Compared to all the other RFCs in the last 3 years I read through I don't
only disagree with this RFC, I also think it is not very well reasoned
about and does not convey the discussions that were held about the topic on
this mailing list. It mainly contains your opinion, which seems
insufficient for an RFC.
I have spent considerable amount of time replying to your points, and as
noted above, I have included peoples comments in my RFC. It is also my
second attempt at solving the problem due to feedback. If you can give me
something/anything to contribute to the RFC, I will happily add/update it;
or if you can provide any other solutions to this problem (please don't
pretend it doesn't exist), then I will happily discuss or even recommend
them.
Craig
There is another 3.5 years until PHP 9 is likely to come out, which is a
lot of time for people to adjust their codebase. I could even see an
argument for not promoting it to a fatal error in 9.0 if so many people
need more time.If it's deprecated, that is an intent to break... and if no other solutions
present themselves, and the vote for this RFC fails... why would it be
delayed? It will then be clear the Internals developers want this (they
would have made an informed choice, and put their name to it).
A deprecation notice is fairly harmless. If in two years many projects
and many developers say that they need more time to fix these
deprecations and promoting them to errors would cause a lot of problems,
then it would be easy to prolong the period where it is only a
deprecation with a small RFC. By then more people will know if they are
impacted, many frameworks will have updated, and there will be a clearer
picture if this is such a big deal or not. Right now this is not clear -
I doubt most projects are using PHP 8.1, not even all
frameworks/libraries are compatible to PHP 8.1.
Are you going to suggest any improvements? what have I missed? I'm trying
to keep it short, because I know long RFC's can be a problem.
An RFC should cover the discussions held on this mailing list. From the
RFC howto:
"Update your RFC to document all the issues and discussions. Cover both
the positive and negative arguments."
Do you honestly believe you have done that? I have tried to discuss some
counterpoints and alternatives to your proposal, but none are mentioned
in the RFC. I also don't see the discussion points of other people in
the RFC. None of the alternatives to your proposal are mentioned in the
RFC - like changing the internal functions to accept null instead. There
have been quite a few suggestions and arguments made so far, and I don't
see them in the RFC.
I have discussed RFCs with a few people on this mailing list, sometimes
with very different opinions about a topic, and not always was there a
resolution to the topic at hand. Yet the discussions with you have been
the most frustrating so far, because you say you are open to arguments
and proposals, yet you do not seem to consider them at all - yet you
really seem to believe you are totally open-minded. I have been
impressed by a few people on this mailing list who I disagreed with
wholeheartedly, because I noticed with how much care they tried to relay
disagreeing arguments and other proposals in the RFC and how balanced
the RFCs were at the end. Your RFC seems totally incomplete to me and
mainly based on your made-up opinion. But at this point I don't think I
will get through to you in any way, which is why I will step out of this
conversation. If the RFC comes to a vote in the current conditions
though I will raise an objection that it does not represent the
discussions held on this mailing list and should be disregarded because
of that.
There is another 3.5 years until PHP 9 is likely to come out, which is a lot of time for people to adjust their codebase. I could even see an argument for not promoting it to a fatal error in 9.0 if so many people need more time.
If it's deprecated, that is an intent to break... and if no other solutions present themselves, and the vote for this RFC fails... why would it be delayed? It will then be clear the Internals developers want this (they would have made an informed choice, and put their name to it).
A deprecation notice is fairly harmless. If in two years many projects and many developers say that they need more time to fix these deprecations and promoting them to errors would cause a lot of problems, then it would be easy to prolong the period where it is only a deprecation with a small RFC. By then more people will know if they are impacted, many frameworks will have updated, and there will be a clearer picture if this is such a big deal or not. Right now this is not clear - I doubt most projects are using PHP 8.1, not even all frameworks/libraries are compatible to PHP 8.1.
You're right, I am working with a few development teams that are simply not upgrading to 8.1 at the moment, they don't have the time to fix this issue... two are using set_error_handler()
to ignore this specific deprecation notice, and one team has moved over, but they are still finding these issues ~6 months later.
Are you going to suggest any improvements? what have I missed? I'm trying to keep it short, because I know long RFC's can be a problem.
An RFC should cover the discussions held on this mailing list. From the RFC howto:
"Update your RFC to document all the issues and discussions. Cover both the positive and negative arguments."
Do you honestly believe you have done that? I have tried to discuss some counterpoints and alternatives to your proposal, but none are mentioned in the RFC. I also don't see the discussion points of other people in the RFC. None of the alternatives to your proposal are mentioned in the RFC - like changing the internal functions to accept null instead. There have been quite a few suggestions and arguments made so far, and I don't see them in the RFC.
I'll ask again, what have I missed? as in, please, give me details.
Last time you vaguely said that "George Peter Banyard wrote some extensive arguments", I spent at least an hour going though every single point (again), and this time I listed them out for you, and you ignored that, not even an apology.
The only point you made in that last paragraph is "changing the internal functions to accept null", it's already there, under the heading "Rejected Features", which includes the link to the RFC I created, and the reason it was rejected. That reason is also noted in the second paragraph of the Introduction.
I have discussed RFCs with a few people on this mailing list, sometimes with very different opinions about a topic, and not always was there a resolution to the topic at hand. Yet the discussions with you have been the most frustrating so far, because you say you are open to arguments and proposals, yet you do not seem to consider them at all - yet you really seem to believe you are totally open-minded. I have been impressed by a few people on this mailing list who I disagreed with wholeheartedly, because I noticed with how much care they tried to relay disagreeing arguments and other proposals in the RFC and how balanced the RFCs were at the end. Your RFC seems totally incomplete to me and mainly based on your made-up opinion. But at this point I don't think I will get through to you in any way, which is why I will step out of this conversation. If the RFC comes to a vote in the current conditions though I will raise an objection that it does not represent the discussions held on this mailing list and should be disregarded because of that.
I have no idea how to respond to this, I have spent a considerable amount of time carefully reading your 5 emails, responding to every single point you raised (which you simply ignore), ensuring all points are in my RFC, and this is your attitude? no apology, no thanks, just vague complaints and insults?
If you have any objection that is not already noted in the RFC, please, let me know what it is... maybe "Andreas does not like the wording in this RFC, or the stats provided, but does not provide any alternative"?
Craig
You're right, I am working with a few development teams that are simply not upgrading to 8.1 at the moment, they don't have the time to fix this issue... two are using
set_error_handler()
to ignore this specific deprecation notice, and one team has moved over, but they are still finding these issues ~6 months later.
Taking time to find and make fixes is the whole point of deprecation
notices - if nothing else changes, they can expect to have another 3.5
years before a version of PHP is released where these become errors.
Upgrading to PHP 8.1 as soon as possible but only tackling the
deprecations when time allows is absolutely the right thing to do.
Regards,
--
Rowan Tommins
[IMSoP]
Taking time to find and make fixes is the whole point of deprecation notices - if nothing else changes, they can expect to have another 3.5 years before a version of PHP is released where these become errors. Upgrading to PHP 8.1 as soon as possible but only tackling the deprecations when time allows is absolutely the right thing to do.
That's true for things that need to be fixed, like the deprecation and removal of magic quotes, but this is a lot of work for... I can't provide an answer to that.
In the mean time, I've only convinced 3 teams to upgrade to 8.1, and two of them are using set_error_handler()
to specifically ignore this error (the log files were filled with far too many "passing null to parameter" deprecation messages, whereas adding #[ReturnTypeWillChange] wasn't too bad).
Craig
On Mon, 11 Apr 2022 at 19:22, Craig Francis craig@craigfrancis.co.uk
wrote:
But the implementation caused a Type Error when coercing
NULL
foreveryone (even when not using strict_types=1), this seems more of an
over-sightis utterly wrong and was a conscious design choice based on the widely
accepted view that nullable by default like Java is a mistake and defeats
the purpose.Bit hash... anyway, I've replaced that paragraph (it wasn't clear enough),
I just wanted to note how developers not usingstrict_types=1
would see
it as an over-sight that coercion works for string/int/float/bool but not
null (despite what the documentation says), and how it's introduced a type
check (that they didn't ask for).I assume you're using
strict_types=1
, and will be unaffected by this, so
with all due respect, I'm not considering how you view or use NULL, nor do
I care what Java does, I'm simply focused on how most developers useNULL
in PHP.
If I indeed exclusively used strict_types I wouldn't care, but I don't
because IMHO strict_types were a mistake and are harmful and induce some
self inflicted problems.
I've spent a large amount of time making coercive typing mode more sensible
and aligning the behaviour as close to reasonably possible with
strict_types so that the possibility of dropping strict_types all together
wouldn't be outrageous.
I've written about this elsewhere and on this list already but main points
are located here: https://github.com/Girgias/unify-typing-modes-rfc
Userland scalar types, however they would have been introduced, did not
include coercion from NULL
for very good reasons.
Wanting to change this behaviour needs more justification than a paragraph
in docs, as the docs should reflect the reality of the implementation.
The second aspect is the general consensus that internals and userland
should behave identically.
Which is why the deprecation is here in the first place, to align internal
behaviour with the accepted better design choice of userland.
Your RFC is going against these two pillars.
Moreover, the fact that we are deprecating this and not just changing the
behaviour from one version to another is that we do recognize PHP
developers use NULL
in this way, and we are giving them advance notice of
the change.
Also a deprecation notice should, IMHO, never be promoted to an exception.
This is maybe something we need to get better at teaching end users so that
they don't pressure library maintainers (or themselves) to fix all of them
when they still have a couple of years.
Be that either on the php.net docs or elsewhere, to teach end users to
either not do this, or at least exclude the vendor directory from their
error handler.
However, and I cannot stress this enough, we cannot just stop using this
tool available to us to steer the language into being better.
Because then the other two choices are doing hard breaks with no notice in
a major version, or not breaking anything. Both things which are
unreasonable to expect.
There are behaviours of PHP that will probably never be "fixable", but this
should not prevent us from improving the aspects we can.
And if we only focused on how developers use PHP we would still have
register globals, magic quotes, etc.
Clearly it seems you won't listen to why these changes were made and want
to reverse course, therefore good luck convincing enough people to accept
such an RFC.
I think I've said all there is to say and won't interact further with any
such proposals.
Regards,
George P. Banyard
https://wiki.php.net/rfc/null_coercion_consistency
I've spent a large amount of time making coercive typing mode more
sensible and aligning the behaviour as close to reasonably possible with
strict_types so that the possibility of dropping strict_types all together
wouldn't be outrageous.
I've written about this elsewhere and on this list already but main points
are located here: https://github.com/Girgias/unify-typing-modes-rfc
Thank you George, I do appreciate your work on this, I agree with pretty
much everything you have written, and I really do want PHP to move in that
direction.
The only thing I disagree with is NULL
coercion, because it has worked
perfectly well for years, and is used so often.
Userland scalar types, however they would have been introduced, did not
include coercion from
NULL
for very good reasons.
Wanting to change this behaviour needs more justification than a paragraph
in docs, as the docs should reflect the reality of the implementation.
Some people see NULL
as an invalid value, which can sometimes be useful in
a strict_types=1
environment, but that's the only "good reason" I can
find... it's why I changed my mind about making some parameters nullable,
and took a different approach with this RFC.
But most developers do not use NULL
like that, and that's why it's passed
so often to the ~335 function parameters I've noted (most of them are on
the list because they realistically receive user values, which can be NULL
in most frameworks).
The second aspect is the general consensus that internals and userland
should behave identically.
Yep, they really should... it's why I said we "Must keep the spirit of the
original RFC, and keep user-defined and internal functions consistent".
Moreover, the fact that we are deprecating this and not just changing the
behaviour from one version to another is that we do recognize PHP
developers useNULL
in this way, and we are giving them advance notice of
the change.
Also a deprecation notice should, IMHO, never be promoted to an exception.
Unfortunately deprecation notices are being ignored (as noted in my RFC,
with examples on how); and there is the intention to use Type Error
Exceptions in 9.0 (the thing I'm trying to avoid).
I'm approaching this from a security point of view - the last thing I want
to do in ~5 years is auditing code/systems where they have stuck with 8.x
because it's too hard to upgrade.
There are behaviours of PHP that will probably never be "fixable", but this
should not prevent us from improving the aspects we can.
And if we only focused on how developers use PHP we would still have
register globals, magic quotes, etc.
I want the language to be better as well, and I was very glad to see the
back of Register Globals and Magic Quotes (I know they were a pain to
remove, but it was well justified, as they caused so many problems).
Clearly it seems you won't listen to why these changes were made and want
to reverse course, therefore good luck convincing enough people to accept
such an RFC.
I have been listening/reading (it's why I've taken about 2 hours to read
what you have written, and reply). I have also been discussing this with
different people (on and off list)... unlike the original RFC discussion,
where it was only Craig Duncan that mentioned this issue:
https://externals.io/message/112327#112531
And it's because I've been listening to people that I changed my mind, and
created this new RFC (as noted above).
I think I've said all there is to say and won't interact further with any
such proposals.
That's a shame, as I'd like to discuss solutions to this problem you would
be happy with.
Craig
https://wiki.php.net/rfc/null_coercion_consistency
I've spent a large amount of time making coercive typing mode more
sensible and aligning the behaviour as close to reasonably possible with
strict_types so that the possibility of dropping strict_types all together
wouldn't be outrageous.
I've written about this elsewhere and on this list already but main points
are located here: https://github.com/Girgias/unify-typing-modes-rfcThank you George, I do appreciate your work on this, I agree with pretty
much everything you have written, and I really do want PHP to move in that
direction.The only thing I disagree with is
NULL
coercion, because it has worked
perfectly well for years, and is used so often.Userland scalar types, however they would have been introduced, did not
include coercion from
NULL
for very good reasons.
Wanting to change this behaviour needs more justification than a paragraph
in docs, as the docs should reflect the reality of the implementation.Some people see
NULL
as an invalid value, which can sometimes be useful in
astrict_types=1
environment, but that's the only "good reason" I can
find... it's why I changed my mind about making some parameters nullable,
and took a different approach with this RFC.But most developers do not use
NULL
like that, and that's why it's passed
so often to the ~335 function parameters I've noted (most of them are on
the list because they realistically receive user values, which can beNULL
in most frameworks).The second aspect is the general consensus that internals and userland
should behave identically.
Yep, they really should... it's why I said we "Must keep the spirit of the
original RFC, and keep user-defined and internal functions consistent".Moreover, the fact that we are deprecating this and not just changing the
behaviour from one version to another is that we do recognize PHP
developers useNULL
in this way, and we are giving them advance notice of
the change.
Also a deprecation notice should, IMHO, never be promoted to an exception.Unfortunately deprecation notices are being ignored (as noted in my RFC,
with examples on how); and there is the intention to use Type Error
Exceptions in 9.0 (the thing I'm trying to avoid).I'm approaching this from a security point of view - the last thing I want
to do in ~5 years is auditing code/systems where they have stuck with 8.x
because it's too hard to upgrade.There are behaviours of PHP that will probably never be "fixable", but this
should not prevent us from improving the aspects we can.
And if we only focused on how developers use PHP we would still have
register globals, magic quotes, etc.I want the language to be better as well, and I was very glad to see the
back of Register Globals and Magic Quotes (I know they were a pain to
remove, but it was well justified, as they caused so many problems).Clearly it seems you won't listen to why these changes were made and want
to reverse course, therefore good luck convincing enough people to accept
such an RFC.I have been listening/reading (it's why I've taken about 2 hours to read
what you have written, and reply). I have also been discussing this with
different people (on and off list)... unlike the original RFC discussion,
where it was only Craig Duncan that mentioned this issue:https://externals.io/message/112327#112531
And it's because I've been listening to people that I changed my mind, and
created this new RFC (as noted above).I think I've said all there is to say and won't interact further with any
such proposals.
That's a shame, as I'd like to discuss solutions to this problem you would
be happy with.Craig
I don't have a vote, so my opinion only goes so far. But here's my 2¢.
I see null as a real type
This confuses me. I see "false" and "true" becoming their own types,
so what is a boolean? The language currently has no way to express
aliases of types, otherwise I'd say a boolean is an alias "true|false"
but it doesn't. What happens if a function returns the "false" type
and I pass it to a function that only allows "bool"? I expect it to
work. So naturally, there is some coercion in there as well, and it
would be great if null got the same treatment where old codebases
benefit without taking away the advantage for newer codebases.
I see this RFC as a balance in that regard, because real life is
messy. I really hope it passes and good luck!
Robert Landers
Software Engineer
Utrecht NL
On Thu, 14 Apr 2022 at 08:31, Robert Landers landers.robert@gmail.com
wrote:
I see null as a real type
This confuses me...
Andreas is probably the best person to explain their view; oddly I see NULL
as it's own type as well, because are there are times where it's useful to
determine the difference between NULL
and an Empty String; but when looking
at examples like user values (e.g. GET/POST/COOKIE) being checked (e.g.
strlen()
or preg_match()
), encoded (e.g. shown on a web page with
htmlspecialchars()
), or simply modified (e.g. trim()
), I don't see why
those possible NULL
values should lead to a Fatal Type Error (unless you
are talking about an environment that uses NULL
as an invalid value, but I
suspect that's quite rare).
I see this RFC as a balance in that regard, because real life is
messy. I really hope it passes and good luck!
Thank you Robert, it's right the balance I'm trying to find.
Craig
I've written a new draft RFC to address the
NULL
coercion problems:
I'm sympathetic to the general problem you're trying to solve, but I'm not convinced by the argument that this is about consistency, because user defined functions have been consistently rejecting null for non-nullable parameters since PHP 7.0, and even before that for arrays and objects. Consistency is a good argument for small changes that eliminate unusual edge cases, but I think far-reaching changes should be able to stand on their own merits - regardless of whether it's consistent, do we want the language to work this way?
To me, the main defence of type coercion is that PHP operates in a world full of strings - URLs are strings, HTTP requests are strings, and a lot of API responses are strings - so making it easy to work with strings like '42' as numbers makes a lot of sense. It's not clear to me that this extends to null.
I think a large part of the problem here is that null can mean many different things in different contexts - "unknown", "not provided", "invalid input", "default", "not applicable", etc.
These differences are subtle, but lead to different expectations of behaviour:
- Treat null as a specific case with its own meaning, distinct from any other valid value. This is what explicitly nullable parameters and union types allow.
- Treat null the same as any other out of range value, and raise an error. This is what happens in user-defined functions in PHP, and in built-in functions expecting non-scalar arguments. Compare also out of range actions like division by zero.
- Treat null as a special state that propagates through expressions, because any operation with an unknown input has an unknown output. This is the approach to null taken by SQL, and by IEEE floating point with NaN. It is also the basis of the ?-> operator, and of things like Optional.map in Swift.
- Treat null as a generic default value which can be filled in implicitly based on requirements. This is the interpretation currently taken by internal functions for scalar arguments, and what you are proposing to make standard. It is also, as you point out, the way PHP treats null in some other contexts, such as many operators.
Interestingly, one of your examples mentions filter_input, which takes the "propagate" approach, and htmlspecialchars, which doesn't. It would often be more useful to retain the information that a value is null than to have it silently converted to an empty string as a side-effect of some other operation.
Perhaps it would be useful to have some function-call equivalent of the ?-> operator. I'm not sure what this would look like for normal function calls, but it would be easy to add if we had a pipe operator, e.g.:
If this was equivalent to htmlspecialchars($foo)
$foo |> htmlspecialchars(...)
Then this could be equivalent to ($foo === null ? null : htmlspecialchars($foo))
$foo ?|> htmlspecialchars(...)
I'm not set against this RFC, but I'm not quite convinced by the case it makes, and think there may still be other options to explore.
Regards,
--
Rowan Tommins
[IMSoP]
On 8 April 2022 18:34:52 BST, Craig Francis craig@craigfrancis.co.uk
wrote:I've written a new draft RFC to address the
NULL
coercion problems:I'm sympathetic to the general problem you're trying to solve, but I'm not
convinced by the argument that this is about consistency, because user
defined functions have been consistently rejecting null for non-nullable
parameters since PHP 7.0, and even before that for arrays and objects.
Thanks for looking at this Rowan.
I am open to discussing different solutions (this is my second attempt),
and any suggestions are welcome.
In regards to consistency, while I think my RFC does that, it's not my main
motivation for creating this RFC - which is to address the backwards
compatibility issues.
But to expand on consistency, I appreciate user defined functions (since
7.0) didn't coerce NULL, and I do want internal/user defined functions to
be consistent with each other (why I agree with the spirit of the original
RFC), it's just odd that string/int/float/bool can be coerced, but NULL
cannot be coerced in this context and can be in other contexts, e.g.
$i = 0;
$n = NULL;
if ($i == $n) { // Fine
}
print($i); // Fine
print($n); // Fine
printf('%s', $i); // Fine
printf('%s', $n); // Fine
echo 'ConCat ' . $i . $n; // Fine
echo htmlspecialchars($i); // Fine
echo htmlspecialchars($n . ''); // Fine
echo htmlspecialchars($n); // Bad???
Consistency is a good argument for small changes that eliminate unusual
edge cases, but I think far-reaching changes should be able to stand on
their own merits - regardless of whether it's consistent, do we want the
language to work this way?
Good question, while I mention consistency, my focus is on backwards
compatibility, and how PHP has accepted NULL
to these ~335 parameters
since, well, forever... my fear is that the upgrade to PHP 9 will be so
difficult (as noted with the lack of tooling to solve this), I'll be
auditing PHP code that's stuck on 8.x for many years to come.
https://twitter.com/mwop/status/1441044164880355342
To me, the main defence of type coercion is that PHP operates in a world
full of strings - URLs are strings, HTTP requests are strings, and a lot of
API responses are strings - so making it easy to work with strings like
'42' as numbers makes a lot of sense. It's not clear to me that this
extends to null.
Yep, PHP does work in a world of strings, and for simplicity
string coercion works fairly well... but I think I can include NULL
in that
definition, because HTTP requests do not guarantee values have been
provided, where PHP (via filter_input) and frameworks (Laravel, Symfony,
CakePHP, CodeIgniter, etc), have used NULL
to note when a value has not
been provided... and many developers have simply not cared about that
distinction (e.g. when passing to htmlspecialchars, trim, strlen,
preg_match, etc).
I think a large part of the problem here is that null can mean many
different things in different contexts - "unknown", "not provided",
"invalid input", "default", "not applicable", etc.These differences are subtle, but lead to different expectations of
behaviour:
- Treat null as a specific case with its own meaning, distinct from any
other valid value. This is what explicitly nullable parameters and union
types allow.- Treat null the same as any other out of range value, and raise an error.
This is what happens in user-defined functions in PHP, and in built-in
functions expecting non-scalar arguments. Compare also out of range actions
like division by zero.- Treat null as a special state that propagates through expressions,
because any operation with an unknown input has an unknown output. This is
the approach to null taken by SQL, and by IEEE floating point with NaN. It
is also the basis of the ?-> operator, and of things like Optional.map in
Swift.- Treat null as a generic default value which can be filled in implicitly
based on requirements. This is the interpretation currently taken by
internal functions for scalar arguments, and what you are proposing to make
standard. It is also, as you point out, the way PHP treats null in some
other contexts, such as many operators.
Thank you, that's a really good explanation.
The developers I work with would assume the last definition, in the same
way they can pass in an integer to urlencode()
, where they just don't
think about it being implicitly coerced into a string, it just works
(TM)... that's not to say the strict_types=1
style of checking, and NULL
being treated as an invalid value, doesn't provide value for some
developers (but as shown in the stats, they are in the minority).
Interestingly, one of your examples mentions filter_input, which takes the
"propagate" approach, and htmlspecialchars, which doesn't. It would often
be more useful to retain the information that a value is null than to have
it silently converted to an empty string as a side-effect of some other
operation.
Those are interesting points.
I think filter_input()
is a bit different, as it's a source of data (not
exactly propagating); whereas all of the other functions, they either
process that data (and return a sting, and do not propagate) or work with
it (e.g. preg_match and strcmp returning a bool, strlen returning an
integer, etc).
Retaining NULL
is interesting, but it would be a new thing...
Perhaps it would be useful to have some function-call equivalent of the ?->
operator. I'm not sure what this would look like for normal function calls,
but it would be easy to add if we had a pipe operator, e.g.:If this was equivalent to htmlspecialchars($foo)
$foo |> htmlspecialchars(...)Then this could be equivalent to ($foo === null ? null :
htmlspecialchars($foo))
$foo ?|> htmlspecialchars(...)
... adding that syntax might be useful in some contexts (when the developer
wants to make NULL
easier to propagate); but that would be a new feature,
and I don't think it solves the backwards compatibility issue we currently
face, and what I'm trying to address in this RFC.
I'm not set against this RFC, but I'm not quite convinced by the case it
makes, and think there may still be other options to explore.
I'm open to any suggestions, I just want to avoid the situation where Fatal
Type Errors happen seemingly randomly (as in, someone who has never used
type checks before, and they use NULL
without any thought about it).
I'm even open to the idea of someone coming up with a tool to auto-update
existing code, but so far I've only heard people say that is the way it
should be done (with absolutely no detail on how it would even begin to
work).
I also cannot explain why NULL
should be rejected, other than for those
developers who see NULL
as an invalid value and also use
strict_types=1
... as in, when a team of developers spends a few hundred
hours adding strval()
everywhere, what does that actually achieve? what do
they tell the client? does it make the code easier to read? does it avoid
any bugs? or is it only for 8.1 compatibility?
Anyway, thanks again Rowan, I really appreciate your thoughts on this.
Craig
The developers I work with would assume the last definition
I think you've somewhat missed my point. I wasn't talking about people's
habits or preferences, I was talking about different scenarios where
null is used to mean different things.
Yes, some people prefer languages that "fails early" and some are more
interested in "do what I mean", but not everything is about that.
I also cannot explain why
NULL
should be rejected ... does it avoid
any bugs?
Yes, sometimes. Imagine an array of values provided by the user; during
validation, those which were optional and not provided get set to null.
You then loop through and display those which were provided:
foreach ( $fields as $name => $value ) {
if ( $value !== null ) {
echo "<strong>$name:</strong> $value <br>\n";
}
}
Then, you realise you forgot about escaping, and decide to run
everything through htmlspecialchars()
:
$htmlFields = array_map('htmlspecialchars', $fields);
foreach ( $htmlFields as $name => $value ) {
if ( $value !== null ) {
echo "<strong>$name:</strong> $value <br>\n";
}
}
Spot the bug? $value will now never be null, because htmlspecialchars()
will silently turn the nulls into empty strings.
I've also seen the opposite problem: a string function was removed
because it was no longer needed, and the code broke because values,
including nulls, were no longer being cast to string.
Is protecting against this worth the backwards compatibility cost of
changing the behaviour, and requiring extra code in other scenarios?
Possibly not. But that's different from not having any benefit.
Regards,
--
Rowan Tommins
[IMSoP]
The developers I work with would assume the last definition
I think you've somewhat missed my point. I wasn't talking about people's
habits or preferences, I was talking about different scenarios where
null is used to mean different things.
Yep, and I'm thankful you listed them... I'm just trying to focus on how
PHP has worked (implicitly based on requirements), and how changing that
causes problems (and will be an even bigger problem when it becomes a Fatal
Type Error).
Yes, some people prefer languages that "fails early" and some are more
interested in "do what I mean", but not everything is about that.
Agreed, the only thing I'd add... failing early with NULL
might help debug
some problems (assuming there is a problem), but I believe static analysis
and unit tests are much better at doing this (e.g. static analysis is able
to see if a variable can be NULL, and apply those stricter checks, as noted
by George with Girgias RFC).
In contrast, failing early at runtime, on something that is not actually a
problem, like the examples in my RFC, creates 2 problems (primarily
upgrading; but also adding unnecessary code to explicitly cast those
possible NULL
values to the correct type, which isn't needed for the other
scalar types).
I also cannot explain why
NULL
should be rejected ... does it avoid
any bugs?Yes, sometimes. Imagine an array of values provided by the user; during
validation, those which were optional and not provided get set to null.
You then loop through and display those which were provided:foreach ( $fields as $name => $value ) {
if ( $value !== null ) {
echo "<strong>$name:</strong> $value <br>\n";
}
}Then, you realise you forgot about escaping, and decide to run
everything throughhtmlspecialchars()
:$htmlFields = array_map('htmlspecialchars', $fields);
foreach ( $htmlFields as $name => $value ) {
if ( $value !== null ) {
echo "<strong>$name:</strong> $value <br>\n";
}
}Spot the bug? $value will now never be null, because
htmlspecialchars()
will silently turn the nulls into empty strings.
Thank you... but I will add, while htmlspecialchars()
rejecting NULL
would get you to look at the code again, I wouldn't say it's directly
picking up the problem, and it relies on there being a NULL
value
in $fields for this to be noticed (if that doesen't happen, you're now in a
position where random errors will start happening in production).
This is where I'd note the value of unit tests, as they are much better
placed to check this feature.
Bit of a tangent, I'm uncomfortable that $name
is not being HTML encoded,
which takes us to context aware templating engines, and how you can
identify these mistakes via the is_literal
RFC or the literal-string
type.
I've also seen the opposite problem: a string function was removed
because it was no longer needed, and the code broke because values,
including nulls, were no longer being cast to string.Is protecting against this worth the backwards compatibility cost of
changing the behaviour, and requiring extra code in other scenarios?
Possibly not. But that's different from not having any benefit.
That's fair, adding in an extra check might give some benefit in some
cases, but I don't think it's reliable, or worth the cost in updating
existing code (with no tooling to help) and the additional code that will
be needed to cast these possible NULL
values to the relevant type (which
isn't needed for the other scalar types).
That said, if you (or anyone) have any better ideas on how to address this
problem, please let me know.
Craig
In contrast, failing early at runtime, on something that is not actually a
problem, like the examples in my RFC, creates 2 problems (primarily
upgrading; but also adding unnecessary code to explicitly cast those
possibleNULL
values to the correct type, which isn't needed for the other
scalar types).
Null is not a scalar type[1], though. This is the reason why it is
not coerced for userland functions using scalar type hints with
coercive typing.
[1] https://www.php.net/is_scalar
--
Christoph M. Becker
Null is not a scalar type[1], though. This is the reason why it is
not coerced for userland functions using scalar type hints with
coercive typing.
But why not? and how many developers actually care how PHP has defined a
scalar type and how that relates to coercion specifically (and only) with
functions?
Bit of a tangent, I could argue that NULL
is closer to a simple
string/int/float/bool, than a complex array/resource/object (well, an
object without a __toString method)... or in other words, NULL
is a simple
value that is coerced perfectly well in other contexts, and has
been coerced for internal functions since, well, forever.
Anyway, the problem I'm trying to solve is the backwards compatibility
issue this is causing, and how it will affect upgrading when it becomes a
Fatal Type Error... if you have any suggestions on how to solve this,
please let me know, for now this is the best solution I've got.
Craig
On Wed, 20 Apr 2022 at 18:02, Craig Francis craig@craigfrancis.co.uk
wrote:
I'm just trying to focus on how PHP has worked
You keep repeating this mantra, but user-defined functions with declared
parameter types have never accepted nulls, in any mode, unless explicitly
marked with "?" or "= null". The fact that internal functions have
parameter parsing behaviour that is almost impossible to implement in
userland, and often not even consistent between functions, is a wart of
engine internals, not a design decision.
All of that, and the "consistency" in the title of your RFC, is a complete
distraction from the real questions:
- given a null input, and a non-nullable parameter, what should the
run-time do? - what is the best way to help users update their code to be compatible
with that new behaviour?
Agreed, the only thing I'd add... failing early with
NULL
might help
debug some problems (assuming there is a problem), but I believe static
analysis and unit tests are much better at doing this (e.g. static analysis
is able to see if a variable can be NULL, and apply those stricter checks,
as noted by George with Girgias RFC).
I think we should institute a "swear jar" - whenever someone says "static
analysis could do this better", they have to donate €1 to the PHP
Foundation. :P
More seriously, 1) your own RFC claims that fewer than 33% of developers
use static analysis; and 2) if PHP had a compile step with mandatory static
analysis, we would still have to decide whether passing NULL
to these
functions should be rejected as an error by the static analyser.
In contrast, failing early at runtime, on something that is not actually a
problem, like the examples in my RFC, creates 2 problems (primarily
upgrading; but also adding unnecessary code to explicitly cast those
possibleNULL
values to the correct type, which isn't needed for the other
scalar types).
I've been trying not to nit-pick, because I think over-all you do have a
valid point, but several of your examples do not need any extra code to
handle nulls; and those that do need maybe a handful of characters. For
instance, $search = ($_GET['q'] ?? ''); is both shorter and clearer than
$search = ($_GET['q'] ?? NULL);
Thank you... but I will add, while
htmlspecialchars()
rejectingNULL
would get you to look at the code again, I wouldn't say it's directly
picking up the problem, and it relies on there being aNULL
value
in $fields for this to be noticed (if that doesen't happen, you're now in a
position where random errors will start happening in production).
There are always going to be errors that depend on the value of input, and
unless you have god-like skill at writing tests, some of those will only
happen in production. Saying "this shouldn't happen" doesn't answer the
question of what to do when it does.
Bit of a tangent, I'm uncomfortable that
$name
is not being HTML
encoded, which takes us to context aware templating engines, and how you
can identify these mistakes via theis_literal
RFC or the
literal-string
type.
This isn't real code, it's an illustrative example; but if it makes you
feel more comfortable, imagine $name has some deliberate HTML markup in it,
so needs to be echo'd raw.
That said, if you (or anyone) have any better ideas on how to address this
problem, please let me know.
I honestly would be more interested in having some string functions return
null for null input than changing existing behaviour to accept null for
non-nullable parameters (interestingly, until PHP 8.0, htmlspecialchars()
could return null, e.g. if given an array). Unfortunately, that would be a
different kind of compatibility break, so I'm not sure it fully solves the
problem.
Regards,
Rowan Tommins
[IMSoP]
Am 21.04.2022 um 16:09 schrieb Rowan Tommins rowan.collins@gmail.com:
All of that, and the "consistency" in the title of your RFC, is a complete
distraction from the real questions:
- given a null input, and a non-nullable parameter, what should the
run-time do?- what is the best way to help users update their code to be compatible
with that new behaviour?
You are leaving out option 3 (which is not part of the RFC but should still be on the table IMHO):
3) Leave the behavior but change the parameter definition to nullable to match the implementation.
For instance, $search = ($_GET['q'] ?? ''); is both shorter and clearer than
$search = ($_GET['q'] ?? NULL);
Your version is lossy as you cannot distinguish between "empty query" and "no query was submitted" any longer.
While this is normally easily fixed checking other flags it needs additional work and cannot be done with a simple search and replace.
Regards,
- Chris
You are leaving out option 3 (which is not part of the RFC but should still be on the table IMHO):
3) Leave the behavior but change the parameter definition to nullable to match the implementation.
Those weren't options, they were questions, or problem statements. Also, as he's repeatedly pointed out, Craig suggested making the parameters explicitly nullable, and received clear feedback that that wouldn't be popular.
For instance, $search = ($_GET['q'] ?? ''); is both shorter and clearer than
$search = ($_GET['q'] ?? NULL);Your version is lossy as you cannot distinguish between "empty query" and "no query was submitted" any longer.
The example passes the variable to htmlspecialchars, so this is exactly what will happen anyway - it returns an empty string for a null input. If you want to distinguish between null and empty string, you must avoid passing it to such a function, so the proposed error is a good thing. The whole argument in favour of coercing rests on code that doesn't need to distinguish between null and empty string.
Note that as well as being shorter, this is more appropriate than using strval()
or (string), which will accept other types as well as null.
Regards,
--
Rowan Tommins
[IMSoP]
I'm just trying to focus on how PHP has worked
You keep repeating this mantra, but user-defined functions with declared parameter types have never accepted nulls, in any mode, unless explicitly marked with "?" or "= null".
Thanks Rowan, and you're right that user defined functions haven't worked like that, but that's not really what I'm focused on - which is how PHP has always worked in regards to internal functions (the bit that has changed, and is causing problems), and how NULL
coercion works in other contexts.
I mention user defined functions because I still want user and internal functions to work in the same way (one form of consistency).
The fact that internal functions have parameter parsing behaviour that is almost impossible to implement in userland, and often not even consistent between functions, is a wart of engine internals, not a design decision.
Bit of a tangent, but do you have some examples? would be nice to clean some of these up, or at least have them in mind as we discuss this RFC.
All of that, and the "consistency" in the title of your RFC, is a complete distraction from the real questions:
- given a null input, and a non-nullable parameter, what should the run-time do?
- what is the best way to help users update their code to be compatible with that new behaviour?
I'm happy to change the title, but consistency does apply in different ways... as in, ensuring user and internal function parameters work in the same way (even if that way might need to change slightly); how other simple values like string/int/float/bool values can still be coerced for parameters, but NULL
won't be; and how NULL
coercion does work in other contexts (string contact, ==
comparison, sprintf, print, array keys).
To answer your questions,
-
I think
NULL
inputs should be coerced for non-nullable parameters, when not usingstrict_types=1
, and where static analysis tools provide the additional/stricter checks (like how they can reject a string '5' for an integer parameter). With nullable parameters being different by acceptingNULL
without coercion. -
With this approach, developers won't need to update their code... in contrast, the approach of
NULL
causing a Type Error for internal functions, developers will need to change their code, and I cannot find any tools that help with this, except very strict static analysis to find but not update them (it's difficult tracing every variable from source to sink).
Agreed, the only thing I'd add... failing early with
NULL
might help debug some problems (assuming there is a problem), but I believe static analysis and unit tests are much better at doing this (e.g. static analysis is able to see if a variable can be NULL, and apply those stricter checks, as noted by George with Girgias RFC).I think we should institute a "swear jar" - whenever someone says "static analysis could do this better", they have to donate €1 to the PHP Foundation. :P
Cool, I've got 23 credits left this month :-)
So I'll spend 1 more... I think it's fair to say that developers using strict_types=1
are more likely to be using static analysis; and if strict_types=1
is going to eventually disappear, those developers won't lose any functionality with the stricter checking being done by static analysis, which can check all possible variable types (more reliable than runtime), and (with the appropriate level of strictness) static analysis can do things like rejecting the string '5' being passed to an integer parameter and null being passed to a non-nullable parameter.
More seriously, 1) your own RFC claims that fewer than 33% of developers use static analysis; and 2) if PHP had a compile step with mandatory static analysis, we would still have to decide whether passing
NULL
to these functions should be rejected as an error by the static analyser.
Not sure how to respond to that... do you think PHP will have a compile step, with mandatory static analysis? Considering how static analysis tools today have several levels, and the ability to create a baseline (to ignore problems with old code), are you suggesting that everyone would have the strictest checks enabled, where no coercion happens at all?
In contrast, failing early at runtime, on something that is not actually a problem, like the examples in my RFC, creates 2 problems (primarily upgrading; but also adding unnecessary code to explicitly cast those possible
NULL
values to the correct type, which isn't needed for the other scalar types).I've been trying not to nit-pick, because I think over-all you do have a valid point, but several of your examples do not need any extra code to handle nulls; and those that do need maybe a handful of characters. For instance, $search = ($_GET['q'] ?? ''); is both shorter and clearer than $search = ($_GET['q'] ?? NULL);
Thank you; and you're right, if you write new code today, you could do that, but that assumes you don't need to tell the difference between an empty value vs a missing value (I find this check is rare, but they do happen, and is why most frameworks use NULL
for that distinction, WordPress being the only exception I've found so far).
But, updating existing code, while that would make automated updates easier, it's likely to cause problems, because you're editing the value source, with no idea about checks later on (like your example which looks for NULL)... and that's why an automated update of existing code would have more luck updating the sinks rather than the sources (e.g. it knows which sinks are expecting a string, so it can add in a strval($var)
, or (string) $var
, or $var ?? ""
).
Thank you... but I will add, while
htmlspecialchars()
rejectingNULL
would get you to look at the code again, I wouldn't say it's directly picking up the problem, and it relies on there being aNULL
value in $fields for this to be noticed (if that doesen't happen, you're now in a position where random errors will start happening in production).There are always going to be errors that depend on the value of input, and unless you have god-like skill at writing tests, some of those will only happen in production. Saying "this shouldn't happen" doesn't answer the question of what to do when it does.
Agreed, it's just that in the example given, runtime checks seemed particularly fragile (as in, the NULL
check is a feature to affect the output, and while I don't expect all code to be tested, I would expect explicit features to be, which would ensure a NULL
value is provided and checked, instead of relying on the developer remembering to test a NULL
value).
Bit of a tangent, I'm uncomfortable that
$name
is not being HTML encoded, which takes us to context aware templating engines, and how you can identify these mistakes via theis_literal
RFC or theliteral-string
type.This isn't real code, it's an illustrative example; but if it makes you feel more comfortable, imagine $name has some deliberate HTML markup in it, so needs to be echo'd raw.
(/me shudders, with memories of a project that did that, and trying to remember which variables contained HTML).
That said, if you (or anyone) have any better ideas on how to address this problem, please let me know.
I honestly would be more interested in having some string functions return null for null input than changing existing behaviour to accept null for non-nullable parameters (interestingly, until PHP 8.0,
htmlspecialchars()
could return null, e.g. if given an array). Unfortunately, that would be a different kind of compatibility break, so I'm not sure it fully solves the problem.
I've talked to a few developers who see NULL
being passed to htmlspecialchars()
as something they want to be warned about, because they see NULL
as an invalid value (with my RFC, they would still keep that check with strict_types=1
and/or static analysis).
Also, have a look at the quiz results:
https://quiz.craigfrancis.co.uk/
It kinda confirms those discussions, while I don't think many people carefully went though the list of parameters (a bit more on the quantitative rather than qualitative side), if you focus on the people who selected option 3 first (preferring a Fatal Error for NULL), it's only Kamil and Tim who were ok with htmlspecialchars()
accepting NULL, but they were not open to many of the other 334 parameters.
That said, your suggestion of "?|>" could work in addition to my RFC, so the NULL
value can be kept when explicitly asked to preserve (avoiding that backwards compatibility issue).
Craig
The fact that internal functions have parameter parsing behaviour that is almost impossible to implement in userland, and often not even consistent between functions, is a wart of engine internals, not a design decision.
Bit of a tangent, but do you have some examples? would be nice to clean some of these up, or at least have them in mind as we discuss this RFC.
Fundamentally, the internal parameter handling system (ZPP) is
completely separate from the way function signatures work in userland,
and evolved based on a different set of requirements. The emphasis of
ZPP is on unwrapping zval structs to values that can be manipulated
directly in C; so, for instance, it has always had support for integer
parameters. Since 7.0, userland signatures have evolved an essentially
parallel set of features with an emphasis on designing a consistent and
useful dynamic typing system.
Increasingly, ZPP is being aligned with the userland language, which
also allows reflection information to be generated based on PHP stubs.
For instance:
- Making rejected parameters throw TypeError rather than raise a Warning
and return null - Giving optional parameters an explicit default in the signature rather
than inspecting the argument count - Using union types, rather than ad hoc if/switch on zval type
The currently proposed change to how internal functions handle nulls in
9.0 is just another part of that process - the userland behaviour is
well-established, and we're making the ZPP behaviour match.
Off the top of my head, I don't know what other inconsistencies remain,
but my point was that in every case so far, internal functions have been
adapted to match userland, not vice versa.
So I'll spend 1 more... I think it's fair to say that developers using
strict_types=1
are more likely to be using static analysis; and ifstrict_types=1
is going to eventually disappear, those developers won't lose any functionality with the stricter checking being done by static analysis, which can check all possible variable types (more reliable than runtime), and (with the appropriate level of strictness) static analysis can do things like rejecting the string '5' being passed to an integer parameter and null being passed to a non-nullable parameter.
There's an unhelpful implication here, and in your discussion of
testing, that PHP users can be divided into two camps: those who check
program correctness with static analysis tools, unit tests, etc; and
those who don't care about program correctness.
Instead, how about we think about those who are writing new code and
want PHP to tell them early when they do something silly; and those who
are maintaining large code bases and have to deal with compatibility
problems. Neither of these groups is helped enough by static analysers -
as you've rightly pointed out elsewhere, static checks are not
reliable in a dynamic language, and are not likely to be built-in any
time soon.
I'm by no means the strongest advocate of strictness in PHP - I think
there is a risk of throwing out good features with the bad. But I would
love to see strict_types=1 become unnecessary - not because "everyone's
running static analysers anyway, so who cares" but because the default
behaviour provides a good balance of safety and usability.
That makes me very hesitant to use the strict_types modes as a crutch
for this compatibility break - it only puts off the question of what we
think the sensible behaviour actually is.
Thank you; and you're right, if you write new code today, you could do that, but that assumes you don't need to tell the difference between an empty value vs a missing value
As I've said multiple times now, as soon as you pass it to a function
that doesn't have specific handling for nulls, you lose that distinction
anyway. There is literally zero difference in behaviour between "$foo =
htmlspecialchars($_GET['foo'] ?? null)" and "$foo =
htmlspecialchars($_GET['foo'] ?? '')".
Telling users when they've passed null to a non-nullable parameter is
precisely about preserving that distinction: if you want null to mean
something specific, treating it as a string is a bug.
But, updating existing code, while that would make automated updates easier, it's likely to cause problems, because you're editing the value source, with no idea about checks later on (like your example which looks for NULL)... and that's why an automated update of existing code would have more luck updating the sinks rather than the sources (e.g. it knows which sinks are expecting a string, so it can add in a
strval($var)
, or(string) $var
, or$var ?? ""
).
That's a fair point, although "sinks" are often themselves the next
"source", which is what makes static analysis possible as often as it is.
Despite all of the above, I am honestly torn on this issue. It is a
disruptive change, and I'm not a fan of errors for errors' sake; but I
can see the value in the decision made back in 7.0 to exclude nulls by
default.
Regards,
--
Rowan Tommins
[IMSoP]
<off topic>Off the top of my head, I don't know what other inconsistencies remain,
but my point was that in every case so far, internal functions have been
adapted to match userland, not vice versa.
Internal functions error if you pass excessive arguments to a non-variadic function. User-space functions just ignore the extras. This is an inconsistency that has caused me considerable grief in the past year.
I know Joe has said he wants userspace to become more strict like internal on this one. I will not predict what actually happens.
</off topic>
--Larry Garfield
On Tue, Apr 26, 2022 at 12:18 AM Larry Garfield larry@garfieldtech.com
wrote:
<more><off topic>Off the top of my head, I don't know what other inconsistencies remain,
but my point was that in every case so far, internal functions have been
adapted to match userland, not vice versa.Internal functions error if you pass excessive arguments to a non-variadic
function. User-space functions just ignore the extras. This is an
inconsistency that has caused me considerable grief in the past year.I know Joe has said he wants userspace to become more strict like internal
on this one. I will not predict what actually happens.</off topic>
A few internal functions have parameters with an UNKNOWN
default value in
the stubs, e.g.
https://github.com/php/php-src/blob/php-8.1.5/Zend/zend_builtin_functions.stub.php#L35
function get_class(object $object = UNKNOWN): string {}
documented with a question mark at
https://www.php.net/manual/en/function.get-class.php
get_class(object $object = ?): string
or
https://github.com/php/php-src/blob/php-8.1.5/ext/standard/basic_functions.stub.php#L1558
function mt_rand(int $min = UNKNOWN, int $max = UNKNOWN): int {}
documented with two signatures at
https://www.php.net/manual/en/function.mt-rand.php
`mt_rand()`: int
mt_rand(int $min, int $max): int
</more>
--
Guilliam Xavier
function mt_rand(int $min = UNKNOWN, int $max = UNKNOWN): int {}
documented with two signatures at
https://www.php.net/manual/en/function.mt-rand.php`mt_rand()`: int mt_rand(int $min, int $max): int
This is actually a really pertinent example: you might expect
mt_rand(null, null) to give the same behaviour as mt_rand()
, but
actually it will always return 0, because it silently coerces to
mt_rand(0, 0). That's exactly the kind of unexpected behaviour type
checking aims to protect against.
There used to be a lot more functions with pseudo-defaults like this,
but a lot were made to accept null in PHP 8.0. e.g.
mb_convert_encoding('foo', 'ASCII', null) now acts like
mb_convert_encoding('foo', 'ASCII') and looks up a run-time default; but
in previous versions it acted as mb_convert_encoding('foo', 'ASCII', '')
which is not a valid call.
Regards,
--
Rowan Tommins
[IMSoP]
function mt_rand(int $min = UNKNOWN, int $max = UNKNOWN): int {}
documented with two signatures at
https://www.php.net/manual/en/function.mt-rand.php`mt_rand()`: int mt_rand(int $min, int $max): int
This is actually a really pertinent example: you might expect mt_rand(null, null) to give the same behaviour as
mt_rand()
, but actually it will always return 0, because it silently coerces to mt_rand(0, 0). That's exactly the kind of unexpected behaviour type checking aims to protect against.
First, thanks Guilliam for the examples.
And Rowan, I agree it's a good example, and I've added it to the Open Issues.
But I'm wondering, is it only one function? and assuming it's a problem, could we use Z_PARAM_LONG_OR_NULL()
and specifically throw an exception when either parameter is NULL, like the max < min
check? On the basis that I'd rather have one extra check for this function, and keep NULL
coercion working everywhere else (i.e. where it's fine).
As an aside, under the Future Scope:
https://wiki.php.net/rfc/null_coercion_consistency#future_scope
I'd noted some functions that could do with some similar changes, but that's more about copying the example of $separator
in explode()
and the “cannot be empty” error, so NULL
or an Empty String is rejected (rather than just rejecting NULL).
There used to be a lot more functions with pseudo-defaults like this, but a lot were made to accept null in PHP 8.0. e.g. mb_convert_encoding('foo', 'ASCII', null) now acts like mb_convert_encoding('foo', 'ASCII') and looks up a run-time default; but in previous versions it acted as mb_convert_encoding('foo', 'ASCII', '') which is not a valid call.
Thanks, that's another good example, but I'm going to be a pain, and would suggest that the specific function was correctly updated to work with NULL
(and anyone who did pass NULL
to that argument, would have had the "must specify at least one encoding" error).
Craig
<off topic>Internal functions error if you pass excessive arguments to a non-variadic
function. User-space functions just ignore the extras. This is an
inconsistency that has caused me considerable grief in the past year.I know Joe has said he wants userspace to become more strict like internal
on this one. I will not predict what actually happens.</off topic>
Just to note our off-list discussion, this is shown with:
function example($a) {
var_dump(func_get_args());
}
$a = example('A', 'B'); // Fine
$b = strlen('A', 'B'); // Expects exactly 1 argument, 2 given
It's due to the old (pre-variadic) approach of using func_get_args()
,
func_num_args()
, etc...
https://www.php.net/manual/en/functions.arguments.php#functions.variable-arg-list
"It is also possible to achieve variable-length arguments by using
func_num_args()
, func_get_arg()
, and func_get_args()
functions. This
technique is not recommended as it was used prior to the introduction of
the ... token."
I'd also note that using these functions (instead of "...$var") aren't so
good for IDE's or Static Analysis, as the function signature does not
clearly define the parameters exist (let alone type)... so these func_*
functions might be candidates for deprecation, if anyone cares enough about
them :-)
Craig
The fact that internal functions have parameter parsing behaviour that is almost impossible to implement in userland, and often not even consistent between functions, is a wart of engine internals, not a design decision.
Bit of a tangent, but do you have some examples? would be nice to clean some of these up, or at least have them in mind as we discuss this RFC.Fundamentally, the internal parameter handling system (ZPP) is completely separate from the way function signatures work in userland, and evolved based on a different set of requirements. The emphasis of ZPP is on unwrapping zval structs to values that can be manipulated directly in C; so, for instance, it has always had support for integer parameters. Since 7.0, userland signatures have evolved an essentially parallel set of features with an emphasis on designing a consistent and useful dynamic typing system.
Increasingly, ZPP is being aligned with the userland language, which also allows reflection information to be generated based on PHP stubs. For instance:
- Making rejected parameters throw TypeError rather than raise a Warning and return null
- Giving optional parameters an explicit default in the signature rather than inspecting the argument count
- Using union types, rather than ad hoc if/switch on zval type
The currently proposed change to how internal functions handle nulls in 9.0 is just another part of that process - the userland behaviour is well-established, and we're making the ZPP behaviour match.
Off the top of my head, I don't know what other inconsistencies remain, but my point was that in every case so far, internal functions have been adapted to match userland, not vice versa.
Thank you Rowan, that's really helpful.
So I'll spend 1 more... I think it's fair to say that developers using
strict_types=1
are more likely to be using static analysis; and ifstrict_types=1
is going to eventually disappear, those developers won't lose any functionality with the stricter checking being done by static analysis, which can check all possible variable types (more reliable than runtime), and (with the appropriate level of strictness) static analysis can do things like rejecting the string '5' being passed to an integer parameter and null being passed to a non-nullable parameter.There's an unhelpful implication here, and in your discussion of testing, that PHP users can be divided into two camps: those who check program correctness with static analysis tools, unit tests, etc; and those who don't care about program correctness.
Instead, how about we think about those who are writing new code and want PHP to tell them early when they do something silly; and those who are maintaining large code bases and have to deal with compatibility problems. Neither of these groups is helped enough by static analysers - as you've rightly pointed out elsewhere, static checks are not reliable in a dynamic language, and are not likely to be built-in any time soon.
I'm by no means the strongest advocate of strictness in PHP - I think there is a risk of throwing out good features with the bad. But I would love to see strict_types=1 become unnecessary - not because "everyone's running static analysers anyway, so who cares" but because the default behaviour provides a good balance of safety and usability.
That makes me very hesitant to use the strict_types modes as a crutch for this compatibility break - it only puts off the question of what we think the sensible behaviour actually is.
Yep, I agree, and now I know what George is planning, with the gradual removal of strict_types=1
, you're right that should not be how the two groups are defined.
I would prefer PHP itself to be strict enough to identify and complain about actual mistakes/problems, but be tolerant and allow reasonable forms of type coercion (e.g. your example with a string holding an integer being coerced to an actual integer as needed). That's where I see static analysis providing strict checking for those who want it (e.g. they can choose to not allow any coercion).
The only issue I have is when NULL
is passed to functions like urlencode()
, I do not see that as a problem, and I think NULL
coercion should continue to also work in that context (I really don't see why it justifies a fatal Type Error for everyone, as not everyone treats NULL
as an invalid value).
I've made some tweaks to my RFC in an attempt to better reflect that.
Thank you; and you're right, if you write new code today, you could do that, but that assumes you don't need to tell the difference between an empty value vs a missing value
As I've said multiple times now, as soon as you pass it to a function that doesn't have specific handling for nulls, you lose that distinction anyway. There is literally zero difference in behaviour between "$foo = htmlspecialchars($_GET['foo'] ?? null)" and "$foo = htmlspecialchars($_GET['foo'] ?? '')".
You're right that changing the NULL
coalesce at that point would be fine, because the value is only going to htmlspecialchars()
- it's why I think automated tools will have much more luck taking this approach (even if it seems redundant)
But we're usually looking at variables for user input (often nullable), that are set earlier in the script, then that nullable variable is used multiple times later.
Forgive this primitive example, but this shows $name
being used in three different ways, where an automated tool cannot simply change line 1 so it doesn't return a NULL, because it would break the Register link.
$name = $request->get('name');
if (trim($name) === '') { // Contains non-whitespace characters; so not "", " ", NULL, etc
$where_sql[] = 'name LIKE ?';
$where_val[] = $name;
}
echo '
<form action="./" method="get">
<label>
Search
<input type="search" name="name" value="' . htmlspecialchars($name) . '">
</label>
<input type="submit" value="Go">
</form>';
if ($name !== NULL) {
$register_url = '/admin/accounts/add/?name=' . urlencode($name);
echo '
<p><a href="' . htmlspecialchars($register_url) . '">Add Account</a></p>';
}
In this example, why does trim()
and htmlspecialchars()
justify a fatal Type Error?
Telling users when they've passed null to a non-nullable parameter is precisely about preserving that distinction: if you want null to mean something specific, treating it as a string is a bug.
I don't think that represents a bug, are we are talking about a system that takes user input (so often nullable), supports coercion with other simple types, and supports NULL
coercion in other contexts (e.g. string concat).
But, updating existing code, while that would make automated updates easier, it's likely to cause problems, because you're editing the value source, with no idea about checks later on (like your example which looks for NULL)... and that's why an automated update of existing code would have more luck updating the sinks rather than the sources (e.g. it knows which sinks are expecting a string, so it can add in a
strval($var)
, or(string) $var
, or$var ?? ""
).That's a fair point, although "sinks" are often themselves the next "source", which is what makes static analysis possible as often as it is.
Yep, it can be complicated... and that's why I like your "$foo ?|> htmlspecialchars(...)" suggestion, because existing code (like the example above) expects functions like trim()
to always return a string (rather than returning a NULL
when provided with a NULL), and that will continue to work as expected/documented, but you get to introduce a new syntax to allow NULL
to propagate though expressions.
Despite all of the above, I am honestly torn on this issue. It is a disruptive change, and I'm not a fan of errors for errors' sake; but I can see the value in the decision made back in 7.0 to exclude nulls by default.
Thanks Rowan, I'll just add that I think the fatal Type Error for NULL
will be much more disruptive, and I would rather relax that requirement for user defined functions, so NULL
coercion works in all contexts.
Craig
Forgive this primitive example, but this shows
$name
being used in three different ways, where an automated tool cannot simply change line 1 so it doesn't return a NULL, because it would break the Register link.$name = $request->get('name'); if (trim($name) === '') { // Contains non-whitespace characters; so not "", " ", NULL, etc $where_sql[] = 'name LIKE ?'; $where_val[] = $name; } echo ' <form action="./" method="get"> <label> Search <input type="search" name="name" value="' . htmlspecialchars($name) . '"> </label> <input type="submit" value="Go"> </form>'; if ($name !== NULL) { $register_url = '/admin/accounts/add/?name=' . urlencode($name); echo ' <p><a href="' . htmlspecialchars($register_url) . '">Add Account</a></p>'; }
In this example, why does
trim()
andhtmlspecialchars()
justify a fatal Type Error?
Honestly, I fail to see how the inconsistent null handling in this
example is anything other than a bug:
- If strings containing only spaces are considered empty, it's probably
a mistake that they're not trimmed everywhere. But of course adding
$name = trim($name) would remove all distinction between null and ''. - If null is equivalent to an empty string when deciding whether to run
the SQL, why is it not also equivalent to an empty string when rendering
the register link? The current logic means that clicking "Go" causes a
register link to appear, with an empty name on the query string, which
doesn't seem like useful functionality. - On the other hand, if
NULL
is considered a different state, why is
there no behaviour distinguishing it around the SQL logic? It seems
likely that an else clause would be added there, perhaps "else { echo
'You must enter a name.'; }" In the current code, that would appear for
both null and empty strings, which is probably a mistake.
I think that actually demonstrates quite nicely that most code would
benefit from treating "string" and "?string" as strictly different
types, and either defaulting to an empty string explicitly and early, or
considering null at every step. Simultaneously relying on null values
being preserved, and them being silently coerced, leads to fragile code
where it's not clear where nulls are deliberately handled as empty
strings, and where they've simply been forgotten about.
Telling users when they've passed null to a non-nullable parameter is precisely about preserving that distinction: if you want null to mean something specific, treating it as a string is a bug.
I don't think that represents a bug, are we are talking about a system that takes user input (so often nullable), supports coercion with other simple types, and supports
NULL
coercion in other contexts (e.g. string concat).
Read the sentence you replied to again: IF you want to treat null as
distinct from '', THEN failing to do so is a bug. If, on the other hand,
you just want to take nullable user input and handle it CONSISTENTLY as
a string, then it's fine to explicitly default to '' AT SOURCE.
Despite all of the above, I am honestly torn on this issue. It is a disruptive change, and I'm not a fan of errors for errors' sake; but I can see the value in the decision made back in 7.0 to exclude nulls by default.
Thanks Rowan, I'll just add that I think the fatal Type Error for
NULL
will be much more disruptive, and I would rather relax that requirement for user defined functions, soNULL
coercion works in all contexts.
I think the main difference between our positions is that I believe that
if PHP's type system was designed from scratch today, null would not be
silently coerced in these situations. So while I agree that the change
will be disruptive, I disagree with your position that it brings no benefit.
But I'm wondering, is it only one function? and assuming it's a problem, could we use
Z_PARAM_LONG_OR_NULL()
and specifically throw an exception when either parameter is NULL, like themax < min
check? On the basis that I'd rather have one extra check for this function, and keepNULL
coercion working everywhere else (i.e. where it's fine).
Well, since it's one of three examples in Guilliam's e-mail, the answer
to the first question seems rather trivially "no", unless I'm missing
something? As for the second question, certainly we could add specific
prohibitions to null on a case by case basis, but that's basically
equivalent to your previous suggestion of explicitly allowing null on a
case by case basis, and doesn't really answer the question of what the
default behaviour should be - especially bearing in mind that any
default should apply to both built-in and user-defined functions.
Regards,
--
Rowan Tommins
[IMSoP]
Forgive this primitive example, but this shows
$name
being used in three different ways, where an automated tool cannot simply change line 1 so it doesn't return a NULL, because it would break the Register link.$name = $request->get('name'); if (trim($name) === '') { // Contains non-whitespace characters; so not "", " ", NULL, etc $where_sql[] = 'name LIKE ?'; $where_val[] = $name; } echo ' <form action="./" method="get"> <label> Search <input type="search" name="name" value="' . htmlspecialchars($name) . '"> </label> <input type="submit" value="Go"> </form>'; if ($name !== NULL) { $register_url = '/admin/accounts/add/?name=' . urlencode($name); echo ' <p><a href="' . htmlspecialchars($register_url) . '">Add Account</a></p>'; }
In this example, why does
trim()
andhtmlspecialchars()
justify a fatal Type Error?Honestly, I fail to see how the inconsistent null handling in this example is anything other than a bug:
My example was only trying to show how an automated tool cannot simply update the $name
source (line 1), because it will not be able to easily tell what happens later (especially if the code is split into multiple files)... whereas an automated tool will find it much easier to do this at the sinks:
- if (trim($name) === '') {
+ if (trim((string) $name) === '') {
Which is how the default NULL
from getConfigData()
was handled with PHPCompatibility:
https://github.com/PHPCompatibility/PHPCompatibility/commit/ab7b0b82d95c82e62f9cffb0f1960f3ccbf0805e https://github.com/PHPCompatibility/PHPCompatibility/commit/ab7b0b82d95c82e62f9cffb0f1960f3ccbf0805e
Or maybe:
- [...] value="' . htmlspecialchars($name) . '">
+ [...] value="' . htmlspecialchars($name ?? '') . '">
Which is how "Vaimo Composer Patches" might "fix" the NULL
from extractSingleValue()
:
https://github.com/vaimo/composer-patches/pull/96/files https://github.com/vaimo/composer-patches/pull/96/files
https://github.com/vaimo/composer-patches/blob/cbceb44dfad9e37f18e319125b036a10496ea094/src/Patch/SourceLoaders/PatchesSearch.php#L234 https://github.com/vaimo/composer-patches/blob/cbceb44dfad9e37f18e319125b036a10496ea094/src/Patch/SourceLoaders/PatchesSearch.php#L234
This is how Laravel Blade handles NULL:
https://github.com/laravel/framework/blob/ab1506091b9f166b312b3990d07b2e21d971f2e6/src/Illuminate/Support/helpers.php#L119 https://github.com/laravel/framework/blob/ab1506091b9f166b312b3990d07b2e21d971f2e6/src/Illuminate/Support/helpers.php#L119
But, on the basis that PHP considers it a problem if NULL
is passed to these functions, and should not be allowed (blocking will make better code, somehow)... maybe someone should ask for this PR to be reverted? It's only one library, so they should be able to easily fix all of the code affected by this, right? :-)
https://github.com/laravel/framework/pull/36262 https://github.com/laravel/framework/pull/36262
Interestingly Symphony Twig (which uses NULL
for undefined $context
values), preserves NULL
when passed to twig_escape_filter()
... but that's typically provided directly to echo
, where NULL
is accepted and coerced to an empty string :-)
echo twig_escape_filter($this->env, ($context["v"] ?? null), "html", null, true);
https://github.com/twigphp/Twig/blob/b4d6723715da57667cca851051eba3786714290d/src/Extension/EscaperExtension.php#L195 https://github.com/twigphp/Twig/blob/b4d6723715da57667cca851051eba3786714290d/src/Extension/EscaperExtension.php#L195
On the basis that PHP considers it a problem if NULL
is passed to these functions, maybe echo(string ...$expressions)
and print(string $expression)
should also deprecate (and eventually fatal error) when they receive NULL? That would be fun :-)
None of these library changes (to support 8.1) are making their code easier to read, nor are they improving the code quality.
Anyway, I'm going on a pointless tangent... the important bit is that many frameworks already (and will probably continue to) return NULL
by default, doing so has been fine (and can be useful in some cases)... and now that NULL
is going to be rejected, I can only see that creating an upgrade problem.
- If strings containing only spaces are considered empty, it's probably a mistake that they're not trimmed everywhere. But of course adding $name = trim($name) would remove all distinction between null and ''.
Bit of a tangent, but someone can use trim()
to check if a value is worth saving (or using the SELECT query), but will still want to preserve exactly what the user wrote (e.g. taking your name as an example, searching for "Tom" vs " Tom" will return your record for both, but not my step-brother Tom).
Either way, why should trim()
trigger a fatal Type Error if it receives NULL? Does that actually represent problem, considering it's fine receiving an integer?
- If null is equivalent to an empty string when deciding whether to run the SQL, why is it not also equivalent to an empty string when rendering the register link? The current logic means that clicking "Go" causes a register link to appear, with an empty name on the query string, which doesn't seem like useful functionality.
While I wasn't intending it to be a real example... maybe imagine the original code did not run the DB query when the field is left blank (or only contains whitespace characters), probably for performance reasons... and later, after someone complains about duplicate accounts, they put in a small change to ensure the admin did at least one search, but for anti-frustration reasons, they allowed the admin to just press [enter]?
- On the other hand, if
NULL
is considered a different state, why is there no behaviour distinguishing it around the SQL logic? It seems likely that an else clause would be added there, perhaps "else { echo 'You must enter a name.'; }" In the current code, that would appear for both null and empty strings, which is probably a mistake.I think that actually demonstrates quite nicely that most code would benefit from treating "string" and "?string" as strictly different types, and either defaulting to an empty string explicitly and early, or considering null at every step. Simultaneously relying on null values being preserved, and them being silently coerced, leads to fragile code where it's not clear where nulls are deliberately handled as empty strings, and where they've simply been forgotten about.
It wasn't supposed to demonstrate anything other than the problem an automated tool would have.
But, on the basis that many developers don't seem to be "considering null at every step" (going on the roughly 15% of scripts using strict_types=1
)... should frameworks default to returning an empty string instead? If so, how much existing code will break if that was to happen? Or should all of these developers specify an empty string as their default every time they ask for a user-value? In either case, it's a lot of work to fix the things that break, and I still cannot see the value in doing so.
Telling users when they've passed null to a non-nullable parameter is precisely about preserving that distinction: if you want null to mean something specific, treating it as a string is a bug.
I don't think that represents a bug, we are talking about a system that takes user input (so often nullable), supports coercion with other simple types, and supports
NULL
coercion in other contexts (e.g. string concat).Read the sentence you replied to again: IF you want to treat null as distinct from '', THEN failing to do so is a bug. If, on the other hand, you just want to take nullable user input and handle it CONSISTENTLY as a string, then it's fine to explicitly default to '' AT SOURCE.
But we're talking about existing code, and exiting frameworks, who use NULL
by default; and it's been coerced perfectly fine (until 8.1, or for user defined functions that have specified a non-nullable type).
Maybe the frameworks should have taken WordPress as their example, and used an empty string by default?
And we're back to the question, what is actually wrong with passing NULL
to the mentioned functions? I can only see the argument of possibly preserving NULL
(with different backwards compatibility issues, and I suspect better handled with your "?|>" suggestion); or those developers who prefer a very strict environment, with no type coercion, where they can treat NULL
as an invalid value (but probably should be using static analysis to reliably check their code never passes NULL
to these functions or performs any coercion).
Despite all of the above, I am honestly torn on this issue. It is a disruptive change, and I'm not a fan of errors for errors' sake; but I can see the value in the decision made back in 7.0 to exclude nulls by default.
Thanks Rowan, I'll just add that I think the fatal Type Error for
NULL
will be much more disruptive, and I would rather relax that requirement for user defined functions, soNULL
coercion works in all contexts.I think the main difference between our positions is that I believe that if PHP's type system was designed from scratch today, null would not be silently coerced in these situations. So while I agree that the change will be disruptive, I disagree with your position that it brings no benefit.
But what is that benefit? I'm sorry, but I really don't see it.
I'm going on the basis that you're ok with numbers in strings being coerced to integers/floats (which I also see as being useful, because you're right, most inputs are strings)... but you're not ok with NULL
being coerced (which is also common, because values aren't guaranteed to be provided by the user, and NULL
is typically the default).
But I'm wondering, is it only one function? and assuming it's a problem, could we use
Z_PARAM_LONG_OR_NULL()
and specifically throw an exception when either parameter is NULL, like themax < min
check? On the basis that I'd rather have one extra check for this function, and keepNULL
coercion working everywhere else (i.e. where it's fine).Well, since it's one of three examples in Guilliam's e-mail, the answer to the first question seems rather trivially "no", unless I'm missing something?
Sorry, I can only see two examples... get_class()
, which is more of an example of UNKNOWN being used (and NULL
has not been "allowed as of PHP 7.2")... and mt_rand()
, which I mentioned in the RFC, because you provided a good example of NULL
being a potential problem, not that I can find anyone doing this publicly:
https://grep.app/search?q=mt_rand%28NULL&filter[lang][0]=PHP https://grep.app/search?q=mt_rand%28NULL&filter%5Blang%5D%5B0%5D=PHP
https://grep.app/search?q=mt_rand%5Cs%2A%5C%28%5Cs%2ANULL®exp=true&filter[lang][0]=PHP https://grep.app/search?q=mt_rand%5Cs%2A%5C%28%5Cs%2ANULL®exp=true&filter%5Blang%5D%5B0%5D=PHP
https://www.google.com/search?q=%22mt_rand+NULL+NULL%22 https://www.google.com/search?q=%22mt_rand+NULL+NULL%22
As for the second question, certainly we could add specific prohibitions to null on a case by case basis, but that's basically equivalent to your previous suggestion of explicitly allowing null on a case by case basis, and doesn't really answer the question of what the default behaviour should be - especially bearing in mind that any default should apply to both built-in and user-defined functions.
I don't want oddities either, I was just exploring the possibility of a developer accidentally writing mt_rand(NULL, NULL)
, in the current environment (where the NULL's are coerced to the integer 0); and if it's a problem worth protecting against, how we could handle it.
Craig
But what is that benefit? I'm sorry, but I really don't see it.
I started drafting a longer reply, but honestly I don't think we're
getting anywhere. Every attempt to explain the benefit seems to end in
one of two ways:
- an endless back and forth nit-picking hypothetical situations where it
might or might not be useful - an outright dismissal that "people who want it strict can go over
there and use strict_types=1 and/or static analysis"
To me, it's always about trade-offs: the benefit of strict checks
exists for everyone, and the question is whether they want to pay the
cost or not. As long as we can't agree on that fundamental point,
there's no point continuing the discussion.
I'm going on the basis that you're ok with numbers in strings being
coerced to integers/floats (which I also see as being useful, because
you're right, most inputs are strings)... but you're not ok withNULL
being coerced (which is also common, because values aren't guaranteed
to be provided by the user, andNULL
is typically the default).
I will reply to this point, though, because I think it's a genuinely
interesting thing to ponder.
One significant difference is that not only is it often not useful to
distinguish an input of 123 from '123', it's often not possible. There
is literally no way for an HTTP URL or header to contain an integer,
rather than a string representation of one, because it's not a binary
protocol.
On the other hand, you might well receive an empty string as input where
you're expecting an integer. Notably, that is not coerced
automatically to zero; the code has to explicitly decide if that should
trigger distinct behaviour (such as a validation error) or be treated as
a default value. Not receiving a field you expected feels very similar,
so similar behaviour feels reasonable.
Regards,
--
Rowan Tommins
[IMSoP]
But what is that benefit? I'm sorry, but I really don't see it.
I started drafting a longer reply, but honestly I don't think we're getting anywhere. Every attempt to explain the benefit seems to end in one of two ways:
- an endless back and forth nit-picking hypothetical situations where it might or might not be useful
- an outright dismissal that "people who want it strict can go over there and use strict_types=1 and/or static analysis"
To me, it's always about trade-offs: the benefit of strict checks exists for everyone, and the question is whether they want to pay the cost or not. As long as we can't agree on that fundamental point, there's no point continuing the discussion.
I hope I don't come across like that; I'm really trying to understand the benefits and costs (I tend to use small examples to help myself understand).
I'm not trying to be dismissive with the use of static analysis. I just think PHP should be tolerant of some things (e.g. string '5' to int 5, and null to empty string), but I also recognise some developers prefer a very strict environment that does not do any type coercion (that's where I think static analysis works really well, as it can enforce extra checks, including type checks for all variables from all sources to all sinks).
That said, I do see value in some Type Errors, like how I updated my RFC a couple of weeks ago with some examples "like substr($string, “offset”)
and htmlspecialchars(array())
as being clearly problematic" (thanks again George).
I'm also fine with substr('abc', $offset)
rejecting an Empty String or NULL
for $offset
(I'll note that $offset
was never added to my list of 335 parameters).
Under "Future Scope" I've given 4 example parameters that probably should reject an Empty String or NULL
(because they do represent problems, similar to how $separator
in explode()
already has the “cannot be empty” fatal error).
And finally, I can see how mt_rand(NULL, NULL)
could be a problem (someone assuming NULL
represents a default value, but it's coerced to the integer 0), but as I noted in my previous email, I cannot find anyone doing this, and after re-checking my lists and having a re-look though the manual, I think it's the only one that benefits from the rejection of NULL
coercion.
Taking that as my rough position on type coercion, I don't see a benefit from blocking NULL
coercion (more below).
Whereas, blocking NULL
coercion does introduce an upgrade cost (not made easier due to the lack of tooling); and the continuing cost to some developers using the noted frameworks or filter_input()
(e.g. always specifying an empty string default, or always manually casting NULL
to a string)... the other cost is the weirdness in how NULL
coercion still works for echo()/print(), string concatenation, == comparisons, arithmetics, sprintf, etc.
I'm going on the basis that you're ok with numbers in strings being coerced to integers/floats (which I also see as being useful, because you're right, most inputs are strings)... but you're not ok with
NULL
being coerced (which is also common, because values aren't guaranteed to be provided by the user, andNULL
is typically the default).I will reply to this point, though, because I think it's a genuinely interesting thing to ponder.
One significant difference is that not only is it often not useful to distinguish an input of 123 from '123', it's often not possible. There is literally no way for an HTTP URL or header to contain an integer, rather than a string representation of one, because it's not a binary protocol.
On the other hand, you might well receive an empty string as input where you're expecting an integer. Notably, that is not coerced automatically to zero; the code has to explicitly decide if that should trigger distinct behaviour (such as a validation error) or be treated as a default value. Not receiving a field you expected feels very similar, so similar behaviour feels reasonable.
I'm someone who will try to justify some very strict coding styles - like no inline JavaScript, use of Trusted Types, the use of literal-string for SQL/HTML/etc, and in some cases the use of application/xhtml+xml (these have easily provable benefits, but they can also be tricky, so few developers use them).
With your example, I'm probably fine with an Empty String or NULL
not being coerced to int 0 (as in, I could see how it might represent a problem, although I wouldn't care if it did get coerced to 0).
But a lot of existing PHP code simply takes user input (which can be NULL), and passes it to these functions with no expectation of a fatal error (not good if it's in mid-way though processing data).
If we were talking about a desktop application, where the UI was defined and displayed by that application, then a missing field would represent a problem in that application, but we're typically talking about the web with PHP, where the data often comes from an un-trusted browser... i.e. the user/browser/extension/network can be doing something odd, all the way down to how a standard HTML checkbox works (unchecked does not provide a field).
It's because of these oddities, and the way NULL
has historically worked, many developers simply don't see the difference between an empty or missing field (like other sources of NULL).
That's why I don't see any benefit to blocking NULL
coercion in this context, as an Empty String or NULL
are often seen as the same - e.g. a programmer is simply checking if a name was provided, checking an email address contains the '@' character, checking if a message is too long, trimming the whitespace from a value, getting a record with an id/slug/name/ref, adding the value to a url, showing the search term, etc.
Craig
Den 2022-04-08 kl. 19:34, skrev Craig Francis:
Hi,
I've written a new draft RFC to address the
NULL
coercion problems:https://wiki.php.net/rfc/null_coercion_consistency
This is due to the result of the Allow
NULL
quiz:https://quiz.craigfrancis.co.uk/
14 votes for Fatal Type Errors irrespective of
strict_types=1
;
13 votes forNULL
coercion when not usingstrict_types=1
;
8 votes to update some parameters to allow NULL;I appreciate some want to force strict type checking on everyone, but I
want to make sure we have this properly documented, with names, and
explanations.Breaking changes should be justified - if they aren't, they only
make upgrading difficult and frustrating (bad for security).Craig
Hi,
Have been busy for a while and haven't followed all the details around
this RFC, but now back on track.
One code pattern to upgrade your code to PHP 8.1 and 9.0 is to use the
null coalescing operator like rtrim($string ?? '',...).
If one then has a legacy codebase that works flawlessly I would say that
this path is the preferred one since it requires a minimum of work. To
dig into your code base and address why the parameter ends up like null
is probably more cumbersome. OTOH, one doesn't reap the benefits of the
"Deprecate passing null to non-nullable arguments of internal functions"
RFC since it requires to much work.
One could argue that the "Deprecating passing null" RFC is nice, but has
a flaw when applied for existing code. Applying the null coalescing code
pattern is an easy way to make your code 8.1 compatible, but not sure if
it adds value to the application itself.
Now if this RFC is the best way to improve things if one think this is
something of an issue with the original RFC worth addressing I don't
know. But at least worth having a discussion about! The purist within
me sighs a little when applying the code pattern above...
r//Björn L
Den 2022-04-08 kl. 19:34, skrev Craig Francis:
Hi,
I've written a new draft RFC to address theNULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistency
...One code pattern to upgrade your code to PHP 8.1 and 9.0 is to use the
null coalescing operator like rtrim($string ?? '',...).If one then has a legacy codebase that works flawlessly I would say that
this path is the preferred one since it requires a minimum of work. To
dig into your code base and address why the parameter ends up like null
is probably more cumbersome. OTOH, one doesn't reap the benefits of the
"Deprecate passing null to non-nullable arguments of internal functions"
RFC since it requires to much work.One could argue that the "Deprecating passing null" RFC is nice, but has
a flaw when applied for existing code. Applying the null coalescing code
pattern is an easy way to make your code 8.1 compatible, but not sure if
it adds value to the application itself.Now if this RFC is the best way to improve things if one think this is
something of an issue with the original RFC worth addressing I don't
know. But at least worth having a discussion about! The purist within
me sighs a little when applying the code pattern above...
Thanks Björn,
My main concern is about backwards compatibility.
If this RFC fails, I think the only way for existing code to avoid the issue will be via an automated tool, but for it to do this safely, I agree with your suggestion, I think it will need to use null coalescing (or casting via (string) $var
or strval($var)
) at the variable sinks... but the idea of doing that to the ~335 parameters I've listed, is, well, ugly to say the least... but I'd much rather do that, instead of having projects stuck on 8.x (where they will inevitably stop receiving security patches).
I would like to know about the future benefits, and this is where my discussion with Rowan has been interesting (trying to work out the benefits vs costs)... but, tbh, I'm not really seeing a future benefit for breaking NULL
coercion with internal functions (sorry). I do see benefits for consistency, and rejecting weird forms of type coercion (e.g. htmlspecialchars(array())
), and I've tried to factor these into my RFC, but considering how NULL
and an Empty String can often be seen as interchangeable (many developers giving no thought to the difference), I'm not sure how these fatal errors will help.
Craig
On Fri, Apr 8, 2022 at 10:35 AM Craig Francis craig@craigfrancis.co.uk
wrote:
Hi,
I've written a new draft RFC to address the
NULL
coercion problems:https://wiki.php.net/rfc/null_coercion_consistency
This is due to the result of the Allow
NULL
quiz:https://quiz.craigfrancis.co.uk/
14 votes for Fatal Type Errors irrespective of
strict_types=1
;
13 votes forNULL
coercion when not usingstrict_types=1
;
8 votes to update some parameters to allow NULL;I appreciate some want to force strict type checking on everyone, but I
want to make sure we have this properly documented, with names, and
explanations.Breaking changes should be justified - if they aren't, they only
make upgrading difficult and frustrating (bad for security).Craig
This RFC seems to be trying to force all PHP developers, regardless of what
they think, to treat null as "whatever the default value is within the
type context of execution", which is probably the most dangerous and
bug-prone usage of null in PHP code. This would make it almost impossible
for most programs to treat null instead as "an undefined value which must
be explicitly set", which is another usage I commonly see in code.
Given that, I think there needs to be much more justification of this
change personally.
Jordan
https://wiki.php.net/rfc/null_coercion_consistency https://wiki.php.net/rfc/null_coercion_consistency
This RFC seems to be trying to force all PHP developers, regardless of what they think, to treat null as "whatever the default value is within the type context of execution", which is probably the most dangerous and bug-prone usage of null in PHP code. This would make it almost impossible for most programs to treat null instead as "an undefined value which must be explicitly set", which is another usage I commonly see in code.
Not what I'm going for... but anyway, to get an idea of your position, do you think the string '15' should be coerced to a number, or should it fatal error as well? e.g.
$my_number = round($request->get('my_number'));
Craig
Hey Craig
Not what I'm going for... but anyway, to get an idea of your position, do
you think the string '15' should be coerced to a number, or should it fatal
error as well? e.g.$my_number = round($request->get('my_number'));
Passing '15' for int
should throw a type error: this is why people
(slowly, steadily) opt into strict types.
I mostly refrained from commenting on this RFC thus far exactly because I'm
already "out of the woods", and I already add strict types everywhere.
Still, when upgrading legacy PHP apps, I would have to deal with the
problem, at which point I prefer looking for type errors in internal
function calls, than looking for changes in null coercion (the one proposed
here) everywhere.
Practically, my idea is:
- coercion rules are old and well known
- changing coercion rules around
null
drags in complexity - focus should be in migrating to strict types instead
Regardless of the output of this RFC, people upgrading to 8.0 or 8.1 will
already experience BC breaks in internal function input types, and adapt to
them.
This RFC added on top of that further destabilises the upgrade path to 8.2
or other versions (wherever introduced).
For me, this makes the RFC a net negative value for the language and its
consumers, as it adds a problem.
Hey Craig
Hi Marco,
Thanks for your thoughts.
Not what I'm going for... but anyway, to get an idea of your position, do you think the string '15' should be coerced to a number, or should it fatal error as well? e.g.
$my_number = round($request->get('my_number'));
Passing '15' for
int
should throw a type error: this is why people (slowly, steadily) opt into strict types.
I don't think everyone agrees :-)
I mostly refrained from commenting on this RFC thus far exactly because I'm already "out of the woods", and I already add strict types everywhere.
Still, when upgrading legacy PHP apps, I would have to deal with the problem, at which point I prefer looking for type errors in internal function calls, than looking for changes in null coercion (the one proposed here) everywhere.
Practically, my idea is:
- coercion rules are old and well known
- changing coercion rules around
null
drags in complexity- focus should be in migrating to strict types instead
Regardless of the output of this RFC, people upgrading to 8.0 or 8.1 will already experience BC breaks in internal function input types, and adapt to them.
I will note that my RFC does not change NULL
coercion (why I quoted the documentation), and I'm not against the other BC breaks where certain type coercions are clearly problematic.
It was 8.1 which introduced this specific BC problem (with ignorable deprecation notices)... and while that change was to begin alignment with user defined functions, it's the user defined functions that have been the oddity (i.e. do not use NULL
coercion, unlike other contexts, such as concatenation, == comparisons, arithmetics, sprintf, print, echo, array keys). Personally I think it should have been user defined functions that worked with the "old and well known" coercion rules, and doing so would have reduced complexity (i.e. working in the same way).
I'm also not against increasing strictness, and some of the type errors (ref previous emails, and noted in my RFC), but I think it's more of a balance, where an overly strict environment creates different problems (e.g. making the language much more verbose, and can be unnecessarily hard to learn/use).
But my focus is on upgrading existing code, and this one is difficult to find and update (e.g. the lack of tooling to help).
Craig
I will note that my RFC does not change
NULL
coercion (why I quoted the documentation), and I'm not against the other BC breaks where certain type coercions are clearly problematic.It was 8.1 which introduced this specific BC problem (with ignorable deprecation notices)... and while that change was to begin alignment with user defined functions, it's the user defined functions that have been the oddity (i.e. do not use
NULL
coercion, unlike other contexts, such as concatenation, == comparisons, arithmetics, sprintf, print, echo, array keys). Personally I think it should have been user defined functions that worked with the "old and well known" coercion rules, and doing so would have reduced complexity (i.e. working in the same way).
...
But my focus is on upgrading existing code, and this one is difficult to find and update (e.g. the lack of tooling to help).
It is exactly user-defined functions that this RFC introduces breakage for.
The behaviour to throw on null in user-defined functions exists since PHP
7.0, and is being relied on. Changing these now would introduce behaviour changes
that are harder to find than new type errors.
Using strict typing isn't an option either/would be just as much work as auditing
the changes this would introduce.
It may be that user-defined functions should have accepted null to begin with in
your opinion, but that still makes it a breaking change now.
Regards,
Mel
It is exactly user-defined functions that this RFC introduces breakage for.
The behaviour to throw on null in user-defined functions exists since PHP
7.0, and is being relied on. Changing these now would introduce behaviour changes
that are harder to find than new type errors.
Using strict typing isn't an option either/would be just as much work as auditing
the changes this would introduce.It may be that user-defined functions should have accepted null to begin with in
your opinion, but that still makes it a breaking change now.
Indeed. I suggest to add a note to "Backward Incompatible Changes"
section of the RFC. This changes the contract, but does not throw new
errors in existing code.
Now my 2 cents, reading docs and RFCs (I'm not the internals developer):
The https://wiki.php.net/rfc/scalar_type_hints_v5 says that NULL
was
excluded "in order to be consistent with our existing type declarations
for classes, callables and arrays".
So, to me it sounds like the main reason was that NULL
can't be cast to
array/object/callable, not that it was expected to be treated as a new
type in the future, nor more strictness was intended.
At the time excluding NULL
made sense as it was a new feature anyway,
and it didn't apply to internal functions, so it was fully sign-in
feature. Although I must say that "it should be possible for existing
userland libraries to add scalar type declarations without breaking
compatibility" wasn't really true, because of NULL.
You could call the proposed NULL-to-scalar coercion in weak-type checks
a new feature that makes
https://wiki.php.net/rfc/deprecate_null_to_scalar_internal_arg
obsolete/invalid, still, there are good reasons to do this. Imo, the BC
break is much less problematic than the one we're trying to solve here.
--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
It is exactly user-defined functions that this RFC introduces breakage for.
The behaviour to throw on null in user-defined functions exists since PHP
7.0, and is being relied on. Changing these now would introduce behaviour changes
that are harder to find than new type errors.
Using strict typing isn't an option either/would be just as much work as auditing
the changes this would introduce.
It may be that user-defined functions should have accepted null to begin with in
your opinion, but that still makes it a breaking change now.Indeed. I suggest to add a note to "Backward Incompatible Changes" section of the RFC. This changes the contract, but does not throw new errors in existing code.
Thanks Aleksander,
While I'm emailing Mel off-list about a more realistic example, I've added a simple example to demonstrate a BC break:
https://wiki.php.net/rfc/null_coercion_consistency#backward_incompatible_changes https://wiki.php.net/rfc/null_coercion_consistency#backward_incompatible_changes
function my_function(string $my_string) {
var_dump($my_string);
}
try {
my_function('A'); // string(1) "A"
my_function(1); // string(1) "1"
my_function(1.2); // string(3) "1.2"
my_function(true); // string(1) "1"
my_function(false); // string(0) ""
my_function(NULL); // Throw Type Error
} catch (TypeError $e) {
// Do something important?
}
I believe the example Mel is looking at involves a function that should return an integer; and later, a second function should only be run when the returned value is !== 0
(not != 0
or > 0
, due to "style guides telling people to unconditionally prefer triple equal")... that's all working fine, but sometimes the first function can return NULL
(I assume that indicates a problem), where it's useful to have the second function throw an exception when this happens (so no more processing happens, and it can be investigated). While I think there are better ways of handling this, I'll work with Mel to get this example into my RFC.
Personally I think these are quite unusual, and don't match the size of the BC impact we face when internal functions (that have been accepting and coercing NULL
since, well, forever) start throwing fatal type errors in 9.0.
Craig
On Sat, May 7, 2022 at 1:38 AM Craig Francis craig@craigfrancis.co.uk
wrote:
Not what I'm going for... but anyway, to get an idea of your position, do
you think the string '15' should be coerced to a number, or should it fatal
error as well? e.g.$my_number = round($request->get('my_number'));
'15' is not an undefined value.
If a program uses the paradigm that null represents the equivalent of
something like unset($var)
, then the program would not expect coercion to
happen at all if a specific type is demanded, for the same reason that a C
program wouldn't expect an out-of-range memory address to silently evaluate
as an int 0. For programs that use this paradigm, the type system that PHP
already uses which has a difference between ?int
and int
, means that
there is an explicit piece of code noting the exceptions to this rule about
how null is treated.
What exactly would be the purpose of ?int
if this RFC was passed? To pass
the value as null instead of 0? That's it? What about int|float
? Which
one would it be coerced to? This pretty radically changes how typing itself
would work in existing user programs. What I'm saying is that for such a
radical BC break you need to provide some compelling justification, and I'm
concerned from your replies in this thread and the content of the RFC that
you don't actually understand the extent of a BC-break you're proposing.
Jordan
On Sat, May 7, 2022 at 2:11 PM Jordan LeDoux jordan.ledoux@gmail.com
wrote:
I'm concerned from your replies in this thread and the content of the RFC
that you don't actually understand the extent of a BC-break you're
proposing.
Sorry, that was a tad more combative than I intended. What I'm trying to
say is that I'm concerned there may be a blindspot here about the impact
which could significantly impact developers in ways that aren't being
accounted for.
Jordan
What exactly would be the purpose of
?int
if this RFC was passed? To pass
the value as null instead of 0? That's it?
Yes. No change here.
What about
int|float
? Which one would it be coerced to?
What happens if you pass FALSE
to such an argument? int(0). The same
would happen with NULL.
This pretty radically changes how typing itself
would work in existing user programs. What I'm saying is that for such a
radical BC break you need to provide some compelling justification.
If coercion is possible it should be done as it is done for
bool/int/float/string. Inside the function you can still trust the
variable type is as expected. The only difference is that it will not
throw an error on NULL. Right now you can still use your function
"incorrectly" and not get an error, e.g. if you pass FALSE
to it (I
understand NULL
might be more common). And, if you use strict_types=1
nothing changes for you.
I don't see this as a radical change. Definitely not more radical than
making the code to throw errors where they were not thrown before (no
matter how long the deprecation period was). So, imo the BC-break
argument is not that strong in this case.
The consistency argument is also not that strong (but here probably more
in favor of the change) because as it was mentioned already current
behavior is consistent with some things and inconsistent with others.
The essential question is whether we want more of "weak-typing coercion"
or not. More would mean that there's no breakage for existing code in
PHP9 (essentially only 8.1 would be "broken" (because of the deprecation
notice) so people could just skip it). No need for ugly code like "$var
?? ''" everywhere is also a win.
Ok, this was my take, now I will shut up.
--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
What happens if you pass
FALSE
to such an argument? int(0). The same
would happen with NULL.
The thing that everyone seems to be glossing over with these coercion
examples is that in order to have any scalar value, it must be explicitly
set somewhere. That might be inside an internal function with a
pass-by-ref, it might be on a regular old assignment line, it might be as
the result of a function return... but somewhere the scalar must be set to
that variable.
This is not the case with null. If you use the unset() function on a
variable for example, it will var_dump as null and it will pass an
is_null()
check and it will pass a $var === null and it will return
false for an isset() check. Null loses these qualities if it is cast to
any scalar.
False being cast to int(0) at least doesn't change whether or not it will
return true or false for isset(). Null fundamentally has the meaning that a
variable is uninitialized, no matter what other ways a program happens to
use it.
Jordan
This is not the case with null. If you use the unset() function on a
variable for example, it will var_dump as null and it will pass an
is_null()
check and it will pass a $var === null and it will return
false for an isset() check. Null loses these qualities if it is cast to
any scalar.
Fortunately the writing is on the wall for the undefined cases, but that
doesn't take away from your argument.
Den 2022-05-08 kl. 08:52, skrev Aleksander Machniak:
What exactly would be the purpose of
?int
if this RFC was passed? To
pass
the value as null instead of 0? That's it?Yes. No change here.
What about
int|float
? Which one would it be coerced to?What happens if you pass
FALSE
to such an argument? int(0). The same
would happen with NULL.This pretty radically changes how typing itself
would work in existing user programs. What I'm saying is that for such a
radical BC break you need to provide some compelling justification.If coercion is possible it should be done as it is done for
bool/int/float/string. Inside the function you can still trust the
variable type is as expected. The only difference is that it will not
throw an error on NULL. Right now you can still use your function
"incorrectly" and not get an error, e.g. if you passFALSE
to it (I
understandNULL
might be more common). And, if you use strict_types=1
nothing changes for you.I don't see this as a radical change. Definitely not more radical than
making the code to throw errors where they were not thrown before (no
matter how long the deprecation period was). So, imo the BC-break
argument is not that strong in this case.The consistency argument is also not that strong (but here probably more
in favor of the change) because as it was mentioned already current
behavior is consistent with some things and inconsistent with others.The essential question is whether we want more of "weak-typing coercion"
or not. More would mean that there's no breakage for existing code in
PHP9 (essentially only 8.1 would be "broken" (because of the deprecation
notice) so people could just skip it). No need for ugly code like "$var
?? ''" everywhere is also a win.
It's not only ugly code ;) To make your program/application/library 8.1
compatible using that codepattern requires en effort, but brings close
to zero improvement, except being 8.1 compatible. So the net effect is
negative.
r//Björn L
It's not only ugly code ;) To make your program/application/library 8.1
compatible using that codepattern requires en effort, but brings close
to zero improvement, except being 8.1 compatible. So the net effect is
negative.
Important correction: that change is to make your codebase PHP 9.0
compatible, not PHP 8.1 compatible.
That is the point of deprecation notices, to give advance warning that
something will become incompatible in the future; when you act on that
notice is up to you, and not in any sense about "compatibility" with the
version that raises the notice.
(For my thoughts on the rest of what you're saying, see every other
message I've sent to this thread.)
Regards,
--
Rowan Tommins
[IMSoP]
It's not only ugly code ;) To make your program/application/library 8.1
compatible using that codepattern requires en effort, but brings close
to zero improvement, except being 8.1 compatible. So the net effect is
negative.Important correction: that change is to make your codebase PHP 9.0
compatible, not PHP 8.1 compatible.That is the point of deprecation notices, to give advance warning that
something will become incompatible in the future; when you act on that
notice is up to you, and not in any sense about "compatibility" with the
version that raises the notice.(For my thoughts on the rest of what you're saying, see every other
message I've sent to this thread.)Regards,
--
Rowan Tommins
[IMSoP]--
To unsubscribe, visit: https://www.php.net/unsub.php
Important correction: that change is to make your codebase PHP 9.0
compatible, not PHP 8.1 compatible.
If you are writing library code, it is 8.1 compatible to avoid people
opening issues about spamming their logs.
If you are writing library code, it is 8.1 compatible to avoid people
opening issues about spamming their logs.
Yes, I understand that such complaints are common, but we should be
working to provide documentation and tooling to those users, rather than
accepting their misinterpretation and repeating it on this list.
Otherwise, we might as well just promote all deprecation notices to
fatal errors immediately.
I will open a new thread about this later.
Regards,
--
Rowan Tommins
[IMSoP]
Not what I'm going for... but anyway, to get an idea of your position, do you think the string '15' should be coerced to a number, or should it fatal error as well? e.g.
$my_number = round($request->get('my_number'));
'15' is not an undefined value.
If a program uses the paradigm that null represents the equivalent of something like
unset($var)
Hi Jordan,
Some programers can do that, but it's not how everyone sees it.
NULL
is not the same as unset()
, it is a value in itself, and many programmers simply give no thought to the distinction between an Empty String and NULL. e.g. when using the noted frameworks, NULL
is provided for user values, and developers will often treat it the same as an Empty String... if $name == ''
, then show an error saying your name is required; if strlen($name) > 200
then your name is too long, etc.
What exactly would be the purpose of
?int
if this RFC was passed? To pass the value as null instead of 0?
I'll talk about int
below; but just so I can focus on the nullability part of this question, I'll use ?string
...
Yes, that's all I'm proposing.
For example ?string
would not change (it will continue to provide the NULL
value to the function), whereas string
would coerce NULL
to an Empty String, e.g.
function accept_nullable_string(?string $my_string) {
var_dump($my_string);
}
function accept_string(string $my_string) {
var_dump($my_string);
}
accept_nullable_string(3); // string(1) "3"
accept_nullable_string('2'); // string(1) "2"
accept_nullable_string(true); // string(1) "1"
accept_nullable_string(false); // string(0) ""
accept_nullable_string(NULL); // `NULL`
accept_string(3); // string(1) "3"
accept_string('2'); // string(1) "2"
accept_string(true); // string(1) "1"
accept_string(false); // string(0) ""
accept_string(NULL); // Currently TypeError, change to string(0) ""
That's it?
Yep, but the point is about avoiding the upgrade problem when internal functions will stop coercing NULL, and instead throw type errors (these are hard to find, and will break code in seemly random places).
What about
int|float
? Which one would it be coerced to?
Good question,
Short answer; today, if you pass the string '4' to a function that uses int|float
, then it receives int(4)
:
function accept_number(int|float $my_int) {
var_dump($my_int);
}
accept_number('4'); // int(4)
Longer answer...
First I'll note the documentation says (for string to integer conversion) "If the string is numeric or leading numeric then it will resolve to the corresponding integer value, otherwise it is converted to zero (0)."
https://www.php.net/manual/en/language.types.integer.php#language.types.integer.casting https://www.php.net/manual/en/language.types.integer.php#language.types.integer.casting
But an Empty String is not coerced to int(0)
for user defined functions, or arithmetic (e.g. 5 + ""
complains after PHP 7.1)... whereas, an empty string is converted to 0 when using intval($var)
, and (int) $var
.
While I think this can be a bit harsh, I can see how it might avoid some ambiguity/issues, e.g. $offset = ''; $a = substr('abc', $offset)
, although I have seen substr('abc', NULL, $length)
a few times.
If PHP was to throw an exception for NULL
to integer coercion with internal functions, while I recognise this would still create a BC break (e.g. round($nullable)
), I'd might be fine with it, because an Empty String to Integer is also rejected.
That said, and back to your question about int|float
, because the string '4' is coerced to int(4)
, I'd prefer NULL
to be coerced to int(0)
.
This pretty radically changes how typing itself would work in existing user programs.
I wouldn't say this is radical, considering it's just removing a single type error (I suspect it's rare for code to rely on NULL
coercion triggering an exception)... in contrast, internal functions have always worked with null coercion, and works in other contexts like string concatenation ('A' . $nullable
), == comparisons, the print()/echo() functions, etc.
What I'm saying is that for such a radical BC break you need to provide some compelling justification, and I'm concerned from your replies in this thread and the content of the RFC that you don't actually understand the extent of a BC-break you're proposing.
As noted previously, and the example I've added to the "Backward Incompatible Changes" section, I don't think this is a radical BC break, it's fairly small change designed to make NULL
coercion work as documented, in a constant way (all contexts), and most importantly, avoid the upgrade problems for PHP 9.0 (please keep in mind the lack of tooling to help with this, where the only safe and easy way to handle this would be to add NULL
coalescing throughout the code base, which is not a good idea).
Craig
Hi,
I've written a new draft RFC to address the
NULL
coercion problems:
As a voter, I'm in favor of this RFC.
I suggest to rename "Documentation" section title to "Scalars coercion
inconsistency" or sth like that, as this section isn't really much about
the documentation, rather the current state.
--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
Hi,
I've written a new draft RFC to address theNULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistencyAs a voter, I'm in favor of this RFC.
I suggest to rename "Documentation" section title to "Scalars coercion inconsistency" or sth like that, as this section isn't really much about the documentation, rather the current state.
Good point Aleksander, I started with 5 quotes from the Documentation, but you're right, the examples have grown a bit since then, and the second part does show the Current State, so I've gone with that heading (hope that's ok).
Craig
I've written a new draft RFC to address the
NULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistency https://wiki.php.net/rfc/null_coercion_consistency
For those against my RFC, can you take a quick look at this patch for Laravel:
https://github.com/laravel/framework/pull/36262/files#diff-15b0a3e2eb2d683222d19dfacc04c616a3db4e3d3b3517e96e196ccbf838f59eR118 https://github.com/laravel/framework/pull/36262/files#diff-15b0a3e2eb2d683222d19dfacc04c616a3db4e3d3b3517e96e196ccbf838f59eR118
If passing NULL
to htmlspecialchars()
represents a problem, as used in templates like <p>Hi {{ $name }}</p>
, presumably this patch should be reverted?
Craig
For those against my RFC, can you take a quick look at this patch for Laravel:
https://github.com/laravel/framework/pull/36262/files#diff-15b0a3e2eb2d683222d19dfacc04c616a3db4e3d3b3517e96e196ccbf838f59eR118 https://github.com/laravel/framework/pull/36262/files#diff-15b0a3e2eb2d683222d19dfacc04c616a3db4e3d3b3517e96e196ccbf838f59eR118
If passing
NULL
tohtmlspecialchars()
represents a problem, as used in templates like<p>Hi {{ $name }}</p>
, presumably this patch should be reverted?
I notice that the docblock didn't previously list null as valid input,
so this was only working by mistake, as the commit message admits. If
you copied the documented union to the native type declaration on the
parameter, it would immediately reject nulls, because that has always
been the behaviour of user-defined functions. Static analysis tools will
also have complained that null was not a valid input according to the
documentation.
I also note that the commit message says "On PHP >= 8.1, an error is
thrown if null
is passed to htmlspecialchars
." which is of course
not true for native PHP, only if you make the highly dubious decision to
promote deprecations to errors.
As an anecdote, I was recently working on a bug involving nulls causing
unintended behaviour in an API. As part of the clean-up, I changed a
function signature from getByIdentifer($identifier) to
getByIdentifer(string $identifier), and during testing, got an error
because I'd missed a scenario where null was passed. This was a good
result - it prevented the broken behaviour and alerted me to the case
that needed fixing. If the parameter had instead been silently coerced
to an empty string, I would have got neither an error nor the original
null behaviour, causing a new bug that might have taken much longer to
track down.
If your RFC as drafted was implemented, I could only have achieved the
desired check by turning on strict_types=1 for whole sections of the
application, which would probably require dozens of other fixes, or
writing an ugly manual check:
function getByIdentifier(?string $identifier) {
if ( $identifier === null ) {
throw new TypeError("Function doesn't actually accept nulls");
}
// ...
}
As I have said previously, I share your concern about the impact of the
currently planned change, but I don't think loosening the existing type
rules on user-defined functions is a good solution.
Regards,
--
Rowan Tommins
[IMSoP]
For those against my RFC, can you take a quick look at this patch for Laravel:
If passing
NULL
tohtmlspecialchars()
represents a problem, as used in templates like<p>Hi {{ $name }}</p>
, presumably this patch should be reverted?I notice that the docblock didn't previously list null as valid input, so this was only working by mistake, as the commit message admits. If you copied the documented union to the native type declaration on the parameter, it would immediately reject nulls, because that has always been the behaviour of user-defined functions. Static analysis tools will also have complained that null was not a valid input according to the documentation.
To confirm the order of events:
First, the Docblock originally said this function did not accept NULL, but at runtime it accepted/coerced NULL
to an empty string. This is exactly how htmlspecialchars()
worked pre 8.1. Where developers using static analysis tools can choose to treat NULL
as an invalid value, and those tools could report nullable variables as an error (via strict type checking).
Second, the Docblock and implementation was updated to allow NULL, because NULL
is common value (a backwards compatibility issue), so this HTML encoding function, used by Blade templates by default, now accepts NULL.
This seems to undermine the argument that NULL
should not be passed to htmlspecialchars()
.
I did suggest updating the ~335 function parameters to be nullable; but that was rejected because some developers prefer to treat NULL
as an invalid value. To keep those developers happy, and avoid the backwards compatibility issue, it seems easier to allow NULL
to be coerced (like other contexts, e.g. string concat).
I also note that the commit message says "On PHP >= 8.1, an error is thrown if
null
is passed tohtmlspecialchars
." which is of course not true for native PHP, only if you make the highly dubious decision to promote deprecations to errors.
While I'd put the word "error" down as a typo, the intention is for 9.0 to throw a type error.
And while user-defined functions are part of the conversation (for consistency reasons), I'm trying to find the benefits of breaking NULL
coercion for internal functions (because, if there is an overall benefit, that Laravel Blade patch should be reverted).
As an anecdote, I was recently working on a bug involving nulls causing unintended behaviour in an API. As part of the clean-up, I changed a function signature from getByIdentifer($identifier) to getByIdentifer(string $identifier), and during testing, got an error because I'd missed a scenario where null was passed. This was a good result - it prevented the broken behaviour and alerted me to the case that needed fixing. If the parameter had instead been silently coerced to an empty string, I would have got neither an error nor the original null behaviour, causing a new bug that might have taken much longer to track down.
If your RFC as drafted was implemented, I could only have achieved the desired check by turning on strict_types=1 for whole sections of the application, which would probably require dozens of other fixes, or writing an ugly manual check:
function getByIdentifier(?string $identifier) {
if ( $identifier === null ) {
throw new TypeError("Function doesn't actually accept nulls");
}
// ...
}
It sounds like you got lucky - you have a function that has a problem with NULL
(but I assume it's fine with an empty string?), and during your testing you happened to pass NULL
to this function. As noted before, static analysis is considerably better at these types of checks, because it's able check if variables can contain NULL. They can also perform other checks as well (important when your code seems to care about NULL
vs an empty string).
As I have said previously, I share your concern about the impact of the currently planned change, but I don't think loosening the existing type rules on user-defined functions is a good solution.
If you have a better solution, please let me know.
I don't think breaking NULL
coercion for internal functions helps (the change seems to involve loads of tiny updates, probably by littering projects with $val ?? ''
, strval($val)
, or (string) $val
, with hard to define benefits); in contrast, allowing NULL
coercion for user-defined functions would at least keep user-defined and internal functions consistent, with NULL
coercion continuing to work in the same way as other contexts (e.g. string concat, == comparisons, sprintf, etc); or how coercion works between string/int/float/bool.
That said, I would still like to know about the benefits of rejecting NULL
for htmlspecialchars()
, in the context of that Laravel Blade patch; because, if it is beneficial (vs the BC break), they should revert that patch... shouldn't they?
Craig
On Thu, May 26, 2022 at 7:21 AM Craig Francis craig@craigfrancis.co.uk
wrote:
That said, I would still like to know about the benefits of rejecting
NULL
forhtmlspecialchars()
, in the context of that Laravel Blade patch;
because, if it is beneficial (vs the BC break), they should revert that
patch... shouldn't they?
No, they should not. Laravel made an explicit choice to handle null values
in its escaping function for its templating framework. That is their
choice to make. They should not be forced to implicitly handle null values
because the language decides to add implicit type coercion to user defined
functions (because consistency seems to be so important to some), and they
should not be forced to reject null values because underlying language
functions reject them as well.
Every implicit type coercion is a bug in hiding IMO. If something doesn’t
explicitly support null, you shouldn’t have been passing null to it in the
first place. Relying on magic language behaviors to prevent type errors is
not a long term sustainable code pattern. If a function, by explicit
design, can safely work with null values then that parameter should be
properly declared nullable. If a function, by explicit design, does not
support null values then a type error is a hugely acceptable response.
Yes, that means that there is a lot of work involved for folks between now
and the release of PHP 9.0 to either refactor code to avoid type errors or
to get the language to expand the supported types in appropriate functions,
but IMO that effort from all parties is worth it in the long run to ensure
language consistency (I really don’t think userland PHP and
internal/extension PHP should have vastly different behaviors, and type
coercion support based on where a function is designed is one of those
holes that needs filled) and to ensure all APIs explicitly declare what
types of data they support.
- Michael Please pardon any errors, this message was sent from my iPhone.
That said, I would still like to know about the benefits of rejecting
NULL
forhtmlspecialchars()
, in the context of that Laravel Blade patch; because, if it is beneficial (vs the BC break), they should revert that patch... shouldn't they?No, they should not. Laravel made an explicit choice to handle null values in its escaping function for its templating framework. That is their choice to make. They should not be forced to implicitly handle null values because the language decides to add implicit type coercion to user defined functions (because consistency seems to be so important to some), and they should not be forced to reject null values because underlying language functions reject them as well.
Erm, I'm simply asking - If there is a good reason for throwing an exception when NULL
is passed to htmlspecialchars()
, then that reason would also apply to Laravel Blade for it's HTML encoding.
If you know what that reasons is, you should be able to make that case to them, and get that patch reverted.
I believe the only time it's helpful is when working in a strict type environment, which is best checked with static analysis, as those extra type checks can highlight unexpected values. But most (I'm assuming ~85%) developers simply do not use strict type checking, and don't seem to consider the sources of NULL
(see the "Common sources of NULL" I've listed in the RFC), and for those developers, spending days editing their code to manually convert those NULL
values, that does not seem justified.
To respond to your comment, Laravel doesn't provide a native type declaration (it's in a DocBlock), so they don't have any type coercion for their e()
function:
Every implicit type coercion is a bug in hiding IMO.
That's fine, but strict static analysis is so much better at finding these issues (i.e. can this variable be NULL, vs a runtime check to see if the value is NULL
this time)... and if you want runtime checks as well, that's where strict_types=1
comes in.
If a function, by explicit design, can safely work with null values then that parameter should be properly declared nullable.
htmlspecialchars()
can and always has safely worked with NULL, the same with the other 335 parameters I've listed. But when I suggested changing these parameters to be nullable, it was not well received.
Yes, that means that there is a lot of work involved for folks between now and the release of PHP 9.0 to either refactor code to avoid type errors or to get the language to expand the supported types in appropriate functions, but IMO that effort from all parties is worth it in the long run to ensure language consistency (I really don’t think userland PHP and internal/extension PHP should have vastly different behaviors, and type coercion support based on where a function is designed is one of those holes that needs filled) and to ensure all APIs explicitly declare what types of data they support.
I don't think userland and internal functions should have different behaviours either (my RFC does say "keep user-defined and internal functions consistent").
To achieve that, for those not using strict_types=1
, I'm simply suggesting that NULL
coercion continues to work for internal functions, and we update user-defined functions to allow NULL
coercion. This would also help developers add native types to their functions. Then we have a setup which is basically the same as how the string "6" can be coerced to the integer 6, and how string concatenation with NULL
works, how '' == $nullable
works, how
sprintf('Hi %s', $nullable)works, how
echo $nullable` works, etc.
Craig
That said, I would still like to know about the benefits of rejecting
NULL
forhtmlspecialchars()
, in the context of that Laravel Blade patch; because, if it is beneficial (vs the BC break), they should revert that patch... shouldn't they?No, they should not. Laravel made an explicit choice to handle null values in its escaping function for its templating framework. That is their choice to make. They should not be forced to implicitly handle null values because the language decides to add implicit type coercion to user defined functions (because consistency seems to be so important to some), and they should not be forced to reject null values because underlying language functions reject them as well.
Erm, I'm simply asking - If there is a good reason for throwing an exception when
NULL
is passed tohtmlspecialchars()
, then that reason would also apply to Laravel Blade for it's HTML encoding.
Laravel’s e()
helper function is more than a wrapper around htmlspecialchars()
which supports values beyond a string, and has elected to support a null value being passed into that function with explicit handling for it. I’ve seen similar patches land in other packages as well. Whether it is to prevent that package from triggering the deprecation regarding null values itself or the package owners choosing to explicitly handle the null case so users don’t need to deal with it doesn’t really matter, the point is folks have made that decision and it is not problematic in any way IMO. You’re basically saying that a framework choosing to apply the same htmlspecialchars($string ?? ‘’)
patch that applications are suggested to use (for those who don’t care to differentiate between null and empty string before something reaches an escaping function) is in the wrong by arguing for Laravel to revert that patch.
If a function, by explicit design, can safely work with null values then that parameter should be properly declared nullable.
htmlspecialchars()
can and always has safely worked with NULL, the same with the other 335 parameters I've listed. But when I suggested changing these parameters to be nullable, it was not well received.
The arginfo for the htmlspecialchars()
function, and if I’m reading correctly (I’m no C developer so I could very well be misinterpreting things) the internal php_html_entities()
function which htmlspecialchars()
calls, does not declare null as a supported value for the $string
parameter. Meaning that depending on whether your code uses strict_types or not, either calling the function with a null value triggers a type error or requires implicit type coercision to convert the value to a type that the function does support. This means to me that the htmlspecialchars()
function does NOT support null values and that passing null through only works because of the reliance on implicit type coercision. That, to me, is a valid reason for a type error to be raised; the code very explicitly requires a string and is not written to process a null value.
A blanket suggestion to make every string parameter which implicitly supports a null value nullable because they previously supported null through type coercion IMO should be shot down. Instead, parameters which are emitting a deprecation notice because they are receiving an unsupported null value should be reviewed, and if that function can by design support null values, those parameters should be updated to reflect nullability. IMO, htmlspecialchars()
is not a function which should expliciitly support a null value and the present deprecation is correct.
Yes, that means that there is a lot of work involved for folks between now and the release of PHP 9.0 to either refactor code to avoid type errors or to get the language to expand the supported types in appropriate functions, but IMO that effort from all parties is worth it in the long run to ensure language consistency (I really don’t think userland PHP and internal/extension PHP should have vastly different behaviors, and type coercion support based on where a function is designed is one of those holes that needs filled) and to ensure all APIs explicitly declare what types of data they support.
I don't think userland and internal functions should have different behaviours either (my RFC does say "keep user-defined and internal functions consistent").
To achieve that, for those not using
strict_types=1
, I'm simply suggesting thatNULL
coercion continues to work for internal functions, and we update user-defined functions to allowNULL
coercion.
On the record, I personally disagree with the strict_types declaration’s existence purely because it makes the runtime inconsistent based solely on what file something was called from, plus there are way to bypass a developer’s explicit opt-in to strict_types being enabled which makes it a “mostly strict but there’s still a gotcha” declaration at best. I don’t think this type of deep rooted behavioral difference improves PHP in any way, but that ship has long sailed.
With that said though, I disagree with exposing the internal type coercion behavior to userland code. I think the userland behavior as it is today is the correct behavior, even if I look at implicit type coercision as a bug which will cause hard-to-identify production issues for somebody at some point down the road. Let’s also be real here, a not-so-insignificant number of PHP builds are powered by platforms which have minimal if any automated testing and even less static analysis, meaning a static analyzer is not going to be of much help to those builds to begin with because they’re using non-analyzed code as a foundation. Changing the language in a way that makes these already fragile platforms even more suseptable to hidden type-related bugs is a bad idea. Personally speaking, I’ve been bitten by enough type-coercion related bugs in my career (and continue to address new issues arise thanks to a combination of poorly formed legacy data in the database plus open source packages becoming stricter in the types they support) that I would rather advocate for the language moving in a direction that makes type-related issues harder to hit and it to be noisy when it does encounter such issues; the changes in PHP 8.1 IMO are a step in the right direction.
[...] If there is a good reason for throwing an exception when
NULL
is passed tohtmlspecialchars()
, then that reason would also apply to Laravel Blade for it's HTML encoding.Laravel’s
e()
helper function is more than a wrapper aroundhtmlspecialchars()
which supports values beyond a string, and has elected to support a null value being passed into that function with explicit handling for it. I’ve seen similar patches land in other packages as well. Whether it is to prevent that package from triggering the deprecation regarding null values itself or the package owners choosing to explicitly handle the null case so users don’t need to deal with it doesn’t really matter, the point is folks have made that decision and it is not problematic in any way IMO. You’re basically saying that a framework choosing to apply the samehtmlspecialchars($string ?? ‘’)
patch that applications are suggested to use (for those who don’t care to differentiate between null and empty string before something reaches an escaping function) is in the wrong by arguing for Laravel to revert that patch.
Sorry, I've read this a few times, and I just don't understand what you mean... but let's leave it, I cannot think how else to re-write my question to get an answer.
htmlspecialchars()
can and always has safely worked with NULL, the same with the other 335 parameters I've listed. But when I suggested changing these parameters to be nullable, it was not well received.The arginfo for the
htmlspecialchars()
function, and if I’m reading correctly (I’m no C developer so I could very well be misinterpreting things) the internalphp_html_entities()
function whichhtmlspecialchars()
calls, does not declare null as a supported value for the$string
parameter. Meaning that depending on whether your code uses strict_types or not, either calling the function with a null value triggers a type error or requires implicit type coercision to convert the value to a type that the function does support. This means to me that thehtmlspecialchars()
function does NOT support null values and that passing null through only works because of the reliance on implicit type coercision. That, to me, is a valid reason for a type error to be raised; the code very explicitly requires a string and is not written to process a null value.A blanket suggestion to make every string parameter which implicitly supports a null value nullable because they previously supported null through type coercion IMO should be shot down. Instead, parameters which are emitting a deprecation notice because they are receiving an unsupported null value should be reviewed, and if that function can by design support null values, those parameters should be updated to reflect nullability. IMO,
htmlspecialchars()
is not a function which should expliciitly support a null value and the present deprecation is correct.
Erm, I really don't wish to be rude, but have you read my RFC, and the problem I'm trying to solve... in short, the backwards compatibility issue for PHP 9.0.
Developers will need to spend a lot of time updating their code, which is important to avoid the fatal type errors, but I simply do not understand what the benefit for them will be.
Yes, that means that there is a lot of work involved for folks between now and the release of PHP 9.0 to either refactor code to avoid type errors or to get the language to expand the supported types in appropriate functions, but IMO that effort from all parties is worth it in the long run to ensure language consistency (I really don’t think userland PHP and internal/extension PHP should have vastly different behaviors, and type coercion support based on where a function is designed is one of those holes that needs filled) and to ensure all APIs explicitly declare what types of data they support.
I don't think userland and internal functions should have different behaviours either (my RFC does say "keep user-defined and internal functions consistent").
To achieve that, for those not using
strict_types=1
, I'm simply suggesting thatNULL
coercion continues to work for internal functions, and we update user-defined functions to allowNULL
coercion.On the record, I personally disagree with the strict_types declaration’s existence purely because it makes the runtime inconsistent based solely on what file something was called from, plus there are way to bypass a developer’s explicit opt-in to strict_types being enabled which makes it a “mostly strict but there’s still a gotcha” declaration at best. I don’t think this type of deep rooted behavioral difference improves PHP in any way, but that ship has long sailed.
I believe George Peter Banyard is trying to work on this.
With that said though, I disagree with exposing the internal type coercion behavior to userland code. I think the userland behavior as it is today is the correct behavior, even if I look at implicit type coercision as a bug which will cause hard-to-identify production issues for somebody at some point down the road. Let’s also be real here, a not-so-insignificant number of PHP builds are powered by platforms which have minimal if any automated testing and even less static analysis, meaning a static analyzer is not going to be of much help to those builds to begin with because they’re using non-analyzed code as a foundation. Changing the language in a way that makes these already fragile platforms even more suseptable to hidden type-related bugs is a bad idea. Personally speaking, I’ve been bitten by enough type-coercion related bugs in my career (and continue to address new issues arise thanks to a combination of poorly formed legacy data in the database plus open source packages becoming stricter in the types they support) that I would rather advocate for the language moving in a direction that makes type-related issues harder to hit and it to be noisy when it does encounter such issues; the changes in PHP 8.1 IMO are a step in the right direction.
Sorry, but I'm finding this hard to follow, it sounds like you don't want any type coercion... e.g. the string "5" to int 5 should just not happen without the developer explicitly converting the type (e.g. intval)... I disagree, but we can leave it at that.
Craig
On Thu, May 26, 2022 at 5:21 AM Craig Francis craig@craigfrancis.co.uk
wrote:
It sounds like you got lucky - you have a function that has a problem with
NULL
(but I assume it's fine with an empty string?), and during your
testing you happened to passNULL
to this function. As noted before, static
analysis is considerably better at these types of checks, because it's
able check if variables can contain NULL. They can also perform other
checks as well (important when your code seems to care aboutNULL
vs an
empty string).
Nearly all code has a problem with null. It very much feels like the
original effort to deprecate null calls decided to resolve this by saying
"let's have the language help developers improve their code so it doesn't
have these problems in the first place", and this effort is trying to
resolve this by saying "let's have the language support the buggy code in
ways that makes it work".
At my job, my task for the last three weeks has literally been upgrading
our internal codebase for 8.1, and the biggest set of logs I'm dealing with
is exactly what you're talking about here: null's passed to internal
functions. Every single case I've looked at so far has been traced to code
that was written incorrectly, where some code somewhere was not properly
guarding its values, and error cases were slipping through.
Jordan
It sounds like you got lucky - you have a function that has a problem with
NULL
(but I assume it's fine with an empty string?), and during your testing you happened to passNULL
to this function. As noted before, static analysis is considerably better at these types of checks, because it's able check if variables can contain NULL. They can also perform other checks as well (important when your code seems to care aboutNULL
vs an empty string).Nearly all code has a problem with null.
Erm, but it doesn't... does it?
I know I keep going on about this very simply example, but it represents a fairly typical style of programming PHP, and I just do not understand what the problem with it is:
$search = $request->input('q'); // Laravel, returns `NULL` when 'q' is not defined.
echo 'Results for ' . htmlspecialchars($search);
But forget about it, hopefully someone else can find a solution to the problem.
It very much feels like the original effort to deprecate null calls decided to resolve this by saying "let's have the language help developers improve their code so it doesn't have these problems in the first place", and this effort is trying to resolve this by saying "let's have the language support the buggy code in ways that makes it work".
At my job, my task for the last three weeks has literally been upgrading our internal codebase for 8.1, and the biggest set of logs I'm dealing with is exactly what you're talking about here: null's passed to internal functions. Every single case I've looked at so far has been traced to code that was written incorrectly, where some code somewhere was not properly guarding its values, and error cases were slipping through.
For one of the teams I work with (the only one trying to make the relevant changes), this is also their "biggest" problem... but they are having exactly the opposite experience, a considerable amount of hours have gone into finding and changing their code, and not a single change was for code that was "written incorrectly" (I suppose that depends on what you think "correct" code is)... the other teams I work with are either suppressing this notice, or simply not upgrading to 8.1.
Craig
First, the Docblock originally said this function did not accept NULL, but at runtime it accepted/coerced
NULL
to an empty string. This is exactly howhtmlspecialchars()
worked pre 8.1. Where developers using static analysis tools can choose to treatNULL
as an invalid value, and those tools could report nullable variables as an error (via strict type checking).
The same events can be given a very different narrative:
Originally, this function was not intended to accept null. The
documentation clearly stated the accepted types, and any static analysis
tool would reject any value not in that list. The author of the function
chose to treat NULL
as an invalid value.
Because the list of valid types wasn't enforced at run-time, it was
accidentally possible to pass null. Many developers who weren't using
additional tooling came to rely on this bug, so as a compromise, the
authors decided to change the documented behaviour of nulls.
At no point did anybody look at the function and say "I can safely pass
null to this, as long as I remember never to use static analysis tools".
They accidentally passed null, and by luck got a useful result.
I also note that the commit message says "On PHP >= 8.1, an error is thrown if
null
is passed tohtmlspecialchars
." which is of course not true for native PHP, only if you make the highly dubious decision to promote deprecations to errors.
While I'd put the word "error" down as a typo, the intention is for 9.0 to throw a type error.
My point is that there is no action required on PHP 8.1. The commit
message should have said, "On PHP >= 9.0, an error will be thrown if
null
is passed to htmlspecialchars
."
And while user-defined functions are part of the conversation (for consistency reasons), I'm trying to find the benefits of breaking
NULL
coercion for internal functions (because, if there is an overall benefit, that Laravel Blade patch should be reverted).
This is a complete non sequitur. It's perfectly possible for different
scenarios to support different decisions, and what Laravel decides that
particular function should accept is based on their own assessment of
the trade-offs. PHP is free to make an entirely different decision based
on entirely different trade-offs.
Meanwhile, the benefits have been explained repeatedly in this thread.
You may not agree that they are worth the cost, and as I've repeatedly
said, I have some sympathy for that. But please stop trying to take the
conversation back to the very beginning by implying that you've asked a
question and not received an answer.
Regards,
--
Rowan Tommins
[IMSoP]
First, the Docblock originally said this function did not accept NULL, but at runtime it accepted/coerced
NULL
to an empty string. This is exactly howhtmlspecialchars()
worked pre 8.1. Where developers using static analysis tools can choose to treatNULL
as an invalid value, and those tools could report nullable variables as an error (via strict type checking).The same events can be given a very different narrative:
Originally, this function was not intended to accept null. The documentation clearly stated the accepted types, and any static analysis tool would reject any value not in that list. The author of the function chose to treat
NULL
as an invalid value.Because the list of valid types wasn't enforced at run-time, it was accidentally possible to pass null. Many developers who weren't using additional tooling came to rely on this bug, so as a compromise, the authors decided to change the documented behaviour of nulls.
At no point did anybody look at the function and say "I can safely pass null to this, as long as I remember never to use static analysis tools". They accidentally passed null, and by luck got a useful result.
I'm really sorry, but I'm not sure what your point is... I was asking why passing NULL
to htmlspecialchars()
represents a problem, but we should just leave it, we're not getting anywhere.
I also note that the commit message says "On PHP >= 8.1, an error is thrown if
null
is passed tohtmlspecialchars
." which is of course not true for native PHP, only if you make the highly dubious decision to promote deprecations to errors.
While I'd put the word "error" down as a typo, the intention is for 9.0 to throw a type error.My point is that there is no action required on PHP 8.1. The commit message should have said, "On PHP >= 9.0, an error will be thrown if
null
is passed tohtmlspecialchars
."
But deprecation messages do prompt developers to make changes (action is required).
For those who do "no action", when they try upgrading to 9.0, they will have a bit of a problem (with no easy fix).
And while user-defined functions are part of the conversation (for consistency reasons), I'm trying to find the benefits of breaking
NULL
coercion for internal functions (because, if there is an overall benefit, that Laravel Blade patch should be reverted).This is a complete non sequitur. It's perfectly possible for different scenarios to support different decisions, and what Laravel decides that particular function should accept is based on their own assessment of the trade-offs. PHP is free to make an entirely different decision based on entirely different trade-offs.
Sorry, but I'm not following... if there is a benefit/reason for PHP to reject NULL
for htmlspecialchars()
, and I'm just too stupid to see what it is, I would have assumed that benefit/reason would also apply to the HTML encoding function e()
in Laravel.
Meanwhile, the benefits have been explained repeatedly in this thread. You may not agree that they are worth the cost, and as I've repeatedly said, I have some sympathy for that. But please stop trying to take the conversation back to the very beginning by implying that you've asked a question and not received an answer.
Sorry, I'm just not getting it, I have read and re-read all of your emails many times, and I'm clearly not clever enough to understand.
Anyway, I've send a separate email about my RFC, because I cannot find a solution... I've tried my best, and I'm done.
Craig
On Fri, May 27, 2022 at 9:36 PM Craig Francis craig@craigfrancis.co.uk
wrote:
I know I keep going on about this very simply example, but it represents a
fairly typical style of programming PHP, and I just do not understand what
the problem with it is:$search = $request->input('q'); // Laravel, returns `NULL` when 'q' is not defined. echo 'Results for ' . htmlspecialchars($search);
If you want to go with this specific example, exactly as written, here is
the exact issue I have with it.
As already mentioned, htmlspecialchars()
is documented and the
implementing code requires a string argument. Passing a null value to it
requires explicitly writing non-typesafe code, and requires understanding
of how the PHP language handles type coercion, and how those rules are
different based on whether the file calling that function has the
strict_types
declaration. So on a code review, I have to understand the
context of the value passed through and know what context the language is
operating in to determine if non-string values are allowed to avoid errors,
and understand the semantics of PHP’s type coercion system to make sure the
output is something expected based on the input. Compare that to knowing
the function explicitly only accepts a string value and you know you are
working with a string through appropriate validation; for me, even in a
loosely typed environment, I’d rather follow the documented parameter types
instead of having to worry about all the other magic involved to make null
work.
Though, that input variable would never reach output in any of my projects
to begin with if it were null or an empty string; both would land on a code
path treated as no search being performed. So that example for me is also
way oversimplified because it doesn’t match a real world workflow IMO.
On Fri, May 27, 2022 at 9:36 PM Craig Francis craig@craigfrancis.co.uk
wrote:
Sorry, but I'm not following... if there is a benefit/reason for PHP to
rejectNULL
forhtmlspecialchars()
, and I'm just too stupid to see what
it is, I would have assumed that benefit/reason would also apply to the
HTML encoding functione()
in Laravel.
No, that would not automatically apply to Laravel’s helper function, or any
escaping functions in Twig, or escaping functions used in platforms without
templating frameworks like WordPress. If it’s not OK for functions that
can call htmlspecialchars()
to gracefully handle null to make sure those
trigger the same type error as the core language, then along the same
lines, I would argue that patches which add null coalescing or explicit
typecasting to that parameter in libraries or applications should also be
rejected because then they’re masking an error the language purposefully
elected to emit.
--
- Michael Please pardon any errors, this message was sent from my iPhone.
As an anecdote, I was recently working on a bug involving nulls causing
unintended behaviour in an API. As part of the clean-up, I changed a
function signature from getByIdentifer($identifier) to
getByIdentifer(string $identifier), and during testing, got an error
because I'd missed a scenario where null was passed. This was a good
result - it prevented the broken behaviour and alerted me to the case
that needed fixing. If the parameter had instead been silently coerced
to an empty string, I would have got neither an error nor the original
null behaviour, causing a new bug that might have taken much longer to
track down.If your RFC as drafted was implemented, I could only have achieved the
desired check by turning on strict_types=1 for whole sections of the
application, which would probably require dozens of other fixes, or
writing an ugly manual check:function getByIdentifier(?string $identifier) {
if ( $identifier === null ) {
throw new TypeError("Function doesn't actually accept nulls");
}
// ...
}
For this specific example, shouldn't it rather [already] be like this anyway?
function getByIdentifier(string $identifier) {
if ( $identifier === '' ) {
throw new InvalidArgumentException("Empty identifier");
}
// ...
}
(but I guess you could find actual examples where an empty string is
"valid" and null is not, indeed)
As I have said previously, I share your concern about the impact of the
currently planned change, but I don't think loosening the existing type
rules on user-defined functions is a good solution.
I agree the "issue" looks like different PoV of benefit VS cost, plus
the reticence about going back on the general trend of "more
strictness" (even with strict_types=0).
I'm just worried to see people rather disabling deprecation notices
(via error_reporting, or a custom error handler, or even patching
PHP's source and recompiling their own) than "fixing" the code
(granted, that's not specific to this RFC, but somewhat "highlighted" here)
FWIW, to avoid the "risks" of expr ?? ''
(possibly hiding undefined)
and (string)expr
/ strval(expr)
(potentially casting "too much"
without errors), I've already seen custom functions like
function null_to_empty_string(?string $string_or_null): string
{ return $string_or_null === null ? '' : $string_or_null; }
(but also its "opposite" empty_string_to_null(string $string): ?string)
Regards,
--
Guilliam Xavier
For this specific example, shouldn't it rather [already] be like this anyway?
function getByIdentifier(string $identifier) {
if ( $identifier === '' ) {
throw new InvalidArgumentException("Empty identifier");
}
// ...
}(but I guess you could find actual examples where an empty string is
"valid" and null is not, indeed)
The actual code in this case ended up in a generic routine that used
isset() to choose which SQL to generate. An empty string would generate
a WHERE clause that matched zero rows, but a null would omit the WHERE
clause entirely, and match all rows. So an extra pre-validation on the
string format might be useful for debugging, but wouldn't result in
materially different results.
I'm just worried to see people rather disabling deprecation notices
(via error_reporting, or a custom error handler, or even patching
PHP's source and recompiling their own) than "fixing" the code
(granted, that's not specific tothis RFC, but somewhat "highlighted" here)
Indeed, I think we have a general problem with how deprecations are
communicated and acted on in general. I have been thinking about how to
improve that, other than "never change anything" or "never warn people
we're going to change anything", and will try to write up my ideas soon.
function null_to_empty_string(?string $string_or_null): string
{ return $string_or_null === null ? '' : $string_or_null; }(but also its "opposite" empty_string_to_null(string $string): ?string)
That's actually an interesting observation. It's probably quite common
to treat empty strings as null when going from input to storage; and to
treat null as empty string when retrieving again. Importantly, databases
generally don't treat them as equivalent, so forgetting that
translation can be a real cause of bugs. I often advocate for string
columns in databases to allow either null or empty string, but not both
(by adding a check constraint), so that such bugs are caught earlier.
To go back to Craig's favourite example, that could be a genuine problem
caused by passing null to htmlspecialchars()
- if we intended that null
to be stored as such in the database, we've silently converted it into a
non-equivalent empty string. (Yes, escaping should be done on output not
input, but it's not completely infeasible that that combination might
happen.)
Regards,
--
Rowan Tommins
[IMSoP]
The actual code in this case ended up in a generic routine that used
isset() to choose which SQL to generate. An empty string would generate
a WHERE clause that matched zero rows, but a null would omit the WHERE
clause entirely, and match all rows. So an extra pre-validation on the
string format might be useful for debugging, but wouldn't result in
materially different results.
Maybe the routine could use e.g. array_key_exists()
rather than
isset()? (anyway, sometimes you actually want isset() null behavior,
or use a null default for a parameter as a "not passed" argument...)
That's actually an interesting observation. It's probably quite common
to treat empty strings as null when going from input to storage; and to
treat null as empty string when retrieving again. Importantly, databases
generally don't treat them as equivalent,
Yeah, I only know Oracle to do something as... "clever" as storing an
empty VARCHAR '' as NULL
(for "optimization" IIRC) -_-
so forgetting that
translation can be a real cause of bugs. I often advocate for string
columns in databases to allow either null or empty string, but not both
(by adding a check constraint), so that such bugs are caught earlier.
Same (sometimes you have no choice but allow NULL, e.g. an optional
foreign key, but the referenced primary key is not nullable and should
generally also reject '')
--
Guilliam Xavier
I've written a new draft RFC to address the
NULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistency
I give up.
I'm clearly not clever enough to understand what the benefits are for breaking NULL
coercion... considering NULL
has been coerced for internal functions since forever, and continues to work in other contexts.
If anyone wants to work on a solution to the problem, feel free to either edit my RFC, or create your own RFC.
I believe my RFC documents every position put forward, and includes all of details I think are relevant.
I fear a vote now will only result in rejection; and once people put their names down for a certain position, they will never change their mind.
That said, if someone did continue, and got their RFC accepted, I should be able to organise some funding for the implementation.
For some background, I paid for the is_literal()
implementation before that RFC vote, as that patch is useful irrespective of the result (it's being used in a few projects now, and works incredibly well; thanks again to Joe Watkins for writing, and Máté Kocsis for testing). In this case, two of my clients are considering the cost of modifying their code (by adding a load of ?? ''
everywhere), and they would rather avoid that (time consuming, and makes their code more complicated). I suggested putting aside a budget to either do that modification, or fund the implementation if my RFC was to pass. I'm not sure how big or complicated the task would be, but I noted the R11 suggested day rate of $500 (~£390), and the implementation could take a few weeks.
If anyone does want to email me about this, I won't respond until at least June 6th.
Craig
I've written a new draft RFC to address the
NULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistencyI give up.
Don't give up. You have my Yes vote.
Imo, the RFC:
- fixes real upgrade problem (very important),
- improves consistency in a better way than the solution introduced in 8.1.
- does not change strict_types behavior
Some people don't care with what arguments their functions are called
with. As long as the value can be coerced to the expected type the
function will do what it is supposed to do. Other people have
strict_types. All people don't want to be forced to modify a working code.
So, this is another "battle" between strict and non-strict camps. I'd
like to see which is the majority these days. I hope that even some
strict-code proponents can see this makes sense.
--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
I've written a new draft RFC to address the
NULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistency
I give up.Don't give up. You have my Yes vote.
Imo, the RFC:
- fixes real upgrade problem (very important),
- improves consistency in a better way than the solution introduced in 8.1.
- does not change strict_types behavior
Some people don't care with what arguments their functions are called with. As long as the value can be coerced to the expected type the function will do what it is supposed to do. Other people have strict_types. All people don't want to be forced to modify a working code.
So, this is another "battle" between strict and non-strict camps. I'd like to see which is the majority these days. I hope that even some strict-code proponents can see this makes sense.
I'm sorry Aleksander.
While I think this is the best approach as well, I don't have a vote, and I don't have the time/energy to respond to emails where I'm clearly not making any progress, or doing a good job of putting the case forward (for reference, each email usually takes a least an hour, as I want to make sure I've completely understood what the person is saying, and yay dyslexia).
Anyway, I'm going away for a week (leaving in 1 hour), and when I get back, I really need to focus on other things (life, client work, and things related to is_literal()
/literal-string
which I am making some good progress on).
Thank you for your comments though... and maybe you would be able to convince the other voters it's worth it?
Keep in mind, even my quiz from a few months ago had split the room on this issue:
https://quiz.craigfrancis.co.uk/
Craig
In this case, two of my clients are considering the cost of modifying their code (by adding a load of
?? ''
everywhere), and they would rather avoid that (time consuming, and makes their code more complicated).
Alternatively, they may wish to define their own functions that do take
null and then run a find / replace or rector rule to convert them over.
I would still recommend ?? because it's the most future proof.
We have passed two RFCs in recent months which will eliminate
default-nulls in 2 of the 3 major variable sources in the engine (the
remaining one being arrays), and that too may eventually find itself
promoted.
If it's a framework, I would suggest adding methods for explicitly
getting a string that defaults to '' instead of null.
I've written a new draft RFC to address the
NULL
coercion problems:
https://wiki.php.net/rfc/null_coercion_consistencyI give up.
For everyone affected by this... Rector has a "solution" via NullToStrictStringFuncCallArgRector.
It has a list of 275 functions (442 parameters), and every time these functions are used, and Rector isn't sure a variable cannot be NULL, it will add a string type cast, e.g.
- <?= htmlspecialchars($var) ?>
+ <?= htmlspecialchars((string) $var) ?>
So yeah, it's very messy, and while I think it makes the code worse, at least it's one way to avoid the fatal errors (coming in 9.0).
Oh, and developers can easily re-introduce more "problems" (because they might not be aware which variables can be NULL), so Rector will need to re-run.
I'm also tempted to create a composer library to provide a second solution, by using a namespace to re-define these functions, e.g.
<?php
namespace null_coercion_fixer;
function strlen(?string $string): int {
return \strlen(\strval($string));
}
?>
Have fun,
Craig