Hello everyone,
I’d like to align on the approach to validating arguments for built-in
functions (usually for flag inputs). Some ongoing discussions in PRs:
- https://github.com/php/php-src/pull/15647
- https://github.com/php/php-src/pull/15883
- https://github.com/php/php-src/pull/17859
In some cases, changes introduced ValueError immediately in the next
version, without a deprecation phase. To ensure a consistent approach, I
propose the following:
- Introduce a deprecation notice in the next minor version.
- Raise a ValueError in the following minor version.
If needed, I can create RFC, but as described by a few people in the
discussions, we can avoid it having the consensus. What do you think?
Kind regards,
Jorg Sowa
My vote is on straight to ValueError without RFC. Deprecations should
only be limited to meaningful changes in functionality, not fixing
bugs or undefined behaviour. Lack of argument validation is not a
behaviour users would have relied on in past.
Hello everyone,
I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:
- https://github.com/php/php-src/pull/15647
- https://github.com/php/php-src/pull/15883
- https://github.com/php/php-src/pull/17859
In some cases, changes introduced ValueError immediately in the next version, without a deprecation phase. To ensure a consistent approach, I propose the following:
- Introduce a deprecation notice in the next minor version.
- Raise a ValueError in the following minor version.
If needed, I can create RFC, but as described by a few people in the discussions, we can avoid it having the consensus. What do you think?
My opinion on this topic is well known, validation errors should go straight to a ValueError.
And if not a ValueError then it should be an E_WARNING.
Raising a deprecation is frankly non-sensical.
Validation errors exist to inform the person programing that there is an error with the call site that is preventable very easily.
If we start emitting E_DEPRECATED
instead of E_WARNING/throwing a ValueError, we are incentivising people to add throwing error handlers for E_DEPRECATED,
something that I (and I know others too) lament constantly, as a deprecation is not an error.
Best regards,
Gina P. Banyard
Am 11.03.2025 um 07:32 schrieb Gina P. Banyard internals@gpb.moe:
I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:
My opinion on this topic is well known, validation errors should go straight to a ValueError.
And if not a ValueError then it should be an E_WARNING.
Raising a deprecation is frankly non-sensical.
I'll sound a bit like a broken record :-)
The very first example shows why we should have an E_WARNING
phase to smoothen BC changes:
Code like
$a = array_filter(["a", "", "b"], strlen(...), 666);
which currently ignores the bogus value and allows the code to continue would suddenly stop with en Exception.
Fixing the code is trivial and a warning helps identify those code locations but decoupling the moment of upgrading PHP from having to fix all code is IMHO still valuable. Especially when using third-party packages which might have a delay in fixing it.
I'm still waiting on some packages to fix new warnings for PHP 8.4 even though we've upgraded to the new version already.
Conclusion:
Yes, I'd very much like the 1) warning 2) ValueError approach from the original mail to be the default.
Regards,
- Chris
Am 11.03.2025 um 07:32 schrieb Gina P. Banyard internals@gpb.moe:
I’d like to align on the approach to validating arguments for built-in functions (usually for flag inputs). Some ongoing discussions in PRs:
My opinion on this topic is well known, validation errors should go straight to a ValueError.
And if not a ValueError then it should be an E_WARNING.
Raising a deprecation is frankly non-sensical.I'll sound a bit like a broken record :-)
The very first example shows why we should have an
E_WARNING
phase to smoothen BC changes:
Code like
$a = array_filter(["a", "", "b"], strlen(...), 666);
which currently ignores the bogus value and allows the code to continue would suddenly stop with en Exception.Fixing the code is trivial and a warning helps identify those code locations but decoupling the moment of upgrading PHP from having to fix all code is IMHO still valuable. Especially when using third-party packages which might have a delay in fixing it.
I'm still waiting on some packages to fix new warnings for PHP 8.4 even though we've upgraded to the new version already.Conclusion:
Yes, I'd very much like the 1) warning 2) ValueError approach from the original mail to be the default.Regards,
- Chris
Well, when you make it an exception, it usually gets upgraded/fixed faster :)
— Rob
Well, when you make it an exception, it usually gets upgraded/fixed faster :)
Not necessarily. When people see long lists of breaking changes in a release, they may just put off upgrading / marking the library compatible.
I think we should be very wary of how far we bend the difference between "minor" and "major" releases.
For these changes, I'd like to hear the argument against starting with a Warning. Is there any significant burden to waiting until 9.0 for these to become errors?
Rowan Tommins
[IMSoP]
Well, when you make it an exception, it usually gets upgraded/fixed faster :)
Not necessarily. When people see long lists of breaking changes in a release, they may just put off upgrading / marking the library compatible.
I think we should be very wary of how far we bend the difference between "minor" and "major" releases.
For these changes, I'd like to hear the argument against starting with a Warning. Is there any significant burden to waiting until 9.0 for these to become errors?
Rowan Tommins
[IMSoP]
Consistency.
Just staring at ext/pcntl and the git blame you can see that some functions validate the signal number (or flags) since PHP 8.0, others do not or only recently (commit from November 2023).
The reason being that the code was inconsistent in when it would actually warn in the first place.
The other one is data loss concerns, many functions that forward strings to C APIs don't check that they do not contain nul bytes, and thus the C API receives a truncated string.
Or C API require a C int, which is 32bit compared to a PHP int which can be 64 bit, so truncation may happen for large negative/positive values.
It also means that we need to do multiple passes, on the same code path, resulting in somewhat of a code churn and possibly not using a common abstraction.
Considering that we didn't even have the time to properly remove some deprecations from PHP 7 with the PHP 8.0.0 release, nor convert all relevant E_WARNINGs to Value/Type errors, I expect that we would be missing some of them again when PHP 9 comes around.
(and this is ignoring the fact that PHP does not follow semver, and I'm starting to really believe that any "semver-like" versioning system just does not work for a PL, especially not one like PHP which has an insanely vast API surface.)
Moreover, changing a silent error condition to throwing a ValueError on a programming error that really never should be happening in the first place is not a disruptive BC break IMHO.
And I will be repeating what I've been saying again and again, none of these issues would exist if bundled extensions did not exist.
The fact that these extensions are tied to the "core PHP language" release cadence is.... incredibly annoying for everyone, from maintainers to users.
Taking the example of ext/pcntl again, if it were a standalone extension, having it follow semver is a way more reasonable proposition.
Because we could just release 2 versions the same day, a x.y+1.0 introducing a warning, and a x+1.0.0 which would convert them into proper errors.
Meaning as a user, you could be running whatever PHP version and have the stricter behaviour, or upgrade PHP while still holding the ext/pcntl version behind while you deal with the issues.
(Bonus points, the issue where PIE is currently unable to install bundled extension [1] would be solved with unbundling too.)
Best regards,
Gina P. Banyard
Am 11.03.2025 um 12:54 schrieb Gina P. Banyard internals@gpb.moe:
Taking the example of ext/pcntl again, if it were a standalone extension, having it follow semver is a way more reasonable proposition.
Because we could just release 2 versions the same day, a x.y+1.0 introducing a warning, and a x+1.0.0 which would convert them into proper errors.
Meaning as a user, you could be running whatever PHP version and have the stricter behaviour, or upgrade PHP while still holding the ext/pcntl version behind while you deal with the issues.
I live in a world where a) people don't always control the PHP version they are using (hosted websites) and b) are using more and more packages which have to stay compatible for (e.g. security) updates to be applicable. My (sad) experience is that semver does not solve this, it just provides the concept of differentiating different types of changes and their respective compatibility assumptions.
To be honest the pcntl example sounds like worst of both worlds:
- Maintainers would have two (and later maybe even more) versions to maintain (for how long?). I'd rather have one version at a time and if I do not want to forget switching the warning to an exception at a specific point there are ample ways of doing this, e.g. a TODO/FIXME/whatever you want to call it marker somewhere which is automatically reported once the given criteria (time, version, ...) is met.
- Would this be the only difference between these two version? If Yes then why have them in the first place (if you want to be strict about warnings then just convert them to an exception, that's easy enough). If No then you open yet another can of worms, e.g. one package requiring x.y+1 and another package requiring x+1.0. Been there, hated it :-)
Regards,
- Chris
Am 11.03.2025 um 16:29 schrieb Christian Schneider cschneid@cschneid.com:
...(if you want to be strict about warnings then just convert them to an exception, that's easy enough)...
Reading it back I maybe wasn't clear: I mean if you as an application developer want to be strict then you can use an error handler to catch warnings and throw an exception yourself.
- Chris
It also means that we need to do multiple passes, on the same code path, resulting in somewhat of a code churn and possibly not using a common abstraction.
Considering that we didn't even have the time to properly remove some deprecations from PHP 7 with the PHP 8.0.0 release, nor convert all relevant E_WARNINGs to Value/Type errors, I expect that we would be missing some of them again when PHP 9 comes around.
I wonder if in some cases, we could automate this within the source code
- e.g. with a macro like WARNING_OR_ERROR(message,
first_version_for_error) which checks a compile-time constant for the
current version. I realise that still requires code cleanup later, but
it means we can't miss our chance when the appropriate version does come
around.
and this is ignoring the fact that PHP does not follow semver
This is true, but only in the sense that something under the MIT license
isn't under the BSD license. Officially, PHP's release policy is
extremely similar to SemVer.
https://semver.org/ says:
Major version X (X.y.z | X > 0) MUST be incremented if any backward
incompatible changes are introduced to the public API. It MAY also
include minor and patch level changes
Meanwhile https://github.com/php/policies/blob/main/release-process.rst
says:
Major Version Number
x.y.z to x+1.0.0
- Backward compatibility can be broken
- API compatibility can be broken (internals and userland)
Minor Version Number
x.y.z to x.y+1.z
- Backward compatibility must be kept
- API compatibility must be kept (userland)
Now, what exactly constitutes "backward compatibility" and "userland API
compatibility" can be debated, but there is currently no agreed policy
allowing "small compatibility breaks".
The fact that everyone acts as though there is such a policy, probably
means a rewrite of that document is overdue. A policy that nobody
follows is no use to either contributors or users.
Maybe we could work on some criteria that could be applied (and
publicised to users) about what is and isn't allowed in minor versions.
For instance:
- Code that already causes fatal behaviour might cause different fatal
behaviour (e.g. throwing an Error instead of raising an E_ERROR) - Code that directly violates documented types might start throwing
TypeError - Code that produced
NULL
as an error case only for invalid inputs might
start throwing ValueError
Some of the cases linked at the start of this thread might then meet the
agreed criteria, but some might not.
For instance, it looks like pathinfo does give reliable outputs if the
flag argument is treated as a bitmask, even if doing so makes no logical
sense. It's not strictly the wrong type, and the behaviour isn't
obviously attempting to indicate an error, or doing something
immediately fatal.
So to allow that as well, we'd need some broader criteria, like:
- Code that is clearly an error on the part of the programmer might
starting throwing Errors
But that's probably too broad and debatable to be meaningful policy.
Another alternative is to scrap x.y.z versioning completely, and give
every annual release equal status. This is the approach that PostgreSQL
has taken, I believe.
We'd probably still want some kind of deprecation policy - some changes
should be deprecated for X releases before removal/change. Which brings
us back to some kind of criteria for which changes need that, so doesn't
really solve the problem.
--
Rowan Tommins
[IMSoP]
It also means that we need to do multiple passes, on the same code path, resulting in somewhat of a code churn and possibly not using a common abstraction.
Considering that we didn't even have the time to properly remove some deprecations from PHP 7 with the PHP 8.0.0 release, nor convert all relevant E_WARNINGs to Value/Type errors, I expect that we would be missing some of them again when PHP 9 comes around.I wonder if in some cases, we could automate this within the source code
- e.g. with a macro like WARNING_OR_ERROR(message,
first_version_for_error) which checks a compile-time constant for the
current version. I realise that still requires code cleanup later, but
it means we can't miss our chance when the appropriate version does come
around.
I don't think this is an improvement for the maintenance side.
and this is ignoring the fact that PHP does not follow semver
This is true, but only in the sense that something under the MIT license
isn't under the BSD license. Officially, PHP's release policy is
extremely similar to SemVer.[...]
Now, what exactly constitutes "backward compatibility" and "userland API
compatibility" can be debated, but there is currently no agreed policy
allowing "small compatibility breaks".The fact that everyone acts as though there is such a policy, probably
means a rewrite of that document is overdue. A policy that nobody
follows is no use to either contributors or users.
Yes, rewriting this policy would be a good idea.
There have been BC breaks in every minor release going all the way back to PHP 4.
Maybe we could work on some criteria that could be applied (and
publicised to users) about what is and isn't allowed in minor versions.[...]
We'd probably still want some kind of deprecation policy - some changes
should be deprecated for X releases before removal/change. Which brings
us back to some kind of criteria for which changes need that, so doesn't
really solve the problem.
Indeed, none of this really solves the problem.
I think it is fair that the Core language has more stringent requirements on BC than anything else.
The standard library being second in line for this sort of considerations.
But every other bundled extension should be allowed to have way more leeway in what said extension maintainer wants to do.
Making a distinction between Type and Value errors is meaningless to me,
if PHP had a stronger type system, e.g. dependent types,
then, a lot of things we currently consider ValueErrors would be TypeErrors.
If we were also designing some of these functions from scratch today, a lot of parameters wouldn't integers, but enums.
The only exception to this are bitmask parameters, as we don't have Enum Sets (or Intersection Types as values) which would replace those.
And again, ValueErrors are only ever added when it easy to check if the condition is satisfied, and it very clearly points to a programming error.
I still find it baffling that telling someone that the code they wrote, even if it is decades old, is incorrect is somehow bad.
Because the alternative is letting users have bugs in their software that they can ignore.
Best regards,
Gina P. Banyard
Hi
Am 2025-03-20 17:00, schrieb Gina P. Banyard:
And again, ValueErrors are only ever added when it easy to check if
the condition is satisfied, and it very clearly points to a programming
error.I still find it baffling that telling someone that the code they wrote,
even if it is decades old, is incorrect is somehow bad.
Because the alternative is letting users have bugs in their software
that they can ignore.
I agree with that (and Kamil, who said the same thing). Passing an
undocumented value that does something is a clear programmer error
that would also break when adding new values, which is generally
considered a backwards compatible change. Pointing out this error as an
actual error before it causes a silent behavioral change is a good
thing.
The round()
function would be a good example. PHP 8.4 both added a
validation of the $mode
parameter if an integer value is given and
also added new rounding modes (just as an enum in the gold release,
though). Before PHP 8.4 it was possible to use round($value, 5)
by
accident and it would be interpreted as PHP_ROUND_HALF_UP
. Now if we
would've added a new const PHP_ROUND_WHATEVER = 5;
constant this code
would silently have changed its behavior. As a user I certainly would be
interested in finding out about this mistake. A clear ValueError is easy
to fix in a backwards compatible way, but incorrectly rounded values can
remain undetected for so long that it becomes impossible to fix them,
since the stored data might already be poisoned, including all backups.
Best regards
Tim Düsterhus
And if not a ValueError then it should be an E_WARNING.
Raising a deprecation is frankly non-sensical.
Validation errors exist to inform the person programing that there is
an error with the call site that is preventable very easily.
If we start emittingE_DEPRECATED
instead of E_WARNING/throwing a
ValueError, we are incentivising people to add throwing error handlers
for E_DEPRECATED,
something that I (and I know others too) lament constantly, as a
deprecation is not an error.
I dedicated a whole section in a previous RFC to making essentially this
same argument:
https://wiki.php.net/rfc/deprecate-bareword-strings#e_warning_vs_e_deprecated
The underlying problem is that "deprecation" isn't really a "severity",
it's more a "category" of messages. It should maybe be a separate flag
combined with the severity, so that
E_DEPRECATION_NOTICE=E_NOTICE|E_DEPRECATED, and
E_DEPRECATION_WARNING=E_WARNING|E_DEPRECATED
That way we could signal the difference between "this is bad, but the
behaviour will remain well-defined" (e.g. file I/O warnings, which would
require a completely new API to eliminate) and "this is bad, and there
are concrete plans to change it" (e.g. it will become an error, or start
doing something different, in the next major version).
--
Rowan Tommins
[IMSoP]