Hello internals,
The RFC for the clamp function is now open and under discussion, you now have 2 weeks
to discuss, suggest improvements and open issues before voting is considered.
Any and all feedback is welcomed.
The RFC is available for viewing here: https://wiki.php.net/rfc/clamp https://wiki.php.net/rfc/clamp
The implementation is available in a PR here: https://github.com/php/php-src/pull/7191 https://github.com/php/php-src/pull/7191
Thank you,
Kim Hallberg
Hi Kim Hallberg,
The RFC for the clamp function is now open and under discussion, you now have 2 weeks
to discuss, suggest improvements and open issues before voting is considered.Any and all feedback is welcomed.
The RFC is available for viewing here: https://wiki.php.net/rfc/clamp
The implementation is available in a PR here: https://github.com/php/php-src/pull/7191
https://wiki.php.net/rfc/howto mentions:
Listen to the feedback, and try to answer/resolve all questions.
Update your RFC to document all the issues and discussions.
Cover both the positive and negative arguments. Put the RFC URL into all your replies.
So I think the major objections are that:
-
This is easy to implement in userland and there's negligible performance benefit
(in code that is a performance sensitive loop, it may still be worse compared to$num < $min ? $min : ($num > $max : $max : $num)
when validity of types and min/max are known, especially with the JIT) -
People not being sure if they'd ever use it personally, especially with the ability to customize parameter order and permitted argument types in userland
-
It's inconsistent with
min()
andmax()
, which support any comparable type such as GMP(https://www.php.net/manual/en/class.gmp)
(arbitrary precision numbers), DateTime, etc., and that may lead to surprises.
Although PHP's comparison operator and https://www.php.net/manual/en/function.min.php have many, many inconsistencies, already(Though special casing GMP is probably a bad idea due to it being optional and having an incompatible license with core for packagers)
-
I'm not sure what this is meant to do with the float
NAN
(Not A Number) from the RFC description, but that's solvable
php > var_dump(min(gmp_init('123'), gmp_init('456')));
object(GMP)#1 (1) {
["num"]=>
string(3) "123"
}
php > var_dump(max(new DateTime('@0'), new DateTime()));
object(DateTime)#2 (3) {
["date"]=>
string(26) "2021-06-23 23:44:47.302531"
["timezone_type"]=>
int(3)
["timezone"]=>
string(3) "UTC"
}
php > echo json_encode(max([0,2],[0,1]));
[0,2]
The RFC should probably link to this RFC announcement thread https://externals.io/message/115076 as well.
Thanks,
Tyson
So I think the major objections are that:
- This is easy to implement in userland and there's negligible performance benefit
(in code that is a performance sensitive loop, it may still be worse compared to$num < $min ? $min : ($num > $max : $max : $num)
when validity of types and min/max are known, especially with the JIT)- People not being sure if they'd ever use it personally, especially with the ability to customize parameter order and permitted argument types in userland
- It's inconsistent with
min()
andmax()
, which support any comparable type such as GMP(https://www.php.net/manual/en/class.gmp)
(arbitrary precision numbers), DateTime, etc., and that may lead to surprises.
Although PHP's comparison operator and https://www.php.net/manual/en/function.min.php have many, many inconsistencies, already(Though special casing GMP is probably a bad idea due to it being optional and having an incompatible license with core for packagers)
4. I'm not sure what this is meant to do with the floatNAN
(Not A Number) from the RFC description, but that's solvable
I'd strongly prefer an actual benchmark for context and accuracy - is it actually faster or slower than the most efficient userland implementation and by how much?
E.g. in an optimized NTS build withCFLAGS=-O2
, opcache enabled(zend_extension=opcache, opcache.enable=1,opcache.enable_cli=1),
and no debug configure flags, how many calls per second can be made on variable values of $num for both?
(I'd assume over twice as fast as calling both min/max from another function, but possibly slower than efficient_clamp, but haven't run this)For userland implementations that did use min/max, they probably weren't performance sensitive for the application.
function efficient_clamp(int|float $num, int|float $min, int|float $max): int|float { return $num < $min ? $min : ($num > $max ? $max : $num); }
Will combine both emails into one here. Will tackle your first point since that has to with performance as your second email also does.
Haven’t tested it with opcache enabled so cannot comment on the performance, but for my general testing the internal version is more than 50% faster than the min/max version. Slower than the “efficient” version in some cases and faster in others. That is to be expected though since the efficient version doesn’t check for a valid range, if a user mixes up the min and max values the efficient clamp function doesn’t work as intended and will always return either the min/max version and never the clamped value.
For your second point, that could be said for most internal functions, str_contains
, str_starts_with
and str_ends_with
could all three be implemented in userland with customises parameter orders, as can most other internals function. Frankly I don’t see this as much of an objection or issue as a phrase used to gate keep what can become a new standardised internal function.
Thirdly, yes I will agree it’s inconsistent with min/max functions. That is mainly - as you stated, min/max takes mixed arguments and works with all comparable types. Though it technically isn’t inconsistent with most userland implementations as they mostly accept numerical values such as ints or floats into their clamp function, meaning only ints and floats are used anyway.
Though I could see a version of clamping being useful in the DatePeriod
class to check if a DateTime
is in range of a certain date range, but that’s OT for this RFC so.
As for your last and forth point, I’ll be updating the RFC with NaN behaviour, sticking with what Nikita suggested to throw when NaN is added to min or max, and return NaN is it’s added to value.
Thanks, Kim.
Hello Kim Hallberg,
The RFC for the clamp function is now open and under discussion, you now have 2 weeks
to discuss, suggest improvements and open issues before voting is considered.
From https://wiki.php.net/rfc/clamp -
Current userland implementations are handled in several ways, some of which use min and max to check the bound,
which is costly and slow when called often.
Because userland implementations are for the most part not cost-effective when called multiple times,
a language implementation is desired.
I'd strongly prefer an actual benchmark for context and accuracy - is it actually faster or slower than the most efficient userland implementation and by how much?
E.g. in an optimized NTS build with CFLAGS=-O2
, opcache enabled(zend_extension=opcache, opcache.enable=1,opcache.enable_cli=1),
and no debug configure flags, how many calls per second can be made on variable values of $num for both?
(I'd assume over twice as fast as calling both min/max from another function, but possibly slower than efficient_clamp, but haven't run this)
For userland implementations that did use min/max, they probably weren't performance sensitive for the application.
function efficient_clamp(int|float $num, int|float $min, int|float $max): int|float {
return $num < $min ? $min : ($num > $max ? $max : $num);
}
Thanks,
Tyson
czw., 24 cze 2021 o 02:15 tyson andre tysonandre775@hotmail.com
napisał(a):
Hello Kim Hallberg,
The RFC for the clamp function is now open and under discussion, you now
have 2 weeks
to discuss, suggest improvements and open issues before voting is
considered.From https://wiki.php.net/rfc/clamp -
Current userland implementations are handled in several ways, some of
which use min and max to check the bound,
which is costly and slow when called often.
Because userland implementations are for the most part not
cost-effective when called multiple times,
a language implementation is desired.I'd strongly prefer an actual benchmark for context and accuracy - is it
actually faster or slower than the most efficient userland implementation
and by how much?
E.g. in an optimized NTS build withCFLAGS=-O2
, opcache
enabled(zend_extension=opcache, opcache.enable=1,opcache.enable_cli=1),
and no debug configure flags, how many calls per second can be made on
variable values of $num for both?
(I'd assume over twice as fast as calling both min/max from another
function, but possibly slower than efficient_clamp, but haven't run this)For userland implementations that did use min/max, they probably weren't
performance sensitive for the application.function efficient_clamp(int|float $num, int|float $min, int|float $max): int|float { return $num < $min ? $min : ($num > $max ? $max : $num); }
Do we really need it?
IMO the performance of this is not gonna change anything much.
We're talking about nanoseconds here possibly.
I'm looking forward to Hello World RFC - that's also often implemented in
userland.
function hello(?string $who = 'World', bool $return = false, bool $newline
= true): ?string {
$str = "Hello {$who}!" . ($newline ? `PHP_EOL` : '');
if ($return) return $str;
echo $str;
}
I get an impression that we constantly add things into standard library
which are from a language perspective irrelevant
and that all could be developed in userland with no harm.
Cheers,
Michał Marcin Brzuchalski
Le 24/06/2021 à 11:44, Michał Marcin Brzuchalski a écrit :
I get an impression that we constantly add things into standard library
which are from a language perspective irrelevant
and that all could be developed in userland with no harm.Cheers,
Michał Marcin Brzuchalski
Hello,
I respectfully do not agree with this last statement. A simple method
such as this one entirely belong to the standard library. I wouldn't
want to end-up with a NPM-like ecosystem where I'd have to choose
between clamp-php, php-clamp, php-bound-int, int-php-bound, etc...
The more the standard library will be complete, the less projects will
need dependencies, which is better for your projects long term
maintenance. You often end-up in dependency resolution hell because
library A use library B, which use foo/clamp which is not compatible
with another C library you use, or because it doesn't support PHP core
version X or Y.
A simple function such as clamp in standard library makes much sense,
anything like it must go into standard library, this way the userland
ecosystem can focus on domain related stuff, or more complex topics,
without worrying too much about this kind of detail. Having a great and
complete standard library lowers community fragmentation, and that's for
the best !
Moreover, it also ensures that all developers speak the same language
regarding primitives provided by the standard library, and makes
communication more efficient within community.
Regards,
--
Pierre
Le 24/06/2021 à 11:44, Michał Marcin Brzuchalski a écrit :
I get an impression that we constantly add things into standard library
which are from a language perspective irrelevant
and that all could be developed in userland with no harm.Cheers,
Michał Marcin Brzuchalski
Hello,
I respectfully do not agree with this last statement. A simple method such as this one entirely belong to the standard library. I wouldn't want to end-up with a NPM-like ecosystem where I'd have to choose between clamp-php, php-clamp, php-bound-int, int-php-bound, etc...
The more the standard library will be complete, the less projects will need dependencies, which is better for your projects long term maintenance. You often end-up in dependency resolution hell because library A use library B, which use foo/clamp which is not compatible with another C library you use, or because it doesn't support PHP core version X or Y.
A simple function such as clamp in standard library makes much sense, anything like it must go into standard library, this way the userland ecosystem can focus on domain related stuff, or more complex topics, without worrying too much about this kind of detail. Having a great and complete standard library lowers community fragmentation, and that's for the best !
Moreover, it also ensures that all developers speak the same language regarding primitives provided by the standard library, and makes communication more efficient within community.
This. All of this.
-Mike
Hello internals,
The RFC for the clamp function is now open and under discussion, you now
have 2 weeks
to discuss, suggest improvements and open issues before voting is
considered.Any and all feedback is welcomed.
The RFC is available for viewing here: https://wiki.php.net/rfc/clamp <
https://wiki.php.net/rfc/clamp>
The implementation is available in a PR here:
https://github.com/php/php-src/pull/7191 <
https://github.com/php/php-src/pull/7191>Thank you,
Kim Hallberg
The RFC is missing a precise description of how this function works with
NaN and negative zero. The expected behavior is that if min or max are NaN,
an exception is thrown, if num if NaN then NaN is returned, and the
behavior wrt negative zero is unspecified. This makes for an IEEE754-2008
style clamping operation. Making negative zero smaller than positive zero
would be an IEEE754-2019 style clamp.
I'm not really convinced that this is a worthwhile addition, but also not
particularly opposed to it.
Regards,
Nikita
The RFC is missing a precise description of how this function works with NaN and negative zero. The expected behavior is that if min or max are NaN, an exception is thrown, if num if NaN then NaN is returned, and the behavior wrt negative zero is unspecified. This makes for an IEEE754-2008 style clamping operation. Making negative zero smaller than positive zero would be an IEEE754-2019 style clamp.
I'm not really convinced that this is a worthwhile addition, but also not particularly opposed to it.
Regards,
Nikita
Will update the RFC with a more precise description. Didn’t know there existed IEEE standards for clamping operations. Adding exceptions for NaN values into min/max is something I agree would be expected behaviour since a non number in the range don’t really make sense.
As for the negative zero behaviour though, I’m not sure how to address that since frankly it’s out of my league, and when would a range of -0, 0 make sense to have? Cannot see a value being placed between that expect for 0, so a clamp(0, -0, 0)
would return 0 then?
Thanks, Kim.
Hello internals,
The RFC for the clamp function is now open and under discussion, you now have 2 weeks
to discuss, suggest improvements and open issues before voting is considered.Any and all feedback is welcomed.
The RFC is available for viewing here: https://wiki.php.net/rfc/clamp https://wiki.php.net/rfc/clamp
The implementation is available in a PR here: https://github.com/php/php-src/pull/7191 https://github.com/php/php-src/pull/7191Thank you,
Kim Hallberg
Since the voting window has been missed, the proposed version will be updated to PHP 8.2 instead.
Thanks, Kim.
Hello internals,
The RFC for the clamp function is now open and under discussion, you now have 2 weeks
to discuss, suggest improvements and open issues before voting is considered.Any and all feedback is welcomed.
The RFC is available for viewing here: https://wiki.php.net/rfc/clamp
The implementation is available in a PR here: https://github.com/php/php-src/pull/7191Thank you,
Kim Hallberg
Since the voting window has been missed, the proposed version will be updated to PHP 8.2 instead.
Thanks, Kim.