Hello Internals,
Is there any particular reason why the substr()
function doesn't accept a
null $length like mb_substr()
does? It seems the behavior to read through
the end of the string can only be controlled by the presence or absence of
the $length parameter: https://3v4l.org/YpuO1
I discovered this discrepancy between the two methods while attempting to
create a specialized string wrapper class with a method like this:
public function getSubstring(int $start, ?int $length = null): string
{
if ($this->isMultibyte) {
return mb_substr($this->line, $start, $length, $this->encoding);
} else {
return substr($this->line, $start, $length);
}
}
This method would not work as expected without additional boilerplate like:
public function getSubstring (int $start, ?int $length = null): string
{
if ($this->isMultibyte) {
return mb_substr($this->line, $start, $length, $this->encoding);
} elseif ($length === null) {
return substr($this->line, $start);
} else {
return substr($this->line, $start, $length);
}
}
Or:
public function getSubstring (int $start, ?int $length = null): string
{
if ($this->isMultibyte) {
return mb_substr($this->line, $start, $length, $this->encoding);
} else {
return substr($this->line, $start, $length ??
(strlen($this->line) - $start));
}
}
Are there any historical reasons preventing substr()
from accepting a null
$length like mb_substr()
does? I'd be happy to write the RFC and take a
stab at the implementation if there's interest in such a change.
Regards,
Colin O'Dell
colinodell@gmail.com
Hello Internals,
Is there any particular reason why the
substr()
function doesn't accept a
null $length likemb_substr()
does? It seems the behavior to read through
the end of the string can only be controlled by the presence or absence of
the $length parameter: https://3v4l.org/YpuO1I discovered this discrepancy between the two methods while attempting to
create a specialized string wrapper class with a method like this:public function getSubstring(int $start, ?int $length = null): string { if ($this->isMultibyte) { return mb_substr($this->line, $start, $length,
$this->encoding);
} else {
return substr($this->line, $start, $length);
}
}This method would not work as expected without additional boilerplate like:
public function getSubstring (int $start, ?int $length = null): string { if ($this->isMultibyte) { return mb_substr($this->line, $start, $length,
$this->encoding);
} elseif ($length === null) {
return substr($this->line, $start);
} else {
return substr($this->line, $start, $length);
}
}Or:
public function getSubstring (int $start, ?int $length = null): string { if ($this->isMultibyte) { return mb_substr($this->line, $start, $length,
$this->encoding);
} else {
return substr($this->line, $start, $length ??
(strlen($this->line) - $start));
}
}Are there any historical reasons preventing
substr()
from accepting a null
$length likemb_substr()
does? I'd be happy to write the RFC and take a
stab at the implementation if there's interest in such a change.
This is a fairly common deficiency in the implementation of internal
functions. Part of the reason is that historically zend_parse_parameters
did not support the ! modifier for integers, and partly these are just
implementation oversights.
Feel free to send a PR to fix this for PHP 8, no RFC needed. If you're
interested in addressing this for more functions, do a grep for "= UNKNOWN"
on the codebase, which gives you (with a few exceptions) a list of
functions that currently handle null parameters incorrectly and should
ideally be fixed for PHP 8.
Nikita
This is a fairly common deficiency in the implementation of internal
functions. Part of the reason is that historically zend_parse_parameters
did not support the ! modifier for integers, and partly these are just
implementation oversights.Feel free to send a PR to fix this for PHP 8, no RFC needed. If you're
interested in addressing this for more functions, do a grep for "= UNKNOWN"
on the codebase, which gives you (with a few exceptions) a list of
functions that currently handle null parameters incorrectly and should
ideally be fixed for PHP 8.
Thank you so much for that guidance! I wasn't sure about the RFC as it's
technically a behavioral change, but I agree with you (and Christoph) and
have therefore submitted a pull request with this change:
https://github.com/php/php-src/pull/4840 I'll also take a look at those
other functions and see if I can assist with adjusting those other
instances as well.
Cheers,
Colin O'Dell
colinodell@gmail.com
Is there any particular reason why the
substr()
function doesn't accept a
null $length likemb_substr()
does? It seems the behavior to read through
the end of the string can only be controlled by the presence or absence of
the $length parameter: https://3v4l.org/YpuO1
The reason for this behavioral inconsistency is that when mb_substr()
's
behavior has been changed[1], substr()
's has not. Due to the subtle BC
break, in my opinion, this shouldn't have been changed in a revision
(happened for 5.4.8); a major version is more suitable for such changes.
I'd be happy to write the RFC […]
It seems to me that an RFC would be overkill for this change, but it has
to be discussed.
Tentatively, I'm +0.5 on this change.
[1]
https://github.com/php/php-src/commit/352a1956b60059f9792cac840d57b184c7305667
--
Christoph M. Becker