Hello,
I recently emailed the group about submitting an RFC for str_begins()
and str_ends() functions. The RFC has now been officially submitted and
is viewable at:
https://wiki.php.net/rfc/add_str_begin_and_end_functions
The github PR may be found at:
https://github.com/php/php-src/pull/2049
Hope to be hearing about this,
Will
I recently emailed the group about submitting an RFC for str_begins() and
str_ends() functions. The RFC has now been officially submitted and is
viewable at:
Feeling "meh" on it (neither for nor against), but I would consider
consistency with other str*() functions by making case-insensitivity
live in separate functions rather than as a parameter. e.g.
str_begins(), str_ibegins(), str_ends(), end_iends()
-Sara
I recently emailed the group about submitting an RFC for str_begins() and
str_ends() functions. The RFC has now been officially submitted and is
viewable at:Feeling "meh" on it (neither for nor against), but I would consider
consistency with other str*() functions by making case-insensitivity
live in separate functions rather than as a parameter. e.g.
str_begins(), str_ibegins(), str_ends(), end_iends()
+1 for having functions for case insensitivity.
I'm not sure if we should have "s". i.e. str_begin"s".
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Sara Golemon wrote:
Feeling "meh" on it (neither for nor against), but I would consider
consistency with other str*() functions by making case-insensitivity
live in separate functions rather than as a parameter. e.g.
str_begins(), str_ibegins(), str_ends(), end_iends()
I guess that "i" isn't appliable when it have slashes.
In this case, functions should be: strbegins, stribegins, strends, striends.
In all case, I think that is better a third parameter and keep underlined.
Yasuo Ohgaki wrote:
+1 for having functions for case insensitivity.
I'm not sure if we should have "s". i.e. str_begin"s".
I think that "s" is good here.
Sounds better for me, but I don't know if it is right in english.
In JS, for instance, we have startsWith. It have a "s" too.
2016-08-01 21:06 GMT-03:00 Yasuo Ohgaki yohgaki@ohgaki.net:
I recently emailed the group about submitting an RFC for str_begins() and
str_ends() functions. The RFC has now been officially submitted and is
viewable at:Feeling "meh" on it (neither for nor against), but I would consider
consistency with other str*() functions by making case-insensitivity
live in separate functions rather than as a parameter. e.g.
str_begins(), str_ibegins(), str_ends(), end_iends()+1 for having functions for case insensitivity.
I'm not sure if we should have "s". i.e. str_begin"s".Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net--
--
David Rodrigues
Hi David,
Sara Golemon wrote:
Feeling "meh" on it (neither for nor against), but I would consider
consistency with other str*() functions by making case-insensitivity
live in separate functions rather than as a parameter. e.g.
str_begins(), str_ibegins(), str_ends(), end_iends()I guess that "i" isn't appliable when it have slashes.
In this case, functions should be: strbegins, stribegins, strends, striends.
In all case, I think that is better a third parameter and keep underlined.
This is difficult issue.
String function names are inconsistent currently.
It is better to stick to CODING_STANDARDS naming convention for new
function names. Therefore, new string functions are better to be named
str_*() unless they are too strange.
e.g.
http://php.net/manual/en/function.str-replace.php
http://php.net/manual/en/function.str-ireplace.php
I would like to fix function name inconsistencies by having aliases in
near future.
https://wiki.php.net/rfc/consistent_function_names
It might be okay to have "s" in function names, but if we want to be
consistent,
str_replace -> str_replaces
str_ireplace -> str_ireplaces
IMO, following names are better for consistency.
str_begin
str_ibegin
str_end
str_iend
In addition, str_replace()
has seach_value at first, so signature might be
boolean str_begin(string $search_value, string $str, [boolean
$case_sensitive = true])
boolean str_end(string $search_value, string $str, string
$search_value [boolean $case_sensitive = true])
However, strstr()
(and other str functions without "_". e.g.
strpos/stripos/strrpos/strripos) has search_value as the 2nd
parameter. If we follow this format, current signature is fine.
It may be better sort out and fix consistency issues first, then add
new functions. Otherwise, we may introduce more consistency issues.
Regards,
BTW, having "i" is more readable.
str_ibegin("searchthis", $str);
is more readable than
str_begin("seachthis", $str, TRUE);
as programmer does not have to know that's the TRUE
means.
It's small thing, but small things add up.
--
Yasuo Ohgaki
yohgaki@ohgaki.net
It might be okay to have "s" in function names, but if we want to be
consistent,str_replace -> str_replaces
str_ireplace -> str_ireplacesIMO, following names are better for consistency.
str_begin
str_ibegin
str_end
str_iend
I think those names mean something different: "str_begin" sounds like an
imperative "make this string begin with X"; "str_begins" is more of an
assertion "the string begins with X". Ruby would spell it with a ? at
the end. It's also the same form, grammatically, as the common "isFoo".
Note that this logic holds for "str_replace", which is an imperative -
you are not saying "tell me if X replaces Y", you are saying "please
replace X with Y".
Regards,
--
Rowan Collins
[IMSoP]
Hi David,
On Tue, Aug 2, 2016 at 10:36 AM, David Rodrigues
david.proweb@gmail.com wrote:Sara Golemon wrote:
Feeling "meh" on it (neither for nor against), but I would consider
consistency with other str*() functions by making case-insensitivity
live in separate functions rather than as a parameter. e.g.
str_begins(), str_ibegins(), str_ends(), end_iends()I guess that "i" isn't appliable when it have slashes.
In this case, functions should be: strbegins, stribegins, strends,
striends.
In all case, I think that is better a third parameter and keep
underlined.This is difficult issue.
String function names are inconsistent currently.
It is better to stick to CODING_STANDARDS naming convention for new
function names. Therefore, new string functions are better to be named
str_*() unless they are too strange.e.g.
http://php.net/manual/en/function.str-replace.php
http://php.net/manual/en/function.str-ireplace.phpI would like to fix function name inconsistencies by having aliases in
near future.
https://wiki.php.net/rfc/consistent_function_namesIt might be okay to have "s" in function names, but if we want to be
consistent,str_replace -> str_replaces
str_ireplace -> str_ireplacesIMO, following names are better for consistency.
str_begin
str_ibegin
str_end
str_iendIn addition,
str_replace()
has seach_value at first, so signature might
beboolean str_begin(string $search_value, string $str, [boolean
$case_sensitive = true])
boolean str_end(string $search_value, string $str, string
$search_value [boolean $case_sensitive = true])However,
strstr()
(and other str functions without "_". e.g.
strpos/stripos/strrpos/strripos) has search_value as the 2nd
parameter. If we follow this format, current signature is fine.It may be better sort out and fix consistency issues first, then add
new functions. Otherwise, we may introduce more consistency issues.Regards,
BTW, having "i" is more readable.
str_ibegin("searchthis", $str);
is more readable than
str_begin("seachthis", $str, TRUE);
as programmer does not have to know that's theTRUE
means.
It's small thing, but small things add up.--
Yasuo Ohgaki
yohgaki@ohgaki.net
Everyone has raised important considerations. For me, the most important
thing is maintaining consistency with the existing PHP string library. I
do not want these functions to feel "tacked" on, as if they were
haphazardly added to PHP. If these functions are added to the language,
it should feel as if they have always been a part of the language (even
if they haven't been). This consistency is important in order to ensure
these functions ADD to PHP instead of just cluttering it up.
Having separate functions for case sensitivity makes sense, that is much
more consistent with the existing string library. I think the proposal
should be amended to separate those two functionalities. I think like
having an "s" at the end of the function names reads better, but
omitting the "s" fits better with the existing function names and does
not read bad. Therefore, I am in favor of dropping the "s".
As far as str_begin vs strbegin, I think str_begin is more readable.
Therefore, I think it would be better to implement:
boolean str_begin(string $search_value, string $str)
boolean str_ibegin(string $search_value, string $str)
boolean str_end(string $search_value, string $str)
boolean str_iend(string $search_value, string $str)
This is much more consistent with the existing string library.
Regards,
Will
as programmer does not have to know that's the
TRUE
means.
s/that's/what's/
I shouldn't write mails while writing code :(
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi!
I guess that "i" isn't appliable when it have slashes.
In this case, functions should be: strbegins, stribegins, strends, striends.
In all case, I think that is better a third parameter and keep underlined.
Please, not stribegins. We have enough functions with weird names :)
I am ambivalent of the question whether to have additional argument or
two functions, I guess with a slight preference for argument.
--
Stas Malyshev
smalyshev@gmail.com
I've updated the RFC to reflect the discussion here and on github. You
may see it at
https://wiki.php.net/rfc/add_str_begin_and_end_functions . You can see
the github PR at https://github.com/php/php-src/pull/2049 .
The motivation for these changes was to maximize consistency between the
proposed functions and existing PHP string functions. The goal is to
make these functions feel natural and add functionality to the language
without cluttering it up.
Thanks,
Will
I've updated the RFC to reflect the discussion here and on github. You may
see it at
https://wiki.php.net/rfc/add_str_begin_and_end_functions . You can see
the github PR at https://github.com/php/php-src/pull/2049 .The motivation for these changes was to maximize consistency between the
proposed functions and existing PHP string functions. The goal is to make
these functions feel natural and add functionality to the language without
cluttering it up.
Generally, +1. A few thoughts.
First, the RFC refers to these working on "characters". I assume you mean
ASCII characters and these actually work strictly on bytes. Working on
"characters" would be more in-line for a multi-byte extension. Would you
please clarify this point?
Second, and related to the multi-byte issue: do the case insensitive
versions honor case-folding in a multi-byte fashion? Either way, it's
probably a good idea to separate the vote between the sensitive and
insensitive versions because this is fundamentally a different, and perhaps
more contentious, question.
Third, perhaps these functions could provide more information than just
yes/no. Return boolean TRUE
if and only if the needle completely
begins/ends the haystack, otherwise return INT representing the length in
common. Yes, that'll probably be a trap for new developers who don't honor
===, but that could be illuminated in docs. Formally:
boolean|int str_begin(string $needle, string $haystack)
boolean|int str_end(string $needle, string $haystack)
For example:
str_begin('http://', 'http://example.com') === true
str_begin('http://', 'https://example.com') === 4
Finally, since the RFC will fuel the final documentation, it might be a
good idea to use needle/haystack terminology in the function signatures for
some kind of consistency.
I guess that "i" isn't appliable when it have slashes.
In this case, functions should be: strbegins, stribegins, strends, striends.
In all case, I think that is better a third parameter and keep underlined.
Please, not stribegins. We have enough functions with weird names :)
I am ambivalent of the question whether to have additional argument or
two functions, I guess with a slight preference for argument.
The bulk of the time I'm applying this to the SQL query that is going to
return a set of results rather than direct to a string. In that case
it's STARTING 'xYZ'. Because the need has not arisen I've only just
noticed - after 20 odd years - there is no matching ENDING. Although
normally one needs to build a phantom field to index the data, so I do
have ONE case of reversed_field STARTING 'ZYX'.
Is starting just a Firebird SQL thing or is it more generally available.
I do a few google searches but as usual when searching for things like
'starting' one gets hundreds of pages on 'running' the software and it's
other connotations.
I suspect like PHP the other methods of doing things take the strain, so
certainly LIKE 'XYZ%' and '%XYZ' are probably the 'generic' solution but
suffer from slower search times, especially when looking for the ENDING
string.
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Is starting just a Firebird SQL thing or is it more generally available.
I do a few google searches but as usual when searching for things like
'starting' one gets hundreds of pages on 'running' the software and it's
other connotations.
I've never come across it in Postgres, MS SQL Server, or MySQL.
Generally LIKE 'abc%' is the recommended approach (and will I think hit
the index in many cases, because the DBMS can optimize the case of a
prefix match if it knows at planning time). A "starting" keyword would
certainly be useful if it was there. :)
It doesn't quite fill the same need as a PHP function, of course,
because you might be checking user input, or API results, or all sorts
of things that won't, or haven't yet, hit the database. Currently the
common idiom for that is the ugly strpos($string, 'abc') === 0
Regards,
Rowan Collins
[IMSoP]
Is starting just a Firebird SQL thing or is it more generally available.
I do a few google searches but as usual when searching for things like
'starting' one gets hundreds of pages on 'running' the software and it's
other connotations.I've never come across it in Postgres, MS SQL Server, or MySQL.
Generally LIKE 'abc%' is the recommended approach (and will I think hit
the index in many cases, because the DBMS can optimize the case of a
prefix match if it knows at planning time). A "starting" keyword would
certainly be useful if it was there. :)It doesn't quite fill the same need as a PHP function, of course,
because you might be checking user input, or API results, or all sorts
of things that won't, or haven't yet, hit the database. Currently the
common idiom for that is the ugly strpos($string, 'abc') === 0
PHP is never going to be loading millions of records into memory and
searching them. That is the job of a database, and while LIKE 'abc%' can
be optimised to use an index and speed up results, if the 'abc%' is
supplied as a parameter that is not generally possible to prepare the
query using an index. While STARTING always knows the matching string is
the first characters of the index. While PHP and SQL share a number of
alternatives, the SQL versions will have a premium on search time if an
index can't be used.
I was just wondering if str_starting and str_ending matched better with
other string handling options.
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Hello,
I only saw you mention strpos, preg_match and substr as (slower)
alternatives. However, there's already a function called substr_compare
which is meant for just this kind of comparisons but which is more
general than your RFC.
function str_begins($a, $b) {
return substr_compare($a, $b, 0, strlen($b)) === 0;
}
function str_ends($a, $b) {
return substr_compare($a, $b, -strlen($b)) === 0;
}
--
Lauri Kenttä
I only saw you mention strpos, preg_match and substr as (slower)
alternatives. However, there's already a function called substr_compare
which is meant for just this kind of comparisons but which is more
general than your RFC.
Thanks for pointing out substr_compare()
, of which I even have not been
aware of. And indeed, substr_compare()
is perfectly suitable to verify
whether a string starts or ends with a certain substring, so, in my
opinion, there is no need for the other functions to be added to
ext/standard.
--
Christoph M. Becker
Hello,
I recently emailed the group about submitting an RFC for str_begins() and str_ends() functions. The RFC has now been officially submitted and is viewable at:
https://wiki.php.net/rfc/add_str_begin_and_end_functions
The github PR may be found at:
https://github.com/php/php-src/pull/2049
Hope to be hearing about this,
Will
Firstly, the argument ordering is the wrong way round for a string function. String functions — especially search-related ones — are haystack, needle (see strpos, strstr, strcspn, strpbrk, etc).
Secondly, I feel like this RFC does need to include that it’s a BC break by introducing new global functions. A quick search shows that SugarCRM[1] already implements str_begin and str_end functions and there’s likely to be other projects that do too.
[1]: https://github.com/sugarcrm/sugarcrm_dev/blob/ae189cfa4ed4edd6a4e1e0d9d1d5ec66f46a0b74/include/utils.php#L2082-L2090
Simon Welsh
Hello,
I recently emailed the group about submitting an RFC for str_begins()
and str_ends() functions. The RFC has now been officially submitted
and is viewable at:https://wiki.php.net/rfc/add_str_begin_and_end_functions
The github PR may be found at:
https://github.com/php/php-src/pull/2049
Hope to be hearing about this,
Will
Firstly, the argument ordering is the wrong way round for a string
function. String functions — especially search-related ones — are
haystack, needle (see strpos, strstr, strcspn, strpbrk, etc).Secondly, I feel like this RFC does need to include that it’s a BC
break by introducing new global functions. A quick search shows that
SugarCRM[1] already implements str_begin and str_end functions and
there’s likely to be other projects that do too.[1]:
https://github.com/sugarcrm/sugarcrm_dev/blob/ae189cfa4ed4edd6a4e1e0d9d1d5ec66f46a0b74/include/utils.php#L2082-L2090Simon Welsh
You are correct, functions like strpos and strstr do follow (haystack,
needle) but functions like str_replace follow the format (needle,
haystack). Because I did these functions with the underscore, it made
sense to make the functions follow the format found in other str_*
functions. If the functions were changed to be strbegin, stribegin,
strend, and striend, then it would make sense to follow the (haystack,
needle) format. However, I think adding the underscore greatly improves
the readability of these functions. And if the functions are named with
an underscore, I think it should follow the format found in the other
underscore functions.
Good call on the BC break, I had not thought about it breaking userland
functions with the same name.
-Will
Hello,
I recently emailed the group about submitting an RFC for str_begins() and str_ends() functions. The RFC has now been officially submitted and is viewable at:
https://wiki.php.net/rfc/add_str_begin_and_end_functions
The github PR may be found at:
https://github.com/php/php-src/pull/2049
Hope to be hearing about this,
Will
Firstly, the argument ordering is the wrong way round for a string
function. String functions — especially search-related ones — are
haystack, needle (see strpos, strstr, strcspn, strpbrk, etc).
Secondly, I feel like this RFC does need to include that it’s a BC
break by introducing new global functions. A quick search shows that
SugarCRM[1] already implements str_begin and str_end functions and
there’s likely to be other projects that do too.
[1]:
https://github.com/sugarcrm/sugarcrm_dev/blob/ae189cfa4ed4edd6a4e1e0d9d1d5ec66f46a0b74/include/utils.php#L2082-L2090
--
Simon WelshYou are correct, functions like strpos and strstr do follow (haystack, needle) but functions like str_replace follow the format (needle, haystack). Because I did these functions with the underscore, it made sense to make the functions follow the format found in other str_* functions. If the functions were changed to be strbegin, stribegin, strend, and striend, then it would make sense to follow the (haystack, needle) format. However, I think adding the underscore greatly improves the readability of these functions. And if the functions are named with an underscore, I think it should follow the format found in the other underscore functions.
str_replace and str_ireplace are the only str_* functions that don’t take the full string (haystack) as the first argument. str_pad, str_repeat, str_split and str_word_count all take the full string first even if there are other compulsory arguments.
Also, these functions are replacements for current usage of strpos/strrpos/substr_compare, so I feel like the argument ordering should match those rather than another function that isn’t closely related in functionality.
Good call on the BC break, I had not thought about it breaking userland functions with the same name.
-Will
--
Simon Welsh