My apologizes if I am bringing up a topic that has been discussed before,
this is my first time wading into the PHP developers lists and I couldn't
find anything particularly relevant with the search.
Here is a bug I submitted over the weekend (
http://bugs.php.net/bug.php?id=54387) with an attached patch that adds a
str_slice() function into PHP. This function is just a very simple string
slicing function, with the logical interface of str_slice(string, start,
[end]). It is of course meant to replace substr()
as an interface for
string slicing.
I detailed the reasons I submitted the patch in the bug a little bit, but
the main reason is that I think the substr()
function is really overly
confusing and just not an intuitive method of string slicing, which is
exceedingly common functionality. I realize we don't want to go around
adding lots of random little functions into the language that don't offer
much, but the problem with that is that if we have a function like substr()
with an unusual and unintuitive interface, it becomes unchangeable due to
legacy issues and then you can never improve. I think this particular
functionality is important enough to offer an updated interface. In the bug
I also pointed to two related bugs that would be essentially fixed with this
patch.
-Dan
I just love substr()
and I think all other languages got it wrong;)
Seriously...it behaves the same as implementations in other languages as
long as values are positive, right? how is that counter-intuitive? How
do other languages handle negative values?
Am 30.03.2011 08:06, schrieb Dan Birken:
My apologizes if I am bringing up a topic that has been discussed before,
this is my first time wading into the PHP developers lists and I couldn't
find anything particularly relevant with the search.Here is a bug I submitted over the weekend (
http://bugs.php.net/bug.php?id=54387) with an attached patch that adds a
str_slice() function into PHP. This function is just a very simple string
slicing function, with the logical interface of str_slice(string, start,
[end]). It is of course meant to replacesubstr()
as an interface for
string slicing.I detailed the reasons I submitted the patch in the bug a little bit, but
the main reason is that I think thesubstr()
function is really overly
confusing and just not an intuitive method of string slicing, which is
exceedingly common functionality. I realize we don't want to go around
adding lots of random little functions into the language that don't offer
much, but the problem with that is that if we have a function likesubstr()
with an unusual and unintuitive interface, it becomes unchangeable due to
legacy issues and then you can never improve. I think this particular
functionality is important enough to offer an updated interface. In the bug
I also pointed to two related bugs that would be essentially fixed with this
patch.-Dan
I think when the values are positive everything is mostly great. I think
when the values are negative is where the main problems are. Both the C
function strncpy() and the C++ strings substr()
function only support
positive values for length AFAIK.
I just think it is very unintuitive for the first parameter always to be a
position, and the 2nd parameter to be a length if the value is positive, and
a position if the value is negative.
substr('string', 1, 2); // Goes from position 1 to position 3
substr('string', -2, -1); // Goes from position -2 to position -1
So here is the same kind of thing in python, which uses [start, end):
string[1:2] ==> 't'
string[-2:-1] ==> 'n'
And ruby, which uses [start, end]:
"string"[2..3] ==> 'tr'
"string"[-2:-1] ==> 'ng'
Both of these languages use positions for positive and negative values. In
addition, in both of these languages if you slice a string impossibly, both
of them return an empy string as opposed to false, which just seems more
intuitive to me.
I don't think this function is particularly novel, I just think both
returning an empty string on impossible slicing and slicing based on
positions are improvements, and combined I think this function is noticeably
more durable and readable than substr()
.
-Dan
On Tue, Mar 29, 2011 at 11:22 PM, Lars Schultz lars.schultz@toolpark.comwrote:
I just love
substr()
and I think all other languages got it wrong;)Seriously...it behaves the same as implementations in other languages as
long as values are positive, right? how is that counter-intuitive? How do
other languages handle negative values?Am 30.03.2011 08:06, schrieb Dan Birken:
My apologizes if I am bringing up a topic that has been discussed before,
this is my first time wading into the PHP developers lists and I couldn't
find anything particularly relevant with the search.Here is a bug I submitted over the weekend (
http://bugs.php.net/bug.php?id=54387) with an attached patch that adds a
str_slice() function into PHP. This function is just a very simple string
slicing function, with the logical interface of str_slice(string, start,
[end]). It is of course meant to replacesubstr()
as an interface for
string slicing.I detailed the reasons I submitted the patch in the bug a little bit, but
the main reason is that I think thesubstr()
function is really overly
confusing and just not an intuitive method of string slicing, which is
exceedingly common functionality. I realize we don't want to go around
adding lots of random little functions into the language that don't offer
much, but the problem with that is that if we have a function like
substr()
with an unusual and unintuitive interface, it becomes unchangeable due to
legacy issues and then you can never improve. I think this particular
functionality is important enough to offer an updated interface. In the
bug
I also pointed to two related bugs that would be essentially fixed with
this
patch.-Dan
PHP's substr()
is awesome and that comes from a person that code in at least
5 different languages daily. Parsing is a problem in many real-world
problems and substr currently works great for that purpose. You work with
two parameters: offset and length of parsing. Since meaning of a negative
offset/length when sub-stringing is intuitively undefined, PHP has reserved
these ranges for two common usecases: offsets from the end of the string and
truncation length.
"I just think it is very unintuitive for the first parameter always to be a
position, and the 2nd parameter to be a length if the value is positive, and
a position if the value is negative."
The first parameter is either an offset from start or offset from end - the
second parameter is either a length or a truncation length. I don't see why
this would be unintuitive? Perhaps you get confused by other languages that
just work with offsets.
In most real world parsing scenarios I've worked with offsets and lengths,
so the current substr definition gets my job done fastest without doubt.
Let's have a real world example "Parse through the data in chunks of 64
bytes at a time." In PHP this is simple, just take the current offset and a
length of 64. In python you'd have to add 64 to the current offset and put
into the second offset parameter = you have to think more and write more
code instead of just working with the length directly.
"Returning FALSE
when start + length parameters are invalid. This is
annoying
because when using this function you always have to deal with this FALSE
case if
you need a string. "
Guess what this code outputs?
var_dump((string) \substr("foo", 5, 6));
Now try this and you'll understand why this is basically never a problem
that substr outputs false and why you don't even have to think about it:
var_dump(\substr("foo", 5, 6) == "", (string) false, false == "");
Welcome to PHP. To be honest this criticism pretty much falls in the "from
person that comes from another language X and is annoyed that every little
detail isn't exactly the same"-category. Just make your own substr()
function that uses the behavior you expect if you don't like the native
version. Although that's bad practice - the best solution is to get used to
it. And if you have an urge to write about your experience with a new
language I suggest you do it in a blog instead of posting it in the
internals mailing list...
~Hannes
I think when the values are positive everything is mostly great. I think
when the values are negative is where the main problems are. Both the C
function strncpy() and the C++ stringssubstr()
function only support
positive values for length AFAIK.I just think it is very unintuitive for the first parameter always to be a
position, and the 2nd parameter to be a length if the value is positive,
and
a position if the value is negative.substr('string', 1, 2); // Goes from position 1 to position 3
substr('string', -2, -1); // Goes from position -2 to position -1So here is the same kind of thing in python, which uses [start, end):
string[1:2] ==> 't'
string[-2:-1] ==> 'n'And ruby, which uses [start, end]:
"string"[2..3] ==> 'tr'
"string"[-2:-1] ==> 'ng'Both of these languages use positions for positive and negative values. In
addition, in both of these languages if you slice a string impossibly, both
of them return an empy string as opposed to false, which just seems more
intuitive to me.I don't think this function is particularly novel, I just think both
returning an empty string on impossible slicing and slicing based on
positions are improvements, and combined I think this function is
noticeably
more durable and readable thansubstr()
.-Dan
On Tue, Mar 29, 2011 at 11:22 PM, Lars Schultz <lars.schultz@toolpark.com
wrote:
I just love
substr()
and I think all other languages got it wrong;)Seriously...it behaves the same as implementations in other languages as
long as values are positive, right? how is that counter-intuitive? How do
other languages handle negative values?Am 30.03.2011 08:06, schrieb Dan Birken:
My apologizes if I am bringing up a topic that has been discussed
before,this is my first time wading into the PHP developers lists and I
couldn't
find anything particularly relevant with the search.Here is a bug I submitted over the weekend (
http://bugs.php.net/bug.php?id=54387) with an attached patch that adds
a
str_slice() function into PHP. This function is just a very simple
string
slicing function, with the logical interface of str_slice(string, start,
[end]). It is of course meant to replacesubstr()
as an interface for
string slicing.I detailed the reasons I submitted the patch in the bug a little bit,
but
the main reason is that I think thesubstr()
function is really overly
confusing and just not an intuitive method of string slicing, which is
exceedingly common functionality. I realize we don't want to go around
adding lots of random little functions into the language that don't
offer
much, but the problem with that is that if we have a function like
substr()
with an unusual and unintuitive interface, it becomes unchangeable due
to
legacy issues and then you can never improve. I think this particular
functionality is important enough to offer an updated interface. In the
bug
I also pointed to two related bugs that would be essentially fixed with
this
patch.-Dan
IMHO substr is just fine enough. It does what you expect and behaves great
on edges cases.
What I believe is that we need more high-level string abstractions (and that
includes functions as well)
substr, strpos and the like works just fine to access strings by offsets,
but when you need to work with substrings the code gets really messy.
how do you test that a string starts with a string?
$string = 'string';
$start = 'str';
0 === strpos($string, $start); // to test that it starts
false === strpos($string, $start); // to test that it does not starts
yeah, it's not too complicated... but it's not legible at a glance, how does
it java?
String string = "string";
String str = "str";
string.startWith(str);
and, how do you test that a string ends with another string....
$str === substr($string, -1 * strlen($str));
tricky, isn't... how about Java?
string.endsWith(str);
I don't speak Ruby or Python... how do you do this in such languages?
Martin Scotta
On Wed, Mar 30, 2011 at 10:05 AM, Hannes Landeholm landeholm@gmail.comwrote:
PHP's
substr()
is awesome and that comes from a person that code in at
least
5 different languages daily. Parsing is a problem in many real-world
problems and substr currently works great for that purpose. You work with
two parameters: offset and length of parsing. Since meaning of a negative
offset/length when sub-stringing is intuitively undefined, PHP has reserved
these ranges for two common usecases: offsets from the end of the string
and
truncation length."I just think it is very unintuitive for the first parameter always to be a
position, and the 2nd parameter to be a length if the value is positive,
and
a position if the value is negative."The first parameter is either an offset from start or offset from end - the
second parameter is either a length or a truncation length. I don't see why
this would be unintuitive? Perhaps you get confused by other languages that
just work with offsets.In most real world parsing scenarios I've worked with offsets and lengths,
so the current substr definition gets my job done fastest without doubt.
Let's have a real world example "Parse through the data in chunks of 64
bytes at a time." In PHP this is simple, just take the current offset and a
length of 64. In python you'd have to add 64 to the current offset and put
into the second offset parameter = you have to think more and write more
code instead of just working with the length directly."Returning
FALSE
when start + length parameters are invalid. This is
annoying
because when using this function you always have to deal with thisFALSE
case if
you need a string. "Guess what this code outputs?
var_dump((string) \substr("foo", 5, 6));
Now try this and you'll understand why this is basically never a problem
that substr outputs false and why you don't even have to think about it:var_dump(\substr("foo", 5, 6) == "", (string) false, false == "");
Welcome to PHP. To be honest this criticism pretty much falls in the "from
person that comes from another language X and is annoyed that every little
detail isn't exactly the same"-category. Just make your ownsubstr()
function that uses the behavior you expect if you don't like the native
version. Although that's bad practice - the best solution is to get used to
it. And if you have an urge to write about your experience with a new
language I suggest you do it in a blog instead of posting it in the
internals mailing list...~Hannes
I think when the values are positive everything is mostly great. I think
when the values are negative is where the main problems are. Both the C
function strncpy() and the C++ stringssubstr()
function only support
positive values for length AFAIK.I just think it is very unintuitive for the first parameter always to be
a
position, and the 2nd parameter to be a length if the value is positive,
and
a position if the value is negative.substr('string', 1, 2); // Goes from position 1 to position 3
substr('string', -2, -1); // Goes from position -2 to position -1So here is the same kind of thing in python, which uses [start, end):
string[1:2] ==> 't'
string[-2:-1] ==> 'n'And ruby, which uses [start, end]:
"string"[2..3] ==> 'tr'
"string"[-2:-1] ==> 'ng'Both of these languages use positions for positive and negative values.
In
addition, in both of these languages if you slice a string impossibly,
both
of them return an empy string as opposed to false, which just seems more
intuitive to me.I don't think this function is particularly novel, I just think both
returning an empty string on impossible slicing and slicing based on
positions are improvements, and combined I think this function is
noticeably
more durable and readable thansubstr()
.-Dan
On Tue, Mar 29, 2011 at 11:22 PM, Lars Schultz <
lars.schultz@toolpark.comwrote:
I just love
substr()
and I think all other languages got it wrong;)Seriously...it behaves the same as implementations in other languages
as
long as values are positive, right? how is that counter-intuitive? How
do
other languages handle negative values?Am 30.03.2011 08:06, schrieb Dan Birken:
My apologizes if I am bringing up a topic that has been discussed
before,this is my first time wading into the PHP developers lists and I
couldn't
find anything particularly relevant with the search.Here is a bug I submitted over the weekend (
http://bugs.php.net/bug.php?id=54387) with an attached patch that
adds
a
str_slice() function into PHP. This function is just a very simple
string
slicing function, with the logical interface of str_slice(string,
start,
[end]). It is of course meant to replacesubstr()
as an interface for
string slicing.I detailed the reasons I submitted the patch in the bug a little bit,
but
the main reason is that I think thesubstr()
function is really overly
confusing and just not an intuitive method of string slicing, which is
exceedingly common functionality. I realize we don't want to go
around
adding lots of random little functions into the language that don't
offer
much, but the problem with that is that if we have a function like
substr()
with an unusual and unintuitive interface, it becomes unchangeable due
to
legacy issues and then you can never improve. I think this particular
functionality is important enough to offer an updated interface. In
the
bug
I also pointed to two related bugs that would be essentially fixed
with
this
patch.-Dan
Parsing is a problem in many real-world
problems and substr currently works great for that purpose.
That's funny because the first thing I thought when I read the
original mail was "oh that would be great for parsing." In fact, I've
just grep'ed through some code from a rich text parser I've been
working on and at first glance there are at ~5 occurences of substr()
that I would replace with str_slice(). There are also 15+ occurences
of substr()
that would remain untouched. It's not black-and-white,
sometimes you want N characters starting from pos X, and other times
you want to cut the text from pos X and pos Y, and that's where
str_slice() would be welcome.
This thread shouldn't be a criticism of substr()
, it would be
pointless. Its signature and behaviour will never change, unless
perhaps around April 1st as a practical joke on the millions of
websites it would break. The question is: is str_slice() useful, does
it fill a need and should it be included into PHP?
-JD
"This thread shouldn't be a criticism of substr()
, it would be
pointless. Its signature and behaviour will never change, unless
perhaps around April 1st as a practical joke on the millions of
websites it would break."
That's a really good joke actually. I can imagine how angry people would
get. Maybe PHP could post it as a fake news update tomorrow?
And yeah, I wouldn't mind an extra string slicing function although I think
substr is sufficient... and if you're adding extra string functions there
are more useful ones. These are the string functions in my PHP framework
that I use regularly:
// Already mentioned:
starts_with($subject, $prefix) // returns bool
ends_with($subject, $tail) // returns bool
// Generating codes/ids:
random_str($length = 16) // returns any string
random_hex_str($length = 16) // returns any 0-9a-f
random_alphanum_str($length = 16, $case_sensitive = true) // returns any
a-z0-9/A-Z0-9
from_index($index) // 0 returns a, 1 returns b .. 3 => c.. n => z.. n + 1
=> aa.. n + 2 => ab...
// like base64 but only uses 0-9a-z (it encodes / as ad, + as ac and a as
ab)
base64_alphanum_encode($data)
base64_alphanum_decode($str)
// hex encodes string (eg. "\x11\x1c" => "111c")
hex_encode($str)
hex_decode($str)
email_validate($email) // regex email validation (the actual regex is 390
characters and has never failed me... don't ask me how it works though)
http_url_validate($url)
in_range($string, $min = -1, $max = -1) // returns true if string has a
certain length (easier to read and faster to write)
quote($string) // for exporting any string to javascript escaping any escape
or control characters automatically (e.g. \n) so "foo"bar" becomes exactly
that
~Hannes
Parsing is a problem in many real-world
problems and substr currently works great for that purpose.That's funny because the first thing I thought when I read the
original mail was "oh that would be great for parsing." In fact, I've
just grep'ed through some code from a rich text parser I've been
working on and at first glance there are at ~5 occurences ofsubstr()
that I would replace with str_slice(). There are also 15+ occurences
ofsubstr()
that would remain untouched. It's not black-and-white,
sometimes you want N characters starting from pos X, and other times
you want to cut the text from pos X and pos Y, and that's where
str_slice() would be welcome.This thread shouldn't be a criticism of
substr()
, it would be
pointless. Its signature and behaviour will never change, unless
perhaps around April 1st as a practical joke on the millions of
websites it would break. The question is: is str_slice() useful, does
it fill a need and should it be included into PHP?-JD
var_dump(\substr("foo", 5, 6) == "", (string) false, false == "");
Welcome to PHP. To be honest this criticism pretty much falls in the
"from person that comes from another language X and is annoyed that
every little detail isn't exactly the same"-category. Just make your
ownsubstr()
function that uses the behavior you expect if you don't
like the native version. Although that's bad practice - the best
solution is to get used to it. And if you have an urge to write about
your experience with a new language I suggest you do it in a blog
instead of posting it in the internals mailing list...
I agree with what you're saying but I think you're being a little harsh,
experience with a new language should be something that matters to
internals.
Back to Dan, you're hitting type conversion so not sure I understand "you
always have to deal with this FALSE
case":
var_dump(substr('', -1) == '0'); // TRUE
var_dump(substr('', -1) === '0'); // FALSE
Equally to Dan, it doesn't seem like a great way to start the conversation
by saying the "interface is inconsistent and confusing".
There's a perception issue here since you're already used to using language
X,Y,Z.
That said the patch seems to add value to php, why not add str_slice() if it
can solve consistency issues for some users? str_slice likely could be a
faster alternative to substr()
in userland parsers.
Also I've seen plenty of bugs caused by type conversion & substr()
in
userland
Hello!
While I personally like PHP's substr()
an awful lot and doubt I would
use the str_slice() method, I thought I'd mention that I think what
you're proposing is much like the string.substring(from, to) method in
Javascript (and PHP's current substr()
function is an awful lot like
Javascript's string.substr(start, end) method).
With that in mind, if this function was to be implemented, I think
that naming it substring() instead of str_slice() might make it easier
for people to pick up out of the box, since PHP developers often have
quite a bit of overlap with Javascript.
Chad
Am 30.03.2011 17:54, schrieb Chad Fulton:
While I personally like PHP's
substr()
an awful lotWith that in mind, if this function was to be implemented, I think
that naming it substring() instead of str_slice() might make it easier
for people to pick up out of the box, since PHP developers often have
quite a bit of overlap with Javascript
a really bad idea!
after that you have substr()
and substring() and we should avoid doing
things like "mysql_escape_string" and "mysql_real_escap_string" again
The example I picked in my patch was a little contrived, however I do think
it is a useful benefit for functions to work in ways people expect, even in
edge cases. There are a lot of people out there who do not know the
difference between == and ===, and I think the fact that str_slice() has one
less potential bug to worry about is a tangible improvement.
As for adding other string functions, I agree, I think there are a lot of
them that would be great to add. starts_with & ends_with for sure. I just
didn't want to write a really large patch for a whole host of functions
before having the discussion about this one.
I also don't mean to attack PHP. I think the interface for this function is
heavily borrowed from C, and I think because PHP has different goals than C
(ease of use being one of them), that improvements can be made from this
perspective. I think from a functionality point of view for an expert user,
the C function interfaces are great.
-Dan
On Wed, Mar 30, 2011 at 7:43 AM, Jonathan Bond-Caron jbondc@openmv.comwrote:
var_dump(\substr("foo", 5, 6) == "", (string) false, false == "");
Welcome to PHP. To be honest this criticism pretty much falls in the
"from person that comes from another language X and is annoyed that
every little detail isn't exactly the same"-category. Just make your
ownsubstr()
function that uses the behavior you expect if you don't
like the native version. Although that's bad practice - the best
solution is to get used to it. And if you have an urge to write about
your experience with a new language I suggest you do it in a blog
instead of posting it in the internals mailing list...I agree with what you're saying but I think you're being a little harsh,
experience with a new language should be something that matters to
internals.Back to Dan, you're hitting type conversion so not sure I understand "you
always have to deal with thisFALSE
case":var_dump(substr('', -1) == '0'); //
TRUE
var_dump(substr('', -1) === '0'); //FALSE
Equally to Dan, it doesn't seem like a great way to start the conversation
by saying the "interface is inconsistent and confusing".
There's a perception issue here since you're already used to using language
X,Y,Z.That said the patch seems to add value to php, why not add str_slice() if
it
can solve consistency issues for some users? str_slice likely could be a
faster alternative tosubstr()
in userland parsers.Also I've seen plenty of bugs caused by type conversion &
substr()
in
userland
As for adding other string functions, I agree, I think there are a lot of
them that would be great to add. starts_with & ends_with for sure.
Both str_startswith and str_endswith have been suggested in the past:
http://marc.info/?t=121647230100001&r=1&w=2
I recently got around to merge them into a largely unfinished extension
so they are archived somewhere safe: https://github.com/mj/php-ext-str
- Martin
As for adding other string functions, I agree, I think there are a lot of
them that would be great to add. starts_with & ends_with for sure.Both str_startswith and str_endswith have been suggested in the past:
http://marc.info/?t=121647230100001&r=1&w=2
I recently got around to merge them into a largely unfinished extension
so they are archived somewhere safe: https://github.com/mj/php-ext-str
I see str_contains() on the TODO there. I've always wanted in_string() so am glad to see a similar item. Using strpos()
for this task feels dirty, much like using arrpos() for arrays would ;)
Regards,
Philip
Both str_startswith and str_endswith have been suggested in the past:
http://marc.info/?t=121647230100001&r=1&w=2
I recently got around to merge them into a largely unfinished extension
so they are archived somewhere safe: https://github.com/mj/php-ext-strI see str_contains() on the TODO there. I've always wanted in_string() so am glad to see a similar item. Using
strpos()
for this task feels dirty, much like using arrpos() for arrays would ;)
From time to time I have wondered if it made sense to add a new operator
"in" that works on variables of different type and could replace
in_string/str_contains:
if ("a" in "abc") { ... }
if ("a" in array("a", "b", "c")) { ... }
if ("a" in $obj) { /* true if $obj->__contains("a") returned true */ }
I suspect there is a massive potential for WTF issues in there and that
most people would hate this feature. Which is why I am only thinking
out loud here -- I have zero intentions to suggest this as a future
enhancement for PHP. ;-)
- Martin
Both str_startswith and str_endswith have been suggested in the past:
http://marc.info/?t=121647230100001&r=1&w=2
I recently got around to merge them into a largely unfinished extension
so they are archived somewhere safe: https://github.com/mj/php-ext-strI see str_contains() on the TODO there. I've always wanted in_string() so
am glad to see a similar item. Usingstrpos()
for this task feels dirty,
much like using arrpos() for arrays would ;)From time to time I have wondered if it made sense to add a new operator
"in" that works on variables of different type and could replace
in_string/str_contains:if ("a" in "abc") { ... }
if ("a" in array("a", "b", "c")) { ... }
if ("a" in $obj) { /* true if $obj->__contains("a") returned true */ }
I suspect there is a massive potential for WTF issues in there and that
most people would hate this feature. Which is why I am only thinking
out loud here -- I have zero intentions to suggest this as a future
enhancement for PHP. ;-)
- Martin
--
would we have mb_in also? :)
Tyrael
As for adding other string functions, I agree, I think there are a lot of
them that would be great to add. starts_with & ends_with for sure.Both str_startswith and str_endswith have been suggested in the past:
http://marc.info/?t=121647230100001&r=1&w=2
I recently got around to merge them into a largely unfinished extension
so they are archived somewhere safe: https://github.com/mj/php-ext-strI see str_contains() on the TODO there. I've always wanted in_string() so am glad to see a similar item. Using
strpos()
for this task feels dirty, much like using arrpos() for arrays would ;)
How would str_contains() be different from strstr()
?
-Rasmus
Martin Scotta
As for adding other string functions, I agree, I think there are a lot
of
them that would be great to add. starts_with & ends_with for sure.Both str_startswith and str_endswith have been suggested in the past:
http://marc.info/?t=121647230100001&r=1&w=2
I recently got around to merge them into a largely unfinished extension
so they are archived somewhere safe: https://github.com/mj/php-ext-strI see str_contains() on the TODO there. I've always wanted in_string() so
am glad to see a similar item. Usingstrpos()
for this task feels dirty,
much like using arrpos() for arrays would ;)How would str_contains() be different from
strstr()
?
They differ in the return type
boolean str_contains(string, string);
string strstr(string, string);
-Rasmus
How would str_contains() be different from
strstr()
?They differ in the return type
$instr = (bool)strstr($string1, $string2);
done. No need for a new function.
Brian.
How would str_contains() be different from
strstr()
?They differ in the return type
$instr = (bool)strstr($string1, $string2);
done. No need for a new function.
Well, to be clearer:
bool str_contains( haystack, needle [, case_insensitive = false ] )
string str[i]str( haystack, needle [, $before_needle = false ] )
The main differences:
- Return value is a bool
- No ordinal value conversions (I assume)
- No === usage worries/requirements
- Intuitive name
Regards,
Philip
- Intuitive name
Argh! Everyone should be forced to learn a bit of C. Like many PHP
functions, the name and argument order is right out of libc. If you type
"man strstr" at your (non-Windows) prompt you get a nice little
description of what it does.
-Rasmus
Am 31.03.2011 17:52, schrieb Rasmus Lerdorf:
Argh! Everyone should be forced to learn a bit of C. Like many PHP
functions, the name and argument order is right out of libc. If you type
"man strstr" at your (non-Windows) prompt you get a nice little
description of what it does.
And if you install pman you just do "pman strstr" and get PHP-specific
documentation.
--
Sebastian Bergmann Co-Founder and Principal Consultant
http://sebastian-bergmann.de/ http://thePHP.cc/
Am 31.03.2011 17:52, schrieb Rasmus Lerdorf:
Argh! Everyone should be forced to learn a bit of C. Like many PHP
functions, the name and argument order is right out of libc. If you type
"man strstr" at your (non-Windows) prompt you get a nice little
description of what it does.And if you install pman you just do "pman strstr" and get PHP-specific
documentation.
Although I disagree that adding a C prerequisite to PHP ends this discussion, using pman is a related topic which can be installed like so:
pear install doc.php.net/pman
Then, simply 'pman in_array' or similar will open local php man pages in shell. However, the pman files have not been updated for awhile so we'll look into this. Oh, guess we should create a 'pman pman' too.
Regards,
Philip
Am 31.03.2011 17:52, schrieb Rasmus Lerdorf:
Argh! Everyone should be forced to learn a bit of C. Like many PHP
functions, the name and argument order is right out of libc. If you type
"man strstr" at your (non-Windows) prompt you get a nice little
description of what it does.And if you install pman you just do "pman strstr" and get PHP-specific
documentation.Although I disagree that adding a C prerequisite to PHP ends this discussion, using pman is a related topic which can be installed like so:
pear install doc.php.net/pman
Then, simply 'pman in_array' or similar will open local php man pages in shell. However, the pman files have not been updated for awhile so we'll look into this. Oh, guess we should create a 'pman pman' too.
Regards,
Philip
Am I right in thinking pman and man is not for windows ... hmmm.
Sounds like half a job to me!
--
Richard Quadling
Twitter : EE : Zend
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY
P.S. It's TFI Friday and April 1st. Oh the joy!
I think it's time to stop thinking in terms of "functions" and move forward
to "abstractions"
$s1 = 'string';
$s1->contains($s2);
$s1->indexOf($s2) === strpos($s1, $s2);
Why can't the strings be exposed as pseudo-objects ? users can choose to use
them as a regular strings or by calling methods on it.
Martin Scotta
- Intuitive name
Argh! Everyone should be forced to learn a bit of C. Like many PHP
functions, the name and argument order is right out of libc. If you type
"man strstr" at your (non-Windows) prompt you get a nice little
description of what it does.-Rasmus
I think it's time to stop thinking in terms of "functions" and move
forward to "abstractions"$s1 = 'string';
$s1->contains($s2);$s1->indexOf($s2) === strpos($s1, $s2);
Why can't the strings be exposed as pseudo-objects ? users can choose to
use them as a regular strings or by calling methods on it.
This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?
-Rasmus
On a somewhat related note (and going back a little to my original patch),
languages like python and ruby allow slicing on array/string objects with
$string_or_array[start:end] syntax. I think this would be really useful
syntax in PHP as well (and would of course make the initial patch I
submitted obsolete). The one thing I was hesitant about with my patch was
polluting the global function space for PHP, and I think adding new and
specific syntax to provide these things is much better.
I have no idea how feasible it is given PHP's internal structure, but if it
was possible I would be glad to try to write a patch for it.
-Dan
I think it's time to stop thinking in terms of "functions" and move
forward to "abstractions"$s1 = 'string';
$s1->contains($s2);$s1->indexOf($s2) === strpos($s1, $s2);
Why can't the strings be exposed as pseudo-objects ? users can choose to
use them as a regular strings or by calling methods on it.This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?-Rasmus
I think it's time to stop thinking in terms of "functions" and move
forward to "abstractions"$s1 = 'string';
$s1->contains($s2);$s1->indexOf($s2) === strpos($s1, $s2);
Why can't the strings be exposed as pseudo-objects ? users can choose to
use them as a regular strings or by calling methods on it.This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?
Sounds interesting. A few thoughts:
The new "methods" could be implemented as functions that accept the string or array as their first argument. Thereby allowing them to be called as functions too.
If the new methods are functions. Maybe they should be defined in the \string and \array namespaces. This would allow a fresh start in naming string and array functions, and allow addressing argument ordering issues. It would of curse also open endless discussions on which functions to include and the naming of these.
If the new string and array functions were defined in some namespaces. Would it be possible to allow extending the string and array "classes" by defining more functions in those namespaces in userland and/or extensions?
Should the new string functions be multibyte character set aware?
Should the above be generalized so that the -> operator can be used on any simple type, and if the called "method" exists as a function in the types namespace, it should be called. This would break the task into two parts. !) changing php to attempt calling functions in a certain namespace for each simple type. 2) Writing extensions that create functions for each type, string and/or array would be obvious starting points.
I think it's time to stop thinking in terms of "functions" and move
forward to "abstractions"$s1 = 'string';
$s1->contains($s2);$s1->indexOf($s2) === strpos($s1, $s2);
Why can't the strings be exposed as pseudo-objects ? users can choose to
use them as a regular strings or by calling methods on it.This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?Sounds interesting. A few thoughts:
The new "methods" could be implemented as functions that accept the string or array as their first argument. Thereby allowing them to be called as functions too.
If the new methods are functions. Maybe they should be defined in the \string and \array namespaces. This would allow a fresh start in naming string and array functions, and allow addressing argument ordering issues. It would of curse also open endless discussions on which functions to include and the naming of these.
If the new string and array functions were defined in some namespaces. Would it be possible to allow extending the string and array "classes" by defining more functions in those namespaces in userland and/or extensions?
Should the new string functions be multibyte character set aware?
Should the above be generalized so that the -> operator can be used on any simple type, and if the called "method" exists as a function in the types namespace, it should be called. This would break the task into two parts. !) changing php to attempt calling functions in a certain namespace for each simple type. 2) Writing extensions that create functions for each type, string and/or array would be obvious starting points.
This would be great - especially if it would provide multibyte (or
only UTF-8 for all I care) support. Of course one potential pain point
is numbers vs strings.
Looking at JavaScript which acts quite similarly to the proposed
approach, the Number methods [1] are not many and not often useful,
but still they exist. What happens if you do "3.3"->ceil() ? If ceil()
exists only in \numeric, you could say autoconvert, but if it exists
in both \string and \numeric we have a problem, so the only sensible
way is to drop type juggling for those autoboxed objects imo. Which is
also what JS does btw. If you have an ambiguous value on your hands,
you should cast it (x.toString() or parseInt(x)) first and then call
what you want to call on it. Another interesting to note is that JS
doesn't allow method calls directly on numbers - 3.toPrecision(2) for
example is a syntax error, you have to put it in a var first. I guess
it's not a problem since you should just inline the value you want
anyway, the Number methods don't do super fancy things.
Anyway, I hope this helps, my point is just that you shouldn't forget
to check out how other languages solved the issue before hitting the
wall ourselves. Not saying we have to do it all like JS, but they have
something that works, so it's worth studying.
Cheers
[1] https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Number#Methods
--
Jordi Boggiano
@seldaek :: http://seld.be/
I think it's time to stop thinking in terms of "functions" and move
forward to "abstractions"$s1 = 'string';
$s1->contains($s2);$s1->indexOf($s2) === strpos($s1, $s2);
Why can't the strings be exposed as pseudo-objects ? users can choose to
use them as a regular strings or by calling methods on it.This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?Sounds interesting. A few thoughts:
The new "methods" could be implemented as functions that accept the string or array as their first argument. Thereby allowing them to be called as functions too.
If the new methods are functions. Maybe they should be defined in the \string and \array namespaces. This would allow a fresh start in naming string and array functions, and allow addressing argument ordering issues. It would of curse also open endless discussions on which functions to include and the naming of these.
If the new string and array functions were defined in some namespaces. Would it be possible to allow extending the string and array "classes" by defining more functions in those namespaces in userland and/or extensions?
Should the new string functions be multibyte character set aware?
Should the above be generalized so that the -> operator can be used on any simple type, and if the called "method" exists as a function in the types namespace, it should be called. This would break the task into two parts. !) changing php to attempt calling functions in a certain namespace for each simple type. 2) Writing extensions that create functions for each type, string and/or array would be obvious starting points.
This would be great - especially if it would provide multibyte (or
only UTF-8 for all I care) support. Of course one potential pain point
is numbers vs strings.
As Stan Vass points out the methods should probably be implemented for scalars and for arrays, and handling of types should be done in the same way as the existing string and number functions. So maybe only \scalar and \array. Alternatively scalars should look for methods in multiple namespaces to keep the function naming consistent.
Looking at JavaScript which acts quite similarly to the proposed
approach, the Number methods [1] are not many and not often useful,
but still they exist. What happens if you do "3.3"->ceil() ? Ifceil()
exists only in \numeric, you could say autoconvert, but if it exists
in both \string and \numeric we have a problem, so the only sensible
way is to drop type juggling for those autoboxed objects imo. Which is
also what JS does btw. If you have an ambiguous value on your hands,
you should cast it (x.toString() or parseInt(x)) first and then call
what you want to call on it. Another interesting to note is that JS
doesn't allow method calls directly on numbers - 3.toPrecision(2) for
example is a syntax error, you have to put it in a var first. I guess
it's not a problem since you should just inline the value you want
anyway, the Number methods don't do super fancy things.
Yes as I wrote the above I did realize that it has some resemblance to javascripts prototypal inheritance, so it is a good idea to look there for inspiration. But the above is not supposed to be a suggestion for introducing a new object model into PHP.
Looking at JavaScript which acts quite similarly to the proposed
approach, the Number methods [1] are not many and not often useful,
but still they exist. What happens if you do "3.3"->ceil() ? Ifceil()
exists only in \numeric, you could say autoconvert, but if it exists
in both \string and \numeric we have a problem, so the only sensible
way is to drop type juggling for those autoboxed objects imo. Which is
also what JS does btw. If you have an ambiguous value on your hands,
you should cast it (x.toString() or parseInt(x)) first and then call
what you want to call on it. Another interesting to note is that JS
doesn't allow method calls directly on numbers - 3.toPrecision(2) for
example is a syntax error, you have to put it in a var first. I guess
it's not a problem since you should just inline the value you want
anyway, the Number methods don't do super fancy things.Anyway, I hope this helps, my point is just that you shouldn't forget
to check out how other languages solved the issue before hitting the
wall ourselves. Not saying we have to do it all like JS, but they have
something that works, so it's worth studying.Jordi Boggiano
We could use the library organization of JS, in that purely math-related
methods like 'ceil', 'sin' etc., constants like PI, E etc. will be in
namespace \math and take any scalar numeric input. As a lot of them take
more than one arguments, and the arguments are sometimes commutative, it
makes less sense to present those ones in particular as methods.
The few number methods that make sense as methods can then be in the scalar
API. Almost like JS but not quite. Since's PHP's interpretation of types is
slightly different than JS, we can surely take a page from JS, that's good
(and I support it) but it needs to be nudged more towards PHP's PoV.
Re. number literals, (3).toPrecision(2) works in JS, I guess they wanted to
avoid some ambiguity in the parser, ex. 3.0.toPrecision(2).
Stan Vass
I think it's time to stop thinking in terms of "functions" and move
forward to "abstractions"$s1 = 'string';
$s1->contains($s2);$s1->indexOf($s2) === strpos($s1, $s2);
Why can't the strings be exposed as pseudo-objects ? users can choose to
use them as a regular strings or by calling methods on it.This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?Sounds interesting. A few thoughts:
The new "methods" could be implemented as functions that accept the string or array as their first argument. Thereby allowing them to be called as functions too.
If the new methods are functions. Maybe they should be defined in the \string and \array namespaces. This would allow a fresh start in naming string and array functions, and allow addressing argument ordering issues. It would of curse also open endless discussions on which functions to include and the naming of these.
If the new string and array functions were defined in some namespaces. Would it be possible to allow extending the string and array "classes" by defining more functions in those namespaces in userland and/or extensions?
Should the new string functions be multibyte character set aware?
Should the above be generalized so that the -> operator can be used on any simple type, and if the called "method" exists as a function in the types namespace, it should be called. This would break the task into two parts. !) changing php to attempt calling functions in a certain namespace for each simple type. 2) Writing extensions that create functions for each type, string and/or array would be obvious starting points.
If I'm not mistaken you're talking about something like the Extension
Methods in C# ( http://msdn.microsoft.com/en-us/library/bb383977.aspx )
which, by the way, seems like a great idea.
The other thing that comes to mind is Scala's implicits (
http://www.codecommit.com/blog/ruby/implicit-conversions-more-powerful-than-dynamic-typing
) It solves the same problem, while being really powerful, because of
Scala's type system.
The problem of conflicting implicits (or whatever it'll be called) of
course should be left to the developer, much like aliasing for traits.
--
Pas
This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?-Rasmus
Glad to see pseudo-object methods for scalars/arrays brought up again. It'll
improve workflow a lot. I want to ask/suggest two clarifications on this:
- You said 'strings and arrays', I hope that's 'scalars and arrays'. As the
current PHP semantics work, and with very minor exceptions, int/float works
as a numeric string when passed to a string function, and numeric strings
work when passed to a number function.
I.e. those should work:
$a = 1234; /* int / echo $a->substr(1); // '234'
$a = '1.23456'; / string */ $a->format(2); // '1.23'
This means the extension methods for primitives should be split in two
groups:
arrays (lists and maps) - one API set
scalars (bool, int, float, strings) - one API set
- Because the language currently has no scalar hinting (I know it's
coming), IDE autocompletion of the extension methods may not work very well
(unless the IDE assumes everything is a scalar unless hinted otherwise).
Both hinting, incl. for method/function return types, and extension methods
should be introduced together as they make each other more useful, than
independently.
Let me know what you think.
Stan Vass
This is actually something I have been toying with a bit. Adding the
ability to call methods on both strings and arrays. I still don't like
the idea of making them real objects as the overhead and the amount of
code that would need to be changed in the core and in every extension is
daunting. Anybody out there have a patch along these lines?-Rasmus
Glad to see pseudo-object methods for scalars/arrays brought up again.
It'll improve workflow a lot. I want to ask/suggest two clarifications on
this:
- You said 'strings and arrays', I hope that's 'scalars and arrays'. As
the current PHP semantics work, and with very minor exceptions, int/float
works as a numeric string when passed to a string function, and numeric
strings work when passed to a number function.I.e. those should work:
$a = 1234; /* int / echo $a->substr(1); // '234'
$a = '1.23456'; / string */ $a->format(2); // '1.23'This means the extension methods for primitives should be split in two
groups:arrays (lists and maps) - one API set
scalars (bool, int, float, strings) - one API set
- Because the language currently has no scalar hinting (I know it's
coming), IDE autocompletion of the extension methods may not work very well
(unless the IDE assumes everything is a scalar unless hinted otherwise).Both hinting, incl. for method/function return types, and extension methods
should be introduced together as they make each other more useful, than
independently.Let me know what you think.
Stan Vass
--
What about a marker interface hierarchy like this?
interface Mixed { }
interface Numeric { }
interface Callable extends Mixed { }
interface Resource extends Mixed { }
interface Scalar extends Mixed { }
interface Array extends Mixed { } // unexpected T_ARRAY
interface Object extends Mixed { }
interface Boolean extends Scalar { }
interface String extends Scalar, Numeric { }
interface Number extends Scalar, Numeric { }
interface Float extends Number { }
interface Integer extends Number { }
( http://graph.gafol.net/etSevVPSJ )
This way the core will know which methods are allowed to call on each $var
Advantages:
- cannnot create instances
- cannot extends these interfaces (private to core)
- very very fine type hint
- core implementation is not exposed
Martin Scotta
How would str_contains() be different from
strstr()
?They differ in the return type
$instr = (bool)strstr($string1, $string2);
done. No need for a new function.
God forbid anyone use (bool)strstr("something0", "0") !
Brian.
I recently got around to merge them into a largely unfinished extension
so they are archived somewhere safe: https://github.com/mj/php-ext-strI see str_contains() on the TODO there. I've always wanted in_string() so am glad to see a similar item. Using
strpos()
for this task feels dirty, much like using arrpos() for arrays would ;)How would str_contains() be different from
strstr()
?
I think my main objective was to have a utility function that has the
advantages of strpos (i.e. faster and less memory usage than strstr)
while having a strict boolean return value so that I don't need to
remember to use === all the time.
- Martin
If substr()
really was so bad, then surely we'd see userland
implementations of str_slice() in every project?
Jevon
My apologizes if I am bringing up a topic that has been discussed before,
this is my first time wading into the PHP developers lists and I couldn't
find anything particularly relevant with the search.Here is a bug I submitted over the weekend (
http://bugs.php.net/bug.php?id=54387) with an attached patch that adds a
str_slice() function into PHP. This function is just a very simple string
slicing function, with the logical interface of str_slice(string, start,
[end]). It is of course meant to replacesubstr()
as an interface for
string slicing.I detailed the reasons I submitted the patch in the bug a little bit, but
the main reason is that I think thesubstr()
function is really overly
confusing and just not an intuitive method of string slicing, which is
exceedingly common functionality. I realize we don't want to go around
adding lots of random little functions into the language that don't offer
much, but the problem with that is that if we have a function likesubstr()
with an unusual and unintuitive interface, it becomes unchangeable due to
legacy issues and then you can never improve. I think this particular
functionality is important enough to offer an updated interface. In the bug
I also pointed to two related bugs that would be essentially fixed with this
patch.-Dan
I think most users of a language take what they are given for basic
functions like this instead of always rolling their own. I admit that both
of the changes I am suggesting here are minor, but taken together I do think
it is a significant and tangible difference (and improvement). I think if
you are going by the metric of people rolling their own, then the functions
"starts_with" and "ends_with" have probably been implemented many many times
by people in various projects (see
http://stackoverflow.com/questions/834303/php-startswith-and-endswith-functions,
especially notice the solutions using strrev or regex searching). However
in that previous discuss that Martin linked to, it looks like it was show
down because the functions were deemed too trivial. A funny sidenote is
that this message (http://marc.info/?l=php-internals&m=121667513510896&w=2)
in that previous discussion has the exact bug in it which I mentioned in my
bug (if (substr($path, -strlen($extension)) == $extension) is TRUE
for $path
= '' and $extension = '0').
I guess the core of my argument is the current PHP string functions, which
certainly provide all functionality one could want, are not optimized from
simplicity or user friendliness. If PHP wants to optimize for the
conciseness of the core library, than I understand an argument to keep out
functions like this. However, if PHP wants to add in functions that are
simpler to use and have more user friendly interfaces, I think it would
benefit by adding in a few of the functions mentioned here (slice,
ends_with, starts_with). The reason I led with str_slice, is because I
think str_slice makes ends_with and starts_with trivial with no potential
corner cases, so if I could pick one function to add, I would pick
str_slice.
-Dan
If
substr()
really was so bad, then surely we'd see userland
implementations of str_slice() in every project?Jevon
My apologizes if I am bringing up a topic that has been discussed before,
this is my first time wading into the PHP developers lists and I couldn't
find anything particularly relevant with the search.Here is a bug I submitted over the weekend (
http://bugs.php.net/bug.php?id=54387) with an attached patch that adds a
str_slice() function into PHP. This function is just a very simple
string
slicing function, with the logical interface of str_slice(string, start,
[end]). It is of course meant to replacesubstr()
as an interface for
string slicing.I detailed the reasons I submitted the patch in the bug a little bit, but
the main reason is that I think thesubstr()
function is really overly
confusing and just not an intuitive method of string slicing, which is
exceedingly common functionality. I realize we don't want to go around
adding lots of random little functions into the language that don't offer
much, but the problem with that is that if we have a function like
substr()
with an unusual and unintuitive interface, it becomes unchangeable due to
legacy issues and then you can never improve. I think this particular
functionality is important enough to offer an updated interface. In the
bug
I also pointed to two related bugs that would be essentially fixed with
this
patch.-Dan