Hi internals!
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):
https://wiki.php.net/rfc/binary_string_comparison
On comparing two numeric strings both operands will be equal if the
string representation will be the same. On comparing two numeric strings
the first operand will be greater if the first not matching byte will be
higher. On comparing two numeric strings the first operand will be lower
if the first not matching byte will be lower.
As a side effect it makes string comparison much faster and force
developer to really write what they mean (No need to guess) and to force
developers to cast/filter input once which also affects performance.
On C-Level the function zendi_smart_strcmp will be unused and marked as
deprecated.
Thanks,
Marc
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):https://wiki.php.net/rfc/binary_string_comparison
If I understand your goal correctly, you seem to want to change a very
fundamental (and ancient) behavior of the language even though
mechanisms already exist to do what you describe as the "changed
behavior".
What exactly is wrong with ===, strcmp()
, etc..?
-Sara
Type juggling is a (major) feature of PHP which would effectively be
neutered by this change. As Sara mentioned we already have tools to achieve
binary and string comparison.
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):https://wiki.php.net/rfc/binary_string_comparison
If I understand your goal correctly, you seem to want to change a very
fundamental (and ancient) behavior of the language even though
mechanisms already exist to do what you describe as the "changed
behavior".What exactly is wrong with ===,
strcmp()
, etc..?-Sara
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):https://wiki.php.net/rfc/binary_string_comparison
If I understand your goal correctly, you seem to want to change a very
fundamental (and ancient) behavior of the language even though
mechanisms already exist to do what you describe as the "changed
behavior".What exactly is wrong with ===,
strcmp()
, etc..?
Yes, this would be a very problematic change at this point. It isn't
something you could audit your code for easily and make it future-proof.
Something that used to work will simply not and it will be nearly
impossible to understand why.
I don't necessarily disagree that this is perhaps the way it should have
worked from the beginning, but without an upgrade path I don't think we
can do this. The only mention in your RFC towards this is:
This can be easily resolved by explicitly casting one of the
operands to an integer or float respectively define the sorting
algorithm."
That is not an upgrade path. Going through and auditing every single
comparison in a large code base to see if one side should perhaps be
cast to a different type is not feasible.
-Rasmus
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):https://wiki.php.net/rfc/binary_string_comparison
If I understand your goal correctly, you seem to want to change a very
fundamental (and ancient) behavior of the language even though
mechanisms already exist to do what you describe as the "changed
behavior".What exactly is wrong with ===,
strcmp()
, etc..?
The question isn't "What's wrong with ===, strcmp()
?" but "What's wrong
with ==, <, >?".
We have a standard way to compare two operands but currently we do some
magic things to solve something that don't need to be solved.
If you would like to compare two pears we currently convert the pears
into apples and compare two apples and say please use a special function
to compare two pears. Why?
There is no numeric context to compare two strings numerically.
-Sara
hi,
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):https://wiki.php.net/rfc/binary_string_comparison
If I understand your goal correctly, you seem to want to change a very
fundamental (and ancient) behavior of the language even though
mechanisms already exist to do what you describe as the "changed
behavior".What exactly is wrong with ===,
strcmp()
, etc..?The question isn't "What's wrong with ===,
strcmp()
?" but "What's wrong with
==, <, >?".
And the answer is: not strict and why === exists.
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
hi,
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):https://wiki.php.net/rfc/binary_string_comparison
If I understand your goal correctly, you seem to want to change a very
fundamental (and ancient) behavior of the language even though
mechanisms already exist to do what you describe as the "changed
behavior".What exactly is wrong with ===,
strcmp()
, etc..?The question isn't "What's wrong with ===,
strcmp()
?" but "What's wrong with
==, <, >?".And the answer is: not strict and why === exists.
Non-strict comparison should do conversion to make both operands
comparable. It should not convert both operands into a third unrelated
type that wasn't mentioned.
Btw. The RFC doesn't handle == and === the same because == do
type-juggling but only if both operands are not on the same type.
strcmp()
isn't the same behavior as it first converts both operands into
a string.
Cheers,
What exactly is wrong with ===,
strcmp()
, etc..?The question isn't "What's wrong with ===,
strcmp()
?" but "What's
wrong with ==, <, >?".We have a standard way to compare two operands but currently we do
some magic things to solve something that don't need to be solved.If you would like to compare two pears we currently convert the
pears into apples and compare two apples and say please use a
special function to compare two pears. Why?There is no numeric context to compare two strings numerically.
I don't want to start a flame war, but perl gets it right, you have 2 sets of
operators 'eq' and '=='.
The choice to have a type juggling '==' was made years ago, that should not change.
If you want '<=' with strings, use strcmp()
.
There is also a speed cost to type juggling (PHP 5.3.3, 64 bit centos):
$a = "1234";
$b = "ghij"
$a == $b is about 64% slower than $a === $b
strcmp($, $b) is about 163% slower than $a === $b
$a = "abcd";
$b = "ghij"
$a == $b is about 24% slower than $a === $b
strcmp($, $b) is about 168% slower than $a === $b
$a = "abcd";
$b = "4567"
$a == $b is about 20% slower than $a === $b
strcmp($, $b) is about 176% slower than $a === $b
$a = "1234";
$b = "4567"
$a == $b is about 20% slower than $a === $b
strcmp($, $b) is about 162% slower than $a === $b
--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
#include <std_disclaimer.h
The question isn't "What's wrong with ===,
strcmp()
?" but "What's wrong
with ==, <, >?".We have a standard way to compare two operands but currently we do some
magic things to solve something that don't need to be solved.
Still it is a key property of the language which we can't simply change.
Also mind this: All input data are strings and some databases also
return data as string. So code like
if ($_GET['id'] > 0)
or
if ($db->fetchRow()[0] == 12)
which is common will break. Maybe using different sets of operators
would have been better, but that should have happened 20 years ago not
now.
Also mind consistency: If you change this you probably also have to
change other places where implicit conversion happens, i.e. array keys.
In the end you get a complete new incompatible language and can throw
away most libraries and so on.
So in case you have a Tardis, DeLorean with Flux Capacitor or any other
time machine I'm happy to support you in the past, but not this late.
johannes
The question isn't "What's wrong with ===,
strcmp()
?" but "What's wrong
with ==, <, >?".We have a standard way to compare two operands but currently we do some
magic things to solve something that don't need to be solved.Still it is a key property of the language which we can't simply change.
Also mind this: All input data are strings and some databases also
return data as string. So code likeif ($_GET['id'] > 0)
or
if ($db->fetchRow()[0] == 12)which is common will break.
Those two cases will actually not be affected, it's strictly string<=>string comparisons that's being discussed here.
Maybe using different sets of operators
would have been better, but that should have happened 20 years ago not
now.Also mind consistency: If you change this you probably also have to
change other places where implicit conversion happens, i.e. array keys.In the end you get a complete new incompatible language and can throw
away most libraries and so on.So in case you have a Tardis, DeLorean with Flux Capacitor or any other
time machine I'm happy to support you in the past, but not this late.johannes
The question isn't "What's wrong with ===,
strcmp()
?" but "What's wrong
with ==, <, >?".We have a standard way to compare two operands but currently we do some
magic things to solve something that don't need to be solved.Still it is a key property of the language which we can't simply change.
Also mind this: All input data are strings and some databases also
return data as string. So code likeif ($_GET['id'] > 0)
or
if ($db->fetchRow()[0] == 12)which is common will break.
Those two cases will actually not be affected, it's strictly string<=>string comparisons that's being discussed here.
Meaning that simple code you find everywhere, in every second tutorial
foreach ($db->query("SELECT id, title FROM entries") as $row) {
echo "<tr><td";
if ($row[0] == $_GET['highlight_id']) {
echo " background='#ff0000'";
}
echo ">".htmlentities($row[1])."</td></tr>";
}
will suddenly fail. How wonderful! (irony)
johannes
ps. yes, the example might be done nicer and better, it still represents
a common pattern.
foreach ($db->query("SELECT id, title FROM entries") as $row) {
echo "<tr><td";
if ($row[0] == $_GET['highlight_id']) {
echo " background='#ff0000'";
}
echo ">".htmlentities($row[1])."</td></tr>";
}will suddenly fail. How wonderful! (irony)
Just to make this more fun: Assume $db is PDO then the behavior will
depend on the driver (and for some drivers even at the configuration,
i.e. setting of PDO::ATTR_EMULATE_PREPARES with MySQL) what will happen.
johannes
foreach ($db->query("SELECT id, title FROM entries") as $row) { echo "<tr><td"; if ($row[0] == $_GET['highlight_id']) { echo " background='#ff0000'"; } echo ">".htmlentities($row[1])."</td></tr>"; }
will suddenly fail. How wonderful! (irony)
Just to make this more fun: Assume $db is PDO then the behavior will
depend on the driver (and for some drivers even at the configuration,
i.e. setting of PDO::ATTR_EMULATE_PREPARES with MySQL) what will happen.
I don't understand exactly what you mean here. This RFC has nothing todo
with DB layer and PDO.
Do you have any example where a DB returns integers differently?
On your example the comparison only fails if the GET-variable is a non
human formed integer means prefixing with a whitespace (but not
suffixing), prefixing with a 0 or formed as a real number or hex.
I'm very sure the changed behavior doesn't open big real life BC issues.
I will run some testsuites on it.
johannes
On 20/08/14 11:12, Marc Bennewitz wrote:>
foreach ($db->query("SELECT id, title FROM entries") as $row) { echo "<tr><td"; if ($row[0] == $_GET['highlight_id']) { echo " background='#ff0000'"; } echo ">".htmlentities($row[1])."</td></tr>"; }
will suddenly fail. How wonderful! (irony)
Just to make this more fun: Assume $db is PDO then the behavior will
depend on the driver (and for some drivers even at the configuration,
i.e. setting of PDO::ATTR_EMULATE_PREPARES with MySQL) what will happen.I don't understand exactly what you mean here. This RFC has nothing todo
with DB layer and PDO.Do you have any example where a DB returns integers differently?
php -r '$p = new PDO("mysql:user=root");
$s=$p->prepare("SELECT 1"); $s->execute();
var_dump($s->fetchAll());
$p->setAttribute(PDO::ATTR_EMULATE_PREPARES,false);
$s=$p->prepare("SELECT 1"); $s->execute();
var_dump($s->fetchAll());'
array(1) {
[0]=>
array(2) {
[1]=>
string(1) "1"
[2]=>
string(1) "1"
}
}
array(1) {
[0]=>
array(2) {
[1]=>
int(1)
[2]=>
int(1)
}
}
--
Regards,
Mike
On 20/08/14 11:12, Marc Bennewitz wrote:>
foreach ($db->query("SELECT id, title FROM entries") as $row) { echo "<tr><td"; if ($row[0] == $_GET['highlight_id']) { echo " background='#ff0000'"; } echo ">".htmlentities($row[1])."</td></tr>"; }
will suddenly fail. How wonderful! (irony)
Just to make this more fun: Assume $db is PDO then the behavior will
depend on the driver (and for some drivers even at the configuration,
i.e. setting of PDO::ATTR_EMULATE_PREPARES with MySQL) what will happen.I don't understand exactly what you mean here. This RFC has nothing todo
with DB layer and PDO.Do you have any example where a DB returns integers differently?
php -r '$p = new PDO("mysql:user=root");
$s=$p->prepare("SELECT 1"); $s->execute();
var_dump($s->fetchAll());
$p->setAttribute(PDO::ATTR_EMULATE_PREPARES,false);
$s=$p->prepare("SELECT 1"); $s->execute();
var_dump($s->fetchAll());'array(1) {
[0]=>
array(2) {
[1]=>
string(1) "1"
[2]=>
string(1) "1"
}
}
array(1) {
[0]=>
array(2) {
[1]=>
int(1)
[2]=>
int(1)
}
}
Thank's for the explanation, but this is nothing that runs into issues
because:
- In case 1 the result number will be a string "1" that is equal with
another string of "1". Only if "1" will be compared to something like
"01", " 1", "0x1" the result will no longer be the same. - In case 2 the DB layer already returns an integer and on comparing
this to a string the string will be converted to an integer, too -
nothing changed here by the RFC
To make the example of Johannes fail the DB layer have to return the
integer as string but not formed in a standard human way.
Marc
On Mon, Aug 18, 2014 at 11:30 PM, Johannes Schlüter johannes@schlueters.de
wrote:
On 18 Aug, 2014, at 10:47 pm, Johannes Schlüter <
johannes@schlueters.de> wrote:The question isn't "What's wrong with ===,
strcmp()
?" but "What's
wrong
with ==, <, >?".We have a standard way to compare two operands but currently we do
some
magic things to solve something that don't need to be solved.Still it is a key property of the language which we can't simply
change.
Also mind this: All input data are strings and some databases also
return data as string. So code likeif ($_GET['id'] > 0)
or
if ($db->fetchRow()[0] == 12)which is common will break.
Those two cases will actually not be affected, it's strictly
string<=>string comparisons that's being discussed here.Meaning that simple code you find everywhere, in every second tutorial
foreach ($db->query("SELECT id, title FROM entries") as $row) {
echo "<tr><td";
if ($row[0] == $_GET['highlight_id']) {
echo " background='#ff0000'";
}
echo ">".htmlentities($row[1])."</td></tr>";
}will suddenly fail. How wonderful! (irony)
Not necessarily and certainly not by definition; reasons for failure are
less obvious such as (but not limited to):
"0" == "0.0"
"11" == " 11" (but note that "11" == "11 " currently yields false)
"0" == ""
I'm not arguing for or against this behaviour change, but I found it
necessary to clear up some apparent confusion as to what repercussions this
proposal carries.
Another approach of attempting to solve the common issue of comparing big
numbers with '==' is to only enforce string-wise comparison if a number
cast would cause precision loss.
johannes
ps. yes, the example might be done nicer and better, it still represents
a common pattern.
--
Tjerk
On Mon, Aug 18, 2014 at 11:30 PM, Johannes Schlüter
<johannes@schlueters.de mailto:johannes@schlueters.de> wrote:> > On 18 Aug, 2014, at 10:47 pm, Johannes Schlüter <johannes@schlueters.de <mailto:johannes@schlueters.de>> wrote: > > > >> > >> The question isn't "What's wrong with ===, `strcmp()`?" but "What's wrong > >> with ==, <, >?". > >> > >> We have a standard way to compare two operands but currently we do some > >> magic things to solve something that don't need to be solved. > > > > Still it is a key property of the language which we can't simply change. > > Also mind this: All input data are strings and some databases also > > return data as string. So code like > > > > if ($_GET['id'] > 0) > > or > > if ($db->fetchRow()[0] == 12) > > > > which is common will break. > > Those two cases will actually not be affected, it's strictly string<=>string comparisons that's being discussed here. Meaning that simple code you find everywhere, in every second tutorial foreach ($db->query("SELECT id, title FROM entries") as $row) { echo "<tr><td"; if ($row[0] == $_GET['highlight_id']) { echo " background='#ff0000'"; } echo ">".htmlentities($row[1])."</td></tr>"; } will suddenly fail. How wonderful! (irony)
Not necessarily and certainly not by definition; reasons for failure are
less obvious such as (but not limited to):"0" == "0.0"
"11" == " 11" (but note that "11" == "11 " currently yields false)
"0" == ""I'm not arguing for or against this behaviour change, but I found it
necessary to clear up some apparent confusion as to what repercussions
this proposal carries.Another approach of attempting to solve the common issue of comparing
big numbers with '==' is to only enforce string-wise comparison if a
number cast would cause precision loss.
That's a good point, too
johannes ps. yes, the example might be done nicer and better, it still represents a common pattern.
--
Tjerk
On Tue, Aug 19, 2014 at 11:07 AM, Tjerk Meesters tjerk.meesters@gmail.com
wrote:
On Mon, Aug 18, 2014 at 11:30 PM, Johannes Schlüter <
johannes@schlueters.de> wrote:On 18 Aug, 2014, at 10:47 pm, Johannes Schlüter <
johannes@schlueters.de> wrote:The question isn't "What's wrong with ===,
strcmp()
?" but "What's
wrong
with ==, <, >?".We have a standard way to compare two operands but currently we do
some
magic things to solve something that don't need to be solved.Still it is a key property of the language which we can't simply
change.
Also mind this: All input data are strings and some databases also
return data as string. So code likeif ($_GET['id'] > 0)
or
if ($db->fetchRow()[0] == 12)which is common will break.
Those two cases will actually not be affected, it's strictly
string<=>string comparisons that's being discussed here.Meaning that simple code you find everywhere, in every second tutorial
foreach ($db->query("SELECT id, title FROM entries") as $row) {
echo "<tr><td";
if ($row[0] == $_GET['highlight_id']) {
echo " background='#ff0000'";
}
echo ">".htmlentities($row[1])."</td></tr>";
}will suddenly fail. How wonderful! (irony)
Not necessarily and certainly not by definition; reasons for failure are
less obvious such as (but not limited to):"0" == "0.0"
"11" == " 11" (but note that "11" == "11 " currently yields false)
"0" == ""
I had mistakenly assumed that "0" == "" would currently yield true, but it
doesn't. My apologies for that. The other two statements are still valid,
though. So are these:
"0" == "0x0"
"0" == "00"
I'm not arguing for or against this behaviour change, but I found it
necessary to clear up some apparent confusion as to what repercussions this
proposal carries.Another approach of attempting to solve the common issue of comparing big
numbers with '==' is to only enforce string-wise comparison if a number
cast would cause precision loss.johannes
ps. yes, the example might be done nicer and better, it still represents
a common pattern.--
Tjerk
--
Tjerk
Hi internals!
I've created a draft RFC and patch to change the behavior of non-strict
string to string comparison to be binary safe (as the strict comparison
operator does):https://wiki.php.net/rfc/binary_string_comparison
On comparing two numeric strings both operands will be equal if the string
representation will be the same. On comparing two numeric strings the first
operand will be greater if the first not matching byte will be higher. On
comparing two numeric strings the first operand will be lower if the first
not matching byte will be lower.As a side effect it makes string comparison much faster and force developer
to really write what they mean (No need to guess) and to force developers to
cast/filter input once which also affects performance.On C-Level the function zendi_smart_strcmp will be unused and marked as
deprecated.Thanks,
Marc
This seems to me as a major breakage, not necessary, as strict
comparison (===) or strcmp()
are available.
Julien Pauli
I've created a draft RFC and patch to change the behavior of non-strict string to string comparison to be binary safe (as the strict comparison operator does):
https://wiki.php.net/rfc/binary_string_comparison
On comparing two numeric strings both operands will be equal if the string representation will be the same. On comparing two numeric strings the first operand will be greater if the first not matching byte will be higher. On comparing two numeric strings the first operand will be lower if the first not matching byte will be lower.
I might as well point out that I am also not in favour of this. In PHP, we usually do the Perl-like thing and pretend ints, floats and numeric strings are the same. This RFC reduces the extent to which we do that, meaning PHP would be less consistent. For that reason, I cannot support it.
--
Andrea Faulds
http://ajf.me/