Hi Nikita,
Thanks for your feedback.
I'll fix the textual errors you mentioned.
- "To compare two numeric strings as numbers, they need to be cast to
floats." This may loose precision for integers. It is better to cast to
numbers (int or float) using, with the canonical way being +$x. But I guess
that won't work under strict_operators. Maybe we should have a (number)
cast (it already exists internally...)
Good point. While in most cases you know if you're working with floats or
integers, adding a way to cast to either an int or float would be nice.
Maybe preferably through a function like numberval($x)
or simply
number($x), so the
(type)` syntax is reserved for actual types. That
would be an RFC on its own though.
- This has already been mentioned by others: Having $str1 < $str2 perform
a strcmp()
style comparison under strict_operators is surprising. I think
that overall the use of lexicographical string comparisons is quite rare
and should be performed using an explicit strcmp()
call. More likely than
not, writing $str1 < $str2 is a bug and should generate a TypeError. Of
course, equality comparisons like $str1 == $str2 should still work, similar
to the distinction you make for arrays.
Ok, fair. I'll change it so <,<=,>,>=,<=> comparison on a string throws a
TypeError, similar to arrays, resources, and objects.
- If I understand correctly, under this RFC "foo" == 0 will throw a
TypeError, but ["foo"] == [0] will return false. Generally the behavior of
the recursive comparison here is that it's the same as strict == but all
errors become not-equal instead. Correct? I'm not sure how I feel about
this, as it seems to introduce one more set of semantics next to the weak
==, strict == and === semantics there already are.
The syntax would be $a == $b
(or $a == [0]
), where $a and $b are a
string/int in one case and both an array in the other case. In the second
case, we can't throw a TypeError as both operands are of the same type.
- I also find it somewhat odd that you can't write something like "$obj !=
null" anymore, only "$obj !== null".
To check against null, it's better to use !==. For objects (and resources)
using != null
is ok, but for other types, it's currently not. For
example; [] == null
gives true.
- I think the "solution" to the last three points is a) only support
numbers in relational operators (<,<=,>,>=,<=>) and throw TypeErrors
otherwise (maybe modulo provisions for object overloading) and b) allow
comparing any types in == and !=, without throwing a TypeError. The
question "Are 42 and 'foobar' equal?" has a pretty clear answer: "No they
aren't", so there is no need to make this a TypeError (while the question
"Is 42 larger than 'foobar'?" has no good answer.) I believe doing
something like this would roughly match how Python 3 works. (Edit: I see
now that this is mentioned in the FAQ, but I think it would be good to
reconsider this. It would solve most of my problems with this proposal.)
Besides the argument in the FAQ, having the == and != return do a type
check, means there are a lot more cases where the behavior changes rather
than that a TypeError is thrown. Currently "foobar" == 0
returns true,
but this would make it return false. So would 1 == true
, "0" == 0
and
"0" == false
. To reduce the cases where the behavior changes to a
minimum, it's better to throw TypeErrors for == and !=.
- String increment seems like a pretty niche use case, and I believe that
many people find the overflow behavior quite surprising. I think it may be
better to forbid string increment under strict_operators.
Ok
- A similar argument can be made for the use of &, | and ^ on strings.
While I have some personal fondness for these, in practical terms these are
rarely used and may be indicative of a bug. I think both for string
increment and string and/or/xor it may be better to expose these as
functions so their use is more explicit.
These operators make it very easy to work with binary data as strings in
PHP. In other languages you have to work with byte arrays, which is a major
pain. They're also very intuitive; "wow" & "xul"
is the same as
chr(ord('w') & ord('x')) . chr(ord('o') & ord('u')). chr(ord('w') & ord('l'))
. I think these should stay.
Regards,
Nikita
Arnold