Hi all,
hex2bin raises E_WARNING
for odd length hex strings from PHP 5.4.1.
http://jp2.php.net/manual/en/function.hex2bin.php
However, just returning FALSE
is easier for programmers.
Current behavior requires additional length check for bad hex to prevent
E_WARNING.
$str = 'abZ';
if (strlen($str)%2) {
// BAD hex - invalid length
} else if (hex2bin($str) {
// BAD hex - invalid char
}
Instead, it would better that hex2bin()
just returns FALSE
for invalid hex,
IMHO.
$str = 'abZ';
if (strlen($str)%2) {
// BAD hex - invalid length or char
}
I can understand current behavior as BC. However, it would better cleanup
bad spec sooner or later.
PHP 5.4: Raise E_ERROR
and return FALSE
for invalid length.
PHP 5.5: Just return FALSE.
or
PHP 5.4: Keep current behavior
PHP 5.4: Riase E_ERROR
and return FALSE
for invalid length.
PHP 5.5: Remove E_ERROR.
Any ideas?
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Typo
2013/6/27 Yasuo Ohgaki yohgaki@ohgaki.net
PHP 5.4: Keep current behavior
PHP 5.4: RiaseE_ERROR
and returnFALSE
for invalid length.
PHP 5.5: RemoveE_ERROR
PHP 5.4: Keep current behavior
PHP 5.5: Riase E_ERROR
and return FALSE
for invalid length.
PHP 5.6: Remove E_ERROR
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
hex2bin raises
E_WARNING
for odd length hex strings from PHP 5.4.1.
http://jp2.php.net/manual/en/function.hex2bin.php
However, just returningFALSE
is easier for programmers.
Why is it easier? If you pass an odd length string to hex2bin you have
malformed input, which is usually a bug on the programmers side. Not having
a warning would make the issue harder to debug.
Nikita
Hi Nikita,
2013/6/27 Nikita Popov nikita.ppv@gmail.com
Why is it easier? If you pass an odd length string to hex2bin you have
malformed input, which is usually a bug on the programmers side. Not having
a warning would make the issue harder to debug.
It's good to have uniformed error handling.
Therefore, E_WARNING
for 'bad chars' is alternative.
hex2bin('abcZ'); // E_WARNING
for invalid char
hex2bin('abc'); // E_WARNING
for invalid length
(Users can handle errors via custom error handler)
or
hex2bin('abcZ'); // return FALSE
since there is invalid char
hex2bin('abc'); // return FALSE
since it has invalid length
(Users should catch errors via their validation/error handling code)
I think either raising error or returning false is OK, but it's not good
have(mix) both for a function. It's not mandatory, though.
It would be nice if there is error/exception handling guideline for
module authors for uniform error/exception handling. IMHO.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi
2013/6/26 Yasuo Ohgaki yohgaki@ohgaki.net:
PHP 5.4: Keep current behavior
PHP 5.4: RiaseE_ERROR
and returnFALSE
for invalid length.
PHP 5.5: Remove E_ERROR.
Never ever will we raise an E_ERROR. E_ERROR
means that we left the
Engine in a state from which we cannot recover from, standard library
function should never prevent execution of a script as this will
introduce inconsistency, if we leave the Engine in a 'recoverable'
state, then we should use E_RECOVERABLE_ERROR, but this is far from
that. We already fixed a few of those in the past, I believe when 5.3
was shipped, we had pretty much killed all of those along when the new
parameter parsing API was being standardized.
I agree that this E_WARNING
is sort of serve, and we should just
return false for invalid values, as we don't need to be THAT detailed,
maybe an E_NOTICE
could work better here even in case something fails
and the developer must know WHY this happened, I believe we do this in
a few places around the core and its extensions.
--
regards,
Kalle Sommer Nielsen
kalle@php.net
2013/6/27 Kalle Sommer Nielsen kalle@php.net
2013/6/26 Yasuo Ohgaki yohgaki@ohgaki.net:
PHP 5.4: Keep current behavior
PHP 5.4: RiaseE_ERROR
and returnFALSE
for invalid length.
PHP 5.5: Remove E_ERROR.Never ever will we raise an E_ERROR.
E_ERROR
means that we left the
Oops.
I meant E_WARNING
which is current hex2bin()
raising for invalid length.
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Kalle,
Thank you for notifying E_ERROR/E_WARNING mistake.
2013/6/27 Kalle Sommer Nielsen kalle@php.net
We already fixed a few of those in the past, I believe when 5.3
was shipped, we had pretty much killed all of those along when the new
parameter parsing API was being standardized.
hex2bin()
is PHP 5.4 function and the behavior was change by PHP 5.4.1
so that raise E_WARNING
for invalid length.
I agree that this
E_WARNING
is sort of serve, and we should just
return false for invalid values
I prefer returning FALSE. Raising E_WARING/E_NOTICE for all invalid hex
would work, too.
maybe an
E_NOTICE
could work better here even in case something fails
and the developer must know WHY this happened, I believe we do this in
a few places around the core and its extensions.
Yes, there are.
We would be better to have error/exception handling guideline
for consistent behavior for future releases.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
I've sent pull request for PHP-5.5 branch.
https://github.com/php/php-src/pull/369
It's simple 1 liner removes E_WARNING
for invalid length.
Are there any objections?
If not I'll merge it to PHP-5.5 and master, then update docs.
--
Yasuo Ohgaki
yohgaki@ohgaki.net
2013/6/27 Yasuo Ohgaki yohgaki@ohgaki.net
Hi all,
hex2bin raises
E_WARNING
for odd length hex strings from PHP 5.4.1.
http://jp2.php.net/manual/en/function.hex2bin.php
However, just returningFALSE
is easier for programmers.Current behavior requires additional length check for bad hex to prevent
E_WARNING.$str = 'abZ';
if (strlen($str)%2) {
// BAD hex - invalid length
} else if (hex2bin($str) {
// BAD hex - invalid char
}Instead, it would better that
hex2bin()
just returnsFALSE
for invalid
hex, IMHO.$str = 'abZ';
if (strlen($str)%2) {
// BAD hex - invalid length or char
}I can understand current behavior as BC. However, it would better cleanup
bad spec sooner or later.PHP 5.4: Raise
E_ERROR
and returnFALSE
for invalid length.
PHP 5.5: Just return FALSE.or
PHP 5.4: Keep current behavior
PHP 5.4: RiaseE_ERROR
and returnFALSE
for invalid length.
PHP 5.5: Remove E_ERROR.Any ideas?
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
I've sent pull request for PHP-5.5 branch.
https://github.com/php/php-src/pull/369It's simple 1 liner removes
E_WARNING
for invalid length.Are there any objections?
Yes, I object to removing the error in the same breath we're arguing for
consistency. 5.4.0 through 5.4.3 (no error), 5.4.4 through 5.4.16 (error),
5.5.0 (error), 5.5.1 (no error)??? What kind of consistency is this?
I thought you wanted to add an extra error for malformed hex, which I would
have been fine with, but removing the error entirely? The error is useful.
It informs the user that they may have buggy code since the function is
clearly documented to expect even length hex encoded strings.
If not I'll merge it to PHP-5.5 and master, then update docs.
Hi Sherif,
I would like to have consistent behavior at least within a function.
2013/6/27 Sherif Ramadan theanomaly.is@gmail.com
I thought you wanted to add an extra error for malformed hex, which I
would have been fine with, but removing the error entirely? The error is
useful. It informs the user that they may have buggy code since the
function is clearly documented to expect even length hex encoded strings.
It is good to have additional errors for invalid inputs. It would be
beneficial for many users.
I've checked some conversion(decoder) functions quickly.
Functions raise E_WARNING
/ E_NOTICE
for invalid inputs
unserialize()
convert_uudecode()
xmlreader module
Functions simply return FALSE
for invalid inputs
base64_decode()
pg_unescape_bytea()
mb_decode_mimiheader()
mb_decode_numericentity()
mb_convert_kana()
mb_convert_encoding()
mb_convert_variables()
Note: mbstring functions raise errors for invalid encoding, otherwise
simply return FALSE.
Functions do not check validity
quoted_printable_decode()
xml_utf8_decode() - replaces bad chars to '?'
Functions have separate error message function
json_decode()
- returns FALSE, but can get errors via json_last_error()
Decoding errors are usually a bug or some kind of attack, so I agree to add
E_NOTICEs. Exception is decoders that supposed to accept external inputs.
e.g. base64_decode()
and mbstring functions.
I think pg_unescape_bytea() should raise E_WARNING, so I'll add it later.
Adding E_WARNING
to it will never be BC issue. It's obvious bug.
hex2bin()
will not be used for handling external inputs almost always, so
raising E_WARNING
make sense.
I've updated pull request. (Added E_WARNING
for bad hex)
https://github.com/php/php-src/pull/369
Everyone is OK with this?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Sent from my iPhone
Hi Sherif,
I would like to have consistent behavior at least within a function.
2013/6/27 Sherif Ramadan theanomaly.is@gmail.com
I thought you wanted to add an extra error for malformed hex, which I
would have been fine with, but removing the error entirely? The error is
useful. It informs the user that they may have buggy code since the
function is clearly documented to expect even length hex encoded strings.It is good to have additional errors for invalid inputs. It would be
beneficial for many users.
I've checked some conversion(decoder) functions quickly.Functions raise
E_WARNING
/E_NOTICE
for invalid inputs
unserialize()
convert_uudecode()
xmlreader moduleFunctions simply return
FALSE
for invalid inputs
base64_decode()
pg_unescape_bytea()
mb_decode_mimiheader()
mb_decode_numericentity()
mb_convert_kana()
mb_convert_encoding()
mb_convert_variables()
Note: mbstring functions raise errors for invalid encoding, otherwise
simply return FALSE.Functions do not check validity
quoted_printable_decode()
xml_utf8_decode() - replaces bad chars to '?'Functions have separate error message function
json_decode()
- returns FALSE, but can get errors viajson_last_error()
Decoding errors are usually a bug or some kind of attack, so I agree to add
E_NOTICEs. Exception is decoders that supposed to accept external inputs.
e.g.base64_decode()
and mbstring functions.I think pg_unescape_bytea() should raise E_WARNING, so I'll add it later.
AddingE_WARNING
to it will never be BC issue. It's obvious bug.
hex2bin()
will not be used for handling external inputs almost always, so
raisingE_WARNING
make sense.I've updated pull request. (Added
E_WARNING
for bad hex)https://github.com/php/php-src/pull/369
Everyone is OK with this?
The thread started with the assertion that it raises a warning and the commits first remove the warning and then adds it again later, so isn't the whole PR a noop? :)
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Tjerk,
2013/6/27 Tjerk Meesters tjerk.meesters@gmail.com
The thread started with the assertion that it raises a warning and the
commits first remove the warning and then adds it again later, so isn't the
whole PR a noop? :)
The issue is inconsistent behavior of hex2bin against invalid inputs.
Both removing and adding E_WARNING
provide consistency.
Adding E_WARNING
is better than removing as a result discussion, IMO.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.netHi
Hi Tjerk,
2013/6/27 Tjerk Meesters tjerk.meesters@gmail.com
The thread started with the assertion that it raises a warning and the
commits first remove the warning and then adds it again later, so isn't the
whole PR a noop? :)The issue is inconsistent behavior of hex2bin against invalid inputs.
Both removing and addingE_WARNING
provide consistency.
I didn't look close enough apparently :)
Adding
E_WARNING
is better than removing as a result discussion, IMO.
Given the sizeable number of functions that don't raise warnings, should
this behaviour then be extended to those as well, e.g. base64_decode()
,
mb_*()?
Of course, doing so puts the onus on the developer to validate their inputs
first to prevent warnings, but personally I feel the gauge on likelihood to
user input exposure conflicts with consistency concerns.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.netHi
--
Tjerk
2013/6/28 Tjerk Meesters tjerk.meesters@gmail.com
Adding
E_WARNING
is better than removing as a result discussion, IMO.Given the sizeable number of functions that don't raise warnings, should
this behaviour then be extended to those as well, e.g.base64_decode()
,
mb_*()?Of course, doing so puts the onus on the developer to validate their
inputs first to prevent warnings, but personally I feel the gauge on
likelihood to user input exposure conflicts with consistency concerns.
Not like function deals with internal data, function that is supposed to
deal with external inputs is better if it returns FALSE
without error.
Since we don't want to handle attack/invalid inputs as errors always. One
may treat invalid data as user mistake.
It depends of use case of the specific function, if triggering error is
better or not.
Anyway, mixing them up for a function should not be done, IMHO.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi all,
I've sent pull request for PHP-5.5 branch.
https://github.com/php/php-src/pull/369It's simple 1 liner removes
E_WARNING
for invalid length.Are there any objections?
Yes, I object to removing the error in the same breath we're arguing for
consistency. 5.4.0 through 5.4.3 (no error), 5.4.4 through 5.4.16 (error),
5.5.0 (error), 5.5.1 (no error)??? What kind of consistency is this?I thought you wanted to add an extra error for malformed hex, which I would
have been fine with, but removing the error entirely? The error is useful.
It informs the user that they may have buggy code since the function is
clearly documented to expect even length hex encoded strings.
In my opinion generating E_WARNING
is too severe. Passing improper
string to hex2bin does not necessary mean "buggy code". If the string is
eg. from external input, why should I check if it's a well formed hex
string if the function does this anyway? I think E_NOTICE
+ returning
false is enough.
And about consistency... I don't know the rationale behind adding
E_ERROR
in recent versions, but if the situation is bad, it should be
fixed, not proliferated to next versions. Yes, it'd better if the change
was better thought, but the milk has spilled.
--Leszek
On Fri, Jun 28, 2013 at 5:21 AM, Leszek Krupiński leafnode@gmail.comwrote:
Hi all,
I've sent pull request for PHP-5.5 branch.
https://github.com/php/php-**src/pull/369https://github.com/php/php-src/pull/369It's simple 1 liner removes
E_WARNING
for invalid length.Are there any objections?
Yes, I object to removing the error in the same breath we're arguing for
consistency. 5.4.0 through 5.4.3 (no error), 5.4.4 through 5.4.16 (error),
5.5.0 (error), 5.5.1 (no error)??? What kind of consistency is this?I thought you wanted to add an extra error for malformed hex, which I
would
have been fine with, but removing the error entirely? The error is useful.
It informs the user that they may have buggy code since the function is
clearly documented to expect even length hex encoded strings.In my opinion generating
E_WARNING
is too severe. Passing improper string
to hex2bin does not necessary mean "buggy code". If the string is eg. from
external input, why should I check if it's a well formed hex string if the
function does this anyway? I thinkE_NOTICE
+ returning false is enough.And about consistency... I don't know the rationale behind adding
E_ERROR
in recent versions, but if the situation is bad, it should be fixed, not
proliferated to next versions. Yes, it'd better if the change was better
thought, but the milk has spilled.
To be fair, the only reason we ended up adding an E_WARNING
in the first
place is because someone raised a concern that they had the wrong
expectation about what the function does. As far as I can tell, there is no
reason that the user would believe hex2bin (a pretty descriptive function
name) should accept anything other than proper hex notation. Meaning
that returning false on failure is fair when the string is not valid
hexadecimal encoding.
The issue of having odd or even length hexits, however, is a different
matter entirely. People (myself included) misunderstood what the function
does. The misconception being that this function took a hexadecimal
number and converted it to a binary number, which is not what it
does. In such a case it is perfectly acceptable to have something like "f"
translate to a valid ASCII character, for example. As is with the case of
pack("h1". "f"), or ord(base_convert('f', 16, 10))...
This is a fruitless endeavor to complain about consistency in PHP where we
are the ones actually creating unnecessary inconsistencies that can't be
justified from an objective point-of-view, in my own personal opinion.
Hi Sherif,
The problem is that current code 'returns false for invalid hex' while it
'returns false and E_WARNING
for invalid length'.
We may choose behavior for invalid inputs
- return false always
- return false and raise
E_WARNING
always
Mixing them up is not good, especially in the same function. IMHO. PHP has
enough inconsistency. We should try to remove inconsistency much as
possible. Besides the change will not break code as long as it is used as
it should be.
Considering the use case of hex2bin()
, it will not handle external inputs
in most cases. Therefore, it's better to raise errors invalid inputs. Do
you prefer not to raise errors?
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Sherif,
The problem is that current code 'returns false for invalid hex' while it
'returns false andE_WARNING
for invalid length'.We may choose behavior for invalid inputs
- return false always
- return false and raise
E_WARNING
alwaysMixing them up is not good, especially in the same function. IMHO. PHP has
enough inconsistency. We should try to remove inconsistency much as
possible. Besides the change will not break code as long as it is used as
it should be.
If we tried to fix all of PHP's functions to either return false always or
return false and raise errors always we'd have to fix every single PHP
function there is since there's absolutely no consistency there as it is.
Different functions handle failure differently, and not all functions
report every possible error they may encounter. Take functions like
file_get_contents, for example, where it wraps other API functions that
could RETURN_FAILURE and/or trigger the error handler, and
file_get_contents isn't even documented to raise those errors in some cases.
I see these as two separate problems. The first problem is clearly a design
problem (but we can't really do much about that in the short term). The
second is a documentation problem. Beyond that there are users that don't
clearly understand the behavior of some functions due to the previous two
reasons.
Considering the use case of
hex2bin()
, it will not handle external inputs
in most cases. Therefore, it's better to raise errors invalid inputs. Do
you prefer not to raise errors?
Considering the fact that we have a discussion in the bug report mentioned
earlier (and it was an extensive discussion because not only were our users
confused but even our core developers were confused if you read the
responses in that bug report), I would prefer that we leave the function
the way it is.
The error was added for a very specific reason. People thought it was doing
what pack('h*', $hex) does, and clearly it wasn't. So the documentation was
also updated to make note of this and explain it to the user, along with
the helpful error message to indicate that hex2bin('f') would not work. I
don't think there was ever any expectation for the user to do something
like hex2bin('Hello World') and expect any kind of sensible result. The
error really does not inform them of anything useful.
</my $0.000002>
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Hi Sherif,
2013/6/29 Sherif Ramadan theanomaly.is@gmail.com
If we tried to fix all of PHP's functions to either return false always or
return false and raise errors always we'd have to fix every single PHP
function there is since there's absolutely no consistency there as it is.
We don't have to change them all. Whether function should raise error or
just return false determined by use case. For instance, raising error in
base64_decode()
would be annoying.
Different functions handle failure differently, and not all functions
report every possible error they may encounter. Take functions like
file_get_contents, for example, where it wraps other API functions that
could RETURN_FAILURE and/or trigger the error handler, and
file_get_contents isn't even documented to raise those errors in some cases.I see these as two separate problems. The first problem is clearly a
design problem (but we can't really do much about that in the short term).
The second is a documentation problem. Beyond that there are users that
don't clearly understand the behavior of some functions due to the previous
two reasons.
Third problem. There should be error/exception handling guideline for
developers.
Considering the use case of
hex2bin()
, it will not handle external inputs
in most cases. Therefore, it's better to raise errors invalid inputs. Do
you prefer not to raise errors?Considering the fact that we have a discussion in the bug report mentioned
earlier (and it was an extensive discussion because not only were our users
confused but even our core developers were confused if you read the
responses in that bug report), I would prefer that we leave the function
the way it is.The error was added for a very specific reason. People thought it was
doing what pack('h*', $hex) does, and clearly it wasn't. So the
documentation was also updated to make note of this and explain it to the
user, along with the helpful error message to indicate that hex2bin('f')
would not work.
I respect the discussion. However, it does not matter to new comers.
Raising E_WARING for bad length while simply return false for bad hex does
not make sense.
I don't think there was ever any expectation for the user to do something
like hex2bin('Hello World') and expect any kind of sensible result. The
error really does not inform them of anything useful.
There is different view for this. Supplying invalid string to hex2bin()
is
obvious bug or attack. It would worth to notify users there is obvious
problem.
I didn't want to take vote for such small change. I hope you've convinced.
Regards,
--
Yasuo Ohgaki
yohgaki@ohgaki.net
Sherif Ramadan wrote:
I see these as two separate problems. The first problem is clearly a design
problem (but we can't really do much about that in the short term).
One thing that came to mind when this first popped up was "Why isn't the string
automatically zero padded?" Check my own uses I see that I've always had an even
number of characters but that seems to be more by luck than design ...
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Hi Sherif,
The problem is that current code 'returns false for invalid hex' while it
'returns false andE_WARNING
for invalid length'.We may choose behavior for invalid inputs
- return false always
- return false and raise
E_WARNING
always
For the record, I'm okay with having E_WARNING
in both cases.
Nikita