Dear list,
I recently ran into big problems with crc32()
and ip2long()
both of which I
was using in the same codebase.
I know these issues have been debated at length in the past, but this
really needs to be fixed.
Anytime you persist these values (to any external medium, files or
databases) you're sitting on a time bomb.
I realize some of you have countless technical arguments why these
functions "work as they're supposed to", but php is a high-level language,
and these functions do not work consistently across platforms.
It can't be the developer's responsibility to write unit-tests for
return-values on internal functions - nor should we need to write elaborate
wrapper-functions for these functions to get them to work consistently.
There are dozens (if not nearing 100) different user-land solutions to this
problem, so it's not like the need isn't there - anyone who has ever used
these functions probably needed a work-around. The need for an enormous red
WARNING label, and elaborate explanation on the crc32()
documentation page
says it all - nothing this simple, that has been standardized for this
long, should require an elaborate explanation, complicated work-arounds or
for that matter a lot of thought on the developer's side.
Since a signed 32-bit integer value is the lowest common denominator,
that's what the functions ought to return, so that at least the return
value is consistent across platforms, and you can decide (for example)
whether to persist it to a signed or unsigned INT in a database, and except
it to work the same everywhere. (Databases at large, and at least MySQL,
correctly persists either signer or unsigned INT values across platforms.)
The simplest work-around I have been able to come up with so far, is this:
var_dump(unpack('l', pack('l', ip2long('255.255.255.0'))));
var_dump(unpack('l', pack('l', crc32('123456789_00_0'))));
Forcing the value into smaller (on some platforms) 32-bit integer, and then
unpacking it, provides a consistent value on 32-bit and 64-bit systems, and
on Windows.
Of course there is backwards compatibility to consider for this broken
behavior, so I propose the simplest solutions is to just add a new pair of
replacement functions. You don't need to deprecate the existing functions,
because they work as prescribed, however useless they may be for any
practical applications.
The new functions and backwards compatible implementations for older
versions of php might look like this:
/**
- @param string
- @return int a signed (32-bit) integer value
*/
function ip2int($ip_string) {
return unpack('l', pack('l', ip2long($ip_string)));
}
/**
- @param int a signed integer value
- @return string
*/
function int2ip($ip_int) {
return long2ip($ip_int);
}
/**
- @param string
- @return int a signed integer value
*/
function crc32i($string) {
return unpack('l', pack('l', crc32($string)));
}
int2ip() would just be an alias for long2ip()
.
I spent almost a full day fighting with these functions and testing
work-arounds, and I bet every php developer who encounters a need for one
of these functions will some day sooner or later go through the same.
Userland solutions are not solutions to fundamental problems that affect
everyone who uses the functions.
Arguing that this behavior is "correct" by some technical definition, is
futile - the behavior is problematic for practical reasons, so technical
circumstances don't really matter here. Core functions need to actually
work consistently and predictably for as many users as possible -
optimizing for C developers and people with deep technical knowledge of
operating system and compiler specifics does not make sense for a language
like php.
Please look for reasons to agree rather than disagree out of spite.
As said, I know this has been debated at length in the past, and always
with the same outcome - but the simple fact is that these functions don't
work for the end-users, and they do not provide proper cross-platform
support.
No one cares how integers work internally, in C, in the CPU, or in the VM,
and it's not relevant.
There is no need to put anyone through all this unnecessary hardship.
These functions need to work for php developers.
- Rasmus Schultz
No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:
list($v) = array_values(unpack('l', pack('l', ip2long('255.255.255.0'))));
I missed the fact that array_values()
returns an array, and for some reason
the return-value from unpack()
is base-1 with apparently no way to change
the key to a 0.
As an aside, the information I posted on the manual pages in the comments
is wrong, and the site currently offers no way to edit or remove a
comment... dangit...
Dear list,
I recently ran into big problems with
crc32()
andip2long()
both of which
I was using in the same codebase.I know these issues have been debated at length in the past, but this
really needs to be fixed.Anytime you persist these values (to any external medium, files or
databases) you're sitting on a time bomb.I realize some of you have countless technical arguments why these
functions "work as they're supposed to", but php is a high-level language,
and these functions do not work consistently across platforms.It can't be the developer's responsibility to write unit-tests for
return-values on internal functions - nor should we need to write elaborate
wrapper-functions for these functions to get them to work consistently.There are dozens (if not nearing 100) different user-land solutions to
this problem, so it's not like the need isn't there - anyone who has ever
used these functions probably needed a work-around. The need for an
enormous red WARNING label, and elaborate explanation on thecrc32()
documentation page says it all - nothing this simple, that has been
standardized for this long, should require an elaborate explanation,
complicated work-arounds or for that matter a lot of thought on the
developer's side.Since a signed 32-bit integer value is the lowest common denominator,
that's what the functions ought to return, so that at least the return
value is consistent across platforms, and you can decide (for example)
whether to persist it to a signed or unsigned INT in a database, and except
it to work the same everywhere. (Databases at large, and at least MySQL,
correctly persists either signer or unsigned INT values across platforms.)The simplest work-around I have been able to come up with so far, is this:
var_dump(unpack('l', pack('l', ip2long('255.255.255.0')))); var_dump(unpack('l', pack('l', crc32('123456789_00_0'))));
Forcing the value into smaller (on some platforms) 32-bit integer, and
then unpacking it, provides a consistent value on 32-bit and 64-bit
systems, and on Windows.Of course there is backwards compatibility to consider for this broken
behavior, so I propose the simplest solutions is to just add a new pair of
replacement functions. You don't need to deprecate the existing functions,
because they work as prescribed, however useless they may be for any
practical applications.The new functions and backwards compatible implementations for older
versions of php might look like this:/**
- @param string
- @return int a signed (32-bit) integer value
*/
function ip2int($ip_string) {
return unpack('l', pack('l', ip2long($ip_string)));
}/**
- @param int a signed integer value
- @return string
*/
function int2ip($ip_int) {
return long2ip($ip_int);
}/**
- @param string
- @return int a signed integer value
*/
function crc32i($string) {
return unpack('l', pack('l', crc32($string)));
}int2ip() would just be an alias for
long2ip()
.I spent almost a full day fighting with these functions and testing
work-arounds, and I bet every php developer who encounters a need for one
of these functions will some day sooner or later go through the same.Userland solutions are not solutions to fundamental problems that affect
everyone who uses the functions.Arguing that this behavior is "correct" by some technical definition, is
futile - the behavior is problematic for practical reasons, so technical
circumstances don't really matter here. Core functions need to actually
work consistently and predictably for as many users as possible -
optimizing for C developers and people with deep technical knowledge of
operating system and compiler specifics does not make sense for a language
like php.Please look for reasons to agree rather than disagree out of spite.
As said, I know this has been debated at length in the past, and always
with the same outcome - but the simple fact is that these functions don't
work for the end-users, and they do not provide proper cross-platform
support.No one cares how integers work internally, in C, in the CPU, or in the VM,
and it's not relevant.There is no need to put anyone through all this unnecessary hardship.
These functions need to work for php developers.
- Rasmus Schultz
No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l', ip2long('255.255.255.0'))));
I had spotted the error, but didn't want to reply because I don't
really understand what you are getting at.
The core issue is that PHP doesn't provide a 32-bit unsigned integer
on a 32-bit platform ... and/or that the size of integer changes
depending on the platform. But I doubt that is going to change any
time soon. Crippling 64-bit systems due to old, legacy 32-bit
platforms is shortsighted.
What's wrong with the manual's approach?
$checksum = sprintf("%u", crc32("The quick brown fox jumped over the
lazy dog."));
Are you going to do further mathematical operations on it? You can
take that string and stuff it into an uint32 field into a db without
an issue.
At the end of the day, there's no getting around that PHP programmers
need to be aware of the difference between 32-bit and 64-bit systems
... it affects far more than these two particular functions.
But if these two functions are particularly bothersome, a better "fix"
IMO is just:
$crc = crc32("12341234", CRC32_UINT32_STRING);
Where the second parameter is CRC32_INT (default & current behavior),
CRC32_INT32 (always negative if high bit is set), CRC32_UINT32_STRING,
CRC32_HEX_STRING
Forgive the poor names.
--
Matthew Leverton
Matthew,
To give another example:
var_dump(array(sprintf('%u', ip2long('127.0.0.1')) => 'foo', sprintf('%u',
ip2long('255.255.255.0')) => 'bar'));
array(2) {
[2130706433]=>
string(3) "foo"
["4294967040"]=>
string(3) "bar"
}
The keys are now two different types (string vs int) on 32-bit platforms,
which leads to problems with strict comparison.
Another issue is persistence - if you save these to unsigned integer
columns in database, and your data access layer converts them back to
integers on load, the values are going to change again depending on what
platform you're on.
To quote the specific case where I encountered this, I have an audit trail
system that logs a lot of user activity - as an optimization, I hash and
store certain keys using crc32()
numeric values, since storing a 32-bit
number is much cheaper (in terms of storage) than storing strings, as well
as giving much faster queries.
I encountered all of the problems mentioned while debugging an error in
this system, and it seems completely bonkers to have to spend this much
time on something that ought to be totally trivial, and would have been, if
these functions returned consistent results on all platforms.
Why would you consider a consistent function "crippled"?
Look at the sheer number of comments on the crc32 manual page and tell me
if you still think this works well for anybody... many of the comments
(including mine!) aren't even correct and don't lead to predictable
results...
This should be easy but it is extremely hard.
In my opinion, the function is "crippled" as it is...
On Fri, Aug 30, 2013 at 12:27 PM, Matthew Leverton leverton@gmail.comwrote:
On Fri, Aug 30, 2013 at 10:29 AM, Rasmus Schultz rasmus@mindplay.dk
wrote:No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l',
ip2long('255.255.255.0'))));I had spotted the error, but didn't want to reply because I don't
really understand what you are getting at.The core issue is that PHP doesn't provide a 32-bit unsigned integer
on a 32-bit platform ... and/or that the size of integer changes
depending on the platform. But I doubt that is going to change any
time soon. Crippling 64-bit systems due to old, legacy 32-bit
platforms is shortsighted.What's wrong with the manual's approach?
$checksum = sprintf("%u", crc32("The quick brown fox jumped over the
lazy dog."));Are you going to do further mathematical operations on it? You can
take that string and stuff it into an uint32 field into a db without
an issue.At the end of the day, there's no getting around that PHP programmers
need to be aware of the difference between 32-bit and 64-bit systems
... it affects far more than these two particular functions.But if these two functions are particularly bothersome, a better "fix"
IMO is just:$crc = crc32("12341234", CRC32_UINT32_STRING);
Where the second parameter is CRC32_INT (default & current behavior),
CRC32_INT32 (always negative if high bit is set), CRC32_UINT32_STRING,
CRC32_HEX_STRINGForgive the poor names.
--
Matthew Leverton
array(2) {
[2130706433]=>
string(3) "foo"
["4294967040"]=>
string(3) "bar"
}The keys are now two different types (string vs int) on 32-bit platforms,
which leads to problems with strict comparison.
Prefix your keys with any alpha char to enforce a consistent string type.
Another issue is persistence - if you save these to unsigned integer columns
in database, and your data access layer converts them back to integers on
load, the values are going to change again depending on what platform you're
on.
I've never had that issue with any database layer I've ever worked
with. Not saying that no such db layer exists, but if you use PHP with
big integers, then you should treat them as strings in and out. If you
want to do math on them, then you have to use one of the bigint
libraries. If you want them for display purposes, keep them as
strings.
To quote the specific case where I encountered this, I have an audit trail
system that logs a lot of user activity - as an optimization, I hash and
store certain keys usingcrc32()
numeric values, since storing a 32-bit
number is much cheaper (in terms of storage) than storing strings, as well
as giving much faster queries.
Again, you should be able to store them from a string int to database
int without any issues. If so, I'd suggest fixing this at the database
access layer.
Why would you consider a consistent function "crippled"?
It's crippled in the sense that it punishes people who are using
modern hardware from intelligently processing the return value. I have
a 64-bit system, and it returns a negative number for a 32-bit CRC?
Look at the sheer number of comments on the crc32 manual page and tell me if
you still think this works well for anybody... many of the comments
(including mine!) aren't even correct and don't lead to predictable
results...
A lot of PHP functions have tons of user error from people who haven't
bothered to read the existing manual entry or other comments.
This should be easy but it is extremely hard.
There should never be the expectation that things "just work" without
having to understand the core features and limitations of the
language. To me, the examples you are giving are just two cases of the
larger problem of 32 vs 64 bit. There really is no getting around the
fact that scripts with integers behave differently depending on the
system.
So while I don't mean to sound dismissive of your complaints (because
they are valid), I just don't see how two bandaids over specific
instances of a larger problem do much good. Although, to be pragmatic,
I offered what I feel is a better solution than your extra functions.
--
Matthew Leverton
Matthew,
Yes, all of these problems can be solved - I am well aware of that. I am
also painfully aware of how much time it can take to solve them reliably.
I just would like to see a solution rather than a bunch of work-arounds -
not for my own sake, my problem is solved, but for the sake of every poor
fool who's going to fall into these traps.
That's all.
On Fri, Aug 30, 2013 at 2:44 PM, Matthew Leverton leverton@gmail.comwrote:
On Fri, Aug 30, 2013 at 11:48 AM, Rasmus Schultz rasmus@mindplay.dk
wrote:array(2) {
[2130706433]=>
string(3) "foo"
["4294967040"]=>
string(3) "bar"
}The keys are now two different types (string vs int) on 32-bit platforms,
which leads to problems with strict comparison.Prefix your keys with any alpha char to enforce a consistent string type.
Another issue is persistence - if you save these to unsigned integer
columns
in database, and your data access layer converts them back to integers on
load, the values are going to change again depending on what platform
you're
on.I've never had that issue with any database layer I've ever worked
with. Not saying that no such db layer exists, but if you use PHP with
big integers, then you should treat them as strings in and out. If you
want to do math on them, then you have to use one of the bigint
libraries. If you want them for display purposes, keep them as
strings.To quote the specific case where I encountered this, I have an audit
trail
system that logs a lot of user activity - as an optimization, I hash and
store certain keys usingcrc32()
numeric values, since storing a 32-bit
number is much cheaper (in terms of storage) than storing strings, as
well
as giving much faster queries.Again, you should be able to store them from a string int to database
int without any issues. If so, I'd suggest fixing this at the database
access layer.Why would you consider a consistent function "crippled"?
It's crippled in the sense that it punishes people who are using
modern hardware from intelligently processing the return value. I have
a 64-bit system, and it returns a negative number for a 32-bit CRC?Look at the sheer number of comments on the crc32 manual page and tell
me if
you still think this works well for anybody... many of the comments
(including mine!) aren't even correct and don't lead to predictable
results...A lot of PHP functions have tons of user error from people who haven't
bothered to read the existing manual entry or other comments.This should be easy but it is extremely hard.
There should never be the expectation that things "just work" without
having to understand the core features and limitations of the
language. To me, the examples you are giving are just two cases of the
larger problem of 32 vs 64 bit. There really is no getting around the
fact that scripts with integers behave differently depending on the
system.So while I don't mean to sound dismissive of your complaints (because
they are valid), I just don't see how two bandaids over specific
instances of a larger problem do much good. Although, to be pragmatic,
I offered what I feel is a better solution than your extra functions.--
Matthew Leverton
Hi!
Yes, all of these problems can be solved - I am well aware of that. I am
also painfully aware of how much time it can take to solve them reliably.I just would like to see a solution rather than a bunch of work-arounds -
not for my own sake, my problem is solved, but for the sake of every poor
fool who's going to fall into these traps.
Sorry if I missed it in the thread - so what is your proposal? Is there
a patch/RFC/something else for it?
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
No, just this thread of e-mails.
What I'm suggesting is simply a set of alternative functions to ip2long()
and crc32()
that return consistent values on all platforms, e.g. 32-bit
signed integer values - a couple of new functions and a couple of quick
updates to the documentation explaining why you might want to use them,
that's all.
On Mon, Sep 2, 2013 at 3:51 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Hi!
Yes, all of these problems can be solved - I am well aware of that. I am
also painfully aware of how much time it can take to solve them reliably.I just would like to see a solution rather than a bunch of work-arounds -
not for my own sake, my problem is solved, but for the sake of every poor
fool who's going to fall into these traps.Sorry if I missed it in the thread - so what is your proposal? Is there
a patch/RFC/something else for it?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()
andcrc32()
that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.
Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
if I wanted strings or something else, that would be simple enough -
sprintf()
will do the job.
in my case, I needed a scalar value that I can actually persist to the
database.
On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.com
wrote:
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()
andcrc32()
that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()
andcrc32()
that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi,
I ran into 32-bit problems, too, when working with >2GB files (in this
case, raw DVD ISO images) on a 32-bit system (I couldn't find a
reliable(!!) way to read a 4-byte absolute offset and seek to it).
Of course, this warning is mentioned in the manpage, but really, not
having at least a class for unsigned 32-bit ints (e.g. a Uint32 or a
Uint64 for 64-bit platforms) which native functions like seek and
friends coerce into a 32/64-bit uint makes PHP useless for anything
that involves access to files or memory offsets (a problem from
another project) > 2GB.
From a technical point of view: would such a UInt32 class actually be
implementable, and at what cost to BC?
Marco
if I wanted strings or something else, that would be simple enough -
sprintf()
will do the job.in my case, I needed a scalar value that I can actually persist to the
database.On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.com
wrote:Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()
andcrc32()
that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()
andcrc32()
that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi,
I ran into 32-bit problems, too, when working with >2GB files (in this
case, raw DVD ISO images) on a 32-bit system (I couldn't find a
reliable(!!) way to read a 4-byte absolute offset and seek to it).
Of course, this warning is mentioned in the manpage, but really, not
having at least a class for unsigned 32-bit ints (e.g. a Uint32 or a
Uint64 for 64-bit platforms) which native functions like seek and
friends coerce into a 32/64-bit uint makes PHP useless for anything
that involves access to files or memory offsets (a problem from
another project) > 2GB.
The seek offset is always signed. In C you have either
int fseek (FILE *stream, long int offset, int whence)
which is signed.
Or you can use
int fseeko (FILE *stream, off_t offset, int whence)
or
off_t lseek(int fd, off_t offset, int whence)
for descriptors.
When you look to glibc, you will find these definitions
#define __OFF_T_TYPE __SYSCALL_SLONG_TYPE
__STD_TYPE __OFF_T_TYPE __off_t;
typedef __off_t off_t;
__SYSCALL_SLONG_TYPE is signed type as well (S Long :)).
If you need to seek over 31bit size on 32bit system, then you can seek
twice. Just need to set the whence to SEEK_CUR
for the second seek...
Jakub
Perhaps this illustrates the problem better:
$value = sprintf('%u', ip2long('255.255.255.0'));
var_dump($value, (int) $value);
$a = array();
$a[$value] = 'foo';
var_dump($a);
Output on 64-bit:
string(10) "4294967040"
int(4294967040)
array(1) {
[4294967040]=>
string(3) "foo"
}
No problem.
But - output on 32-bit:
string(10) "4294967040"
int(2147483647)
array(1) {
["4294967040"]=>
string(3) "foo"
}
In this example, $value and (int) $value lead to incompatible results -
that's if your database access layer will let you store a string in an
integer column in the first place. Which, even if it will, when you get the
integer value back from the database and cast it to an integer, you have
the same problem again. If you query against the database using integer
values you computed, you have problems. And so on...
On Fri, Aug 30, 2013 at 12:27 PM, Matthew Leverton leverton@gmail.comwrote:
On Fri, Aug 30, 2013 at 10:29 AM, Rasmus Schultz rasmus@mindplay.dk
wrote:No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l',
ip2long('255.255.255.0'))));I had spotted the error, but didn't want to reply because I don't
really understand what you are getting at.The core issue is that PHP doesn't provide a 32-bit unsigned integer
on a 32-bit platform ... and/or that the size of integer changes
depending on the platform. But I doubt that is going to change any
time soon. Crippling 64-bit systems due to old, legacy 32-bit
platforms is shortsighted.What's wrong with the manual's approach?
$checksum = sprintf("%u", crc32("The quick brown fox jumped over the
lazy dog."));Are you going to do further mathematical operations on it? You can
take that string and stuff it into an uint32 field into a db without
an issue.At the end of the day, there's no getting around that PHP programmers
need to be aware of the difference between 32-bit and 64-bit systems
... it affects far more than these two particular functions.But if these two functions are particularly bothersome, a better "fix"
IMO is just:$crc = crc32("12341234", CRC32_UINT32_STRING);
Where the second parameter is CRC32_INT (default & current behavior),
CRC32_INT32 (always negative if high bit is set), CRC32_UINT32_STRING,
CRC32_HEX_STRINGForgive the poor names.
--
Matthew Leverton
Hi,
No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l', ip2long('255.255.255.0'))));
I recognise this from a comment on one of my SO answers here: http://stackoverflow.com/questions/13419921/how-should-a-crc32-be-stored-in-mysql/13420281#13420281
Instead of using list() and array_values()
you can use current()
on the return value to get the first array element from unpack()
's result.
echo current(unpack('l', pack('l', crc32(...))));
Admittedly only makes it marginally nicer to look at :)
I missed the fact that
array_values()
returns an array, and for some reason
the return-value fromunpack()
is base-1 with apparently no way to change
the key to a 0.As an aside, the information I posted on the manual pages in the comments
is wrong, and the site currently offers no way to edit or remove a
comment... dangit...Dear list,
I recently ran into big problems with
crc32()
andip2long()
both of which
I was using in the same codebase.I know these issues have been debated at length in the past, but this
really needs to be fixed.Anytime you persist these values (to any external medium, files or
databases) you're sitting on a time bomb.I realize some of you have countless technical arguments why these
functions "work as they're supposed to", but php is a high-level language,
and these functions do not work consistently across platforms.It can't be the developer's responsibility to write unit-tests for
return-values on internal functions - nor should we need to write elaborate
wrapper-functions for these functions to get them to work consistently.There are dozens (if not nearing 100) different user-land solutions to
this problem, so it's not like the need isn't there - anyone who has ever
used these functions probably needed a work-around. The need for an
enormous red WARNING label, and elaborate explanation on thecrc32()
documentation page says it all - nothing this simple, that has been
standardized for this long, should require an elaborate explanation,
complicated work-arounds or for that matter a lot of thought on the
developer's side.Since a signed 32-bit integer value is the lowest common denominator,
that's what the functions ought to return, so that at least the return
value is consistent across platforms, and you can decide (for example)
whether to persist it to a signed or unsigned INT in a database, and except
it to work the same everywhere. (Databases at large, and at least MySQL,
correctly persists either signer or unsigned INT values across platforms.)The simplest work-around I have been able to come up with so far, is this:
var_dump(unpack('l', pack('l', ip2long('255.255.255.0'))));
var_dump(unpack('l', pack('l', crc32('123456789_00_0'))));
Forcing the value into smaller (on some platforms) 32-bit integer, and
then unpacking it, provides a consistent value on 32-bit and 64-bit
systems, and on Windows.Of course there is backwards compatibility to consider for this broken
behavior, so I propose the simplest solutions is to just add a new pair of
replacement functions. You don't need to deprecate the existing functions,
because they work as prescribed, however useless they may be for any
practical applications.The new functions and backwards compatible implementations for older
versions of php might look like this:/**
- @param string
- @return int a signed (32-bit) integer value
*/
function ip2int($ip_string) {
return unpack('l', pack('l', ip2long($ip_string)));
}/**
- @param int a signed integer value
- @return string
*/
function int2ip($ip_int) {
return long2ip($ip_int);
}/**
- @param string
- @return int a signed integer value
*/
function crc32i($string) {
return unpack('l', pack('l', crc32($string)));
}int2ip() would just be an alias for
long2ip()
.I spent almost a full day fighting with these functions and testing
work-arounds, and I bet every php developer who encounters a need for one
of these functions will some day sooner or later go through the same.Userland solutions are not solutions to fundamental problems that affect
everyone who uses the functions.Arguing that this behavior is "correct" by some technical definition, is
futile - the behavior is problematic for practical reasons, so technical
circumstances don't really matter here. Core functions need to actually
work consistently and predictably for as many users as possible -
optimizing for C developers and people with deep technical knowledge of
operating system and compiler specifics does not make sense for a language
like php.Please look for reasons to agree rather than disagree out of spite.
As said, I know this has been debated at length in the past, and always
with the same outcome - but the simple fact is that these functions don't
work for the end-users, and they do not provide proper cross-platform
support.No one cares how integers work internally, in C, in the CPU, or in the VM,
and it's not relevant.There is no need to put anyone through all this unnecessary hardship.
These functions need to work for php developers.
- Rasmus Schultz