Dear list,
I recently ran into big problems with crc32() and ip2long() both of which I 
was using in the same codebase.
I know these issues have been debated at length in the past, but this 
really needs to be fixed.
Anytime you persist these values (to any external medium, files or 
databases) you're sitting on a time bomb.
I realize some of you have countless technical arguments why these 
functions "work as they're supposed to", but php is a high-level language, 
and these functions do not work consistently across platforms.
It can't be the developer's responsibility to write unit-tests for 
return-values on internal functions - nor should we need to write elaborate 
wrapper-functions for these functions to get them to work consistently.
There are dozens (if not nearing 100) different user-land solutions to this 
problem, so it's not like the need isn't there - anyone who has ever used 
these functions probably needed a work-around. The need for an enormous red 
WARNING label, and elaborate explanation on the crc32() documentation page 
says it all - nothing this simple, that has been standardized for this 
long, should require an elaborate explanation, complicated work-arounds or 
for that matter a lot of thought on the developer's side.
Since a signed 32-bit integer value is the lowest common denominator, 
that's what the functions ought to return, so that at least the return 
value is consistent across platforms, and you can decide (for example) 
whether to persist it to a signed or unsigned INT in a database, and except 
it to work the same everywhere. (Databases at large, and at least MySQL, 
correctly persists either signer or unsigned INT values across platforms.)
The simplest work-around I have been able to come up with so far, is this:
var_dump(unpack('l', pack('l', ip2long('255.255.255.0'))));
var_dump(unpack('l', pack('l', crc32('123456789_00_0'))));
Forcing the value into smaller (on some platforms) 32-bit integer, and then 
unpacking it, provides a consistent value on 32-bit and 64-bit systems, and 
on Windows.
Of course there is backwards compatibility to consider for this broken 
behavior, so I propose the simplest solutions is to just add a new pair of 
replacement functions. You don't need to deprecate the existing functions, 
because they work as prescribed, however useless they may be for any 
practical applications.
The new functions and backwards compatible implementations for older 
versions of php might look like this:
/**
- @param string
 - @return int a signed (32-bit) integer value 
*/
function ip2int($ip_string) {
return unpack('l', pack('l', ip2long($ip_string)));
} 
/**
- @param int a signed integer value
 - @return string 
*/
function int2ip($ip_int) {
return long2ip($ip_int);
} 
/**
- @param string
 - @return int a signed integer value 
*/
function crc32i($string) {
return unpack('l', pack('l', crc32($string)));
} 
int2ip() would just be an alias for long2ip().
I spent almost a full day fighting with these functions and testing 
work-arounds, and I bet every php developer who encounters a need for one 
of these functions will some day sooner or later go through the same.
Userland solutions are not solutions to fundamental problems that affect 
everyone who uses the functions.
Arguing that this behavior is "correct" by some technical definition, is 
futile - the behavior is problematic for practical reasons, so technical 
circumstances don't really matter here. Core functions need to actually 
work consistently and predictably for as many users as possible - 
optimizing for C developers and people with deep technical knowledge of 
operating system and compiler specifics does not make sense for a language 
like php.
Please look for reasons to agree rather than disagree out of spite.
As said, I know this has been debated at length in the past, and always 
with the same outcome - but the simple fact is that these functions don't 
work for the end-users, and they do not provide proper cross-platform 
support.
No one cares how integers work internally, in C, in the CPU, or in the VM, 
and it's not relevant.
There is no need to put anyone through all this unnecessary hardship.
These functions need to work for php developers.
- Rasmus Schultz
 
No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to 
convert the long values consistently appears to be:
list($v) = array_values(unpack('l', pack('l', ip2long('255.255.255.0'))));
I missed the fact that array_values() returns an array, and for some reason 
the return-value from unpack() is base-1 with apparently no way to change 
the key to a 0.
As an aside, the information I posted on the manual pages in the comments 
is wrong, and the site currently offers no way to edit or remove a 
comment... dangit...
Dear list,
I recently ran into big problems with
crc32()andip2long()both of which
I was using in the same codebase.I know these issues have been debated at length in the past, but this
really needs to be fixed.Anytime you persist these values (to any external medium, files or
databases) you're sitting on a time bomb.I realize some of you have countless technical arguments why these
functions "work as they're supposed to", but php is a high-level language,
and these functions do not work consistently across platforms.It can't be the developer's responsibility to write unit-tests for
return-values on internal functions - nor should we need to write elaborate
wrapper-functions for these functions to get them to work consistently.There are dozens (if not nearing 100) different user-land solutions to
this problem, so it's not like the need isn't there - anyone who has ever
used these functions probably needed a work-around. The need for an
enormous red WARNING label, and elaborate explanation on thecrc32()
documentation page says it all - nothing this simple, that has been
standardized for this long, should require an elaborate explanation,
complicated work-arounds or for that matter a lot of thought on the
developer's side.Since a signed 32-bit integer value is the lowest common denominator,
that's what the functions ought to return, so that at least the return
value is consistent across platforms, and you can decide (for example)
whether to persist it to a signed or unsigned INT in a database, and except
it to work the same everywhere. (Databases at large, and at least MySQL,
correctly persists either signer or unsigned INT values across platforms.)The simplest work-around I have been able to come up with so far, is this:
var_dump(unpack('l', pack('l', ip2long('255.255.255.0')))); var_dump(unpack('l', pack('l', crc32('123456789_00_0'))));Forcing the value into smaller (on some platforms) 32-bit integer, and
then unpacking it, provides a consistent value on 32-bit and 64-bit
systems, and on Windows.Of course there is backwards compatibility to consider for this broken
behavior, so I propose the simplest solutions is to just add a new pair of
replacement functions. You don't need to deprecate the existing functions,
because they work as prescribed, however useless they may be for any
practical applications.The new functions and backwards compatible implementations for older
versions of php might look like this:/**
- @param string
 - @return int a signed (32-bit) integer value
 
*/
function ip2int($ip_string) {
return unpack('l', pack('l', ip2long($ip_string)));
}/**
- @param int a signed integer value
 - @return string
 
*/
function int2ip($ip_int) {
return long2ip($ip_int);
}/**
- @param string
 - @return int a signed integer value
 
*/
function crc32i($string) {
return unpack('l', pack('l', crc32($string)));
}int2ip() would just be an alias for
long2ip().I spent almost a full day fighting with these functions and testing
work-arounds, and I bet every php developer who encounters a need for one
of these functions will some day sooner or later go through the same.Userland solutions are not solutions to fundamental problems that affect
everyone who uses the functions.Arguing that this behavior is "correct" by some technical definition, is
futile - the behavior is problematic for practical reasons, so technical
circumstances don't really matter here. Core functions need to actually
work consistently and predictably for as many users as possible -
optimizing for C developers and people with deep technical knowledge of
operating system and compiler specifics does not make sense for a language
like php.Please look for reasons to agree rather than disagree out of spite.
As said, I know this has been debated at length in the past, and always
with the same outcome - but the simple fact is that these functions don't
work for the end-users, and they do not provide proper cross-platform
support.No one cares how integers work internally, in C, in the CPU, or in the VM,
and it's not relevant.There is no need to put anyone through all this unnecessary hardship.
These functions need to work for php developers.
- Rasmus Schultz
 
No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l', ip2long('255.255.255.0'))));
I had spotted the error, but didn't want to reply because I don't 
really understand what you are getting at.
The core issue is that PHP doesn't provide a 32-bit unsigned integer 
on a 32-bit platform ... and/or that the size of integer changes 
depending on the platform. But I doubt that is going to change any 
time soon. Crippling 64-bit systems due to old, legacy 32-bit 
platforms is shortsighted.
What's wrong with the manual's approach?
$checksum = sprintf("%u", crc32("The quick brown fox jumped over the 
lazy dog."));
Are you going to do further mathematical operations on it? You can 
take that string and stuff it into an uint32 field into a db without 
an issue.
At the end of the day, there's no getting around that PHP programmers 
need to be aware of the difference between 32-bit and 64-bit systems 
... it affects far more than these two particular functions.
But if these two functions are particularly bothersome, a better "fix" 
IMO is just:
$crc = crc32("12341234", CRC32_UINT32_STRING);
Where the second parameter is CRC32_INT (default & current behavior), 
CRC32_INT32 (always negative if high bit is set), CRC32_UINT32_STRING, 
CRC32_HEX_STRING
Forgive the poor names.
-- 
Matthew Leverton
Matthew,
To give another example:
var_dump(array(sprintf('%u', ip2long('127.0.0.1')) => 'foo', sprintf('%u', 
ip2long('255.255.255.0')) => 'bar'));
array(2) { 
[2130706433]=> 
string(3) "foo" 
["4294967040"]=> 
string(3) "bar" 
}
The keys are now two different types (string vs int) on 32-bit platforms, 
which leads to problems with strict comparison.
Another issue is persistence - if you save these to unsigned integer 
columns in database, and your data access layer converts them back to 
integers on load, the values are going to change again depending on what 
platform you're on.
To quote the specific case where I encountered this, I have an audit trail 
system that logs a lot of user activity - as an optimization, I hash and 
store certain keys using crc32() numeric values, since storing a 32-bit 
number is much cheaper (in terms of storage) than storing strings, as well 
as giving much faster queries.
I encountered all of the problems mentioned while debugging an error in 
this system, and it seems completely bonkers to have to spend this much 
time on something that ought to be totally trivial, and would have been, if 
these functions returned consistent results on all platforms.
Why would you consider a consistent function "crippled"?
Look at the sheer number of comments on the crc32 manual page and tell me 
if you still think this works well for anybody... many of the comments 
(including mine!) aren't even correct and don't lead to predictable 
results...
This should be easy but it is extremely hard.
In my opinion, the function is "crippled" as it is...
On Fri, Aug 30, 2013 at 12:27 PM, Matthew Leverton leverton@gmail.comwrote:
On Fri, Aug 30, 2013 at 10:29 AM, Rasmus Schultz rasmus@mindplay.dk
wrote:No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l',
ip2long('255.255.255.0'))));I had spotted the error, but didn't want to reply because I don't
really understand what you are getting at.The core issue is that PHP doesn't provide a 32-bit unsigned integer
on a 32-bit platform ... and/or that the size of integer changes
depending on the platform. But I doubt that is going to change any
time soon. Crippling 64-bit systems due to old, legacy 32-bit
platforms is shortsighted.What's wrong with the manual's approach?
$checksum = sprintf("%u", crc32("The quick brown fox jumped over the
lazy dog."));Are you going to do further mathematical operations on it? You can
take that string and stuff it into an uint32 field into a db without
an issue.At the end of the day, there's no getting around that PHP programmers
need to be aware of the difference between 32-bit and 64-bit systems
... it affects far more than these two particular functions.But if these two functions are particularly bothersome, a better "fix"
IMO is just:$crc = crc32("12341234", CRC32_UINT32_STRING);
Where the second parameter is CRC32_INT (default & current behavior),
CRC32_INT32 (always negative if high bit is set), CRC32_UINT32_STRING,
CRC32_HEX_STRINGForgive the poor names.
--
Matthew Leverton
array(2) {
[2130706433]=>
string(3) "foo"
["4294967040"]=>
string(3) "bar"
}The keys are now two different types (string vs int) on 32-bit platforms,
which leads to problems with strict comparison.
Prefix your keys with any alpha char to enforce a consistent string type.
Another issue is persistence - if you save these to unsigned integer columns
in database, and your data access layer converts them back to integers on
load, the values are going to change again depending on what platform you're
on.
I've never had that issue with any database layer I've ever worked 
with. Not saying that no such db layer exists, but if you use PHP with 
big integers, then you should treat them as strings in and out. If you 
want to do math on them, then you have to use one of the bigint 
libraries. If you want them for display purposes, keep them as 
strings.
To quote the specific case where I encountered this, I have an audit trail
system that logs a lot of user activity - as an optimization, I hash and
store certain keys usingcrc32()numeric values, since storing a 32-bit
number is much cheaper (in terms of storage) than storing strings, as well
as giving much faster queries.
Again, you should be able to store them from a string int to database 
int without any issues. If so, I'd suggest fixing this at the database 
access layer.
Why would you consider a consistent function "crippled"?
It's crippled in the sense that it punishes people who are using 
modern hardware from intelligently processing the return value. I have 
a 64-bit system, and it returns a negative number for a 32-bit CRC?
Look at the sheer number of comments on the crc32 manual page and tell me if
you still think this works well for anybody... many of the comments
(including mine!) aren't even correct and don't lead to predictable
results...
A lot of PHP functions have tons of user error from people who haven't 
bothered to read the existing manual entry or other comments.
This should be easy but it is extremely hard.
There should never be the expectation that things "just work" without 
having to understand the core features and limitations of the 
language. To me, the examples you are giving are just two cases of the 
larger problem of 32 vs 64 bit. There really is no getting around the 
fact that scripts with integers behave differently depending on the 
system.
So while I don't mean to sound dismissive of your complaints (because 
they are valid), I just don't see how two bandaids over specific 
instances of a larger problem do much good. Although, to be pragmatic, 
I offered what I feel is a better solution than your extra functions.
-- 
Matthew Leverton
Matthew,
Yes, all of these problems can be solved - I am well aware of that. I am 
also painfully aware of how much time it can take to solve them reliably.
I just would like to see a solution rather than a bunch of work-arounds - 
not for my own sake, my problem is solved, but for the sake of every poor 
fool who's going to fall into these traps.
That's all.
On Fri, Aug 30, 2013 at 2:44 PM, Matthew Leverton leverton@gmail.comwrote:
On Fri, Aug 30, 2013 at 11:48 AM, Rasmus Schultz rasmus@mindplay.dk
wrote:array(2) {
[2130706433]=>
string(3) "foo"
["4294967040"]=>
string(3) "bar"
}The keys are now two different types (string vs int) on 32-bit platforms,
which leads to problems with strict comparison.Prefix your keys with any alpha char to enforce a consistent string type.
Another issue is persistence - if you save these to unsigned integer
columns
in database, and your data access layer converts them back to integers on
load, the values are going to change again depending on what platform
you're
on.I've never had that issue with any database layer I've ever worked
with. Not saying that no such db layer exists, but if you use PHP with
big integers, then you should treat them as strings in and out. If you
want to do math on them, then you have to use one of the bigint
libraries. If you want them for display purposes, keep them as
strings.To quote the specific case where I encountered this, I have an audit
trail
system that logs a lot of user activity - as an optimization, I hash and
store certain keys usingcrc32()numeric values, since storing a 32-bit
number is much cheaper (in terms of storage) than storing strings, as
well
as giving much faster queries.Again, you should be able to store them from a string int to database
int without any issues. If so, I'd suggest fixing this at the database
access layer.Why would you consider a consistent function "crippled"?
It's crippled in the sense that it punishes people who are using
modern hardware from intelligently processing the return value. I have
a 64-bit system, and it returns a negative number for a 32-bit CRC?Look at the sheer number of comments on the crc32 manual page and tell
me if
you still think this works well for anybody... many of the comments
(including mine!) aren't even correct and don't lead to predictable
results...A lot of PHP functions have tons of user error from people who haven't
bothered to read the existing manual entry or other comments.This should be easy but it is extremely hard.
There should never be the expectation that things "just work" without
having to understand the core features and limitations of the
language. To me, the examples you are giving are just two cases of the
larger problem of 32 vs 64 bit. There really is no getting around the
fact that scripts with integers behave differently depending on the
system.So while I don't mean to sound dismissive of your complaints (because
they are valid), I just don't see how two bandaids over specific
instances of a larger problem do much good. Although, to be pragmatic,
I offered what I feel is a better solution than your extra functions.--
Matthew Leverton
Hi!
Yes, all of these problems can be solved - I am well aware of that. I am
also painfully aware of how much time it can take to solve them reliably.I just would like to see a solution rather than a bunch of work-arounds -
not for my own sake, my problem is solved, but for the sake of every poor
fool who's going to fall into these traps.
Sorry if I missed it in the thread - so what is your proposal? Is there 
a patch/RFC/something else for it?
-- 
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/ 
(408)454-6900 ext. 227
No, just this thread of e-mails.
What I'm suggesting is simply a set of alternative functions to ip2long() 
and crc32() that return consistent values on all platforms, e.g. 32-bit 
signed integer values - a couple of new functions and a couple of quick 
updates to the documentation explaining why you might want to use them, 
that's all.
On Mon, Sep 2, 2013 at 3:51 AM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Hi!
Yes, all of these problems can be solved - I am well aware of that. I am
also painfully aware of how much time it can take to solve them reliably.I just would like to see a solution rather than a bunch of work-arounds -
not for my own sake, my problem is solved, but for the sake of every poor
fool who's going to fall into these traps.Sorry if I missed it in the thread - so what is your proposal? Is there
a patch/RFC/something else for it?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()andcrc32()that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.
Why 32-bit signed values and not something else - like binhex or some 
other form? What these values are to be used for?
-- 
Stanislav Malyshev, Software Architect 
SugarCRM: http://www.sugarcrm.com/ 
(408)454-6900 ext. 227
if I wanted strings or something else, that would be simple enough - 
sprintf() will do the job.
in my case, I needed a scalar value that I can actually persist to the 
database.
On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.com 
wrote:
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()andcrc32()that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()andcrc32()that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi,
I ran into 32-bit problems, too, when working with >2GB files (in this 
case, raw DVD ISO images) on a 32-bit system (I couldn't find a 
reliable(!!) way to read a 4-byte absolute offset and seek to it). 
Of course, this warning is mentioned in the manpage, but really, not 
having at least a class for unsigned 32-bit ints (e.g. a Uint32 or a 
Uint64 for 64-bit platforms) which native functions like seek and 
friends coerce into a 32/64-bit uint makes PHP useless for anything 
that involves access to files or memory offsets (a problem from 
another project) > 2GB. 
From a technical point of view: would such a UInt32 class actually be 
implementable, and at what cost to BC?
Marco
if I wanted strings or something else, that would be simple enough -
sprintf()will do the job.in my case, I needed a scalar value that I can actually persist to the
database.On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.com
wrote:Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()andcrc32()that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227On Mon, Sep 2, 2013 at 4:21 PM, Stas Malyshev smalyshev@sugarcrm.comwrote:
Hi!
What I'm suggesting is simply a set of alternative functions to
ip2long()andcrc32()that return consistent values on all platforms,
e.g. 32-bit signed integer values - a couple of new functions and a
couple of quick updates to the documentation explaining why you might
want to use them, that's all.Why 32-bit signed values and not something else - like binhex or some
other form? What these values are to be used for?--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Hi,
I ran into 32-bit problems, too, when working with >2GB files (in this
case, raw DVD ISO images) on a 32-bit system (I couldn't find a
reliable(!!) way to read a 4-byte absolute offset and seek to it).
Of course, this warning is mentioned in the manpage, but really, not
having at least a class for unsigned 32-bit ints (e.g. a Uint32 or a
Uint64 for 64-bit platforms) which native functions like seek and
friends coerce into a 32/64-bit uint makes PHP useless for anything
that involves access to files or memory offsets (a problem from
another project) > 2GB.
The seek offset is always signed. In C you have either
int fseek (FILE *stream, long int offset, int whence)
which is signed.
Or you can use 
int fseeko (FILE *stream, off_t offset, int whence) 
or 
off_t lseek(int fd, off_t offset, int whence) 
for descriptors.
When you look to glibc, you will find these definitions 
#define __OFF_T_TYPE        __SYSCALL_SLONG_TYPE 
__STD_TYPE __OFF_T_TYPE __off_t; 
typedef __off_t off_t;
__SYSCALL_SLONG_TYPE is signed type as well (S Long :)).
If you  need to seek over 31bit size on 32bit system, then you can seek 
twice. Just need to set the whence to SEEK_CUR for the second seek...
Jakub
Perhaps this illustrates the problem better:
$value = sprintf('%u', ip2long('255.255.255.0')); 
var_dump($value, (int) $value);
$a = array(); 
$a[$value] = 'foo'; 
var_dump($a);
Output on 64-bit:
string(10) "4294967040" 
int(4294967040) 
array(1) { 
[4294967040]=> 
string(3) "foo" 
}
No problem.
But - output on 32-bit:
string(10) "4294967040" 
int(2147483647) 
array(1) { 
["4294967040"]=> 
string(3) "foo" 
}
In this example, $value and (int) $value lead to incompatible results - 
that's if your database access layer will let you store a string in an 
integer column in the first place. Which, even if it will, when you get the 
integer value back from the database and cast it to an integer, you have 
the same problem again. If you query against the database using integer 
values you computed, you have problems. And so on...
On Fri, Aug 30, 2013 at 12:27 PM, Matthew Leverton leverton@gmail.comwrote:
On Fri, Aug 30, 2013 at 10:29 AM, Rasmus Schultz rasmus@mindplay.dk
wrote:No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l',
ip2long('255.255.255.0'))));I had spotted the error, but didn't want to reply because I don't
really understand what you are getting at.The core issue is that PHP doesn't provide a 32-bit unsigned integer
on a 32-bit platform ... and/or that the size of integer changes
depending on the platform. But I doubt that is going to change any
time soon. Crippling 64-bit systems due to old, legacy 32-bit
platforms is shortsighted.What's wrong with the manual's approach?
$checksum = sprintf("%u", crc32("The quick brown fox jumped over the
lazy dog."));Are you going to do further mathematical operations on it? You can
take that string and stuff it into an uint32 field into a db without
an issue.At the end of the day, there's no getting around that PHP programmers
need to be aware of the difference between 32-bit and 64-bit systems
... it affects far more than these two particular functions.But if these two functions are particularly bothersome, a better "fix"
IMO is just:$crc = crc32("12341234", CRC32_UINT32_STRING);
Where the second parameter is CRC32_INT (default & current behavior),
CRC32_INT32 (always negative if high bit is set), CRC32_UINT32_STRING,
CRC32_HEX_STRINGForgive the poor names.
--
Matthew Leverton
Hi,
No replies probably means no one cares. oh well.
For the record, the examples I posted are wrong - the correct way to
convert the long values consistently appears to be:list($v) = array_values(unpack('l', pack('l', ip2long('255.255.255.0'))));
I recognise this from a comment on one of my SO answers here: http://stackoverflow.com/questions/13419921/how-should-a-crc32-be-stored-in-mysql/13420281#13420281
Instead of using list() and array_values() you can use current() on the return value to get the first array element from unpack()'s result.
echo current(unpack('l', pack('l', crc32(...)))); 
Admittedly only makes it marginally nicer to look at :)
I missed the fact that
array_values()returns an array, and for some reason
the return-value fromunpack()is base-1 with apparently no way to change
the key to a 0.As an aside, the information I posted on the manual pages in the comments
is wrong, and the site currently offers no way to edit or remove a
comment... dangit...Dear list,
I recently ran into big problems with
crc32()andip2long()both of which
I was using in the same codebase.I know these issues have been debated at length in the past, but this
really needs to be fixed.Anytime you persist these values (to any external medium, files or
databases) you're sitting on a time bomb.I realize some of you have countless technical arguments why these
functions "work as they're supposed to", but php is a high-level language,
and these functions do not work consistently across platforms.It can't be the developer's responsibility to write unit-tests for
return-values on internal functions - nor should we need to write elaborate
wrapper-functions for these functions to get them to work consistently.There are dozens (if not nearing 100) different user-land solutions to
this problem, so it's not like the need isn't there - anyone who has ever
used these functions probably needed a work-around. The need for an
enormous red WARNING label, and elaborate explanation on thecrc32()
documentation page says it all - nothing this simple, that has been
standardized for this long, should require an elaborate explanation,
complicated work-arounds or for that matter a lot of thought on the
developer's side.Since a signed 32-bit integer value is the lowest common denominator,
that's what the functions ought to return, so that at least the return
value is consistent across platforms, and you can decide (for example)
whether to persist it to a signed or unsigned INT in a database, and except
it to work the same everywhere. (Databases at large, and at least MySQL,
correctly persists either signer or unsigned INT values across platforms.)The simplest work-around I have been able to come up with so far, is this:
var_dump(unpack('l', pack('l', ip2long('255.255.255.0'))));
var_dump(unpack('l', pack('l', crc32('123456789_00_0'))));
Forcing the value into smaller (on some platforms) 32-bit integer, and
then unpacking it, provides a consistent value on 32-bit and 64-bit
systems, and on Windows.Of course there is backwards compatibility to consider for this broken
behavior, so I propose the simplest solutions is to just add a new pair of
replacement functions. You don't need to deprecate the existing functions,
because they work as prescribed, however useless they may be for any
practical applications.The new functions and backwards compatible implementations for older
versions of php might look like this:/**
- @param string
 - @return int a signed (32-bit) integer value
 
*/
function ip2int($ip_string) {
return unpack('l', pack('l', ip2long($ip_string)));
}/**
- @param int a signed integer value
 - @return string
 
*/
function int2ip($ip_int) {
return long2ip($ip_int);
}/**
- @param string
 - @return int a signed integer value
 
*/
function crc32i($string) {
return unpack('l', pack('l', crc32($string)));
}int2ip() would just be an alias for
long2ip().I spent almost a full day fighting with these functions and testing
work-arounds, and I bet every php developer who encounters a need for one
of these functions will some day sooner or later go through the same.Userland solutions are not solutions to fundamental problems that affect
everyone who uses the functions.Arguing that this behavior is "correct" by some technical definition, is
futile - the behavior is problematic for practical reasons, so technical
circumstances don't really matter here. Core functions need to actually
work consistently and predictably for as many users as possible -
optimizing for C developers and people with deep technical knowledge of
operating system and compiler specifics does not make sense for a language
like php.Please look for reasons to agree rather than disagree out of spite.
As said, I know this has been debated at length in the past, and always
with the same outcome - but the simple fact is that these functions don't
work for the end-users, and they do not provide proper cross-platform
support.No one cares how integers work internally, in C, in the CPU, or in the VM,
and it's not relevant.There is no need to put anyone through all this unnecessary hardship.
These functions need to work for php developers.
- Rasmus Schultz