signed long hash index for PHP7?

10 years ago by Andrea Faulds — view source

unread

Instead of doing that, why not simply disallow negative array indices to be
used as integers?

In other words, negative indices are treated as if you had used strings so
that it doesn't upset the the last numeric index kept in the array
structure.

From what I can tell, it should be a pretty simply patch.

Thoughts?

That would make sense, but doesn't solve all edge cases as your maximum array
index is still more than 2 times the largest positive integer on 32-bit.

Perhaps we should completely change behaviour and forbid negative indices and
store indexes that are too large as strings? That would be the sanest way to go
IMO, solves all the edge cases, and makes ["999999999999999999999999999"] and
[999999999999999999999999999] consistent, resolving that long-existing
discrepancy.

Andrea Faulds

10 years ago by Tjerk Meesters — view source

unread

Instead of doing that, why not simply disallow negative array indices to be
used as integers?

In other words, negative indices are treated as if you had used strings so
that it doesn't upset the the last numeric index kept in the array
structure.

From what I can tell, it should be a pretty simply patch.

Thoughts?

That would make sense, but doesn't solve all edge cases as your maximum array
index is still more than 2 times the largest positive integer on 32-bit.

Is that by design, a bug or something else entirely? Could you explain this edge case with some code?

Perhaps we should completely change behaviour and forbid negative indices and
store indexes that are too large as strings? That would be the sanest way to go
IMO, solves all the edge cases, and makes ["999999999999999999999999999"] and
[999999999999999999999999999] consistent, resolving that long-existing
discrepancy.

Forbidding negative indices is a bit harsh and imho quite unnecessary; turning “out of range” indices into strings should work just fine afaict. Is there a reason why it shouldn’t?

A compromise could be to allow string keys that would otherwise have converted into a negative integer, but disallow negative int/float explicitly.

--
Andrea Faulds

10 years ago by Andrea Faulds — view source

unread

That would make sense, but doesn't solve all edge cases as your maximum array
index is still more than 2 times the largest positive integer on 32-bit.

Is that by design, a bug or something else entirely? Could you explain this edge case with some code?

On a 32-bit platform, the maximum signed long is 0x7FFFFFFF, but the maximum unsigned long is 0xFFFFFFFF, slightly more than twice as big.

For example, this does what you’d expect on my machine (OS X 64-bit Intel Core i5):

andreas-air:~ ajf$ php -r '$x = [0xFFFFFFFF => 1]; $x[] = 2; var_dump($x);'
array(2) {
[4294967295]=>
int(1)
[4294967296]=>
int(2)
}

On my 32-bit Ubuntu VM (which I use precisely to test this kind of issue when working on bigints), however, it wraps around:

ajf@andrea-VirtualBox:~$ php -r '$x = [0xFFFFFFFF => 1]; $x[] = 2; var_dump($x);'
array(2) {
[-1]=>
int(1)
[0]=>
int(2)
}

I think we should probably use an unsigned long internally, but prevent negative values.

Forbidding negative indices is a bit harsh and imho quite unnecessary;

Actually, I missed the bit of your email suggesting treating them as strings the first time I read it. I’d be fine with that.

turning “out of range” indices into strings should work just fine afaict. Is there a reason why it shouldn’t?

Well… there is one issue. Basically, some array functions treat integer and string keys completely differently.

A compromise could be to allow string keys that would otherwise have converted into a negative integer, but disallow negative int/float explicitly.

It’d be a complete BC break, but we could make negative indices work like they do in Python and grab the (length + index)th item (i.e. -1 returns item 4 in a list of 5, -2 returns item 3, and so on). However, because our arrays are weird semi-indexed semi-hashmap things, this probably isn’t good, as it’d prevent you from using strings like “-1” as keys. Alas, I can dream.

To actually respond to your suggestion, I don’t like the idea of blocking -1 but allowing “-1”. In PHP, numeric strings, integers and floats are supposed to be equivalent, and I’m already unhappy that large integer indexes and large numeric string indexes work differently. Whatever we do, I’d like PHP 7’s arrays to treat integer, float and numeric string indexes consistently.

Thinking about it a little more, if we use a long for indexes, we don’t even need to make them strings. It would fit the principle of least astonishment IMO if any valid PHP int is a valid index and won’t be a string. I was going to say that negative indexes don’t work right internally, but then I realised they could work fine for indexing into the buckets if we just cast them to unsigned longs internally (hence getting the 2’s complement representation on modern CPUs) for indexing and hashing, but only expose signed longs to the outside world, including through the API.

So in summary, I think we should use signed longs for indexes (or at least whatever type PHP’s basic int is), and anything outside of the range of one should be treated as a string. This would make numeric strings and ints consistent, would solve all the weird overflow issues, and is the most intuitive approach IMO.

--
Andrea Faulds
http://ajf.me/

10 years ago by Lester Caine — view source

unread

That would make sense, but doesn't solve all edge cases as your maximum array
index is still more than 2 times the largest positive integer on 32-bit.

Are we still looking at a situation where how a program performs is
platform specific? An array index of 'bigint' would still not be usable
on 32bit hardware?

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

signed long hash index for PHP7?

-- Lester Caine - G8HFL

--
Lester Caine - G8HFL