Good evening,
Since I don’t want this to languish as a ‘Draft’ forever, despite the patch being incomplete, I am finally putting the Big Integer Support RFC “Under Discussion”.
The RFC can be found here: https://wiki.php.net/rfc/bigint
The patch is, as I mentioned, incomplete. Additionally, there are quite a few matters left to be decided (see Open Questions). However, I think I should put this formally under discussion now.
Any help with the patch (largely just updating extensions and the mountains of tests these changes break, though later I will need to deal with opcache) would be appreciated.
Thanks!
Andrea Faulds
http://ajf.me/
Hi!
Since I don’t want this to languish as a ‘Draft’ forever, despite the
patch being incomplete, I am finally putting the Big Integer Support
RFC “Under Discussion”.The RFC can be found here: https://wiki.php.net/rfc/bigint
This introduces new type, IS_BIGINT. However, given that GMP now
supports arithmetical operations, I wonder if it won't be easier to do
it in slightly different way, specifically create a hook that is going
to be called when an operation is about to cause over/underflow and let
GMP hook there and produce a GMP number (I'm not sure about the exact
details how to actually do it, so it's just an idea now, but if it makes
sense we can try to work out technical details).
Of course, this would require some rough edges to be polished, such as
what happens if you try to use it as int, or convert, etc. but this is
already present with IS_BIGINT too, and additionally we already have
conversion handlers for objects, which aren't consistently used in all
cases but can be made so. The benefit is we're not creating anything
completely new, we just improving how objects work.
This would also allow anybody who doesn't like GMP big integers easily
implement their own module to replace them.
Moreover, this also allows to make the support for bigints optional -
i.e., if you don't need bigints, you don't have to carry GMP and thus do
not have to be bound by its license.
What do you think?
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
Hi!
Since I don’t want this to languish as a ‘Draft’ forever, despite the
patch being incomplete, I am finally putting the Big Integer Support
RFC “Under Discussion”.The RFC can be found here: https://wiki.php.net/rfc/bigint
This introduces new type, IS_BIGINT. However, given that GMP now
supports arithmetical operations, I wonder if it won't be easier to do
it in slightly different way, specifically create a hook that is going
to be called when an operation is about to cause over/underflow and let
GMP hook there and produce a GMP number (I'm not sure about the exact
details how to actually do it, so it's just an idea now, but if it makes
sense we can try to work out technical details).Of course, this would require some rough edges to be polished, such as
what happens if you try to use it as int, or convert, etc. but this is
already present with IS_BIGINT too, and additionally we already have
conversion handlers for objects, which aren't consistently used in all
cases but can be made so. The benefit is we're not creating anything
completely new, we just improving how objects work.This would also allow anybody who doesn't like GMP big integers easily
implement their own module to replace them.Moreover, this also allows to make the support for bigints optional -
i.e., if you don't need bigints, you don't have to carry GMP and thus do
not have to be bound by its license.What do you think?
I'm not sure what this would solve. Sure, you could just use objects instead of a new type, but both present exactly the same challenges. Adding a new type isn't hard in itself. The problem is updating everything which handles numbers and their associated tests. This doesn't make my job any easier. It also wouldn't cover a few places that a new type can, like constants. Another problem is this means that bigints are a separate thing from ints, meaning users have to worry about a new type which sometimes behaves differently. This isn't good. Under this RFC's proposal, however, bigints are a mere implementation detail. So far as the user cares, there are just ints.
Making it optional destroys most of the benefits of the RFC. Instead of reducing platform differences, it adds a massive new one. Now developers have to check whether or not bigints are enabled and have two different code paths. That's much worse than the status quo.
--
Andrea Faulds
http://ajf.me/
Hi!
I'm not sure what this would solve. Sure, you could just use objects
instead of a new type, but both present exactly the same challenges.
Adding a new type isn't hard in itself. The problem is updating
everything which handles numbers and their associated tests. This
Exactly. Since objects are convertable to numbers (and to anything, in
fact) we get double profit here - we make objects work better and we
achieve big integer support. And we don't need to handle new type where
we don't need numbers
doesn't make my job any easier. It also wouldn't cover a few places
that a new type can, like constants. Another problem is this means
I'm not sure I see much case for bigint constants. Would be pretty hard
for me to come up with a case where you need such a constant, and if you
do, you could just have a string constant and convert it to GMP in runtime.
that bigints are a separate thing from ints, meaning users have to
worry about a new type which sometimes behaves differently. This
isn't good. Under this RFC's proposal, however, bigints are a mere
implementation detail. So far as the user cares, there are just
ints.
No, they are not implementation detail - they are whole new type, which
means every extension and every piece of PHP code aware of types now
needs to know about it and needs special code to handle it. I.e. you
pass it to mysql - mysql needs to handle this type. You pass it to SOAP
- SOAP needs to handle this type. Etc. But if it's an object, they
already deal with objects, one way or another.
Making it optional destroys most of the benefits of the RFC. Instead
of reducing platform differences, it adds a massive new one. Now
I'm not saying we have to make it optional. I'm just saying it's possible.
developers have to check whether or not bigints are enabled and have
two different code paths. That's much worse than the status quo.
I don't see why you'd have two code paths. If you need bigints and they
are not there, then you just fail, like with any extension your code
needs and is not installed. If it's there, you just continue working.
All the code existing now doesn't need bigints, and even in the future
most code won't need it. But for some code it would just work like
before, only with unlimited range now.
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
I don't see why you'd have two code paths. If you need bigints and they
are not there, then you just fail, like with any extension your code
needs and is not installed. If it's there, you just continue working.
All the code existing now doesn't need bigints, and even in the future
most code won't need it. But for some code it would just work like
before, only with unlimited range now.
'bitinteger!'
I'm still waiting to see how we handle 'BIGINT' under this rfc since
that is something every database driver does need to handle.
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
I don't see why you'd have two code paths. If you need bigints and they
are not there, then you just fail, like with any extension your code
needs and is not installed. If it's there, you just continue working.
All the code existing now doesn't need bigints, and even in the future
most code won't need it. But for some code it would just work like
before, only with unlimited range now.'bitinteger!'
I'm still waiting to see how we handle 'BIGINT' under this rfc since
that is something every database driver does need to handle.
If you mean 64-bit ints, this RFC enables them to work on 32-bit too with exactly the same semantics. No more float overflow. On a 64-bit machine, they’re IS_LONG internally, and on 32-bit machines they’re IS_BIGINT, but the user doesn’t need to worry, they both act the same.
Assuming I actually get round to updating the DB drivers.
Andrea Faulds
http://ajf.me/
Hi!
I'm not sure what this would solve. Sure, you could just use objects
instead of a new type, but both present exactly the same challenges.
Adding a new type isn't hard in itself. The problem is updating
everything which handles numbers and their associated tests. ThisExactly. Since objects are convertable to numbers (and to anything, in
fact) we get double profit here - we make objects work better and we
achieve big integer support. And we don't need to handle new type where
we don't need numbers
Handling a new type in cases where we don’t need numbers isn’t really a problem.
doesn't make my job any easier. It also wouldn't cover a few places
that a new type can, like constants. Another problem is this meansI'm not sure I see much case for bigint constants. Would be pretty hard
for me to come up with a case where you need such a constant, and if you
do, you could just have a string constant and convert it to GMP in runtime.
Still, it’s inconvenient. More for developers to worry about.
that bigints are a separate thing from ints, meaning users have to
worry about a new type which sometimes behaves differently. This
isn't good. Under this RFC's proposal, however, bigints are a mere
implementation detail. So far as the user cares, there are just
ints.No, they are not implementation detail - they are whole new type, which
means every extension and every piece of PHP code aware of types now
needs to know about it and needs special code to handle it.
No, only extensions. It is completely transparent to userland. That’s the whole point.
I.e. you
pass it to mysql - mysql needs to handle this type. You pass it to SOAP
- SOAP needs to handle this type. Etc. But if it's an object, they
already deal with objects, one way or another.
Yes, but they don’t handle large integer objects already. So you pass it a GMP object, it converts to a string, then that overflows and you end up with a float when it converts it to a number. Which isn’t what you wanted. Or, it handles it as a string, which is also not ideal, as while a string and an int may be the same thing to some extensions, they are not to others.
developers have to check whether or not bigints are enabled and have
two different code paths. That's much worse than the status quo.I don't see why you'd have two code paths. If you need bigints and they
are not there, then you just fail, like with any extension your code
needs and is not installed.
It’s not about “extensions your code needs”. If you need ext/gmp, you can already require it. This RFC is about removing cross-platform integer handling differences.
All the code existing now doesn't need bigints, and even in the future
most code won't need it. But for some code it would just work like
before, only with unlimited range now.
No, but existing code does have to handle float overflow. If you allow that to optionally be int overflow, you now need to worry about handling both.
--
Andrea Faulds
http://ajf.me/
Hi!
Still, it’s inconvenient. More for developers to worry about.
I still have no idea why one would need a bigint constant, could you
give an common example where you would do that?
No, only extensions. It is completely transparent to userland.
That’s the whole point.
I'm not sure how it can be completely transparent if it's a different
type. Is it still identifying as int? In this case, this is dangerous as
some functions may not be able to accept big integers when accepting int
arguments, but checks for is_int, etc. would pass.
Yes, but they don’t handle large integer objects already. So you pass
it a GMP object, it converts to a string, then that overflows and you
end up with a float when it converts it to a number. Which isn’t what
you wanted. Or, it handles it as a string, which is also not ideal,
as while a string and an int may be the same thing to some
extensions, they are not to others.
If it's not, the extension has to handle it, the same way it has to
handle bigint anyway if it makes difference for it. The point is many
common cases are already covered, e.g. if the extension just needs a
string, or if the bigint actually represents a small int, etc.
It’s not about “extensions your code needs”. If you need ext/gmp, you
can already require it. This RFC is about removing cross-platform
integer handling differences.
But nothing changes there - it is still removing the diffs and it still
requires GMP. The only change is you're not paying for it if you don't
need it.
No, but existing code does have to handle float overflow. If you
allow that to optionally be int overflow, you now need to worry about
handling both.
What's "float overflow"? I'm not sure I'm getting your point here. You
don't need to handle anything - if your code doesn't care about big
ints, you just do math as usual. If it does, then you have to check big
ints are there, then do math as usual but be aware that int can be now
of two different types. I don't see any difference from the RFC here.
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
Hi!
Still, it’s inconvenient. More for developers to worry about.
I still have no idea why one would need a bigint constant, could you
give an common example where you would do that?
The main point is why you should prohibit it. The point of bigints is to remove cross-platform integer differences. Why shouldn’t you here? Why should I conditionally do different things on 64-bit and 32-bit?
No, only extensions. It is completely transparent to userland.
That’s the whole point.I'm not sure how it can be completely transparent if it's a different
type. Is it still identifying as int?
Yes.
In this case, this is dangerous as
some functions may not be able to accept big integers when accepting int
arguments, but checks for is_int, etc. would pass.
We already have this danger for another type: boolean. phpng got rid of IS_BOOL in favour of IS_TRUE and IS_FALSE. If we can update everything to handle the IS_BOOL change, surely we can update everything to handle bigints, too.
If it's not, the extension has to handle it, the same way it has to
handle bigint anyway if it makes difference for it. The point is many
common cases are already covered, e.g. if the extension just needs a
string, or if the bigint actually represents a small int, etc.
Many common cases are easily covered by a new type anyway. You overestimate the effort, I have already done the work, it’s not much. Objects make nothing easier.
No, but existing code does have to handle float overflow. If you
allow that to optionally be int overflow, you now need to worry about
handling both.What's "float overflow”?
Beyond PHP_INT_MAX, integers magically become floats in PHP. They have done so for a long time.
I'm not sure I'm getting your point here. You
don't need to handle anything - if your code doesn't care about big
ints, you just do math as usual.
Then get weird results when someone passes a large number in.
If it does, then you have to check big
ints are there, then do math as usual but be aware that int can be now
of two different types. I don't see any difference from the RFC here.
The main point of the RFC is to make integers completely consistent across platforms and to remove the need to worry about overflow. Adding optional overflow to GMP means you still have to worry about it. It doesn’t solve anything. You can already use GMP for applications which explicitly need to use large numbers. This RFC doesn’t exist for that purpose.
--
Andrea Faulds
http://ajf.me/
Hi!
We already have this danger for another type: boolean. phpng got rid
of IS_BOOL in favour of IS_TRUE and IS_FALSE. If we can update
everything to handle the IS_BOOL change, surely we can update
everything to handle bigints, too.
No, it's not the same thing at all. For bool, you still have only true
and false. For bigint, your function now should be able to handle
infinite integers internally, but what if it has fixed resources that
assume integers have fixed range? For extensions, it's a commonplace,
but even for user code that can happen. That means, any call that you do
to an internal function with int argument now could fail since the
internal function is unable to support bigint, and you can't even guard
for this since your code can not distinguish regular int from bigint. I
don't think it is a good situation.
Then get weird results when someone passes a large number in.
Why would you get weird results? You describe some vague dangers but I
didn't see any concrete example of what is different.
The main point of the RFC is to make integers completely consistent
across platforms and to remove the need to worry about overflow.
Which does not change with my proposal.
Adding optional overflow to GMP means you still have to worry about
it. It doesn’t solve anything. You can already use GMP for
You seem to misunderstand what my proposal is. It doesn't add any
additional overflow, it just changes from using separate type
masquerading as int to using objects. All the rest stays the same.
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
We already have this danger for another type: boolean. phpng got rid
of IS_BOOL in favour of IS_TRUE and IS_FALSE. If we can update
everything to handle the IS_BOOL change, surely we can update
everything to handle bigints, too.No, it's not the same thing at all. For bool, you still have only true
and false. For bigint, your function now should be able to handle
infinite integers internally, but what if it has fixed resources that
assume integers have fixed range?
You throw an error. Just as plenty of functions already can’t handle ridiculously large integer arguments.
For extensions, it's a commonplace,
but even for user code that can happen. That means, any call that you do
to an internal function with int argument now could fail since the
internal function is unable to support bigint, and you can't even guard
for this since your code can not distinguish regular int from bigint. I
don't think it is a good situation.
If a function can’t support a large integer argument, this is usually for an obvious reason. I am not tormented daily in Python by the fact that I can’t seek by 2^69 bytes in a file, and I doubt any PHP developer would be.
Then get weird results when someone passes a large number in.
Why would you get weird results? You describe some vague dangers but I
didn't see any concrete example of what is different.
Integers beyond 2^64 (on 64-bit systems) or 2^32 (on 32-bit systems) overflow to floats and lose accuracy. Then, if they’re casted back to integers, are truncated silently and wrap around.
The main point of the RFC is to make integers completely consistent
across platforms and to remove the need to worry about overflow.Which does not change with my proposal.
No, it does: There are now integers, and objects that represent large integers, which behave differently.
--
Andrea Faulds
http://ajf.me/
Hi!
You throw an error. Just as plenty of functions already can’t handle
ridiculously large integer arguments.
The problem is, if you function can handle the int range and you checked
for is_int()
and everything worked fine - now it's broken because
is_int()
no longer implies fixed range and there's no way to check if
you're dealing with fixed-range number or infinite-range number.
No, it does: There are now integers, and objects that represent large
integers, which behave differently.
IS_INT and IS_BIGINT would necessarily behave differently too - since
some functions may support both and some only integers. Again, no change
here.
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
You throw an error. Just as plenty of functions already can’t handle
ridiculously large integer arguments.The problem is, if you function can handle the int range and you checked
foris_int()
and everything worked fine - now it's broken because
is_int()
no longer implies fixed range and there's no way to check if
you're dealing with fixed-range number or infinite-range number.
Yes, but you can check if the function errors. I don’t really think this is a massive problem. People will probably not realistically expect that all functions can accept really large numbers, whether that range cuts off at 2**64-1 or something more arbitrary. It’s a problem if abs()
or sign() don’t work for bigints. It isn’t if str_repeat()
doesn’t, because a similarly-sized non-bigint would error too.
No, it does: There are now integers, and objects that represent large
integers, which behave differently.IS_INT and IS_BIGINT would necessarily behave differently too - since
some functions may support both and some only integers.
All functions would “support” both for integer arguments. But some might choose to reject bigints which are larger than the internal integer type the function uses… much like a function written for PHP currently might reject longs larger than the internal integer type the function uses.
Again, no change
here.
The point is the degree to which they can act the same. Objects can only go so far.
Andrea Faulds
http://ajf.me/
The RFC can be found here: https://wiki.php.net/rfc/bigint
The patch is, as I mentioned, incomplete. Additionally, there are quite a few matters left to be decided (see Open Questions). However, I think I should put this formally under discussion now.
I promise not to mail the list for every change I make to this RFC. ;)
But I do have quite a big one to announce. Previously, some issues with the GNU Multiple Precision Arithmetic Library (GMP) had been discovered. In particular, it is not liberally licensed (LGPL), it only has one set of custom allocators, which causes segfaults from other libraries which use it because PHP defines its own allocators, and it immediately calls an un-hookable abort() in certain failure cases.
I was unaware of any good alternatives, however today I was pointed by Chris Wright (DaveRandom) on StackOverflow towards a new possibility: LibTomMath. It is liberally licensed (dual-licensed as Public Domain and WTFPL), written in pure C, packaged for multiple platforms, and it lacks the immediate abort() problem to the best of my knowledge. Plus, it will not cause any segfaults when we use custom allocators, as I do not believe PHP uses any libraries which use LibTomMath at present. If you’re worried about whether it’s battle-tested, it’s used by another dynamic language, Tcl.
Because it appears to solve all three major issues with GMP, I am currently porting my bigint branch to use it. This is possible because the entire implementation of bigints is abstracted, meaning you can swap out back-ends. If we wished to, we could quite simply allow the choice of GMP at compile-time, or indeed any other back-end.
I should note that LibTomMath certainly isn’t perfect. I don’t believe it is optimised to the same degree GMP is. That being said, again, it does seem to solve all the major problems I had with GMP. So I have few qualms in making the patch use it, especially given that it is easy to swap out the back-end.
I’ve updated the RFC to reflect this new state of affairs: https://wiki.php.net/rfc/bigint
Thoughts?
Andrea Faulds
http://ajf.me/
Hi Andrea,
Why don't you use the ability of operator overloading? (it's in the engine
since 5.6).
BIGINT don't have to be completely transparent. If user would like to work
with BIGINT, let them crate PHP objects explicitly and then use operator
overloading. e.g.
<?php
function print_powers_of_two($bits) {
$bit = BIGINT(1);
$last = BIGINT(2) ** $bits;
while ($bit < $last) {
$bit *= 2;
echo "$bit\n";
}
}
print_powers_of_two(256);
?>
Your solution would allows writing the same without BIGINT, but not for
free.
I expect, it'll make some slowdown for all PHP scripts, independently, if
they use BIGINT or not.
I'll try to take a deeper look into the patch later...
Could you provide some benchmark results, comparing your patch with master?
Thanks. Dmitry.
Good evening,
Since I don’t want this to languish as a ‘Draft’ forever, despite the
patch being incomplete, I am finally putting the Big Integer Support RFC
“Under Discussion”.The RFC can be found here: https://wiki.php.net/rfc/bigint
The patch is, as I mentioned, incomplete. Additionally, there are quite a
few matters left to be decided (see Open Questions). However, I think I
should put this formally under discussion now.Any help with the patch (largely just updating extensions and the
mountains of tests these changes break, though later I will need to deal
with opcache) would be appreciated.Thanks!
Andrea Faulds
http://ajf.me/
Hi Andrea,
Why don't you use the ability of operator overloading? (it's in the engine since 5.6).
I've already answered this in this thread, but I'll answer it again if I must.
BIGINT don't have to be completely transparent. If user would like to work with BIGINT, let them crate PHP objects explicitly and then use operator overloading. e.g.
Well, they already can. ext/gmp exists.
Your solution would allows writing the same without BIGINT, but not for free.
I expect, it'll make some slowdown for all PHP scripts, independently, if they use BIGINT or not.
I'll try to take a deeper look into the patch later...Could you provide some benchmark results, comparing your patch with master?
So, the point of this RFC is basically to make PHP a language where, like Python, Haskell, Prolog or (de jure but not de facto) Dart, integers can be arbitrarily large and you never have to worry about overflow. Instead of applications which definitely need bigints using them explicitly, all applications can now support integers of any size transparently, essentially for free. It also makes the language more intuitive in a way. Plus, it's one less cross-platform difference so code is more portable.
You're right it might not actually be free, though. I'll need to run some benchmarks - will do later today if I remember. It shouldn't be any slower than master, though. All it does is change what we do in our usual overflow checks, which we already had. Now, once you've overflowed and got a bigint, obviously they're slower than floats. However if you need floating-point performance you can explicitly cast to double and deliberately lose accuracy.
Andrea Faulds
http://ajf.me/
I expect, it'll make some slowdown for all PHP scripts, independently, if they use BIGINT or not.
I'll try to take a deeper look into the patch later...Could you provide some benchmark results, comparing your patch with master?
I finally made the requested benchmarks. There’s barely a noticeable difference, though the bigint branch is apparently marginally faster (most likely from getting rid of fast_increment_function):
master bigint
0.344788074 0.339091063
0.34658289 0.361176014
0.376623154 0.346175194
0.35006094 0.359763861
0.352533817 0.341754198
0.354025841 0.357409
0.360356092 0.379124165
0.367921829 0.351316929
0.370724916 0.373735189
0.351090908 0.346349001
0.355952978 0.356275797
average 0.357332858 0.355651855
(Times in seconds, smaller is better.)
Script:
<?php
$start = microtime(true);
for ($i = 0; $i < 1000000; $i++) {
$a = 2 * 3;
$b = $a - 3;
$c = $a * $b;
$d = $c / $c;
}
$end = microtime(true);
echo "took ", $end - $start, " secs\n”;
?>
I ran the script several times, then took the results and put them into Excel to produce the above table with its averages.
So common scripts are either unaffected, or will run ever-so-slightly faster.
Andrea Faulds
http://ajf.me/
I ran the script several times, then took the results and put them into Excel to produce the above table with its averages.
So common scripts are either unaffected, or will run ever-so-slightly faster.
Just to be clear, though, that didn’t tell the whole story. With that number of iterations, there’s no speed difference that isn’t within the margin of error. However, up the iterations by 100x and the bigint branch is consistently very slightly slower. Remove the body of the loop so it’s just for ($i = 0; $i < 100000000; $i++) {}
and the bigint branch is consistently very slightly faster. No idea why either of these is the case.
So, apparently, the bigint branch both makes things slower and makes them faster! But it’s not a big enough difference for me to be worried about it. The differences that do exist might disappear if the fast_* functions can have their inline asm rewritten and be uncommented. Currently, master has custom asm for these, while the bigint branch has to use the probably slower C implementations because I don’t understand x86 or x64 asm and am unable to rewrite it.
--
Andrea Faulds
http://ajf.me/
Hi Andrea,
The synthetic benchmarks are not always reflect the impact on real-life
performance.
Unfortunately, I wasn't able to run any big real-life apps with your bigint
branch, because it misses support for commonly used extensions
(ext/session, ext/json, ext/pdo).
I ran bench.php and it's a bit slower with bigint.
master 1.210 sec
bigint 1.330 sec
I also measured the number of executed instructions using valgrind
--tool=callgrind (less is better)
master 1,118M
bigint 1,435M
May be part of this difference is caused by missing latest master
improvements, but anyway, introducing new core type, can't be done for free.
I also was able to run qdig, and it showed about 2% slowdown.
[master] $ sapi/cgi/php-cgi -T 1000 /var/www/html/bench/qdig/index.php >
/dev/null
Elapsed time: 3.327445 sec
[bigint] $ sapi/cgi/php-cgi -T 1000 /var/www/html/bench/qdig/index.php >
/dev/null
Elapsed time: 3.382823 sec
It would be great to measure the difference on wordpress, drupal, ZF...
Thanks. Dmitry.
I ran the script several times, then took the results and put them into
Excel to produce the above table with its averages.So common scripts are either unaffected, or will run ever-so-slightly
faster.Just to be clear, though, that didn’t tell the whole story. With that
number of iterations, there’s no speed difference that isn’t within the
margin of error. However, up the iterations by 100x and the bigint branch
is consistently very slightly slower. Remove the body of the loop so it’s
justfor ($i = 0; $i < 100000000; $i++) {}
and the bigint branch is
consistently very slightly faster. No idea why either of these is the case.So, apparently, the bigint branch both makes things slower and makes them
faster! But it’s not a big enough difference for me to be worried about it.
The differences that do exist might disappear if the fast_* functions can
have their inline asm rewritten and be uncommented. Currently, master has
custom asm for these, while the bigint branch has to use the probably
slower C implementations because I don’t understand x86 or x64 asm and am
unable to rewrite it.--
Andrea Faulds
http://ajf.me/
Hi!
Hi Andrea,
The synthetic benchmarks are not always reflect the impact on real-life performance.
Unfortunately, I wasn't able to run any big real-life apps with your bigint branch, because it misses support for commonly used extensions (ext/session, ext/json, ext/pdo).
Yes, that’s unfortunate. ext/json is first on my list to update once I’m done with ext/standard, I particularly want large integers in JSON to decode to bigints (though allow disabling this if you desire). I really should’ve finished porting ext/standard months ago, I’ve been dragging my heels on that one.
I ran bench.php and it's a bit slower with bigint.
master 1.210 sec
bigint 1.330 secI also measured the number of executed instructions using valgrind --tool=callgrind (less is better)
master 1,118M
bigint 1,435MMay be part of this difference is caused by missing latest master improvements, but anyway, introducing new core type, can't be done for free.
I’m not really sure about whether a new core type can’t be free. For switch statements, if they’re compiled to a jump table, they shouldn’t be any slower when a new case is added. But I’m not certain on that, I don’t spend much time reading generated asm.
Does bench.php do any float operations? I’m not sure from reading the source, but I think it might end up having ints overflow and become floats in master or bigints in my branch. If that’s the case, it would obviously be slower as bigints trade performance for accuracy. This particular issue can’t really be helped. Although these apps, if they want floats, can just ask for them explicitly by marking their numbers with a dot.
Another source of slowdown is, as previously mentioned, the asm functions not being updated and hence me having to disable them. Particularly for things like multiplication, addition and so on, the C code we have is far less efficient. I believe the asm code simply checks for an overflow flag after the operation, which should be very fast. On the other hand, the C code converts the ints to doubles, does a double operation, sees if the result of that is greater than PHP_INT_MAX
converted to a double, and then does the operation if it won’t overflow. This means that, until the asm code is updated, all integer operations may be significantly slower, which is unfortunate. However, I think that if the asm were to be updated, the slowdown for integer ops would completely, or at least mostly, disappear.
I also was able to run qdig, and it showed about 2% slowdown.
[master] $ sapi/cgi/php-cgi -T 1000 /var/www/html/bench/qdig/index.php > /dev/null
Elapsed time: 3.327445 sec[bigint] $ sapi/cgi/php-cgi -T 1000 /var/www/html/bench/qdig/index.php > /dev/null
Elapsed time: 3.382823 secIt would be great to measure the difference on wordpress, drupal, ZF…
The reasons for the dig slowdown are likely the same.
--
I’ve so far been scared to touch the asm… but actually, I don’t think it could be that hard. It’s not doing something especially complex. The bigint API looks fairly stable now and I’m unlikely to change it much further, so there’s little worry about having to change the asm a second time. The main problem with asm, I suppose, is testing it. I do have a 32-bit Ubuntu VM set up, but I’d also need to set up Windows VMs, and possibly others (don’t we have PowerPC in the source just now?).
I might experiment with it tonight, or sometime later this week.
Thanks.
--
Andrea Faulds
http://ajf.me/
Hi!
Hi Andrea,
The synthetic benchmarks are not always reflect the impact on real-life
performance.Unfortunately, I wasn't able to run any big real-life apps with your
bigint branch, because it misses support for commonly used extensions
(ext/session, ext/json, ext/pdo).Yes, that’s unfortunate. ext/json is first on my list to update once I’m
done with ext/standard, I particularly want large integers in JSON to
decode to bigints (though allow disabling this if you desire). I really
should’ve finished porting ext/standard months ago, I’ve been dragging my
heels on that one.I ran bench.php and it's a bit slower with bigint.
master 1.210 sec
bigint 1.330 secI also measured the number of executed instructions using valgrind
--tool=callgrind (less is better)master 1,118M
bigint 1,435MMay be part of this difference is caused by missing latest master
improvements, but anyway, introducing new core type, can't be done for free.I’m not really sure about whether a new core type can’t be free. For
switch statements, if they’re compiled to a jump table, they shouldn’t be
any slower when a new case is added. But I’m not certain on that, I don’t
spend much time reading generated asm.Does bench.php do any float operations? I’m not sure from reading the
source, but I think it might end up having ints overflow and become floats
in master or bigints in my branch. If that’s the case, it would obviously
be slower as bigints trade performance for accuracy. This particular issue
can’t really be helped. Although these apps, if they want floats, can just
ask for them explicitly by marking their numbers with a dot.
bench.php does some math on long and floats, but I don't think overflow is
involved.
Another source of slowdown is, as previously mentioned, the asm functions
not being updated and hence me having to disable them. Particularly for
things like multiplication, addition and so on, the C code we have is far
less efficient. I believe the asm code simply checks for an overflow flag
after the operation, which should be very fast.
yes, this may be a reason.
On the other hand, the C code converts the ints to doubles, does a double
operation, sees if the result of that is greater thanPHP_INT_MAX
converted
to a double, and then does the operation if it won’t overflow. This means
that, until the asm code is updated, all integer operations may be
significantly slower, which is unfortunate. However, I think that if the
asm were to be updated, the slowdown for integer ops would completely, or
at least mostly, disappear.I also was able to run qdig, and it showed about 2% slowdown.
[master] $ sapi/cgi/php-cgi -T 1000 /var/www/html/bench/qdig/index.php >
/dev/null
Elapsed time: 3.327445 sec[bigint] $ sapi/cgi/php-cgi -T 1000 /var/www/html/bench/qdig/index.php >
/dev/null
Elapsed time: 3.382823 secIt would be great to measure the difference on wordpress, drupal, ZF…
The reasons for the dig slowdown are likely the same.
2% is not a big difference (it may be even a measurement mistake), but more
tests should be done.
--
I’ve so far been scared to touch the asm… but actually, I don’t think it
could be that hard. It’s not doing something especially complex. The
bigint API looks fairly stable now and I’m unlikely to change it much
further, so there’s little worry about having to change the asm a second
time. The main problem with asm, I suppose, is testing it. I do have a
32-bit Ubuntu VM set up, but I’d also need to set up Windows VMs, and
possibly others (don’t we have PowerPC in the source just now?).
change asm for 32-bit Linux and add TODO marks for others. I don't test PHP
on PPC as well.
Thanks. Dmitry.
I might experiment with it tonight, or sometime later this week.
Thanks.
--
Andrea Faulds
http://ajf.me/
Hey Dmitry,
I’ve so far been scared to touch the asm… but actually, I don’t think it
could be that hard. It’s not doing something especially complex. The
bigint API looks fairly stable now and I’m unlikely to change it much
further, so there’s little worry about having to change the asm a second
time. The main problem with asm, I suppose, is testing it. I do have a
32-bit Ubuntu VM set up, but I’d also need to set up Windows VMs, and
possibly others (don’t we have PowerPC in the source just now?).change asm for 32-bit Linux and add TODO marks for others. I don't test PHP
on PPC as well.
After procrastinating about this for a long time, I finally went and updated the overflow checks today and ran bench.php.
I still haven’t touched the inline asm, I’ve just removed it, since clang and GCC (only in GCC 5.0, sadly) have checked arithmetic intrinsics. If someone wants to, they can rewrite the inline asm for compilers that have no overflow-checking intrinsics, but this is good enough for now, at least for the purposes of performance checking on my machine. I’m using clang, by the way. If you want to replicate these results, you’ll probably also need it, since GCC 5.0 isn’t out yet, unfortunately.
I compiled the bigint-libtommath branch (theoretically this was just a branch, but actually all the new changes have gone there, I’ll merge it into the bigint branch once LibTomMath port is done), and the current master branch.
For bigint-libtommath, I used ./configure --enable-debug --enable-phpdbg —-disable-all —-enable-bigint-gmp
Because of the —-enable-bigint-gmp flag, it’s using the GMP backend, not the LibTomMath one. I’m doing this since there’s still one or two small things I haven’t finished implementing for LibTomMath, e.g. the binary bitwise ops have the wrong behaviour just now.
For master, I used ./configure --enable-debug --enable-phpdbg —-disable-all
Then, I ran bench.php four times, and each time I ran it first on ./php-bigint-gmp, then on ./php-bigint-master.
On each run, the bigint branch turned out faster, as well as overall:
bigint master
6.593 6.659
6.424 6.661
6.414 6.588
6.381 6.673
AVERAGE
6.453 6.64525
DIFFERENCE
-0.19225 0.19225
RATIO
0.971069561 1.0297923446
So master is 2.9% slower! Full output here: https://gist.github.com/TazeTSchnitzel/759c1513b442571f5e26
I can’t actually explain why bigints would be faster. It might just be because I got rid of fast_increment_function in favour of just checking of op1 == ZEND_LONG_MAX in zend_vm_execute.h, ditto for fast_decrement_function. Maybe using overflow intrinsics is faster than inline asm. Maybe it’s something completely different. I honestly don’t know.
The result surprised me as I expect bigints would be slower, so I redid it. Again, bigints came out on top:
bigint master
6.55 6.779
6.353 6.738
6.326 6.674
6.144 6.177
AVERAGE
6.34325 6.592
DIFFERENCE
-0.24875 0.24875
RATIO
0.9622648665 1.0392149135
This time master was around 3.9% slower. Full log here: https://gist.github.com/TazeTSchnitzel/59c190b86c9dd5b20570
If we combine the two runs:
bigint master
6.593 6.659
6.424 6.661
6.414 6.588
6.381 6.673
6.55 6.779
6.353 6.738
6.326 6.674
6.144 6.177
AVERAGE
6.398125 6.618625
DIFFERENCE
-0.2205 0.2205
RATIO
0.9666849232 1.0344632216
master’s 3.4% slower.
Just to check I named the files correctly:
oa-res-26-240:php-src ajf$ ./php-master -r 'var_dump(PHP_INT_MAX * 2);'
float(1.844674407371E+19)
oa-res-26-240:php-src ajf$ ./php-bigint-gmp -r 'var_dump(PHP_INT_MAX * 2);'
int(18446744073709551614)
Yes, it’s definitely the bigint branch.
So, at least by these preliminary results, the bigint branch would appear to be faster than master. This is merely bench.php, but it’s still a good sign. :)
Thanks!
Andrea Faulds
http://ajf.me/
I'm really surprised by the results :)
I'll try to find time for bigint on next week and play with it a bit.
Thanks. Dmitry.
Hey Dmitry,
I’ve so far been scared to touch the asm… but actually, I don’t think it
could be that hard. It’s not doing something especially complex. The
bigint API looks fairly stable now and I’m unlikely to change it much
further, so there’s little worry about having to change the asm a second
time. The main problem with asm, I suppose, is testing it. I do have a
32-bit Ubuntu VM set up, but I’d also need to set up Windows VMs, and
possibly others (don’t we have PowerPC in the source just now?).change asm for 32-bit Linux and add TODO marks for others. I don't test
PHP
on PPC as well.After procrastinating about this for a long time, I finally went and
updated the overflow checks today and ran bench.php.I still haven’t touched the inline asm, I’ve just removed it, since clang
and GCC (only in GCC 5.0, sadly) have checked arithmetic intrinsics. If
someone wants to, they can rewrite the inline asm for compilers that have
no overflow-checking intrinsics, but this is good enough for now, at least
for the purposes of performance checking on my machine. I’m using clang, by
the way. If you want to replicate these results, you’ll probably also need
it, since GCC 5.0 isn’t out yet, unfortunately.I compiled the bigint-libtommath branch (theoretically this was just a
branch, but actually all the new changes have gone there, I’ll merge it
into the bigint branch once LibTomMath port is done), and the current
master branch.For bigint-libtommath, I used ./configure --enable-debug --enable-phpdbg
—-disable-all —-enable-bigint-gmpBecause of the —-enable-bigint-gmp flag, it’s using the GMP backend, not
the LibTomMath one. I’m doing this since there’s still one or two small
things I haven’t finished implementing for LibTomMath, e.g. the binary
bitwise ops have the wrong behaviour just now.For master, I used ./configure --enable-debug --enable-phpdbg —-disable-all
Then, I ran bench.php four times, and each time I ran it first on
./php-bigint-gmp, then on ./php-bigint-master.On each run, the bigint branch turned out faster, as well as overall:
bigint master
6.593 6.659
6.424 6.661
6.414 6.588
6.381 6.673
AVERAGE
6.453 6.64525
DIFFERENCE
-0.19225 0.19225
RATIO
0.971069561 1.0297923446So master is 2.9% slower! Full output here:
https://gist.github.com/TazeTSchnitzel/759c1513b442571f5e26I can’t actually explain why bigints would be faster. It might just be
because I got rid of fast_increment_function in favour of just checking of
op1 == ZEND_LONG_MAX in zend_vm_execute.h, ditto for
fast_decrement_function. Maybe using overflow intrinsics is faster than
inline asm. Maybe it’s something completely different. I honestly don’t
know.The result surprised me as I expect bigints would be slower, so I redid
it. Again, bigints came out on top:bigint master
6.55 6.779
6.353 6.738
6.326 6.674
6.144 6.177
AVERAGE
6.34325 6.592
DIFFERENCE
-0.24875 0.24875
RATIO
0.9622648665 1.0392149135This time master was around 3.9% slower. Full log here:
https://gist.github.com/TazeTSchnitzel/59c190b86c9dd5b20570If we combine the two runs:
bigint master
6.593 6.659
6.424 6.661
6.414 6.588
6.381 6.673
6.55 6.779
6.353 6.738
6.326 6.674
6.144 6.177
AVERAGE
6.398125 6.618625
DIFFERENCE
-0.2205 0.2205
RATIO
0.9666849232 1.0344632216master’s 3.4% slower.
Just to check I named the files correctly:
oa-res-26-240:php-src ajf$ ./php-master -r 'var_dump(PHP_INT_MAX * 2);'
float(1.844674407371E+19)
oa-res-26-240:php-src ajf$ ./php-bigint-gmp -r 'var_dump(PHP_INT_MAX * 2);'
int(18446744073709551614)Yes, it’s definitely the bigint branch.
So, at least by these preliminary results, the bigint branch would appear
to be faster than master. This is merely bench.php, but it’s still a good
sign. :)Thanks!
Andrea Faulds
http://ajf.me/
Hi Andrea,
Where can I get the code?
Thanks. Dmitry.
I'm really surprised by the results :)
I'll try to find time for bigint on next week and play with it a bit.Thanks. Dmitry.
Hey Dmitry,
I’ve so far been scared to touch the asm… but actually, I don’t think
it
could be that hard. It’s not doing something especially complex. The
bigint API looks fairly stable now and I’m unlikely to change it much
further, so there’s little worry about having to change the asm a
second
time. The main problem with asm, I suppose, is testing it. I do have a
32-bit Ubuntu VM set up, but I’d also need to set up Windows VMs, and
possibly others (don’t we have PowerPC in the source just now?).change asm for 32-bit Linux and add TODO marks for others. I don't test
PHP
on PPC as well.After procrastinating about this for a long time, I finally went and
updated the overflow checks today and ran bench.php.I still haven’t touched the inline asm, I’ve just removed it, since clang
and GCC (only in GCC 5.0, sadly) have checked arithmetic intrinsics. If
someone wants to, they can rewrite the inline asm for compilers that have
no overflow-checking intrinsics, but this is good enough for now, at least
for the purposes of performance checking on my machine. I’m using clang, by
the way. If you want to replicate these results, you’ll probably also need
it, since GCC 5.0 isn’t out yet, unfortunately.I compiled the bigint-libtommath branch (theoretically this was just a
branch, but actually all the new changes have gone there, I’ll merge it
into the bigint branch once LibTomMath port is done), and the current
master branch.For bigint-libtommath, I used ./configure --enable-debug --enable-phpdbg
—-disable-all —-enable-bigint-gmpBecause of the —-enable-bigint-gmp flag, it’s using the GMP backend, not
the LibTomMath one. I’m doing this since there’s still one or two small
things I haven’t finished implementing for LibTomMath, e.g. the binary
bitwise ops have the wrong behaviour just now.For master, I used ./configure --enable-debug --enable-phpdbg
—-disable-allThen, I ran bench.php four times, and each time I ran it first on
./php-bigint-gmp, then on ./php-bigint-master.On each run, the bigint branch turned out faster, as well as overall:
bigint master
6.593 6.659
6.424 6.661
6.414 6.588
6.381 6.673
AVERAGE
6.453 6.64525
DIFFERENCE
-0.19225 0.19225
RATIO
0.971069561 1.0297923446So master is 2.9% slower! Full output here:
https://gist.github.com/TazeTSchnitzel/759c1513b442571f5e26I can’t actually explain why bigints would be faster. It might just be
because I got rid of fast_increment_function in favour of just checking of
op1 == ZEND_LONG_MAX in zend_vm_execute.h, ditto for
fast_decrement_function. Maybe using overflow intrinsics is faster than
inline asm. Maybe it’s something completely different. I honestly don’t
know.The result surprised me as I expect bigints would be slower, so I redid
it. Again, bigints came out on top:bigint master
6.55 6.779
6.353 6.738
6.326 6.674
6.144 6.177
AVERAGE
6.34325 6.592
DIFFERENCE
-0.24875 0.24875
RATIO
0.9622648665 1.0392149135This time master was around 3.9% slower. Full log here:
https://gist.github.com/TazeTSchnitzel/59c190b86c9dd5b20570If we combine the two runs:
bigint master
6.593 6.659
6.424 6.661
6.414 6.588
6.381 6.673
6.55 6.779
6.353 6.738
6.326 6.674
6.144 6.177
AVERAGE
6.398125 6.618625
DIFFERENCE
-0.2205 0.2205
RATIO
0.9666849232 1.0344632216master’s 3.4% slower.
Just to check I named the files correctly:
oa-res-26-240:php-src ajf$ ./php-master -r 'var_dump(PHP_INT_MAX * 2);'
float(1.844674407371E+19)
oa-res-26-240:php-src ajf$ ./php-bigint-gmp -r 'var_dump(PHP_INT_MAX *
2);'
int(18446744073709551614)Yes, it’s definitely the bigint branch.
So, at least by these preliminary results, the bigint branch would appear
to be faster than master. This is merely bench.php, but it’s still a good
sign. :)Thanks!
Andrea Faulds
http://ajf.me/
Hi Andrea,
Where can I get the code?
Thanks. Dmitry.
Hey Dmitry,
The bigint-libtommath branch was merged back into the bigint branch since I figured there was no point keeping them separate, even if the LibTomMath backend isn’t quite complete.
So, the pull request is here: https://github.com/php/php-src/pull/876
Or, the branch directly: https://github.com/TazeTSchnitzel/php-src/tree/bigint
When configuring, you can use —-enable-bigint-gmp to use GMP for bigints. Otherwise it will use LibTomMath. GMP is probably faster, and it has all operations implemented (I still need to do bitwise ops for LibTomMath). For GMP, you’ll need to have the library installed.
Thanks.
Andrea Faulds
http://ajf.me/
Oh, it's still in draft state.
Too may extensions are missing ext/seesion, ext/json, ext/pdo.
Only very simple tests may be done now, and they can't predict impact on
real-life applications.
Thanks. Dmitry.
Hi Andrea,
Where can I get the code?
Thanks. Dmitry.
Hey Dmitry,
The bigint-libtommath branch was merged back into the bigint branch since
I figured there was no point keeping them separate, even if the LibTomMath
backend isn’t quite complete.So, the pull request is here: https://github.com/php/php-src/pull/876
Or, the branch directly:
https://github.com/TazeTSchnitzel/php-src/tree/bigintWhen configuring, you can use —-enable-bigint-gmp to use GMP for bigints.
Otherwise it will use LibTomMath. GMP is probably faster, and it has all
operations implemented (I still need to do bitwise ops for LibTomMath). For
GMP, you’ll need to have the library installed.Thanks.
Andrea Faulds
http://ajf.me/
Oh, it's still in draft state.
Too may extensions are missing ext/seesion, ext/json, ext/pdo.
Only very simple tests may be done now, and they can't predict impact on
real-life applications.
We may as well try to help here.
This patch is anything we want but simple. I really do not want to see
Andrea going down the pain we had with the 64bit patch. So let
organize ourselves to avoid that.
Step 1:
Which extensions do we consider as critical to actually get a clue
about the impact?
I see session, standard ( ;) ), json on top of my head. Which other?
Let help Andrea to port these exts and do the other once we know if
the RFC is accepted or not.
Cheers,
Pierre
ext/session and ext/json are required by most apps.
Actually I stopped attempts to build it when I saw compilation errors in
ext/session.
Thanks. Dmitry.
Oh, it's still in draft state.
Too may extensions are missing ext/seesion, ext/json, ext/pdo.
Only very simple tests may be done now, and they can't predict impact on
real-life applications.We may as well try to help here.
This patch is anything we want but simple. I really do not want to see
Andrea going down the pain we had with the 64bit patch. So let
organize ourselves to avoid that.Step 1:
Which extensions do we consider as critical to actually get a clue
about the impact?I see session, standard ( ;) ), json on top of my head. Which other?
Let help Andrea to port these exts and do the other once we know if
the RFC is accepted or not.Cheers,
Pierre
ext/session and ext/json are required by most apps.
Right.
The question is: Do you see any other we must have before discussing
that any further?
I don't know which ones are supported or not.
Of course we need some extension to connect to database mysql, mysqli or
pdo_mysql.
Thanks. Dmitry.
ext/session and ext/json are required by most apps.
Right.
The question is: Do you see any other we must have before discussing
that any further?
Hey Dmitry,
ext/session and ext/json are required by most apps.
Actually I stopped attempts to build it when I saw compilation errors in ext/session.Thanks. Dmitry.
Oh dear, does ext/session not build? :/
So far I've only built the branch with --disable-all.
In the case of most extensions, the main source of compilation errors will be changes to certain Zend Engine functions. In particular, is_numeric_string_ex needs to support bigints now and has an extra parameter. I don't think I changed very many other functions.
Porting extensions should for the most part be relatively simple. Most extensions are just sets of functions and use zpp. If they're using the 'l' specifier (Z_PARAM_LONG) they'll continue to work. In most cases there is no need to update an 'l' parameter to support bigints. The length of a string can't exceed PHP's max integer size, for example. Of course, there are some functions where it would have a clear benefit to add bigint support.
The main problem with extensions is 'z'
Oh, it's still in draft state.
Too may extensions are missing ext/seesion, ext/json, ext/pdo.
Only very simple tests may be done now, and they can't predict impact on
real-life applications.We may as well try to help here.
This patch is anything we want but simple. I really do not want to see
Andrea going down the pain we had with the 64bit patch. So let
organize ourselves to avoid that.Step 1:
Which extensions do we consider as critical to actually get a clue
about the impact?I see session, standard ( ;) ), json on top of my head. Which other?
Let help Andrea to port these exts and do the other once we know if
the RFC is accepted or not.Cheers,
Pierre
BTW: why not to wrap big integers into special IS_OBJECT?
It would keep everything working out of the box (without BIGINT), and would
allow to eliminate more than half of unnecessary changes.
In the past we made similar decision for closures.
Thanks. Dmitry.
Hey Dmitry,
ext/session and ext/json are required by most apps.
Actually I stopped attempts to build it when I saw compilation errors in
ext/session.Thanks. Dmitry.
Oh dear, does ext/session not build? :/
So far I've only built the branch with --disable-all.
In the case of most extensions, the main source of compilation errors will
be changes to certain Zend Engine functions. In particular,
is_numeric_string_ex needs to support bigints now and has an extra
parameter. I don't think I changed very many other functions.Porting extensions should for the most part be relatively simple. Most
extensions are just sets of functions and use zpp. If they're using the 'l'
specifier (Z_PARAM_LONG) they'll continue to work. In most cases there is
no need to update an 'l' parameter to support bigints. The length of a
string can't exceed PHP's max integer size, for example. Of course, there
are some functions where it would have a clear benefit to add bigint
support.The main problem with extensions is 'z'
On Thu, Jan 15, 2015 at 10:44 AM, Pierre Joye pierre.php@gmail.com
wrote:Oh, it's still in draft state.
Too may extensions are missing ext/seesion, ext/json, ext/pdo.
Only very simple tests may be done now, and they can't predict impact on
real-life applications.We may as well try to help here.
This patch is anything we want but simple. I really do not want to see
Andrea going down the pain we had with the 64bit patch. So let
organize ourselves to avoid that.Step 1:
Which extensions do we consider as critical to actually get a clue
about the impact?I see session, standard ( ;) ), json on top of my head. Which other?
Let help Andrea to port these exts and do the other once we know if
the RFC is accepted or not.Cheers,
Pierre
Hi Dmitry,
BTW: why not to wrap big integers into special IS_OBJECT?
It would keep everything working out of the box (without BIGINT), and would allow to eliminate more than half of unnecessary changes.In the past we made similar decision for closures.
In retrospect that might have been a good idea. Though objects can't quite do everything our primitive types can. To get bigints to work that way, you'd need to improve the support for objects a lot. You'd still need to update virtually every zval-accepting extension. The signature of is_numeric_string_ex would still have to change. You would need to make constants support objects, too. You'd still need to change a lot of things, unfortunately.
At this stage, switching to using objects is probably a waste of time.
Thanks.
--
Andrea Faulds
http://ajf.me/
Thanks. Dmitry.
Hey Dmitry,
ext/session and ext/json are required by most apps.
Actually I stopped attempts to build it when I saw compilation errors in ext/session.Thanks. Dmitry.
Hey Dmitry,
ext/session and ext/json are required by most apps.
Actually I stopped attempts to build it when I saw compilation errors in ext/session.Thanks. Dmitry.
Oh dear, does ext/session not build? :/
So far I've only built the branch with --disable-all.
In the case of most extensions, the main source of compilation errors will be changes to certain Zend Engine functions. In particular, is_numeric_string_ex needs to support bigints now and has an extra parameter. I don't think I changed very many other functions.
Porting extensions should for the most part be relatively simple. Most extensions are just sets of functions and use zpp. If they're using the 'l' specifier (Z_PARAM_LONG) they'll continue to work. In most cases there is no need to update an 'l' parameter to support bigints. The length of a string can't exceed PHP's max integer size, for example. Of course, there are some functions where it would have a clear benefit to add bigint support.
The main problem with extensions is 'z'
(Sorry, accidentally sent too early)
The main problem with most extensions is the 'z' format specifier which accepts any value. If it accepts IS_LONG then it needs to accept IS_BIGINT too. In many cases you can just convert the bigint to a long and maybe reject it or wrap it if it won't fit, if the function doesn't need to support large integers:
case IS_BIGINT:
if (!zend_bigint_can_fit_long(Z_BIG_P(some_zval))) {
zend_error(E_WARNING, "$some_zval too large");
RETURN_FALSE;
} else {
lval = zend_bigint_to_long(Z_BIG_P(some_zval));
}
break;
Something like that would work in most cases. There is also convert_to_long.
I probably should have focussed more on extension support, maybe I'll start try to port some of them, there's not that much Zend stuff left to do really. I would have ported ext/json, but there's now the jsond RFC.
Any help would be appreciated. I am panicking a bit as there's not that long to go before PHP7 feature freeze, assuming Zeev's timetable is actually followed. Though I think this feature should be doable: as I said, there's not much Zend stuff left to do, and most extensions should be quite simple to port.
Thanks.
Andrea Faulds
http://ajf.me/
Hey everyone,
Anatol (aka welting) has done some excellent work, and a lot more extensions now build on the bigint branch, even if not all of them are fully ported:
https://wiki.php.net/rfc/bigint#todo
This should mean that testing “real-world applications” for performance is now possible.
Thanks!
--
Andrea Faulds
http://ajf.me/
Hi,
Hey everyone,
Anatol (aka welting) has done some excellent work, and a lot more extensions now build on the bigint branch, even if not all of them are fully ported:
https://wiki.php.net/rfc/bigint#todo
This should mean that testing “real-world applications” for performance is now possible.
I’m a little worried that nobody has responded to this yet. Feature freeze is looming… :(
Andrea Faulds
http://ajf.me/