I've had some trouble with large arrays in my PHP programs causing
corruption of the heap. I tracked the problem back to PHP 4's 16-bit
reference count. If you have more than 64K references to a given zval, the
counter will overflow, then when the references are freed, the object will
be double-freed, causing a segfault. Dangling references are also possible,
allowing reading and writing of subsequently allocated blocks.
No doubt the PHP dev team are aware of this already, since it was fixed in
PHP 5, by using a 32-bit type instead. My question is: is there any
intention to backport this simple but important bugfix to PHP 4? Many PHP
users are still using PHP 4, and it's not a very well advertised fact that
it does not properly support arrays with more than 64K entries.
-- Tim Starling
Hello Tim,
that's a major API break so we would need to make that 4.5. Which is kind
of out of question atm. Just change to 5.
marcus
Saturday, January 7, 2006, 5:47:27 AM, you wrote:
I've had some trouble with large arrays in my PHP programs causing
corruption of the heap. I tracked the problem back to PHP 4's 16-bit
reference count. If you have more than 64K references to a given zval, the
counter will overflow, then when the references are freed, the object will
be double-freed, causing a segfault. Dangling references are also possible,
allowing reading and writing of subsequently allocated blocks.
No doubt the PHP dev team are aware of this already, since it was fixed in
PHP 5, by using a 32-bit type instead. My question is: is there any
intention to backport this simple but important bugfix to PHP 4? Many PHP
users are still using PHP 4, and it's not a very well advertised fact that
it does not properly support arrays with more than 64K entries.
-- Tim Starling
Best regards,
Marcus
At 06:47 07/01/2006, Tim Starling wrote:
I've had some trouble with large arrays in my PHP programs causing
corruption of the heap. I tracked the problem back to PHP 4's 16-bit
reference count. If you have more than 64K references to a given zval, the
counter will overflow, then when the references are freed, the object will
be double-freed, causing a segfault. Dangling references are also possible,
allowing reading and writing of subsequently allocated blocks.No doubt the PHP dev team are aware of this already, since it was fixed in
PHP 5, by using a 32-bit type instead. My question is: is there any
intention to backport this simple but important bugfix to PHP 4? Many PHP
users are still using PHP 4, and it's not a very well advertised fact that
it does not properly support arrays with more than 64K entries.
Tim,
Your analysis was correct until the last sentence - PHP surely does
support arrays with more than 64K entries. It just doesn't support
the same entry being linked from more than 64K locations (which is
much, much more rare occurrence).
Zeev
Zeev Suraski wrote:
Tim,
Your analysis was correct until the last sentence - PHP surely does
support arrays with more than 64K entries. It just doesn't support the
same entry being linked from more than 64K locations (which is much,
much more rare occurrence).
Well yes, I was just stirring. It's more common than you might think though.
My own case was somewhat similar to this:
<?
class CacheEntry {
var $name;
var $stuff = array();
function CacheEntry( $name ) {
$this->name = $name;
}
}
$cache = array();
for ( $i=0; $i<100000; $i++) {
$cache[] = new CacheEntry( "Cache entry $i" );
}
?>
...which will segfault on exit due to the excess references to the empty
array object created for the initialisation of the unused member variable
$stuff. Generally speaking, arrays of objects with initialised but unused
member variables will cause problems. Our code (MediaWiki) has plenty of
such arrays, the reason we don't see this more often is because the arrays
usually contain less than 64K entries. The size of the arrays depend on user
input.
-- Tim Starling
At 00:00 09/01/2006, Tim Starling wrote:
Zeev Suraski wrote:
Tim,
Your analysis was correct until the last sentence - PHP surely does
support arrays with more than 64K entries. It just doesn't support the
same entry being linked from more than 64K locations (which is much,
much more rare occurrence).Well yes, I was just stirring. It's more common than you might think though.
I'm not saying that it's four-leafed clover, just that it's
significantly more rare than plain arrays that contain 64K
(different) entries...
We did decide to increase the refcount size in PHP 5, but as Marcus
said, releasing a binary-compat-breaking version of 4 at this stage
is pretty much out of the question.
Zeev
It seems that the problem you're encountering could be fixed
trivially by moving the initialization of $stuff into the constructor.
However, it's difficult to tell from the provided code whether $stuff
is intended as a static member... but if it's not, seems like a very
simple fix...
I don't personally know about PHP4 internals, but it would seem that
the reason it breaks as-is is that ALL CacheEntry objects initially
report to the same (statically created) empty array. So, moving it
into the constructor would at least create n empty arrays rather than
segfaulting.
Right? Then you wouldn't even need to change your runtime environment
(ie PHP version) to solve the problem.
Alan
Zeev Suraski wrote:
Tim,
Your analysis was correct until the last sentence - PHP surely does
support arrays with more than 64K entries. It just doesn't
support the
same entry being linked from more than 64K locations (which is much,
much more rare occurrence).Well yes, I was just stirring. It's more common than you might
think though.
My own case was somewhat similar to this:<?
class CacheEntry {
var $name;
var $stuff = array();function CacheEntry( $name ) { $this->name = $name; }
}
$cache = array();
for ( $i=0; $i<100000; $i++) {
$cache[] = new CacheEntry( "Cache entry $i" );
}
?>...which will segfault on exit due to the excess references to the
empty
array object created for the initialisation of the unused member
variable
$stuff. Generally speaking, arrays of objects with initialised but
unused
member variables will cause problems. Our code (MediaWiki) has
plenty of
such arrays, the reason we don't see this more often is because the
arrays
usually contain less than 64K entries. The size of the arrays
depend on user
input.-- Tim Starling
Alan Pinstein wrote:
It seems that the problem you're encountering could be fixed trivially
by moving the initialization of $stuff into the constructor.However, it's difficult to tell from the provided code whether $stuff
is intended as a static member... but if it's not, seems like a very
simple fix...I don't personally know about PHP4 internals, but it would seem that
the reason it breaks as-is is that ALL CacheEntry objects initially
report to the same (statically created) empty array. So, moving it into
the constructor would at least create n empty arrays rather than
segfaulting.Right? Then you wouldn't even need to change your runtime environment
(ie PHP version) to solve the problem.
The easy way to solve it was to limit the array to less than 64K entries,
which I did before I made my original post. The problem only showed up
because of a bug causing the cache to grow in an unconstrained way. Thanks
for the suggestion, but the reason I started this thread was because I
wanted to make sure that this doesn't happen again, regardless of bad
programming.
Upgrading to PHP 5 is certainly a reasonable solution, it's been on our
wishlist for a while. We will probably drop support for PHP 4 in MediaWiki
1.6 or 1.7, so that we can take advantage of the new features. Upgrading
won't be a trivial task, we're apparently one of the larger PHP
installations in the world, with about 100 servers running PHP, and there's
a number of custom extensions to check as well. Still, it has to be done
some time.
-- Tim Starling
Upgrading to PHP 5 is certainly a reasonable solution, it's been on our
wishlist for a while. We will probably drop support for PHP 4 in MediaWiki
1.6 or 1.7, so that we can take advantage of the new features. Upgrading
We've used MediaWiki at work with PHP 5.1.x (from CVS) for a long time
without any problems. I'd suggest you just start another branch of it
e.g. 2.0 which is for PHP 5.1 only. Then just don't upgrade 1.x anymore. :)
People will just have to upgrade. They just don't know it's for their
own good yet..
--Jani