Ok, I am no C coder and I don't claim to be very smart about low level
parts of PHP. But, IMO, this is bug.
http://bugs.php.net/bug.php?id=36924
I have created a test script that shows the insanity of this bug:
<?php
$str = "This is a medium length string";
$start = memory_get_usage()
;
for($x=1;$x<=100;$x++){
echo $str.": (3.5): ".time()."\n";
}
$first_growth = number_format(memory_get_usage() - $start);
$start = memory_get_usage()
;
for($x=1;$x<=100;$x++){
echo "$str: (3.5): ".time()."\n";
}
$growth = number_format(memory_get_usage() - $start);
echo "first growth: $first_growth\nsecond growth: $growth\n";
?>
Now, you turn that into a script that is going to loop millions of
times, building strings to build data and run for hours. The next thing
you know, your cron job is using 450MB of memory and you try to figure
out what YOU did wrong. Hours later you find this little jewel waiting
for you and realize you did not do anything wrong. You just used the
language.
Everyone ignored my last email about memory waste and large arrays. I
can only assume that you will ignore this one too. I am starting to see
why Stefan became so bitter.
I have always tried to squash the "myth" that putting variables inside
strings in PHP was bad, but now, I think I will flip on that.
I tested as far back as PHP 4.3.9 and as new as 5.2.1.
--
Brian Moon
http://dealnews.com/
It's good to be cheap =)
Ok, I am no C coder and I don't claim to be very smart about low level
parts of PHP. But, IMO, this is bug.http://bugs.php.net/bug.php?id=36924
I have created a test script that shows the insanity of this bug:
<?php
$str = "This is a medium length string";
$start =
memory_get_usage()
;
for($x=1;$x<=100;$x++){
echo $str.": (3.5): ".time()."\n";
}
$first_growth = number_format(memory_get_usage() - $start);$start =
memory_get_usage()
;
for($x=1;$x<=100;$x++){
echo "$str: (3.5): ".time()."\n";
}
$growth = number_format(memory_get_usage() - $start);echo "first growth: $first_growth\nsecond growth: $growth\n";
?>
Number of iterations | First growth | Second growth | Peak usage
100 480 bytes ~5Kb ~73Kb
100 000 556 bytes ~32Kb ~100Kb
1 000 000 556 bytes ~32Kb ~100Kb
Tested with PHP 5.2.1 and 5.2.2-dev on Linux (with --disable-debug).
Now, you turn that into a script that is going to loop millions of
times, building strings to build data and run for hours. The next thing
you know, your cron job is using 450MB of memory and you try to figure
out what YOU did wrong. Hours later you find this little jewel waiting
for you and realize you did not do anything wrong. You just used the
language.
Yup, the language stole ~32Kb of your memory and used it to speedup the allocation of small chunks.
Everyone ignored my last email about memory waste and large arrays. I
can only assume that you will ignore this one too.
I am starting to see why Stefan became so bitter.
I have always tried to squash the "myth" that putting variables inside
strings in PHP was bad, but now, I think I will flip on that.
I tested as far back as PHP 4.3.9 and as new as 5.2.1.
--
Wbr,
Antony Dovgal
Antony Dovgal wrote:
Yup, the language stole ~32Kb of your memory and used it to speedup the
allocation of small chunks.
Ok, in my attempt to send a sane looking example, I cleaned up the problem.
<?php
$str = "This is a medium length string";
$start = memory_get_usage()
;
for($x=1;$x<=1000000;$x++){
$var = $str." ".$x.": (3.5): ".time()."\n";
}
$first_growth = number_format(memory_get_usage() - $start);
$start = memory_get_usage()
;
for($x=1;$x<=1000000;$x++){
$var = "$str $x: (3.5): ".time()."\n";
}
$growth = number_format(memory_get_usage() - $start);
echo "first growth: $first_growth\nsecond growth: $growth\n";
?>
The introduction of $x into the string makes the difference.
first growth: 892
second growth: 3,955,068
--
Brian Moon
http://dealnews.com/
It's good to be cheap =)
The introduction of $x into the string makes the difference.
first growth: 892
second growth: 3,955,068
Strange, I get:
first growth: 820
second growth: 56
Which build did you test?
--
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/
Brian Moon wrote:
Antony Dovgal wrote:
Yup, the language stole ~32Kb of your memory and used it to speedup
the allocation of small chunks.Ok, in my attempt to send a sane looking example, I cleaned up the problem.
<?php
$str = "This is a medium length string";
$start =
memory_get_usage()
;
for($x=1;$x<=1000000;$x++){
$var = $str." ".$x.": (3.5): ".time()."\n";
}
$first_growth = number_format(memory_get_usage() - $start);$start =
memory_get_usage()
;
for($x=1;$x<=1000000;$x++){
$var = "$str $x: (3.5): ".time()."\n";
}
$growth = number_format(memory_get_usage() - $start);echo "first growth: $first_growth\nsecond growth: $growth\n";
?>
The introduction of $x into the string makes the difference.
first growth: 892
second growth: 3,955,068
I get:
first growth: 704
second growth: 32,264
with current PHP_5_2 checkout. I don't have a 5.1.x handy with memory
limits compiled in.
-Rasmus
Rasmus Lerdorf wrote:
I get:
first growth: 704
second growth: 32,264with current PHP_5_2 checkout. I don't have a 5.1.x handy with memory
limits compiled in.
Starting to think this is more prevelant on Mac OS X. My huge numbers
are coming from 5.2.0, on Mac OS X. On my Linux boxes I am getting
smaller numbers, more like what you guys are reporting so I assume you
are using Linux.
Then there is this confusing result:
$ php test.php
first growth: 860
second growth: -12,145,628
That is on 5.2.0 on Linux. to get a -12MB, it had to be that big or
bigger to start with at some point. Anyone know what that is about?
Passing true to memory_get_usage only clouds things further as I get all
0 for memory growth on all platforms making me think that
memory_get_usage()
without the true param (>=5.2) is useless.
However, despite all those tests (which you can never trust to show real
world perfectly) I changed the script I was writing to use NO vars in
strings. It is a script that processes queued items. As long as there
are items in the queue, it keeps running. The strings I was building
were mostly SQL queries. It is now holding at 11MB and not growing.
Before, I was having to kill it after an hour as it would grow to over
100MB of memory and growing. That was all determined by looking at ps.
Also, what was taking hours, now only took 15 minutes while I was
writing this email. All of this was on Linux. My fear is that Mac OS X
is making an issue very obvious that is more subtle on Linux, but does
exist. I will see if I can come up with a good example on Linux that
makes this more prevelant. It may have to run for an hour, which I
understand is not the primary target of PHP. I am a big proponent of
that philosophy. But, there is a CLI and I am gonna use it. I would
like it to be sane. =)
--
Brian Moon
http://dealnews.com/
It's good to be cheap =)
Brian Moon wrote:
Rasmus Lerdorf wrote:
I get:
first growth: 704
second growth: 32,264with current PHP_5_2 checkout. I don't have a 5.1.x handy with memory
limits compiled in.Starting to think this is more prevelant on Mac OS X. My huge numbers
are coming from 5.2.0, on Mac OS X. On my Linux boxes I am getting
smaller numbers, more like what you guys are reporting so I assume you
are using Linux.
Nope, mine was from OSX:
3:45pm shiny:~> php uu
first growth: 704
second growth: 32,264
3:45pm shiny:~> php -v
PHP 5.2.2-dev (cli) (built: Feb 18 2007 00:11:29)
Copyright (c) 1997-2007 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies
3:45pm shiny:~> uname -a
Darwin shiny.lerdorf.com 8.8.0 Darwin Kernel Version 8.8.0: Fri Sep 8
17:18:57 PDT 2006; root:xnu-792.12.6.obj~1/RELEASE_PPC Power Macintosh
powerpc
-Rasmus
Starting to think this is more prevelant on Mac OS X. My huge numbers
are coming from 5.2.0, on Mac OS X. On my Linux boxes I am getting
smaller numbers, more like what you guys are reporting so I assume you
are using Linux.
I vaguely remember there was some problem with calculating exact memory
usage sometime ago. Not sure it's 5.2.0 but it might be. I would first
verify what happens in 5.2.1.
Passing true to memory_get_usage only clouds things further as I get all
0 for memory growth on all platforms making me think that
memory_get_usage()
without the true param (>=5.2) is useless.
Why it's useless? 0 means your script didn't change real memory usage -
but it wasn't supposed to - since PHP pre-allocated memory blocks
anyway, so to change real memory usage you have to allocate something
big and hold it.
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/
Rasmus Lerdorf wrote:
I get:
first growth: 704
second growth: 32,264with current PHP_5_2 checkout. I don't have a 5.1.x handy with memory
limits compiled in.Starting to think this is more prevelant on Mac OS X. My huge numbers
are coming from 5.2.0, on Mac OS X.
You should really upgrade to 5.2.1 first.
On my Linux boxes I am getting
smaller numbers, more like what you guys are reporting so I assume you
are using Linux.
--
Wbr,
Antony Dovgal
I think what is happening is that the Engine caches small-sized memory
blocks and does not really free them when they are deallocated, even
when they are not referenced anymore. The cache size is limited, so I
don't think you need to be concerned. If you still think it is a big
problem for you, you could set USE_ZEND_ALLOC environment variable to 0
and have the engine user regular malloc/free. Note that this would
probably harm your performance.
Stanislav Malyshev, Zend Products Engineer
stas@zend.com http://www.zend.com/