Can someone spot why this code
(tested in both 5.2.5 and 5.3)
<?php
function curl($post) {
$ch = curl_init()
;
curl_setopt($ch, CURLOPT_URL, "www.fdhfkdsslak.bogus");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
if($post) {
curl_setopt($ch, CURLOPT_POST, 1);
for($args='',$i=0;$i<75;$i++) $args .= "a=$i&";
curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
unset($args);
}
curl_exec($ch);
curl_close($ch);
}
echo "start ".memory_get_usage()."\n";
for($i=0;$i<10;$i++) {
curl(0);
echo "GET ".memory_get_usage()."\n";
}
for($i=0;$i<10;$i++) {
curl(1);
echo "POST ".memory_get_usage()."\n";
}
?>
outputs:
start 326616
GET 327256
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
POST 327516
POST 327588
POST 327652
POST 327712
POST 327892
POST 328064
POST 328228
POST 328384
POST 328528
POST 328628
The leak size isn't constant even though my post data size is constant,
and the returned data (a dns error, presumably) is the same. Seems like
there is something weird in our postfield handling code in the curl
extension. Note also that it stabilizes somehow, although this is an
extremely simplified version of a long-running command-line script that
seems to grow on each post with wildly varying post data.
-Rasmus
start 326616
GET 327256
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
POST 327516
POST 327588
POST 327652
POST 327712
POST 327892
POST 328064
POST 328228
POST 328384
POST 328528
POST 328628
It's not a solution, Rasmus, but here's more data, taken from
5.2.4. The results were exactly the same, in the same order, after
running your code ten times.
start 57140
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
POST 58244
POST 58492
POST 58728
POST 58948
POST 59060
POST 59164
POST 59264
POST 59360
POST 59452
POST 59452
When changing the for() loop in the curl() function to this:
<?php
for($args='',$i=0;$i<75;$i++) {
$args .= $i == 0 ? "a=$i" : "&a=$i";
}
?>
The results stabilize quicker:
start 57688
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
POST 58844
POST 59088
POST 59416
POST 59692
POST 59800
POST 59904
POST 59904
POST 59904
POST 59904
POST 59904
--
</Daniel P. Brown>
Dedicated Servers - Intel 2.4GHz w/2TB bandwidth/mo. starting at just
$59.99/mo. with no contract!
Dedicated servers, VPS, and hosting from $2.50/mo.
Note that changing it to pass an array of post args instead of passing
it a string makes the leak go away.
This block of code in interface.c deals with the string case:
} else {
char *post = NULL;
convert_to_string_ex(zvalue);
post = estrndup(Z_STRVAL_PP(zvalue), Z_STRLEN_PP(zvalue));
zend_llist_add_element(&ch->to_free.str, &post);
error = curl_easy_setopt(ch->cp, CURLOPT_POSTFIELDS, post);
error = curl_easy_setopt(ch->cp, CURLOPT_POSTFIELDSIZE,
Z_STRLEN_PP(zvalue));
}
convert it to a string, estrndup it with the right args and add it to
the to_free.str list. Then in the destructor for the resource we have:
zend_llist_clean(&ch->to_free.str);
So I am obviously missing something. That code looks fine.
-Rasmus
Daniel Brown wrote:
start 326616
GET 327256
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
GET 327276
POST 327516
POST 327588
POST 327652
POST 327712
POST 327892
POST 328064
POST 328228
POST 328384
POST 328528
POST 328628It's not a solution, Rasmus, but here's more data, taken from
5.2.4. The results were exactly the same, in the same order, after
running your code ten times.start 57140
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
GET 57820
POST 58244
POST 58492
POST 58728
POST 58948
POST 59060
POST 59164
POST 59264
POST 59360
POST 59452
POST 59452When changing the for() loop in the curl() function to this:
<?php
for($args='',$i=0;$i<75;$i++) {
$args .= $i == 0 ? "a=$i" : "&a=$i";
}
?>The results stabilize quicker:
start 57688
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
GET 58340
POST 58844
POST 59088
POST 59416
POST 59692
POST 59800
POST 59904
POST 59904
POST 59904
POST 59904
POST 59904
Rasmus Lerdorf escribió:
Can someone spot why this code
(tested in both 5.2.5 and 5.3)
<?php
function curl($post) {
$ch =curl_init()
;
curl_setopt($ch, CURLOPT_URL, "www.fdhfkdsslak.bogus");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
if($post) {
curl_setopt($ch, CURLOPT_POST, 1);
for($args='',$i=0;$i<75;$i++) $args .= "a=$i&";
curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
unset($args);
}
curl_exec($ch);
curl_close($ch);
}
echo "start ".memory_get_usage()."\n";
for($i=0;$i<10;$i++) {
curl(0);
echo "GET ".memory_get_usage()."\n";
}
for($i=0;$i<10;$i++) {
curl(1);
echo "POST ".memory_get_usage()."\n";
}
?>outputs:
for me
start 120400
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
POST 124968
POST 125928
POST 126608
POST 127272
POST 127920
POST 128552
POST 129168
POST 129768
POST 130352
POST 130920
when I request for $real_usage the results are constant..
--
"Progress is possible only if we train ourselves to think about programs
without thinking of them as pieces of executable code.” - Edsger W.
Dijkstra
Cristian Rodríguez R.
Platform/OpenSUSE - Core Services
SUSE LINUX Products GmbH
Research & Development
http://www.opensuse.org/
Cristian Rodríguez wrote:
Rasmus Lerdorf escribió:
Can someone spot why this code
(tested in both 5.2.5 and 5.3)
<?php
function curl($post) {
$ch =curl_init()
;
curl_setopt($ch, CURLOPT_URL, "www.fdhfkdsslak.bogus");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
if($post) {
curl_setopt($ch, CURLOPT_POST, 1);
for($args='',$i=0;$i<75;$i++) $args .= "a=$i&";
curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
unset($args);
}
curl_exec($ch);
curl_close($ch);
}
echo "start ".memory_get_usage()."\n";
for($i=0;$i<10;$i++) {
curl(0);
echo "GET ".memory_get_usage()."\n";
}
for($i=0;$i<10;$i++) {
curl(1);
echo "POST ".memory_get_usage()."\n";
}
?>outputs:
for me
start 120400
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
GET 122624
POST 124968
POST 125928
POST 126608
POST 127272
POST 127920
POST 128552
POST 129168
POST 129768
POST 130352
POST 130920when I request for $real_usage the results are constant..
Yes, I'm not saying there is a malloc leak. I haven't seen that, but
the emalloc leak means that eventually a script that repeatedly sends
post requests is going to hit the memory limit no matter how much
cleanup it does.
Note that you don't actually need to send the request. It looks like
repeatedly doing:
$ch = curl_init()
;
curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
curl_close($ch);
Is enough to do it. Still looking at the code. Seems like
zend_llist_clean(&ch->to_free.str); isn't doing the right thing somehow.
-Rasmus
Rasmus Lerdorf wrote:
Note that you don't actually need to send the request. It looks like
repeatedly doing:$ch =
curl_init()
;
curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
curl_close($ch);Is enough to do it. Still looking at the code. Seems like
zend_llist_clean(&ch->to_free.str); isn't doing the right thing somehow.
Is this really related to curl?
php -n -r 'for($i=0;$i<10;$i++) { for($args="",$j=0;$j<75;$j++) $args
.= "a=$j&"; unset($args); echo memory_get_usage()
."\n"; }'
55872
56100
56172
56240
56304
56364
56420
56420
56420
56420
/ manuel
Manuel Mausz wrote:
Rasmus Lerdorf wrote:
Note that you don't actually need to send the request. It looks like
repeatedly doing:$ch =
curl_init()
;
curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
curl_close($ch);Is enough to do it. Still looking at the code. Seems like
zend_llist_clean(&ch->to_free.str); isn't doing the right thing somehow.Is this really related to curl?
php -n -r 'for($i=0;$i<10;$i++) { for($args="",$j=0;$j<75;$j++) $args
.= "a=$j&"; unset($args); echo
memory_get_usage()
."\n"; }'
55872
56100
56172
56240
56304
56364
56420
56420
56420
56420
Yeah, I noticed that as well as I was simplifying things down further
and further. I have gotten to the point where my simplification has
lost track of the real problem, I think. My original complicated code
is leaking hundreds of K per iteration, and I assumed the smaller leak
in the simplified version was representative of that, but I don't think
it is. That would also why I haven't been able to find a bug in this
curl code I have been looking at.
-Rasmus
Hi!
php -n -r 'for($i=0;$i<10;$i++) { for($args="",$j=0;$j<75;$j++) $args
.= "a=$j&"; unset($args); echo
memory_get_usage()
."\n"; }'
55872
56100
56172
56240
56304
56364
56420
56420
56420
56420
This looks like cache filling up, not a leak. Unless it doesn't stop at
some boundary.
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com