Hi,
When I posted yesterday's patch to add stream support to include_path
(http://news.php.net/php.internals/36031) I mentioned that I suspected
benchmarking would reveal it to be slow. My primary goal is to provide
no impact on current users who are using a traditional include_path,
with a secondary goal of improving performance of those who use the new
syntax. Today I ran callgrind on the thing, with some surprising results.
With the patch, include is faster for our traditional users than it is
now.
With the patch, include_once with >1000 unique files is about 3% slower
- not the whole execution, just include_once
With the patch, include_once with 1 unique file included 10000 times is
insignificantly slower (about 0.4%)
For these reasons, I'm really encouraged :). The next step is to
absolutely ensure correctness and then see if the streams part of
include_path can be optimized at all (or if it needs it).
Details
I just ran callgrind on this script:
<?php
set_include_path('.:/usr/local/lib/php:/home/cellog/workspace/php5/ext/phar');
for ($i = 0; $i < 100000; $i++) {
include 'extra.php';
}
The empty file "extra.php" (zero byte) is in
"/home/cellog/workspace/php5/ext/phar/extra.php" ensuring that we
traverse include_path to find it.
To my great shock, the script runs faster with my patch, because it
executes significantly more instruction cycles in
php_stream_open_for_zend_ex without the patch.
Note that this does not measure the cost of *_once. *_once is a lot
harder to measure, so I created 10,000 files (yikes) via this script:
<?php
for ($i = 1; $i <= 10000; $i++) file_put_contents('test' . $i, '');
and then ran this test script:
<?php
set_include_path('.:/usr/local/lib/php:/home/cellog/workspace/php5/poop');
for ($i = 1; $i <= 10000; $i++) {
include_once 'test' . $i;
}
callgrind reported that php_resolve_path was about twice as slow as the
other version, resulting in a 3% degradation of include_once performance
over the current version (which is much faster than 5.2.x, incidentally).
Finally, to test the _once aspect of include_once, I ran this script:
<?php
set_include_path('.:/usr/local/lib/php:/home/cellog/workspace/php5/ext/phar');
for ($i = 0; $i < 100000; $i++) {
include_once 'extra.php';
}
With this script, it really highlights the most common use case of
include/require_once: attempting to include the same file multiple
times. The difference in performance was insignificant, with callgrind
reporting a total execution portion of 75.12% for CVS, and 75.57% with
my patch.
So, it looks like the biggest performance hit would be for users
including more than 1000 different files, and would result in
approximately 3% slower performance of include_once. I'm curious how
many of our readers have a PHP setup that includes close to this many
files, because it seems rather unlikely to me that anyone would include
more than a few hundred in a single process.
The surprising news is that users who are using "include" would see a
performance improvement from my patch, so I recommend that portion be
committed regardless of other actions. This improvement proabbly
results from removing an include_path search in plain_wrapper.
Thanks,
Greg
Hi,
When I posted yesterday's patch to add stream support to include_path
(http://news.php.net/php.internals/36031) I mentioned that I suspected
benchmarking would reveal it to be slow. My primary goal is to provide
no impact on current users who are using a traditional include_path,
with a secondary goal of improving performance of those who use the new
syntax. Today I ran callgrind on the thing, with some surprising results.With the patch, include is faster for our traditional users than it is
now.
With the patch, include_once with >1000 unique files is about 3% slower
- not the whole execution, just include_once
With the patch, include_once with 1 unique file included 10000 times is
insignificantly slower (about 0.4%)For these reasons, I'm really encouraged :). The next step is to
absolutely ensure correctness and then see if the streams part of
include_path can be optimized at all (or if it needs it).Details
I just ran callgrind on this script:
<?php
set_include_path('.:/usr/local/lib/php:/home/cellog/workspace/php5/ext/phar');
for ($i = 0; $i < 100000; $i++) {
include 'extra.php';
}The empty file "extra.php" (zero byte) is in
"/home/cellog/workspace/php5/ext/phar/extra.php" ensuring that we
traverse include_path to find it.To my great shock, the script runs faster with my patch, because it
executes significantly more instruction cycles in
php_stream_open_for_zend_ex without the patch.Note that this does not measure the cost of *_once. *_once is a lot
harder to measure, so I created 10,000 files (yikes) via this script:<?php
for ($i = 1; $i <= 10000; $i++) file_put_contents('test' . $i, '');and then ran this test script:
<?php
set_include_path('.:/usr/local/lib/php:/home/cellog/workspace/php5/poop');
for ($i = 1; $i <= 10000; $i++) {
include_once 'test' . $i;
}callgrind reported that php_resolve_path was about twice as slow as the
other version, resulting in a 3% degradation of include_once performance
over the current version (which is much faster than 5.2.x, incidentally).Finally, to test the _once aspect of include_once, I ran this script:
<?php
set_include_path('.:/usr/local/lib/php:/home/cellog/workspace/php5/ext/phar');
for ($i = 0; $i < 100000; $i++) {
include_once 'extra.php';
}With this script, it really highlights the most common use case of
include/require_once: attempting to include the same file multiple
times. The difference in performance was insignificant, with callgrind
reporting a total execution portion of 75.12% for CVS, and 75.57% with
my patch.So, it looks like the biggest performance hit would be for users
including more than 1000 different files, and would result in
approximately 3% slower performance of include_once. I'm curious how
many of our readers have a PHP setup that includes close to this many
files, because it seems rather unlikely to me that anyone would include
more than a few hundred in a single process.The surprising news is that users who are using "include" would see a
performance improvement from my patch, so I recommend that portion be
committed regardless of other actions. This improvement proabbly
results from removing an include_path search in plain_wrapper.
what about including(_once) by absolute path?
--
Alexey Zakhlestin
http://blog.milkfarmsoft.com/
Alexey Zakhlestin wrote:
what about including(_once) by absolute path?
Hi,
There is no difference from the existing code for include_once, the
first line of php_resolve_path has a check for IS_ABSOLUTE_PATH, all of
the code I wrote follows this line. Calls to include (no _once) would
add 1 function call to php_resolve_path, an insignificant difference.
Greg