Background: PHP has a not-often-considered feature, the stat-cache. That is, the runtime caches the OS stat()
call for files, so that subsequent reads on the same file can be faster. However, it's even less realized that it's a single-file cache. It literally only applies when you try to do two file-infomation operations on the same file in rapid succession, without any other file reads in between.
For more info: https://tideways.com/profiler/blog/the-php-stat-cache-explained
Because it's so rarely relevant, in the cases it is relevant, it can be quite a surprise, and a surprise causing weird and hard to explain caching bugs in applications.
The cache also dates from 20 years ago, when Rasmus added it (and the realpath cache) in Yahoo's forked PHP 4, and then it got integrated into PHP 5. However, hard drives are vastly faster than they were then, and operating systems are vastly more efficient than they were then.
There's been some discussion about making the cache disable-able, though the consensus now seems to be leaning toward getting rid of it outright:
https://github.com/php/php-src/pull/17178
Arnaud ran some quick benchmarks and found that disabling it has a less than 1% impact on Symfony and WordPress.
https://github.com/php/php-src/pull/17178#issuecomment-2554323572
Before we go any further, is there appetite among the voting population to remove it? clearstatcache()
and similar functions would get stubbed out as no-ops, but otherwise we'd just hand the responsibility back to the OS where it belongs, which seems so far like it would be almost an unmeasurable performance difference but remove some surprise complexity.
Would you support such a removal?
What additional data would you need to make the case for such removal?
--
Larry Garfield
larry@garfieldtech.com
Background: PHP has a not-often-considered feature, the stat-cache. That is, the runtime caches the OS
stat()
call for files, so that subsequent reads on the same file can be faster. However, it's even less realized that it's a single-file cache. It literally only applies when you try to do two file-infomation operations on the same file in rapid succession, without any other file reads in between.For more info: https://tideways.com/profiler/blog/the-php-stat-cache-explained
Because it's so rarely relevant, in the cases it is relevant, it can be quite a surprise, and a surprise causing weird and hard to explain caching bugs in applications.
The cache also dates from 20 years ago, when Rasmus added it (and the realpath cache) in Yahoo's forked PHP 4, and then it got integrated into PHP 5. However, hard drives are vastly faster than they were then, and operating systems are vastly more efficient than they were then.
There's been some discussion about making the cache disable-able, though the consensus now seems to be leaning toward getting rid of it outright:
https://github.com/php/php-src/pull/17178
Arnaud ran some quick benchmarks and found that disabling it has a less than 1% impact on Symfony and WordPress.
https://github.com/php/php-src/pull/17178#issuecomment-2554323572
Before we go any further, is there appetite among the voting population to remove it?
clearstatcache()
and similar functions would get stubbed out as no-ops, but otherwise we'd just hand the responsibility back to the OS where it belongs, which seems so far like it would be almost an unmeasurable performance difference but remove some surprise complexity.Would you support such a removal?
What additional data would you need to make the case for such removal?--
Larry Garfield
larry@garfieldtech.com
At least on the platform I'm supporting (IBM i), filesystem calls can be
quite slow. I know it's similar on Windows too. That said, I think
getting rid of the stat cache is probably the right call. It's better to
do this at the OS or application levels, where they know more about the
workload (either because they have a system view, or the app knows what
it needs to keep). I haven't measured this yet though.
Background: PHP has a not-often-considered feature, the stat-cache. That is, the runtime caches the OS
stat()
call for files, so that subsequent reads on the same file can be faster. However, it's even less realized that it's a single-file cache. It literally only applies when you try to do two file-infomation operations on the same file in rapid succession, without any other file reads in between.There's been some discussion about making the cache disable-able, though the consensus now seems to be leaning toward getting rid of it outright:
https://github.com/php/php-src/pull/17178
Arnaud ran some quick benchmarks and found that disabling it has a less than 1% impact on Symfony and WordPress.
https://github.com/php/php-src/pull/17178#issuecomment-2554323572
Before we go any further, is there appetite among the voting population to remove it?
clearstatcache()
and similar functions would get stubbed out as no-ops, but otherwise we'd just hand the responsibility back to the OS where it belongs, which seems so far like it would be almost an unmeasurable performance difference but remove some surprise complexity.Would you support such a removal?
I still think the stat cache should be deprecated first. That gives
users a chance to reconsider calling multiple stat related functions
instead of doing a single stat()
call. See my previous comment[1] for
some further details.
[1] https://github.com/php/php-src/pull/5894#issuecomment-2546473892
Christoph
Hi,
On Fri, Dec 20, 2024 at 10:37 PM Christoph M. Becker cmbecker69@gmx.de
wrote:
Background: PHP has a not-often-considered feature, the stat-cache.
That is, the runtime caches the OSstat()
call for files, so that
subsequent reads on the same file can be faster. However, it's even less
realized that it's a single-file cache. It literally only applies when you
try to do two file-infomation operations on the same file in rapid
succession, without any other file reads in between.There's been some discussion about making the cache disable-able, though
the consensus now seems to be leaning toward getting rid of it outright:https://github.com/php/php-src/pull/17178
Arnaud ran some quick benchmarks and found that disabling it has a less
than 1% impact on Symfony and WordPress.https://github.com/php/php-src/pull/17178#issuecomment-2554323572
Before we go any further, is there appetite among the voting population
to remove it?clearstatcache()
and similar functions would get stubbed out
as no-ops, but otherwise we'd just hand the responsibility back to the OS
where it belongs, which seems so far like it would be almost an
unmeasurable performance difference but remove some surprise complexity.Would you support such a removal?
I still think the stat cache should be deprecated first. That gives
users a chance to reconsider calling multiple stat related functions
instead of doing a singlestat()
call. See my previous comment[1] for
some further details.
I don't think we should force users update their code because of negligible
perf impact. Most of the time this want play any role in perf anyway as
often for applications, that actually do something, the most time is spent
on waiting for IO. So I really don't see a reason for deprecation in this
case.
Regards
Jakub
Background: PHP has a not-often-considered feature, the stat-cache. That is, the runtime caches the OS
stat()
call for files, so that subsequent reads on the same file can be faster. However, it's even less realized that it's a single-file cache. It literally only applies when you try to do two file-infomation operations on the same file in rapid succession, without any other file reads in between.There's been some discussion about making the cache disable-able, though the consensus now seems to be leaning toward getting rid of it outright:
https://github.com/php/php-src/pull/17178
Arnaud ran some quick benchmarks and found that disabling it has a less than 1% impact on Symfony and WordPress.
https://github.com/php/php-src/pull/17178#issuecomment-2554323572
Before we go any further, is there appetite among the voting population to remove it?
clearstatcache()
and similar functions would get stubbed out as no-ops, but otherwise we'd just hand the responsibility back to the OS where it belongs, which seems so far like it would be almost an unmeasurable performance difference but remove some surprise complexity.Would you support such a removal?
I still think the stat cache should be deprecated first. That gives
users a chance to reconsider calling multiple stat related functions
instead of doing a singlestat()
call. See my previous comment[1] for
some further details.[1] https://github.com/php/php-src/pull/5894#issuecomment-2546473892
Christoph
What exactly would deprecation look like here? My plan was to just rip the cache out, and update clearstatcache()
to be a no-op, but issue a deprecation message "Hey, this doesn't do anything anymore." And then we can remove the function itself in like PHP 10 or something, because it doesn't hurt anything to leave it be.
I don't see there being much value to a period of "hey, this is going to do nothing in the future", when users couldn't do anything about it. That just gives them a deprecation notice they cannot fix, if they're in one of the very few situations where manually clearing the cache is useful. That doesn't seem great.
--Larry Garfield
I still think the stat cache should be deprecated first. That gives
users a chance to reconsider calling multiple stat related functions
instead of doing a singlestat()
call. See my previous comment[1] for
some further details.[1] https://github.com/php/php-src/pull/5894#issuecomment-2546473892
What exactly would deprecation look like here? My plan was to just rip the cache out, and update
clearstatcache()
to be a no-op, but issue a deprecation message "Hey, this doesn't do anything anymore." And then we can remove the function itself in like PHP 10 or something, because it doesn't hurt anything to leave it be.I don't see there being much value to a period of "hey, this is going to do nothing in the future", when users couldn't do anything about it. That just gives them a deprecation notice they cannot fix, if they're in one of the very few situations where manually clearing the cache is useful. That doesn't seem great.
I believe the whole point of the stat cache is to optimize multiple
consecutive calls to stat releted functions on the same file name.
E.g. code like
$mtime = filemtime($filename);
$fsize = filesize($filename);
would be a relevant example. Such code could be changed in userland to
$stat = stat($filename);
$mtime = $stat["mtime"];
$fsize = $stat["stat"];
where the stat cache would be irrelevant. Of course, users who are not
aware that there may be a difference in performance won't even think
about that. As such a deprecation message could be triggered whenever
the stat cache is hit, possibly pointing also to the file:line where the
cache had been populated. The usefulness of this is based on the
assumption that it's pretty unlikely that the stat cache is hit from
unrelated code paths.
If a general deprecation is not desired (and that seems to be the case),
I'm also fine with a PR/patch that users could apply themselves, similar
what Nikita did back then when string to number comparisons changed[1].
Note that clearstatcache()
should not be no-opped altogether; clearing
(parts of) the realpath cache seems still useful.
[1] https://github.com/php/php-src/pull/3917
Christoph
On Fri, Dec 20, 2024 at 8:29 PM Larry Garfield larry@garfieldtech.com
wrote:
Background: PHP has a not-often-considered feature, the stat-cache. That
is, the runtime caches the OSstat()
call for files, so that subsequent
reads on the same file can be faster. However, it's even less realized
that it's a single-file cache. It literally only applies when you try to
do two file-infomation operations on the same file in rapid succession,
without any other file reads in between.For more info:
https://tideways.com/profiler/blog/the-php-stat-cache-explainedBecause it's so rarely relevant, in the cases it is relevant, it can be
quite a surprise, and a surprise causing weird and hard to explain caching
bugs in applications.The cache also dates from 20 years ago, when Rasmus added it (and the
realpath cache) in Yahoo's forked PHP 4, and then it got integrated into
PHP 5. However, hard drives are vastly faster than they were then, and
operating systems are vastly more efficient than they were then.There's been some discussion about making the cache disable-able, though
the consensus now seems to be leaning toward getting rid of it outright:https://github.com/php/php-src/pull/17178
Arnaud ran some quick benchmarks and found that disabling it has a less
than 1% impact on Symfony and WordPress.https://github.com/php/php-src/pull/17178#issuecomment-2554323572
Before we go any further, is there appetite among the voting population to
remove it?clearstatcache()
and similar functions would get stubbed out as
no-ops, but otherwise we'd just hand the responsibility back to the OS
where it belongs, which seems so far like it would be almost an
unmeasurable performance difference but remove some surprise complexity.Would you support such a removal?
What additional data would you need to make the case for such removal?--
Larry Garfield
larry@garfieldtech.com
This gets a +1 from me. I've had bugs that I suspected were caused by this
cache, but I was never able to confirm it until putting clearstatcache()
in
production. That's not a workflow I'd like to follow, and it has wasted
enough of my time.
Am 20.12.2024 um 20:26 schrieb Larry Garfield:
Would you support such a removal?
+1 from me.
Here is an example of how the stat-cache can lead to interesting
situations in testing:
https://github.com/sebastianbergmann/phpunit/issues/5996#issuecomment-2422018481
There's been some discussion about making the cache disable-able, though the consensus now seems to be leaning toward getting rid of it outright:
Just to fill in more context, which wasn't originally obvious to me: that PR thread replaces one from 2021 https://github.com/php/php-src/pull/5894 which was discussed on the list before without consensus: https://externals.io/message/115912.
That in turn links to a feature request from all the way back in 2004: https://bugs.php.net/bug.php?id=28790
I have no doubt there are various other duplicates and discussions; clearly this has always been a contentious topic.
Regards,
Rowan Tommins
[IMSoP]