Hi,
I have just reported #72666 (touch() works differently on plain paths
and file:// paths with regards to cleaning the stat cache) and I would
gladly provide a PR with a fix and unit tests. However, before I start
working on this (well - it would be an easy fix), I would like to
question whether the stat cache is a good idea and whether it's still
(or ever was) needed these days.
File systems right now are really good at quickly providing the
information: On MacOS I measure only 10% loss in performance when
doing nothing but calling stat()
and clearstatcache()
compared to just
calling stat()
. Over NFS on Linux, it's 20%.
Additionally, I wonder what the stat cache is actually helpful for -
as far as I understand it, this only helps if you repeatedly call
stat()
(or a related function) on the same file within the same
request which probably isn't very common application behaviour to
begin with.
As such, before I fix touch()
to call php_clear_stat_cache(), maybe
it's worth reconsidering the whole thing and removing it from PHP
itself at least for later versions, though I guess there I would be in
firm RFC territory which I'd be willing to write if I had RFC writing
credentials.
What do you think?
Philip
Hi Philip,
-----Original Message-----
From: Philip Hofstetter [mailto:phofstetter@sensational.ch]
Sent: Monday, July 25, 2016 9:55 AM
To: PHP internals internals@lists.php.net
Subject: [PHP-DEV] stat cache / still needed these days?Hi,
I have just reported #72666 (touch() works differently on plain paths and file://
paths with regards to cleaning the stat cache) and I would gladly provide a PR
with a fix and unit tests. However, before I start working on this (well - it would
be an easy fix), I would like to question whether the stat cache is a good idea
and whether it's still (or ever was) needed these days.File systems right now are really good at quickly providing the
information: On MacOS I measure only 10% loss in performance when doing
nothing but callingstat()
andclearstatcache()
compared to just callingstat()
.
Over NFS on Linux, it's 20%.Additionally, I wonder what the stat cache is actually helpful for - as far as I
understand it, this only helps if you repeatedly call
stat()
(or a related function) on the same file within the same request which
probably isn't very common application behaviour to begin with.As such, before I fix
touch()
to call php_clear_stat_cache(), maybe it's worth
reconsidering the whole thing and removing it from PHP itself at least for later
versions, though I guess there I would be in firm RFC territory which I'd be willing
to write if I had RFC writing credentials.What do you think?
Not only single paths fill the cache, but also all the sub-paths. Say for /a/b/c/d, all the single sub-parts /a/b/c, /a/b, /a will be cached. So it's not just when some file is accessed, but when siblings with the joint parent are accessed. This spares a lot of recursive function calls for the path resolution. Especially in TS, but also in NTS variant. Even disregarding that, I don't really see any functional gain on removing - at the end the path will land in some C runtime I/O function as "last authority". From the experience, path's barely change within the same request, so the common case profits from the path caching.
Regards
Anatol