Hello,
Following on from the work to introduce a read-only file cache in
#16551, I'd like to propose the next logical step and request feedback
on a change to improve opcache's portability in modern containerized
environments.
This is a pre-RFC discussion to gather initial thoughts on the
approach before I proceed with writing a full RFC document.
A draft PR implementing the change can be found here:
https://github.com/php/php-src/pull/19123
The Problem
Currently, the opcache file cache is versioned using zend_system_id.
While effective for ensuring safety, this ID is highly sensitive to
minor changes in the build environment, such as updates to system
libraries (e.g., libc) or even patch-level changes in the PHP version
itself.
This fragility makes pre-generated, read-only opcaches difficult to
use reliably in platforms like Google App Engine/Cloud Run, AWS
Lambda, or any environment that uses immutable Docker images with
automatic base image updates. These updates, while beneficial for
security, can change the zend_system_id and needlessly invalidate an
otherwise perfectly valid opcache, negating the performance benefits
of pre-warming.
The Proposed Solution
To address this, I propose introducing a more stable "ABI hash" for
versioning the file cache.
During the build process, this change calculates a CRC32 checksum of
the header files that define opcache's core data structures (e.g.,
zend_op, zval, zend_string, zend_function, zend_class_entry, etc.).
This hash would then be used as part of the cache key instead of the
more volatile zend_system_id.
The result is a cache key that only changes when the underlying data
structures—the Application Binary Interface (ABI)—actually change.
This typically only happens between minor PHP versions (e.g., 8.5 vs
8.6), not on every patch release or system library update.
Benefits
-
Improved Portability: Opcache file caches become portable across
builds with identical ABIs, allowing them to be reused across
different patch versions of PHP (e.g., 8.3.5 -> 8.3.6) and different
base images. -
Enhanced Performance on Managed Platforms: This significantly
improves the effectiveness of pre-warmed, read-only caches, leading to
consistently better cold-start performance and reduced CPU usage in
containerized environments.
This is the logical continuation of the work on read-only file caches,
making them a truly viable solution for high-performance, scalable
applications.
I would greatly appreciate any feedback on this approach.
Is an ABI hash a feasible solution?
Are there any potential pitfalls or alternative approaches I should consider?
Thank you for your time.
Samuel Melrose
sam@melroseandco.uk
+44 (0) 7754096383