Hello!
Most of PHP's symbols are case insensitive. This means extensions that
need to do things with function and method names end up lowercasing
and hashing the lowercased names, often having to do more memory
allocations too. Since case insensitive symbols is language dictated
behavior, it makes sense to expose the correctly cased symbols to
extensions. In PHP 8.0 (and possibly older, I did not check), the
engine is already interning the lowercased name of user defined
functions; it's just not made available to extensions.
In my ideal world, we'd actually switch all symbols to be case
sensitive. However, that won't be happening for PHP 8 due to BC.
So, instead, I propose adding an .lcname
member (or some other name
indicating it's been normalized to the preferred PHP case) to at least
zend_op_array and zend_class_entry, but preferably for internal
functions too. Note that many internal functions will already be
lowercase, so the data can be shared.
I could make this change in the main engine, but I strongly suspect it
will not play correctly with opcache.
Hello!
Most of PHP's symbols are case insensitive. This means extensions that
need to do things with function and method names end up lowercasing
and hashing the lowercased names, often having to do more memory
allocations too. Since case insensitive symbols is language dictated
behavior, it makes sense to expose the correctly cased symbols to
extensions. In PHP 8.0 (and possibly older, I did not check), the
engine is already interning the lowercased name of user defined
functions; it's just not made available to extensions.In my ideal world, we'd actually switch all symbols to be case
sensitive. However, that won't be happening for PHP 8 due to BC.So, instead, I propose adding an
.lcname
member (or some other name
indicating it's been normalized to the preferred PHP case) to at least
zend_op_array and zend_class_entry, but preferably for internal
functions too. Note that many internal functions will already be
lowercase, so the data can be shared.I could make this change in the main engine, but I strongly suspect it
will not play correctly with opcache.--
To unsubscribe, visit: https://www.php.net/unsub.php
I just realized I didn't ask any specific questions. Oops:
- Can anyone think of issues except increased memory due to
increasing the size of the struct? Since the strings were previously
interned, I don't think the strings themselves will have much effect
on memory usage (but we can measure this). - Anyone else who thinks this would be useful?
Hi Levi Morrison,
Hello!
Most of PHP's symbols are case insensitive. This means extensions that
need to do things with function and method names end up lowercasing
and hashing the lowercased names, often having to do more memory
allocations too. Since case insensitive symbols is language dictated
behavior, it makes sense to expose the correctly cased symbols to
extensions. In PHP 8.0 (and possibly older, I did not check), the
engine is already interning the lowercased name of user defined
functions; it's just not made available to extensions.In my ideal world, we'd actually switch all symbols to be case
sensitive. However, that won't be happening for PHP 8 due to BC.So, instead, I propose adding an
.lcname
member (or some other name
indicating it's been normalized to the preferred PHP case) to at least
zend_op_array and zend_class_entry, but preferably for internal
functions too. Note that many internal functions will already be
lowercase, so the data can be shared.I could make this change in the main engine, but I strongly suspect it
will not play correctly with opcache.--
To unsubscribe, visit: https://www.php.net/unsub.php
I just realized I didn't ask any specific questions. Oops:
1. Can anyone think of issues except increased memory due to
increasing the size of the struct? Since the strings were previously
interned, I don't think the strings themselves will have much effect
on memory usage (but we can measure this).
2. Anyone else who thinks this would be useful?
I don't have a personal use case for this and no common operations come to mind but could be persuaded.
The lack of examples or might be why there's been no response.
I assume the overhead is probably negligible for classes, and larger for functions.
What fields of zend_op_array did you mean?
What parts of the engine or extensions would use the lowercase string?
I see a few places it's used for compilation in php-src itself but nothing that seems performance critical.
What are examples of functionality/functions of extensions that you expect would see a performance improvement?
Why would they need to convert the strings to lowercase rather than use the casing of the declaration
(e.g. using "memcached" instead of "Memcached" in "class Memcached{...}")
E.g. if something had already looked up the zend_class_entry *ce
for a class name, than ce->name
would be a string (and a unique pointer) such as "Memcached" that uniquely identifies that class's name
(unless the code is unexpectedly redeployed later with different casing)
- They're still case sensitive in some ways, e.g. the composer autoloader is case sensitive
- Changing case should be rare
Regards,
Tyson