Hello everyone,
I was wondering if it would make sense to store small strings (length <= 7)
directly inside the zval struct, thereby avoiding the need to extra allocate
a zend_string, which would also not entail any costly indirection and
refcounting for such strings.
The idea would be to add a new sruct struct { uint8_t len; char val[7]; } sval
to the _zend_value union type in order to embed it directly into the
zval struct and use a type flag (zval.u1.v.type_flags) such as
IS_SMALL_STRING to destinguish between a regular heap allocated zend_string
and the directly embedded compact representation.
Small strings are quite common IMHO. In fact quickly sampling my company's
PHP code base I found well over 50% of the strings to be of length <= 7. It
would save a lot of memory allocations as well as pointer indirection, and
could also bypass refcounting logic. Also, comparing small strings for
equality would become a trivial operation (just comparing two pre-aligned
64bit integers) - no more need to keep small strings interned.
Of course it wouldn't longer be possible to also persistently store the hash
value of a small string, though calculating the hash value for small strings
is less costly anyways because less characters equals less iterations, so
that might not be an issue in practice.
I don't see such an idea in https://wiki.php.net/php-7.1-ideas and I was
wondering: Has anybody experimented with that approach yet? Is it worth
discussing?
Please let me know your thoughts,
Ben
--
Bejamin Coutu
ben.coutu@zeyos.com
ZeyOS, Inc.
http://www.zeyos.com/ http://www.zeyos.com
Hi,
Such < small strings > implementation was done in the PICK Systems' PICK
Basic, which was popular in the early 1980's.
( https://en.wikipedia.org/wiki/Pick_operating_system )
I found many features in php that were present in Pick Basic that make me
adopt php with great enthusiasm.
RAD language, string and integer implicit conversion without errors nor
notices, no need to take care of any kind of allocation, etc.
I wrote a specific high performance version of Basic Pick compiler/bytecode
interpreter for a former french computer company that had its own specific
pick system fork.
Small strings, beside the speed improvment, where very usefull for the very
small memory size that was available at that time. and the memory cost..
Perhaps php could also profit of that ?
Pascal KISSIAN