Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:95993 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 19528 invoked from network); 14 Sep 2016 12:37:46 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 14 Sep 2016 12:37:46 -0000 Authentication-Results: pb1.pair.com header.from=php-mailing-list@lool.fr; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=php-mailing-list@lool.fr; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain lool.fr from 212.27.42.2 cause and error) X-PHP-List-Original-Sender: php-mailing-list@lool.fr X-Host-Fingerprint: 212.27.42.2 smtp2-g21.free.fr Received: from [212.27.42.2] ([212.27.42.2:24450] helo=smtp2-g21.free.fr) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id FC/34-21040-59449D75 for ; Wed, 14 Sep 2016 08:37:44 -0400 Received: from pcdepk (unknown [82.236.192.32]) by smtp2-g21.free.fr (Postfix) with ESMTP id 90099200342 for ; Wed, 14 Sep 2016 14:37:38 +0200 (CEST) To: Date: Wed, 14 Sep 2016 14:37:38 +0200 Message-ID: <009201d20e84$c717bfa0$55473ee0$@lool.fr> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0093_01D20E95.8AA300A0" X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdIOgl3pTNnVcvcsQq6sNgqGNwgmeA== Content-Language: fr Subject: Re: Directly embed small strings in zvals From: php-mailing-list@lool.fr ("Pascal KISSIAN") ------=_NextPart_000_0093_01D20E95.8AA300A0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hello everyone, I was wondering if it would make sense to store small strings (length <= 7) directly inside the zval struct, thereby avoiding the need to extra allocate a zend_string, which would also not entail any costly indirection and refcounting for such strings. The idea would be to add a new sruct ``struct { uint8_t len; char val[7]; } sval`` to the _zend_value union type in order to embed it directly into the zval struct and use a type flag (zval.u1.v.type_flags) such as IS_SMALL_STRING to destinguish between a regular heap allocated zend_string and the directly embedded compact representation. Small strings are quite common IMHO. In fact quickly sampling my company's PHP code base I found well over 50% of the strings to be of length <= 7. It would save a lot of memory allocations as well as pointer indirection, and could also bypass refcounting logic. Also, comparing small strings for equality would become a trivial operation (just comparing two pre-aligned 64bit integers) - no more need to keep small strings interned. Of course it wouldn't longer be possible to also persistently store the hash value of a small string, though calculating the hash value for small strings is less costly anyways because less characters equals less iterations, so that might not be an issue in practice. I don't see such an idea in https://wiki.php.net/php-7.1-ideas and I was wondering: Has anybody experimented with that approach yet? Is it worth discussing? Please let me know your thoughts, Ben -- Bejamin Coutu ben.coutu@zeyos.com ZeyOS, Inc. http://www.zeyos.com Hi, Such < small strings > implementation was done in the PICK Systems' PICK Basic, which was popular in the early 1980's. ( https://en.wikipedia.org/wiki/Pick_operating_system ) I found many features in php that were present in Pick Basic that make me adopt php with great enthusiasm. RAD language, string and integer implicit conversion without errors nor notices, no need to take care of any kind of allocation, etc. I wrote a specific high performance version of Basic Pick compiler/bytecode interpreter for a former french computer company that had its own specific pick system fork. Small strings, beside the speed improvment, where very usefull for the very small memory size that was available at that time. and the memory cost.. Perhaps php could also profit of that ? Pascal KISSIAN http://pascal.kissian.net ------=_NextPart_000_0093_01D20E95.8AA300A0--