Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:96469 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 31713 invoked from network); 19 Oct 2016 10:45:06 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Oct 2016 10:45:06 -0000 Authentication-Results: pb1.pair.com header.from=ben.coutu@zeyos.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=ben.coutu@zeyos.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zeyos.com designates 89.163.237.165 as permitted sender) X-PHP-List-Original-Sender: ben.coutu@zeyos.com X-Host-Fingerprint: 89.163.237.165 mx.zeyos.com Received: from [89.163.237.165] ([89.163.237.165:47380] helo=mx.zeyos.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 76/22-14749-FAE47085 for ; Wed, 19 Oct 2016 06:45:04 -0400 Received: from mx.zeyos.com (localhost [127.0.0.1]) by mx.zeyos.com (Postfix) with ESMTP id 453BD5FAB8 for ; Wed, 19 Oct 2016 12:45:01 +0200 (CEST) Authentication-Results: mx.zeyos.com (amavisd-new); dkim=pass reason="pass (just generated, assumed good)" header.d=zeyos.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=zeyos.com; h= content-transfer-encoding:content-type:content-type:mime-version :to:subject:subject:from:from:date:date; s=dkim; t=1476873901; x=1477737902; bh=1YdqmobKgzssYan4EvKb40+hZeQKEWA6aZKDeXM6jdA=; b= ePgwjOe3DSE0BzNoHqtH2+rGisDj+MJnBNjFSIu9M3LFqxls5Pl5CVv9zAT0VdCq 3jmN5vAD6d8I7pheXMOBbFIA2mtSSdFXPSlVWSYSL7euek7uxo6ph+d39efUbL9m b/p4dzaqXm/nOg3yzumpBWgBqYGmTeghI1Ztawi0DjY= X-Virus-Scanned: Debian amavisd-new at mx.zeyos.com Received: from mx.zeyos.com ([127.0.0.1]) by mx.zeyos.com (mx.zeyos.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id EuUvf2A9MO6z for ; Wed, 19 Oct 2016 12:45:01 +0200 (CEST) Received: from 127.0.0.1 (srv32.dedicated.server-hosting.expert [89.163.135.32]) by mx.zeyos.com (Postfix) with ESMTPSA id D37DC5FAB4; Wed, 19 Oct 2016 12:45:00 +0200 (CEST) Date: Wed, 19 Oct 2016 12:45:00 +0200 To: Xinchen Hui , Dmitry Stogov , Nikita Popov Cc: PHP Internals MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID: <20161019104501.453BD5FAB8@mx.zeyos.com> Subject: [PHP-DEV] Exploit fully packed array/hash property From: ben.coutu@zeyos.com (Benjamin Coutu) Hello everyone, I've identified a few more array/hash use cases where it might make sense t= o introduce special short circuit logic for packed arrays. Specifically, there is an additional property of certain packed arrays (apa= rt from being packed obviously) that we can utilize: A packed array with nN= umUsed equal to nNumOfElements must be consecutively indexed from zero onwa= rd without any gaps (no IS_UNDEF). Let's call this special pure form of a p= acked array a "fully packed array". Earlier I have posted on how this convenient property could speed things up= for array_slice, even if preserved_keys=3Dtrue (for the offset=3D0 case), = see https://marc.info/?l=3Dphp-internals&m=3D147048569215717&w=3D2=0A I therefore propose to introduce the following two new macros in /Zend/zend= _types.h in order to make it easier for developers to introduce special pac= ked (or fully packed) array logic: #define HT_IS_PACKED(ht) ((ht)->u.flags & HASH_FLAG_PACKED) #define HT_IS_FULLY_PACKED(ht) (HT_IS_PACKED(HT) && (ht)->nNumUsed =3D=3D (= ht)->nNumOfElements) With this we can easily speed things up (and more importantly preserve the = packed array characteristics) for the common case of array_slice($packed_ar= ray, $offset=3D0, $length, $preserved_keys=3Dtrue) by using the follwing sn= ippet for https://github.com/php/php-src/blob/master/ext/standard/array.c#L= 2906 : if (preserve_keys ? (offset =3D=3D 0 && HT_IS_FULLY_PACKED(Z_ARRVAL_P(input= ))) : HT_IS_PACKED(Z_ARRVAL_P(input))) ... ZEND_HASH_FILL_PACKED ... =20 Another example where this could come in handy involves the encoding of JSO= N. For every array that it encodes, json_encode must first detemine wheter = the array contains string keys or is not consecutively indexed, so that it = can decide wheter to use a JSON object or a plain JSON array. That is accom= plished through php_json_determine_array_type in /ext/json/json_encoder.c. = That code currently has to iterate through the array until it finds a strin= g key or a non-consecutive index (which in the worst case would mean iterat= ing through the entire array). Now, for fully packed arrays we can short circuit things and jump directly = to the conclusion of it being a real plain old array, if it is packed and w= e can prove it has no gaps. That can of course easily be done via the new H= T_IS_FULLY_PACKED macro. We can improve this code for a large amount of arr= ays with the following simple snippet for https://github.com/php/php-src/bl= ob/master/ext/json/json_encoder.c#L38 : if (HT_IS_FULLY_PACKED(myht)) return PHP_JSON_OUTPUT_ARRAY;=0A I'll be happy to make an effort to screen the entire code base in search fo= r more cases where this could be useful once this is picked up by a lead co= re developer (maybe Xinchen?) who is willing to commit something on the lin= es of the above. Any thoughts? Benjamin --=20 Bejamin Coutu ben.coutu@zeyos.com ZeyOS, Inc. http://www.zeyos.com