Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:87397 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 8707 invoked from network); 30 Jul 2015 13:00:52 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Jul 2015 13:00:52 -0000 Authentication-Results: pb1.pair.com header.from=laruence@php.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=xinchen.h@zend.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 209.85.213.53 as permitted sender) X-PHP-List-Original-Sender: xinchen.h@zend.com X-Host-Fingerprint: 209.85.213.53 mail-vk0-f53.google.com Received: from [209.85.213.53] ([209.85.213.53:35653] helo=mail-vk0-f53.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E4/42-31830-FFF1AB55 for ; Thu, 30 Jul 2015 09:00:49 -0400 Received: by vkhg129 with SMTP id g129so9838908vkh.2 for ; Thu, 30 Jul 2015 06:00:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=0/dcaXtuez2R8ldf4o6t8kRLroicGU3NfghSREJUJps=; b=WbH0cCDW6UPW94EEkTHzA7yL3qrrWV5usDjrUe3wdt0A+froYXbApmm5m/ywjijl0K Xp7GmGIKzdqGgr90UPwjt4zCE8qSrYWp/G7/PltlPDt94jfes/IXjyxdoQ7ahsTjnz5Y WAnrZ9Zd9sEeM+3BAObtDzZt8dUpnWWSD3mhlykPHM7UGg0q7KtQf3FOv5KDpxfICl9O gRRmMX+oLNqZyzPoVE6gPmCSrKsSqVflAuFokP0lU0NkoF36acDdsxMwbow/W1sVywKD 2fmtyRdquwbFbst58VeCyZ4j5G4osc6IDrI18pk8Z6eMDYOF0LARu7lHISK5ytXxrCww rB1A== X-Gm-Message-State: ALoCoQl/B8+Vf6MbgiExpN6sFtKHejsK5TTQCT2jwwHUAqZf7Z817zdKStx5PQ6/ep/RV8DhjhM7ctMiX1w5AMwsrlEkiwnJpHhtlIzSqbMQWsYUahPIIQMY4+kZKRfKXmlLZVKr9+I3papGthHfzaIcFw+W6dYvBdNzTUIvGge7rfMrgzWfd8A= X-Received: by 10.53.1.171 with SMTP id bh11mr59697674vdd.95.1438261243718; Thu, 30 Jul 2015 06:00:43 -0700 (PDT) Received: from mail-vk0-f53.google.com (mail-vk0-f53.google.com. [209.85.213.53]) by smtp.gmail.com with ESMTPSA id ea7sm170711vdb.0.2015.07.30.06.00.43 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 Jul 2015 06:00:43 -0700 (PDT) Received: by vkhg129 with SMTP id g129so9838802vkh.2 for ; Thu, 30 Jul 2015 06:00:43 -0700 (PDT) X-Received: by 10.52.240.236 with SMTP id wd12mr61883414vdc.77.1438261243117; Thu, 30 Jul 2015 06:00:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.31.12.1 with HTTP; Thu, 30 Jul 2015 06:00:23 -0700 (PDT) In-Reply-To: References: <0ABC26E371A76440A370CFC5EB1056CC2F6C9AE9@IRSMSX106.ger.corp.intel.com> Date: Thu, 30 Jul 2015 21:00:23 +0800 Message-ID: To: Joe Watkins Cc: "Andone, Bogdan" , "internals@lists.php.net" , Dmitry Stogov Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] Introduction and some opcache SSE related stuff From: laruence@php.net (Xinchen Hui) Hey: On Thu, Jul 30, 2015 at 8:24 PM, Joe Watkins wrote: > Hi Andone, > > I'm not sure why nobody has replied to you yet, we've all looked at the > PR and spent a lot of the day yesterday discussing it. > > I've CC'd Dmitry, he doesn't always read internals, so this should get > his attention. Sorry for late response, and Dmitry is on vacation now. so, he probably not be able to reply this soon. anyway, is the performance improvement is consistently be seen? have you tested it with some profiling tool? IR reduced or cache misses reduced? thanks > > Lastly, very cool ... I look forward to some more cleverness ... > > Cheers > Joe > > On Wed, Jul 29, 2015 at 3:22 PM, Andone, Bogdan > wrote: > >> Hi Guys, >> >> My name is Bogdan Andone and I work for Intel in the area of SW >> performance analysis and optimizations. >> We would like to actively contribute to Zend PHP project and to involve >> ourselves in finding new performance improvement opportunities based on >> available and/or new hardware features. >> I am still in the source code digesting phase but I had a look to the >> fast_memcpy() implementation in opcache extension which uses SSE intrinsics: >> >> If I am not wrong fast_memcpy() function is not currently used, as I >> didn't find the "-msse4.2" gcc flag in the Makefile. I assume you probably >> didn't see any performance benefit so you preserved generic memcpy() usage. >> >> I would like to propose a slightly different implementation which uses >> _mm_store_si128() instead of _mm_stream_si128(). This ensures that copied >> memory is preserved in data cache, which is not bad as the interpreter will >> start to use this data without the need to go back one more time to memory. >> _mm_stream_si128() in the current implementation is intended to be used for >> stores where we want to avoid reading data into the cache and the cache >> pollution; in opcache scenario it seems that preserving the data in cache >> has a positive impact. >> >> Running php-cgi -T10000 on WordPress4.1/index.php I see ~1% performance >> increase for the new version of fast_memcpy() compared with the generic >> memcpy(). Same result using a full load test with http_load on a Haswell EP >> 18 cores. >> >> Here is the proposed pull request: >> https://github.com/php/php-src/pull/1446 >> >> Related to the SW prefetching instructions in fast_memcpy()... they are >> not really useful in this place. There benefit is almost negligible as the >> address requested for prefetch will be needed at the next iteration (few then maybe we don't need this in fast_memcpy? I mean it maybe used widely if is is proven to be faster, which will be out of this context. thanks >> cycles later), while the time needed to get data from RAM is >100 cycles >> usually.. Nevertheless... they don't heart and it seems they still have a >> very small benefit so I preserved the original instruction and I added a >> new prefetch request for the destination pointer. >> >> Hope it helps, >> Bogdan >> -- Xinchen Hui @Laruence http://www.laruence.com/