Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:100838 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 45238 invoked from network); 8 Oct 2017 03:47:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 8 Oct 2017 03:47:08 -0000 X-Host-Fingerprint: 95.144.122.1 unknown Received: from [95.144.122.1] ([95.144.122.1:11998] helo=localhost.localdomain) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 1B/6B-16800-8BF99D95 for ; Sat, 07 Oct 2017 23:47:05 -0400 Message-ID: <1B.6B.16800.8BF99D95@pb1.pair.com> To: internals@lists.php.net X-Mozilla-News-Host: news://news.php.net:119 Date: Sun, 8 Oct 2017 04:47:00 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0 SeaMonkey/2.46 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Posted-By: 95.144.122.1 Subject: Parallelised run-tests.php (patch) From: ajf@ajf.me (Andrea Faulds) Hi there, Do you spend HOURS every day AGONISING over how long run-tests.php takes to complete? Have you long since ABANDONED every test directory besides Zend/tests? Does it feel like your eight CPU cores are pointlessly WASTED by single-threaded PHP test execution? Are you SADDENED every day at how high-quality the run-tests.php code is, DREAMING of an even worse mess? Then I've got just the trick for you! …*ahem*. Okay, enough terrible salesmanship. I felt like parallelising run-tests.php, so I did it. If you give it the flag -jX, it'll spawn X worker processes and throw batches at tests at them, and those worker processes will send back the results of tests for the parent process to display and collate. To avoid potential problems with tests that step on eachother's toes (i.e. access the same file or port or whatever), it gives each worker a particular directory of tests and it then executes those in parallel. Obviously, this means things aren't as parallel as they could be for very large test directories (looking at you, Zend/tests/…), though that's a solvable problem. :) Per a suggestion by Rasmus in an earlier thread, the parent hands out large directories to children first to reduce the time spent on children occupied with huge directories. And what a difference it makes. On my Ubuntu 16.04 VirtualBox VM running inside Windows 10 Pro x64 on my Ryzen 1700 8-core 16-thread PC, here's vanilla run-tests.php against master on a --disable-all build: ===================================================================== TEST RESULT SUMMARY --------------------------------------------------------------------- Exts skipped : 68 Exts tested : 7 --------------------------------------------------------------------- Number of tests : 15613 8481 Tests skipped : 7132 ( 45.7%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 30 ( 0.2%) ( 0.4%) Expected fail : 31 ( 0.2%) ( 0.4%) Tests passed : 8420 ( 53.9%) ( 99.3%) --------------------------------------------------------------------- Time taken : 317 seconds ===================================================================== And here's run-tests.php -j16: ===================================================================== TEST RESULT SUMMARY --------------------------------------------------------------------- Exts skipped : 68 Exts tested : 7 --------------------------------------------------------------------- Number of tests : 15613 8481 Tests skipped : 7132 ( 45.7%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 35 ( 0.2%) ( 0.4%) Expected fail : 31 ( 0.2%) ( 0.4%) Tests passed : 8415 ( 53.9%) ( 99.2%) --------------------------------------------------------------------- Time taken : 80 seconds ===================================================================== That's about a quarter of the single-threaded time. It could get even better still if there were more test directories to work with; currently the number of alive workers declines as the test run goes on, because all the small directories get finished quickly and you're left waiting on the Zend/tests-like behemoths. One idea might be to add some sort of special file you could add to a directory which tells run-tests.php that it is safe to parallelise, in which case it would break it into chunks at runtime for you. Though I don't think we should have huge numbers of files in a single directory anyway, if possible. Okay, but what's the catch? Well, the code isn't the most… elegant thing. I did make run-tests.php faster, but I didn't refactor it to be less of a global-variables-dependant mess than it is at present, except for moving all the top-level code into a single function for the sake of my own sanity. Therefore, the worker child processes get their own copy of the global state to initialise them. It's a bit hacky, but this is basically what UNIX fork() does, so maybe I shouldn't feel so bad. ;) On the other hand, the remainder of the design is quite elegant! It's all clean message-passing over STDIN and STDOUT. This isn't fully tested yet. I haven't actually investigated those 5 test failures, and my implementation probably breaks some run-tests.php features right now. I also haven't bothered to try this on a non --disable-all build yet. That said, I'm impressed how well it actually works right now and how effectively it tolerates errors. The child processes will even happily kill themselves if the parent does. The code is here: https://github.com/php/php-src/pull/2822 Happy Halloween. -- Andrea Faulds https://ajf.me/