Newsgroups: php.internals,php.qa
Path: news.php.net
Xref: news.php.net php.internals:63507 php.qa:66760
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.214.42 as permitted sender)
Message-ID: <507ED1BE.8040903@gmail.com>
Date: Wed, 17 Oct 2012 16:41:50 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Nuno Lopes <nlopess@php.net>
CC: internals@lists.php.net, PHP QA <php-qa@lists.php.net>
References: <4FB4E844.2070305@gmail.com> <9AF025709ED6452F9E5A91834F82F788@PC07655> <4FB9EE1E.3050701@gmail.com> <BCC3B4980516429D9BF61D910B09548D@PC07655> <506E9442.6050501@gmail.com> <038CBF95DEC5479DA771E23528A33FB3@pc07654> <507C7DE0.5030508@gmail.com>
In-Reply-To: <507C7DE0.5030508@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [PHP-DEV] Re: [PHP-QA] Parallel run-tests
From: aparachic@gmail.com (zoe slattery)

Nuno - just a PS to the last note. It is (mainly) the task allocation 
across processors which means that running tests in parallel on a 4-way 
machine is not 4 times as fast as running them in sequence.

Here are some results from a run on my 2-way Mac - 
http://static.inky.ws/image/3257/image.jpg. The blocks of colour are 
just representations of the time it takes a group to run - and Open 
office is allocating the colors randomly so they don't have any 
significance. I've annotated the chart to show which groups are taking a 
long time....

The net is that P0 runs its half of the tasks and then just hangs about 
waiting for P1 to finish :-/.

I have added a 'debug' flag to the code which will print information 
about how tasks are allocated if anyone wants to try on a 4 or 8 way. 
Given that there are not many groups that take a long time to run the 
simplest thing seems to be to map these to specific processors - that's 
easy enough and requires no difficult code. There are more elegant 
solutions of course.

Zoe
> Hi Nuno
>
>> Hi,
>>
>> Here you have a dump of a run of PHP_HEAD in the gcov machine (almost 
>> 13k tests) without valgrind:
>> http://gcov.php.net/~nlopess/dump_PHP_HEAD_z4.txt
>>
>> It was run with -z 4.  However, the reported CPU usage is only 213% 
>> (instead of ~400%).
> I spent a lot of time this weekend trying to figure this out, not 
> completely successfully. The main problem is the way that groups of 
> tests are allocated between processors.  The allocation is done 
> randomly from a list of groups (directories of tests), so for example, 
> on my two way machine, if  the "array" and the "file" tests get 
> allocated to the same processor then the whole run can take as long as 
> long with two processors as with one. The second processor just 
> finishes its task list and hangs around waiting for the first one to 
> complete.
>
> Ideally all the processors would do something like pop tasks from a 
> list, so they'd all be busy all the time. However, I think this might 
> be difficult (for me) to implement correctly (?). A simpler solution 
> might be to have an allocation map either hard coded or read from a 
> configuration file - the student who implemented the parallel code did 
> allow for this. Having observed the issue with 'array' and 'file' I 
> was going to try this next. Any thoughts on the right way to do this 
> would be helpful.
>>
>> As you can see in the dump, there are a few BORK'ed tests.
> BORK'ed tests don't worry me too much. This version BORKs on tests 
> that have 'unrecognised sections', the current implementation just 
> ignores them so there are a few tests (for example those which contain 
> SERVER test sections) which run-tests just silently does nothing with, 
> I don't really like that behaviour and think that BORKs are an 
> improvement.
>
> I'll have a look through your results and make sure that it's nothing 
> more than that.
>>
>> BTW, sometimes I get this error when launching run-tests.php 
>> (non-deterministic):
>> Fatal error: Call to undefined method rtGroupResults::run() in 
>> src/taskScheduler/rtTaskSchedulerFile.php on line 225
> That is probably a bug - sorry. I added that class at the weekend, 
> I'll look into it.
>>
>> Nuno
>>
>>
>> ----- Original Message -----
>>> Hi Nuno - did you ever get a chance to look at the parallel version 
>>> of run-tests? I had some free time to work on it last week so 
>>> REDIRECT is now implemented, I have tested it using ext/pdo_mysql 
>>> and ext/pdo_sqlite.
>>>
>>> I'm testing using about 8000 phpt tests and getting the same results 
>>> as run-tests. There is still work to do on performance but I think 
>>> it might at this stage be worth trying a bigger sample.
>>>
>>> Zoe
>>>> Alright, thanks for the reply!
>>>> I'll try to have a look at the code and give it a try next weekend.
>>>>
>>>> Nuno
>>>>
>>>>
>>>> -----Original Message----- From: zoe slattery Sent: Monday, May 21, 
>>>> 2012 8:26 AM
>>>> Subject: Re: [PHP-QA] Parallel run-tests
>>>> On 21/05/2012 06:45, Nuno Lopes wrote:
>>>>> Hi Zoe,
>>>>>
>>>>> Thanks for undertaking this project!
>>>>>
>>>>> I just have a few concerns about this:
>>>>> - The speedup seems a bit low to me. Maybe with a higher number of 
>>>>> extensions enabled it will improve?..
>>>> I don't think it will improve with a higher number of extensions. 
>>>> It _may_ improve as a result of someone doing some work on the 
>>>> tests are scheduled - but I wouldn't count on it. Looking at 
>>>> improving performance is important and I wanted to get some 
>>>> confirmation that it is faster, I think we have that level of 
>>>> confirmation now. However, if I'm the only person working on this, 
>>>> further performance work will come after complete implementation + 
>>>> debugging.
>>>>> - Is there any developer documentation available?  If I wanted to 
>>>>> do a change or a bug fix today, how could I do it?
>>>> Lots - https://wiki.php.net/qa/runtests. I have just updated the 
>>>> "Development" section which has instructions on how to build and 
>>>> test the code. The 'Documentation' section has not been updated 
>>>> recently but I think it is still valid.
>>>>
>>>>> - Can it be packaged as a single drop-in file replacement for 
>>>>> run-tests.php? The deployment is very important to me.  IMHO, the 
>>>>> optimal solution would be to have a drop-in replacement for the 
>>>>> current script, so that most developers wouldn't even notice the 
>>>>> difference.
>>>> Yes - totally agree. I think I experimented with packaging it as a 
>>>> .phar (so it would be called run-tests.phar, not run-tests.php), 
>>>> but it's so long ago that I can't remember. I have added it to the 
>>>> to-do list (https://wiki.php.net/qa/runtests/todos).
>>>>> - From previous emails exchanged in this thread, it seems that 
>>>>> this new version requires a few extensions to run (gzip, soap??). 
>>>>> This is not acceptable. The script must be able to run with 
>>>>> --disable-all. Of course in that case the parallel version will fail.
>>>> No, it doesn't. That was a stupid bug and I fixed it last night :-)
>>>>
>>>> Zoe
>>>>>
>>>>> Nuno
>>>>>
>>>>>
>>>>> -----Original Message----- From: zoe slattery
>>>>> Sent: Thursday, May 17, 2012 1:00 PM
>>>>> To: internals@lists.php.net ; PHP QA
>>>>> Subject: [PHP-QA] Parallel run-tests
>>>>>
>>>>> Hi
>>>>>
>>>>> Over the past couple of weeks I have updated the parallel run-tests
>>>>> (fixed a couple of minor bugs in the PHP code and the build.xml), 
>>>>> it's
>>>>> now almost at the point where I could go ahead and implement the last
>>>>> pieces.  Here is a summary and a few questions:
>>>>>
>>>>> 1. In rebasing the code the the dev't stream I found a number of 
>>>>> tests
>>>>> with non-standard sections. My code checks test case structure and
>>>>> objects to anything non-standard, the current run-tests.php mainly
>>>>> ignores this kind of thing. I fixed up about 15 of these tests (see
>>>>> #62022) already - I'll fix the rest if there are no objections - I 
>>>>> will
>>>>> open another bug report first.
>>>>>
>>>>> 2. If there is agreement to use this code it would make sense to
>>>>> replace  the existing run-tests code with it, or rather, it would 
>>>>> make
>>>>> no sense to try and maintain both versions. The new code is OO 
>>>>> PHP, it's
>>>>> in http://git.php.net/repository/phpruntests.git, is there any 
>>>>> problem
>>>>> with it staying there long term and maybe copying a run-tests.phar 
>>>>> into
>>>>> the PHP source directory? I have no idea what the right answer is,
>>>>> suggestions welcome.
>>>>>
>>>>> 3. I ran a couple of small tests on my dual core Mac yesterday. For a
>>>>> standard set of tests, the parallel code ran in 207 seconds, 
>>>>> sequential
>>>>> in 293 seconds and the standard run-tests.php took 298 seconds. 
>>>>> This is
>>>>> an improvement but I suspect we could still do better by looking 
>>>>> at the
>>>>> scheduling algorithm.
>>>>> At the moment it's very simple, we just assemble a list of 
>>>>> directories
>>>>> with tests in and hand them out to processors till everything is 
>>>>> done.
>>>>> Being able to handle tests that must be run in sequence (mysql, 
>>>>> mysqli)
>>>>> will mean making some changes to this. So, perhaps we give an 
>>>>> explicit
>>>>> list to p1 and let the scheduler distribute the rest of the tests? Or
>>>>> maybe we should have a 'process map' for all tests for extensions 
>>>>> that
>>>>> are build by default? Again, suggestions welcome.
>>>>>
>>>>> 4. REDIRECTTEST still needs to be implemented, I understand how it 
>>>>> works
>>>>> and this isn't (afaict) a major issue.
>>>>>
>>>>> 5. Testing. I'm able to do basic testing on Mac OSX  and Linux. I 
>>>>> really
>>>>> need access to an 8 way Linux system, or someone who has access and
>>>>> would be interested in looking at performance? Any volunteers? 
>>>>> This is
>>>>> probably the most interesting part of the project :-)
>>>>>
>>>>> 6. Windows. I'm not in a position to do anything much with Windows
>>>>> except some very basic checks to make sure that the sequential 
>>>>> version
>>>>> runs. The parallel code won't work because we used pcntl(), however I
>>>>> know that Stefan and George were keen to design the code so that a
>>>>> Windows solution could be implemented if anyone thought of one. If
>>>>> anyone wants to pick up this aspect I'd be happy to get them started.
>>>>>
>>>>> Zoe 
>>
>>
>
>