Newsgroups: php.internals,php.qa
Path: news.php.net
Xref: news.php.net php.internals:63426 php.qa:66758
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.54 as permitted sender)
Message-ID: <507C7DE0.5030508@gmail.com>
Date: Mon, 15 Oct 2012 22:19:28 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Nuno Lopes <nlopess@php.net>
CC: internals@lists.php.net, PHP QA <php-qa@lists.php.net>
References: <4FB4E844.2070305@gmail.com> <9AF025709ED6452F9E5A91834F82F788@PC07655> <4FB9EE1E.3050701@gmail.com> <BCC3B4980516429D9BF61D910B09548D@PC07655> <506E9442.6050501@gmail.com> <038CBF95DEC5479DA771E23528A33FB3@pc07654>
In-Reply-To: <038CBF95DEC5479DA771E23528A33FB3@pc07654>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [PHP-QA] Parallel run-tests
From: zoe.slattery@gmail.com (zoe slattery)

Hi Nuno

> Hi,
>
> Here you have a dump of a run of PHP_HEAD in the gcov machine (almost 
> 13k tests) without valgrind:
> http://gcov.php.net/~nlopess/dump_PHP_HEAD_z4.txt
>
> It was run with -z 4.  However, the reported CPU usage is only 213% 
> (instead of ~400%).
I spent a lot of time this weekend trying to figure this out, not 
completely successfully. The main problem is the way that groups of 
tests are allocated between processors.  The allocation is done randomly 
from a list of groups (directories of tests), so for example, on my two 
way machine, if  the "array" and the "file" tests get allocated to the 
same processor then the whole run can take as long as long with two 
processors as with one. The second processor just finishes its task list 
and hangs around waiting for the first one to complete.

Ideally all the processors would do something like pop tasks from a 
list, so they'd all be busy all the time. However, I think this might be 
difficult (for me) to implement correctly (?). A simpler solution might 
be to have an allocation map either hard coded or read from a 
configuration file - the student who implemented the parallel code did 
allow for this. Having observed the issue with 'array' and 'file' I was 
going to try this next. Any thoughts on the right way to do this would 
be helpful.
>
> As you can see in the dump, there are a few BORK'ed tests.
BORK'ed tests don't worry me too much. This version BORKs on tests that 
have 'unrecognised sections', the current implementation just ignores 
them so there are a few tests (for example those which contain SERVER 
test sections) which run-tests just silently does nothing with, I don't 
really like that behaviour and think that BORKs are an improvement.

I'll have a look through your results and make sure that it's nothing 
more than that.
>
> BTW, sometimes I get this error when launching run-tests.php 
> (non-deterministic):
> Fatal error: Call to undefined method rtGroupResults::run() in 
> src/taskScheduler/rtTaskSchedulerFile.php on line 225
That is probably a bug - sorry. I added that class at the weekend, I'll 
look into it.
>
> Nuno
>
>
> ----- Original Message -----
>> Hi Nuno - did you ever get a chance to look at the parallel version 
>> of run-tests? I had some free time to work on it last week so 
>> REDIRECT is now implemented, I have tested it using ext/pdo_mysql and 
>> ext/pdo_sqlite.
>>
>> I'm testing using about 8000 phpt tests and getting the same results 
>> as run-tests. There is still work to do on performance but I think it 
>> might at this stage be worth trying a bigger sample.
>>
>> Zoe
>>> Alright, thanks for the reply!
>>> I'll try to have a look at the code and give it a try next weekend.
>>>
>>> Nuno
>>>
>>>
>>> -----Original Message----- From: zoe slattery Sent: Monday, May 21, 
>>> 2012 8:26 AM
>>> Subject: Re: [PHP-QA] Parallel run-tests
>>> On 21/05/2012 06:45, Nuno Lopes wrote:
>>>> Hi Zoe,
>>>>
>>>> Thanks for undertaking this project!
>>>>
>>>> I just have a few concerns about this:
>>>> - The speedup seems a bit low to me. Maybe with a higher number of 
>>>> extensions enabled it will improve?..
>>> I don't think it will improve with a higher number of extensions. It 
>>> _may_ improve as a result of someone doing some work on the tests 
>>> are scheduled - but I wouldn't count on it. Looking at improving 
>>> performance is important and I wanted to get some confirmation that 
>>> it is faster, I think we have that level of confirmation now. 
>>> However, if I'm the only person working on this, further performance 
>>> work will  come after complete implementation + debugging.
>>>> - Is there any developer documentation available?  If I wanted to 
>>>> do a change or a bug fix today, how could I do it?
>>> Lots - https://wiki.php.net/qa/runtests. I have just updated the 
>>> "Development" section which has instructions on how to build and 
>>> test the code. The 'Documentation' section has not been updated 
>>> recently but I think it is still valid.
>>>
>>>> - Can it be packaged as a single drop-in file replacement for 
>>>> run-tests.php? The deployment is very important to me.  IMHO, the 
>>>> optimal solution would be to have a drop-in replacement for the 
>>>> current script, so that most developers wouldn't even notice the 
>>>> difference.
>>> Yes - totally agree. I think I experimented with packaging it as a 
>>> .phar (so it would be called run-tests.phar, not run-tests.php), but 
>>> it's so long ago that I can't remember. I have added it to the to-do 
>>> list (https://wiki.php.net/qa/runtests/todos).
>>>> - From previous emails exchanged in this thread, it seems that this 
>>>> new version requires a few extensions to run (gzip, soap??). This 
>>>> is not acceptable. The script must be able to run with 
>>>> --disable-all. Of course in that case the parallel version will fail.
>>> No, it doesn't. That was a stupid bug and I fixed it last night :-)
>>>
>>> Zoe
>>>>
>>>> Nuno
>>>>
>>>>
>>>> -----Original Message----- From: zoe slattery
>>>> Sent: Thursday, May 17, 2012 1:00 PM
>>>> To: internals@lists.php.net ; PHP QA
>>>> Subject: [PHP-QA] Parallel run-tests
>>>>
>>>> Hi
>>>>
>>>> Over the past couple of weeks I have updated the parallel run-tests
>>>> (fixed a couple of minor bugs in the PHP code and the build.xml), it's
>>>> now almost at the point where I could go ahead and implement the last
>>>> pieces.  Here is a summary and a few questions:
>>>>
>>>> 1. In rebasing the code the the dev't stream I found a number of tests
>>>> with non-standard sections. My code checks test case structure and
>>>> objects to anything non-standard, the current run-tests.php mainly
>>>> ignores this kind of thing. I fixed up about 15 of these tests (see
>>>> #62022) already - I'll fix the rest if there are no objections - I 
>>>> will
>>>> open another bug report first.
>>>>
>>>> 2. If there is agreement to use this code it would make sense to
>>>> replace  the existing run-tests code with it, or rather,  it would 
>>>> make
>>>> no sense to try and maintain both versions. The new code is OO PHP, 
>>>> it's
>>>> in http://git.php.net/repository/phpruntests.git, is there any problem
>>>> with it staying there long term and maybe copying a run-tests.phar 
>>>> into
>>>> the PHP source directory? I have no idea what the right answer is,
>>>> suggestions welcome.
>>>>
>>>> 3. I ran a couple of small tests on my dual core Mac yesterday. For a
>>>> standard set of tests, the parallel code ran in 207 seconds, 
>>>> sequential
>>>> in 293 seconds and the standard run-tests.php took 298 seconds. 
>>>> This is
>>>> an improvement but I suspect we could still do better by looking at 
>>>> the
>>>> scheduling algorithm.
>>>> At the moment it's very simple, we just assemble a list of directories
>>>> with tests in and hand them out to processors till everything is done.
>>>> Being able to handle tests that must be run in sequence (mysql, 
>>>> mysqli)
>>>> will mean making some changes to this. So, perhaps we give an explicit
>>>> list to p1 and let the scheduler distribute the rest of the tests? Or
>>>> maybe we should have a 'process map' for all tests for extensions that
>>>> are build by default? Again, suggestions welcome.
>>>>
>>>> 4. REDIRECTTEST still needs to be implemented, I understand how it 
>>>> works
>>>> and this isn't (afaict) a major issue.
>>>>
>>>> 5. Testing. I'm able to do basic testing on Mac OSX  and Linux. I 
>>>> really
>>>> need access to an 8 way Linux system, or someone who has access and
>>>> would be interested in looking at performance? Any volunteers? This is
>>>> probably the most interesting part of the project :-)
>>>>
>>>> 6. Windows. I'm not in a position to do anything much with Windows
>>>> except some very basic checks to make sure that the sequential version
>>>> runs. The parallel code won't work because we used pcntl(), however I
>>>> know that Stefan and George were keen to design the code so that a
>>>> Windows solution could be implemented if anyone thought of one. If
>>>> anyone wants to pick up this aspect I'd be happy to get them started.
>>>>
>>>> Zoe 
>
>