Change all XFAIL tests to FAIL

13 years ago by Alexey Shein — view source

unread

Hi, internals!

I've got a suggestion about refactoring our tests suite. I'd like to
remove XFAIL institution and mark all failing tests just as FAIL.
XFAIL has a problem that it hides attention from failing tests
depending on not yet fixed bugs (most important), not yet implemented
features (less important).
Failed tests should make pain. They should bug you every day until you
go and fix them.
XFAILs serve now as a pain-killers, we've got about 50 of them in the
repo, so devs (I assume) think this way: "It's failing, but it's
EXPECTED to fail, so let's leave it as is".
That's wrong thinking. Either tests are correct and if they fail you
should fix the code and leave them failed until the code is fixed, or,
if the tests are incorrect - fix the tests or remove them completely.

Reasons introducing XFAILs were described in this article:
http://zoomsplatter.blogspot.com/2008/06/why-xfail.html
I'll quote some thoughts from there:

The intention of XFAIL is to help people working on developing PHP. Consider first the situation where you (as a PHP implementer) are working through a set of failing tests. You do some analysis on one test but you can't fix the implementation until something else is fixed – however – you don't want to lose the analysis and it might be some time before you can get back to the failing test. In this case I think it's reasonable to add an XFAIL section with a brief description of the analysis. This takes the test out of the list of reported failures making it easier for you to see what is really on your priority list but still leaving the test as a failing test.

If you need something else to be fixed, leave your failing test, it
would annoy everybody, so something else will be fixed faster. If you
"don't want to lose analysis" you can make a list of tests you need
with (run-tests.php has -l option which can read/write test filenames
from a file) and use that. Failing tests should not be hidden.

The second place that I can see that XFAIL might be useful is when a group of people are working on the same development project. Essentially one person on the project finds a missing feature or capability but it isn't something they can add immediately, or maybe another person has agreed to implement it. A really good way to document the omission is to write a test case which is expected to fail but which will pass if the feature is implemented. This assumes that there is general agreement that implementation is a good idea and needs to be done at some stage.

These "feature tests" can be put in a separate branch to not pollute
main release branches until the feature is ready. These usually called
"acceptance tests" so they mark if a feature fully implements its
functions or not. We could also introduce "Incomplete" state like it's
done in PHPUnit for these tests.

What do you think?

--
Regards,
Shein Alexey

13 years ago by Stas Malyshev — view source

unread

Hi!

I've got a suggestion about refactoring our tests suite. I'd like to
remove XFAIL institution and mark all failing tests just as FAIL.
XFAIL has a problem that it hides attention from failing tests
depending on not yet fixed bugs (most important), not yet implemented
features (less important).
Failed tests should make pain. They should bug you every day until you
go and fix them.

Please note that we were in that position and we moved from there. So to
move back, we need some argument about what's different this time from
the place we were a year ago.

XFAILs serve now as a pain-killers, we've got about 50 of them in the
repo, so devs (I assume) think this way: "It's failing, but it's
EXPECTED to fail, so let's leave it as is".

Leaving it as is was happening anyway. It's not like we had crowds of
devs descending on test fails but ignoring xfails. Most of xfails were
fails and were sitting there ignored for years. So the difference was
"running constantly with 50 fails" or "having some xfails but detecting
new fails easily since we have no or very little fails normally".
The problem is exactly that there are no devs thinking like you imagine
them to think.

from a file) and use that. Failing tests should not be hidden.

They are not hidden. But they were not being fixed when they were just
fails - only thing that happened is that we constantly run with tons of
fails, so it was impossible to distinguish situation of "everything is
fine" from "the build is FUBAR".

functions or not. We could also introduce "Incomplete" state like it's
done in PHPUnit for these tests.

So what's the difference between xfail and incomplete?

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

13 years ago by Ferenc Kovacs — view source

unread

Please note that we were in that position and we moved from there. So to
move back, we need some argument about what's different this time from
the place we were a year ago.

could you elaborate on this part? where were we a year ago?

XFAILs serve now as a pain-killers, we've got about 50 of them in the
repo, so devs (I assume) think this way: "It's failing, but it's
EXPECTED to fail, so let's leave it as is".

Leaving it as is was happening anyway. It's not like we had crowds of
devs descending on test fails but ignoring xfails. Most of xfails were
fails and were sitting there ignored for years. So the difference was
"running constantly with 50 fails" or "having some xfails but detecting
new fails easily since we have no or very little fails normally".
The problem is exactly that there are no devs thinking like you imagine
them to think.

yeah, but as we did see, the current approach makes it very easy to "hide"
even the not so small issues (for example the bunch of date related XFAILS
which you personally asked multiple times to be fixed before the 5.4
release).
I think that in it's current form XFAIL hurts more than it helps.

from a file) and use that. Failing tests should not be hidden.

They are not hidden. But they were not being fixed when they were just
fails - only thing that happened is that we constantly run with tons of
fails, so it was impossible to distinguish situation of "everything is
fine" from "the build is FUBAR".

yeah, and the exact same thing happened with the 5.3.7/CVE-2011-3189, even
though that we have XFAILs.
I think that eliminating the failing tests and making the fails noisy would
be a better approach.
I think that those spontaneous "TestFests" initiated by Rasmus really
helped, and now that we are on git/github, there would be even
greater audience.

functions or not. We could also introduce "Incomplete" state like it's
done in PHPUnit for these tests.

So what's the difference between xfail and incomplete?

http://www.phpunit.de/manual/3.2/en/incomplete-and-skipped-tests.html
currently the XFAIL can mean that either the test is incomplete, or the
functionality is missing/the code is broken, but we are expecting that for
some reason.
killing the XFAIL and adding the Incomplete test feature would allow the
first, but the second should be a failing test.

--
Ferenc Kovács
@Tyr43l - http://tyrael.hu

13 years ago by Stas Malyshev — view source

unread

Hi!

could you elaborate on this part? where were we a year ago?

We had many failing tests that now XFAILs classified as regular FAILs.

yeah, but as we did see, the current approach makes it very easy to
"hide" even the not so small issues (for example the bunch of date
related XFAILS which you personally asked multiple times to be fixed
before the 5.4 release).

And did that happen while they were FAILs? No, it did not. These fails
were still ignored.

I think that in it's current form XFAIL hurts more than it helps.

Hurts what? What is worse than before? Every problem you describe we had
before, and on top of that we have ones that we don't have now.

I think that eliminating the failing tests and making the fails noisy
would be a better approach.

Better in which regard? We know for a fact that having test fails does
not lead to people promptly fixing it. We just have 50 test failures for
a year, and people stop regarding 50 test failures as something
exceptional - we always had tons of test fails, who cares if there's one
or two or ten more?

So if you propose going back to what we already had a year ago, you
still have to explain how situation would be better than it was a year
ago - what exactly changed?

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

13 years ago by Alexey Shein — view source

unread

30 марта 2012 г. 3:19 пользователь Stas Malyshev
smalyshev@sugarcrm.com написал:

Hi!

I've got a suggestion about refactoring our tests suite. I'd like to
remove XFAIL institution and mark all failing tests just as FAIL.
XFAIL has a problem that it hides attention from failing tests
depending on not yet fixed bugs (most important), not yet implemented
features (less important).
Failed tests should make pain. They should bug you every day until you
go and fix them.

Please note that we were in that position and we moved from there. So to
move back, we need some argument about what's different this time from
the place we were a year ago.

XFAILs serve now as a pain-killers, we've got about 50 of them in the
repo, so devs (I assume) think this way: "It's failing, but it's
EXPECTED to fail, so let's leave it as is".

Leaving it as is was happening anyway. It's not like we had crowds of
devs descending on test fails but ignoring xfails. Most of xfails were
fails and were sitting there ignored for years. So the difference was
"running constantly with 50 fails" or "having some xfails but detecting
new fails easily since we have no or very little fails normally".
The problem is exactly that there are no devs thinking like you imagine
them to think.

The difference started from 5.3.9 release when we start to pay much
more attention to tests.
You now can cleanly see failing tests, it's not that huge list, so
it's big difference.
If the problem is only with finding new fails, we can use jenkins for
that - it already detects new fails in builds and can mail it here, so
they won't get unnoticed. It could also bug a commiter who failed the
build - nice feature to have.

The main idea I'm trying to say is that it's comfortable to live with
XFAILs. That's why they live by years. They don't get make any
pressure, we don't have a release rule "No failing tests", so they go
from release to release until some hero will come up and fix them. By
turning them into FAILs they become common problem, because they start
to annoy everyone, so it's easier to collaborate on their fixing.

from a file) and use that. Failing tests should not be hidden.

They are not hidden. But they were not being fixed when they were just
fails - only thing that happened is that we constantly run with tons of
fails, so it was impossible to distinguish situation of "everything is
fine" from "the build is FUBAR".

They are not hidden, but they don't really bother everyone.
My (quite limited setup though) on master branch reports this:
Number of tests : 12100 8194
Tests skipped : 3906 ( 32.3%) --------
Tests warned : 0 ( 0.0%) ( 0.0%)
Tests failed : 3 ( 0.0%) ( 0.0%)
Expected fail : 35 ( 0.3%) ( 0.4%)
Tests passed : 8156 ( 67.4%) ( 99.5%)

We have 3 failing tests and 35 xfails, I don't see any tons of fails
here. Sorry, if I sound like a broken record, but if we need to fix
those, we need to make more noise about that.

functions or not. We could also introduce "Incomplete" state like it's
done in PHPUnit for these tests.

So what's the difference between xfail and incomplete?

XFAIL - expected to fail test. If it's fails - then it's ok. That's
how I understand it. Failing test should not be ok, it's an error. If
you get used to not paying attention on failing tests, you're in
dangerous situation. It's like a fairy tale about boy that cried
"Wolf!". In the end of the story nobody trusts him. That's why I think
we should return trust back to failing tests.

About incomplete, well, it seems it doesn't suite here much, it's
about that test is not fully written or finished.
For example, if we have a plan for some release branch (say, 5.4)
about implementing some features, we can have failing/incomplete
acceptance tests for those (in a separate suite for example), so
release is just a matter of making all tests pass.
If feature is quite big and can take several releases (traits come to
my mind) it could always be put into separate branch until it's ready.

--
Regards,
Shein Alexey

13 years ago by Stas Malyshev — view source

unread

Hi!

The difference started from 5.3.9 release when we start to pay much
more attention to tests.
You now can cleanly see failing tests, it's not that huge list, so
it's big difference.

Yes, and removing XFAILs would kill that advantage.

The main idea I'm trying to say is that it's comfortable to live with
XFAILs. That's why they live by years. They don't get make any
pressure, we don't have a release rule "No failing tests", so they go

You talk about "making pressure", but when date fails were sitting in
the tests as FAILs, they didn't make any "pressure" and nobody was
fixing them. And if we had rule of "no failing tests", we'd have no
releases for years now, because nobody is fixing those tests and bugs
behind them. You want to fix them? Go ahead, no problem. But if there's
nobody to fix them - what's the use to put them in FAILs and prevent us
from seeing issues that are going to be fixed?

We have 3 failing tests and 35 xfails, I don't see any tons of fails
here. Sorry, if I sound like a broken record, but if we need to fix
those, we need to make more noise about that.

OK, you made noise. Let's see how many of those 35 xfails get fixed,
let's say, in a month. How many you would predict it would be?

XFAIL - expected to fail test. If it's fails - then it's ok. That's
how I understand it. Failing test should not be ok, it's an error. If
you get used to not paying attention on failing tests, you're in
dangerous situation. It's like a fairy tale about boy that cried

Nobody already is paying attention, so it's not an "if", it's a fact.
It's a sad fact, but still a fact. And it's not result of the XFAILs,
because this situation predates XFAILs and was there before we moved
such tests to XFAILs.

About incomplete, well, it seems it doesn't suite here much, it's
about that test is not fully written or finished.

If your test is not finished, do it in a fork. By the time the feature
gets merged into main branches, it should be complete enough to run the
tests.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

13 years ago by Yasuo Ohgaki — view source

unread

Hi,

As a distribution maintainer, I would like to distinct "failed test that is
ok" and "failed test that is not ok".

I think release versions should not have test that fails.

How about add "Dev" and "Release" mode for tests with "Dev" mode
as default?

--
Yasuo Ohgaki
yohgaki@ohgaki.net

13 years ago by Dirk Haun — view source

unread

Stas Malyshev wrote:

And if we had rule of "no failing tests", we'd have no
releases for years now, because nobody is fixing those tests and bugs
behind them.

I was about to suggest that maybe PHP should have a rule: "no release with failing tests".

What's the point of a test that fails (or XFAILs)? Either something is broken - then it should be fixed. Or the test makes no sense - then it should be removed.

Yes, I realize this is a simplistic view and things are more complicated in practice. But since, as stated, PHP is paying more attention to tests now (which is a good thing!), maybe it's time to take this one step further.

bye, Dirk

13 years ago by Stas Malyshev — view source

unread

Hi!

I was about to suggest that maybe PHP should have a rule: "no release
with failing tests".

In current situation, this rule would be a bit shorter: "no release".

What's the point of a test that fails (or XFAILs)? Either something
is broken - then it should be fixed. Or the test makes no sense -
then it should be removed.

You are completely right. Please fix it.

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

13 years ago by Pierre Joye — view source

unread

hi,

What's the point of a test that fails (or XFAILs)? Either something is broken - then it should be fixed. Or the test makes no sense - then it should be removed.

See the archive about the reasons, it has been discussed to death many times.

To me, the main reason is:

A test fails but a fix is not easy or not possible right now. To keep
it in our radar, it is still added to our tests suite and listed as
"expected failure". This is very common practice.

That being said, as Stas wrote, feel free to fix them all (the bugs) :-)

Cheers,

Pierre

@pierrejoye | http://blog.thepimp.net | http://www.libgd.org

13 years ago by Alexey Shein — view source

unread

30 марта 2012 г. 5:55 пользователь Stas Malyshev
smalyshev@sugarcrm.com написал:

Hi!

The difference started from 5.3.9 release when we start to pay much
more attention to tests.
You now can cleanly see failing tests, it's not that huge list, so
it's big difference.

Yes, and removing XFAILs would kill that advantage.

The main idea I'm trying to say is that it's comfortable to live with
XFAILs. That's why they live by years. They don't get make any
pressure, we don't have a release rule "No failing tests", so they go

You talk about "making pressure", but when date fails were sitting in
the tests as FAILs, they didn't make any "pressure" and nobody was
fixing them. And if we had rule of "no failing tests", we'd have no
releases for years now, because nobody is fixing those tests and bugs
behind them. You want to fix them? Go ahead, no problem. But if there's
nobody to fix them - what's the use to put them in FAILs and prevent us
from seeing issues that are going to be fixed?

They didn't make any pressure because they were not frequently exposed
on the list and irc.
What I think should be done:

Make a daily notifications about failing tests in this mailing
list and irc. This will create pressure and make sure that nobody will
forget that we still have problems and they need to be solved.
BTW, that's really strange that we still do not have any
notifications about failed builds, but do have them on phpdoc project.
I don't think those guys are smarter than us :)
Create explicit distinction release-stopper tests (let's call them
acceptance) and usual functional/unit tests. For example, we create a
folder "acceptance" under each "tests/" folder and put there all tests
that never should be broken. If those tests are broken, release can't
be made.

We have 3 failing tests and 35 xfails, I don't see any tons of fails
here. Sorry, if I sound like a broken record, but if we need to fix
those, we need to make more noise about that.

OK, you made noise. Let's see how many of those 35 xfails get fixed,
let's say, in a month. How many you would predict it would be?

That's not a noise. See p.1 above. If we don't setup constant
notifications, people won't feel pressure.
Of course, it's easy to tune spam filter in your mail client or ban a
bot on IRC, that's why I'm asking for agreement here, to make it a
part of the development process.
Guys, I respect you very much, all of you. I can feed my family
because of your work. I'm really trying to help. Please, don't get it
personally and let's try to find a decision together. I assume we at
least agree that we have a problem here.

XFAIL - expected to fail test. If it's fails - then it's ok. That's
how I understand it. Failing test should not be ok, it's an error. If
you get used to not paying attention on failing tests, you're in
dangerous situation. It's like a fairy tale about boy that cried

Nobody already is paying attention, so it's not an "if", it's a fact.
It's a sad fact, but still a fact. And it's not result of the XFAILs,
because this situation predates XFAILs and was there before we moved
such tests to XFAILs.

See above.

About incomplete, well, it seems it doesn't suite here much, it's
about that test is not fully written or finished.

If your test is not finished, do it in a fork. By the time the feature
gets merged into main branches, it should be complete enough to run the
tests.

Yes, it's a sane way too.

--
Regards,
Shein Alexey

13 years ago by Christopher Jones — view source

unread

That's not a noise. See p.1 above. If we don't setup constant
notifications, people won't feel pressure.

We do get constant notification of bugs assigned to us. I don't
believe it has any impact on the fix rate.

We need a balance between carrot & stick here. The carrot being
extra hands & a general positive attitude (n.b. by caring about
this issue you are exhibiting both, so my comment is a general
one)

Chris

--
Email: christopher.jones@oracle.com
Tel: +1 650 506 8630
Blog: http://blogs.oracle.com/opal/

13 years ago by Christopher Jones — view source

unread

Hi, internals!

I've got a suggestion about refactoring our tests suite. I'd like to
remove XFAIL institution and mark all failing tests just as FAIL.
XFAIL has a problem that it hides attention from failing tests
depending on not yet fixed bugs (most important), not yet implemented
features (less important).
Failed tests should make pain. They should bug you every day until you
go and fix them.
XFAILs serve now as a pain-killers, we've got about 50 of them in the
repo, so devs (I assume) think this way: "It's failing, but it's
EXPECTED to fail, so let's leave it as is".
That's wrong thinking. Either tests are correct and if they fail you
should fix the code and leave them failed until the code is fixed, or,
if the tests are incorrect - fix the tests or remove them completely.

The XFAIL mechanism reflects the reality of open source that not all
bugs are fixed. We need a simple, low maintenance way to have a
'clean' testsuite shipped which exhibits minimal noise so that users
don't waste time investigating known failures.

XFAIL also allows end users to see when something has broken that used
to work.

If the system is being overused, feel free to call people out on it.
I don't think it should be used for unimplemented features long term.

XFAIL is a simple mechanism. Anything different like moving tests to
a special 'failed' directory adds burden. I don't belive we have
extra cycles for this, but would be happy to be proved wrong.

Chris

--
Email: christopher.jones@oracle.com
Tel: +1 650 506 8630
Blog: http://blogs.oracle.com/opal/

13 years ago by Alexey Shein — view source

unread

30 марта 2012 г. 22:16 пользователь Christopher Jones
christopher.jones@oracle.com написал:

Hi, internals!

I've got a suggestion about refactoring our tests suite. I'd like to
remove XFAIL institution and mark all failing tests just as FAIL.
XFAIL has a problem that it hides attention from failing tests
depending on not yet fixed bugs (most important), not yet implemented
features (less important).
Failed tests should make pain. They should bug you every day until you
go and fix them.
XFAILs serve now as a pain-killers, we've got about 50 of them in the
repo, so devs (I assume) think this way: "It's failing, but it's
EXPECTED to fail, so let's leave it as is".
That's wrong thinking. Either tests are correct and if they fail you
should fix the code and leave them failed until the code is fixed, or,
if the tests are incorrect - fix the tests or remove them completely.

The XFAIL mechanism reflects the reality of open source that not all
bugs are fixed. We need a simple, low maintenance way to have a
'clean' testsuite shipped which exhibits minimal noise so that users
don't waste time investigating known failures.

I'm trying to solve 2 different problems here:

Separate clean testsuite (new failed bugs) from known failed bugs
(as you said) - XFAIL solves that
Keep devs' attention on known failures - XFAIL doesn't solve that.
You remember about them when you run tests and if you want make
attention at them.
What I propose is a single daily newsletter saying "Hey, guys! We
still have XFAIL bugs on 5.3 <list bugs>, 5.4 <list bugs> and master
<list bugs>. Bye!" That will make some pressure, especially if those
bugs have maintainers.

We do get constant notification of bugs assigned to us. I don't
believe it has any impact on the fix rate.

Hmm, that's different. You get a notification if there's some change
on that bug (new comment/state changed/patch etc.). If bug didn't
change for years, you won't get any notifications -> it's more likely
you forget about it.

XFAIL also allows end users to see when something has broken that used
to work.

Maybe, but not the best way, since it involves manual editing phpt
source FAIL->XFAIL. Jenkins build failure notification solves it
better.

If the system is being overused, feel free to call people out on it.
I don't think it should be used for unimplemented features long term.
XFAIL is a simple mechanism. Anything different like moving tests to
a special 'failed' directory adds burden. I don't belive we have
extra cycles for this, but would be happy to be proved wrong.

Agree, that's a lot of work, need to try something else. The problem
is here "what bugs need to be solved for release to be made?". We need
to separate these somehow. XFAIL doesn't really helps here since it's
just "bugs that are hard to solve" and it doesn't enforce any priority
here.
For 5.4 release Stas used wiki for keeping track of bugs stopping the release.

--
Regards,
Shein Alexey

13 years ago by Rasmus Lerdorf — view source

unread

Hmm, that's different. You get a notification if there's some change
on that bug (new comment/state changed/patch etc.). If bug didn't
change for years, you won't get any notifications -> it's more likely
you forget about it.

That's not true. There is a weekly reminder email if you have
outstanding open bugs assigned to you. Although I haven't seen one for a
little while, so we may finally have given up on that since it was
completely ineffective.

-Rasmus

13 years ago by Stas Malyshev — view source

unread

Hi!

That's not true. There is a weekly reminder email if you have
outstanding open bugs assigned to you. Although I haven't seen one for a
little while, so we may finally have given up on that since it was
completely ineffective.

Actually, this one I'd like to keep - though I'd prefer monthly one.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

13 years ago by Yasuo Ohgaki — view source

unread

2012/3/31 Stas Malyshev smalyshev@sugarcrm.com:

Hi!

That's not true. There is a weekly reminder email if you have
outstanding open bugs assigned to you. Although I haven't seen one for a
little while, so we may finally have given up on that since it was
completely ineffective.

Actually, this one I'd like to keep - though I'd prefer monthly one.

+1 for monthly.

Today, I assigned many bugs to myself.
None is not good, weekly is too much.

--
Yasuo Ohgaki
yohgaki@ohgaki.net

13 years ago by Pierre Joye — view source

unread

hi,

+1 for monthly.

It is the case already and I get them regularly.

Cheers,

Pierre

@pierrejoye | http://blog.thepimp.net | http://www.libgd.org

13 years ago by Alexey Shein — view source

unread

31 марта 2012 г. 12:34 пользователь Rasmus Lerdorf rasmus@lerdorf.com написал:

Hmm, that's different. You get a notification if there's some change
on that bug (new comment/state changed/patch etc.). If bug didn't
change for years, you won't get any notifications -> it's more likely
you forget about it.

That's not true. There is a weekly reminder email if you have
outstanding open bugs assigned to you. Although I haven't seen one for a
little while, so we may finally have given up on that since it was
completely ineffective.

Ok, we have a weekly reminder to bug maintainers (that maybe not
working). That's a bit different, I'm talking about public email about
failed tests - if bug is closed, maintainer won't get a notfication
and, if bug is reopened, only maintainer will get a notification, not
everybody on list.

--
Regards,
Shein Alexey

13 years ago by johannes@schlueters.de — view source

unread

Ok, we have a weekly reminder to bug maintainers (that maybe not
working).

Those are working. At least for me :-)
There was some trouble with mails for individual changes, but recently I
got an "you have been assigned" mail, too.

That's a bit different, I'm talking about public email about
failed tests - if bug is closed, maintainer won't get a notfication
and, if bug is reopened, only maintainer will get a notification, not
everybody on list.

Which can also be a lot, many test, unfortunately, are depending on the
environment, like OS or external systems (library version, database
configuration, etc.) and it is hard to cover all those combinations
while still testing edge case conditions.

johannes

13 years ago by Stas Malyshev — view source

unread

Hi!

Keep devs' attention on known failures - XFAIL doesn't solve that.
You remember about them when you run tests and if you want make
attention at them.

Which devs you are referring to? Why you assume their attention needs help?

What I propose is a single daily newsletter saying "Hey, guys! We
still have XFAIL bugs on 5.3 <list bugs>, 5.4 <list bugs> and master
<list bugs>. Bye!" That will make some pressure, especially if those
bugs have maintainers.

I would not subscribe to this and would not read this. Would you? Why?

We know we have technical debt. It's not a secret. What we need is not
more harassment but more people fixing that debt. Spamming whole list
with messages that nobody would read is not the solution.

Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

13 years ago by Alexey Shein — view source

unread

31 марта 2012 г. 12:50 пользователь Stas Malyshev
smalyshev@sugarcrm.com написал:

Hi!

Keep devs' attention on known failures - XFAIL doesn't solve that.
You remember about them when you run tests and if you want make
attention at them.

Which devs you are referring to? Why you assume their attention needs help?

Every developer on this list including core and non-core. There are a
lot of people reading this list, that's clearly seen by lengthy
feature discussions. If you're well-aware of current PHP problems (I
assume you're the best person to ask about, since you're RM), others
may even don't have a glue about that. By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

What I propose is a single daily newsletter saying "Hey, guys! We
still have XFAIL bugs on 5.3 <list bugs>, 5.4 <list bugs> and master
<list bugs>. Bye!" That will make some pressure, especially if those
bugs have maintainers.

I would not subscribe to this and would not read this. Would you? Why?

I don't mean a separate mailing list, but a letter to this list,
internals. If i'm already here, I'd read it. If you think that daily
is too much - let's make it weekly, but it should come every week, not
just once or twice. If it annoys you too much or you think it's
useless for you - you always can tune your spam filter. I'd read it,
since I like writing/fixing tests in my spare time, maybe that letter
would contain some tests I can easily fix (since I'm not good
C-developer I work primarily on tests), or my investigation on problem
will help somebody to make a patch.

We know we have technical debt. It's not a secret. What we need is not
more harassment but more people fixing that debt. Spamming whole list
with messages that nobody would read is not the solution.

You can't interest more people if they are not aware of your problems.
That's why personal reminders won't work good here - if only bug
maintainer will be notified, nobody else will recall that bug.

--
Regards,
Shein Alexey

13 years ago by Rasmus Lerdorf — view source

unread

31 марта 2012 г. 12:50 пользователь Stas Malyshev
smalyshev@sugarcrm.com написал:

Hi!

Keep devs' attention on known failures - XFAIL doesn't solve that.
You remember about them when you run tests and if you want make
attention at them.

Which devs you are referring to? Why you assume their attention needs help?

Every developer on this list including core and non-core. There are a
lot of people reading this list, that's clearly seen by lengthy
feature discussions. If you're well-aware of current PHP problems (I
assume you're the best person to ask about, since you're RM), others
may even don't have a glue about that. By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

Every developer on this list builds PHP at least daily and also runs
"make test" at least every few days so they are well aware of the status
of the tests. I think you will find that there aren't as many developers
here as you think and only the developers are actually going to fix stuff.

An alert on a brand new test failure along with the commit that caused
it, that would be useful and that is what the Jenkins work will
eventually bring us. A list of the same xfails that all of us are
painfully aware of isn't useful.

-Rasmus

13 years ago by johannes@schlueters.de — view source

unread

By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

While many people will simply filter them out. At least that's my
experience with such automated mails in different projects ;-)

johannes

13 years ago by Alexey Shein — view source

unread

1 апреля 2012 г. 0:27 пользователь Johannes Schlüter
johannes@schlueters.de написал:

By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

While many people will simply filter them out. At least that's my
experience with such automated mails in different projects ;-)

johannes

Okay, let's find it out. I've created a poll here:
https://wiki.php.net/xfail_poll.
Please, leave your voice. I'll close the poll in a week. Thank you.

--
Regards,
Shein Alexey

13 years ago by Christopher Jones — view source

unread

1 апреля 2012 г. 0:27 пользователь Johannes Schlüter
johannes@schlueters.de написал:

By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

While many people will simply filter them out. At least that's my
experience with such automated mails in different projects ;-)

johannes

Okay, let's find it out. I've created a poll here:
https://wiki.php.net/xfail_poll.
Please, leave your voice. I'll close the poll in a week. Thank you.

Is there anything in the Jenkins work that makes this discussion irrelevant
(or more relevant)? What other ways should we be running & reviewing test
failures?

--
Email: christopher.jones@oracle.com
Tel: +1 650 506 8630
Blog: http://blogs.oracle.com/opal/

13 years ago by Ferenc Kovacs — view source

unread

On Sun, Apr 1, 2012 at 9:39 AM, Christopher Jones <
christopher.jones@oracle.com> wrote:

1 апреля 2012 г. 0:27 пользователь Johannes Schlüter
johannes@schlueters.de написал:

By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

While many people will simply filter them out. At least that's my
experience with such automated mails in different projects ;-)

johannes

Okay, let's find it out. I've created a poll here:
https://wiki.php.net/xfail_**poll https://wiki.php.net/xfail_poll.
Please, leave your voice. I'll close the poll in a week. Thank you.

Is there anything in the Jenkins work that makes this discussion irrelevant
(or more relevant)? What other ways should we be running & reviewing test
failures?

currently XFAILs are handled as passing tests.
I'm working on a solution for having email notifications about new test
failures.
There are 3 option, which I'm evaluating:
1, one can set thresholds if using the xunit plugin (
https://wiki.jenkins-ci.org/display/JENKINS/xUnit+Plugin) instead of the
built-in junit plugin, which allows to fail the build if a new test failure
is introduced.
2, the email-ext plugin allows you to set trigger for sending an email if a
"regression"(new test failure), and also allows sending email for
"improvement" (which is triggered when a previously failing test starts
passing.
3, creating a custom post-build action, where we compare the current
build's test result with the previous reports, and send out emails/mark the
build failed if a new test failure is introduced.

The problem with 1 is that it doesn't likes our junit.xml. First I debuged
out that it doesn't support nested test suites, but that should be fixed
with https://issues.jenkins-ci.org/browse/JENKINS-8460 I upgraded the
plugin, but still no joy, I have to debug further.
The problem with 2 is that the trigger only happens based on comparing the
previous and current build's test result numbers. So if you introduce a new
test failure and also fix a previously failing test, it won't fire the
"regression" trigger.

seems to be the best solution, but requires the most work on our part.

When we implement this (using one method, or another) we could change so
that the XFAILs are handled from the jenkins POV as test failures, as they
will be listed as such, but you still can see what are the ones failing
since ages and what are the new ones.
Jenkins provides the following informations regarding test failures:

you can see the test failures introduced with a particular build through
the build Status page (
http://ci.qa.php.net/job/php-src-5.3-matrix-build/architecture=x86,os=linux-debian-6.0/452/
for
example)
here you have a link to the Test Result page, plus the number of the
failing tests, plus the sum of the test changes with that build (+-)
under that line, Jenkins list the test failures introduced with that build
under that you have "Show all failed tests >>>" which will expand the test
list to include all failing tests, not just the new ones.

jenkins also provides us with a Test Result Trend on the project page (
http://ci.qa.php.net/job/php-src-5.3-matrix-build/ for example), where you
can toggle between showing all tests or just the failing ones.
you can also see the history for a build for a config through the History
link (
http://ci.qa.php.net/job/php-src-5.4-matrix-build/architecture=x86,os=linux-debian-6.0/651/testReport/history/?
for
example)
you can also see the total test results for a specific build per config (
http://ci.qa.php.net/job/php-src-5.4-matrix-build/architecture=x86,os=linux-debian-6.0/651/testReport/junit/?)
or aggregated (
http://ci.qa.php.net/job/php-src-5.4-matrix-build/651/testReport/?)
you can also see a single test result history via selecting a test and
clicking on the History link (
http://ci.qa.php.net/job/php-src-5.4-matrix-build/651/architecture=x86,os=linux-debian-6.0/testReport/php-src.ext.libxml/tests/004_phpt___libxml_set_streams_context__/history/?
for
example) here you can also see the status for that test in each build, so
you can see when was that introduced or fixed.
for example you can see that the libxml_set_streams_context() started
failing after build number 641, where you can see the Changes mentions a
libxml bugfix:
http://ci.qa.php.net/job/php-src-5.4-matrix-build/architecture=x86,os=linux-debian-6.0/641/changes

having the email notification in place could make those kind of changes
more visible.

--
Ferenc Kovács
@Tyr43l - http://tyrael.hu

13 years ago by Alexey Shein — view source

unread

1 апреля 2012 г. 2:38 пользователь Alexey Shein confik@gmail.com написал:

1 апреля 2012 г. 0:27 пользователь Johannes Schlüter
johannes@schlueters.de написал:

By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

While many people will simply filter them out. At least that's my
experience with such automated mails in different projects ;-)

johannes

Okay, let's find it out. I've created a poll here:
https://wiki.php.net/xfail_poll.
Please, leave your voice. I'll close the poll in a week. Thank you.

It seems nobody likes the idea. Sorry for taking your time and any
inconvenience I've created.

--
Regards,
Shein Alexey

13 years ago by Alexey Shein — view source

unread

1 апреля 2012 г. 14:27 пользователь Ferenc Kovacs
<tyra3l@gmail.com javascript:;>
написал:

On Sun, Apr 1, 2012 at 9:39 AM, Christopher Jones <
christopher.jones@oracle.com javascript:;> wrote:

1 апреля 2012 г. 0:27 пользователь Johannes Schlüter
<johannes@schlueters.de javascript:;> написал:

By constantly publishing
newsletter with failed / xfail bugs you're telling them "That's our
current problems. Maybe you could help us with them". This way we
could convert that discussing energy into some good patches.

While many people will simply filter them out. At least that's my
experience with such automated mails in different projects ;-)

johannes

Okay, let's find it out. I've created a poll here:
https://wiki.php.net/xfail_**poll https://wiki.php.net/xfail_poll.
Please, leave your voice. I'll close the poll in a week. Thank you.

Is there anything in the Jenkins work that makes this discussion
irrelevant
(or more relevant)? What other ways should we be running & reviewing test
failures?

currently XFAILs are handled as passing tests.
I'm working on a solution for having email notifications about new test
failures.

Wow, that was great post :)

snip

you can also see a single test result history via selecting a test and
clicking on the History link (

http://ci.qa.php.net/job/php-src-5.4-matrix-build/651/architecture=x86,os=linux-debian-6.0/testReport/php-src.ext.libxml/tests/004_phpt___libxml_set_streams_context__/history/
?

for
example) here you can also see the status for that test in each build, so
you can see when was that introduced or fixed.
for example you can see that the libxml_set_streams_context() started
failing after build number 641, where you can see the Changes mentions a
libxml bugfix:

http://ci.qa.php.net/job/php-src-5.4-matrix-build/architecture=x86,os=linux-debian-6.0/641/changes

The one problem I see here is that we keep only 100 last builds, so if bug
is old enough you can't know build/revision where it's introduced.
Maybe we should not keep in jenkins build php binary but only lightweight
stuff like junit logs, coverage reports (if we have them) and etc. This way
build size become smaller, so we can keep more of them.

--
Regards,
Shein Alexey

13 years ago by Ferenc Kovacs — view source

unread

The one problem I see here is that we keep only 100 last builds, so if bug
is old enough you can't know build/revision where it's introduced.
Maybe we should not keep in jenkins build php binary but only lightweight
stuff like junit logs, coverage reports (if we have them) and etc. This way
build size become smaller, so we can keep more of them.

yeah, that limit was set because in the first setup, where we had a
separate job for building php and running the testsuite.
with that setup, we had to mark the built php binary as artifact to be able
to copy and keep reference of the binary in the test job.
now that we do both the build and the test execution in one build, we no
longer need the php binary outside of the build, so the build artifact
would be only the junit.xml which is much smaller so we can raise that
limit.
another thing that I want to set up is using the Build Keeper Plugin, which
allows us to mark some builds as keep forever automatically.
I will look into this.

Ferenc Kovács
@Tyr43l - http://tyrael.hu

13 years ago by johannes@schlueters.de — view source

unread

The XFAIL mechanism reflects the reality of open source that not all
bugs are fixed.

I wonder what that has to do with open source ... besides maybe TeX
there's no non-trivial bug free software.

johannes

Change all XFAIL tests to FAIL

So what's the difference between xfail and incomplete?

So if you propose going back to what we already had a year ago, you still have to explain how situation would be better than it was a year ago - what exactly changed?

If your test is not finished, do it in a fork. By the time the feature gets merged into main branches, it should be complete enough to run the tests.

Cheers,

Actually, this one I'd like to keep - though I'd prefer monthly one.

Cheers,

We know we have technical debt. It's not a secret. What we need is not more harassment but more people fixing that debt. Spamming whole list with messages that nobody would read is not the solution.

So if you propose going back to what we already had a year ago, you
still have to explain how situation would be better than it was a year
ago - what exactly changed?

If your test is not finished, do it in a fork. By the time the feature
gets merged into main branches, it should be complete enough to run the
tests.

We know we have technical debt. It's not a secret. What we need is not
more harassment but more people fixing that debt. Spamming whole list
with messages that nobody would read is not the solution.