Test Scorecards

How do you add tests to a project that show a bug exists? How do you ensure that, once a bug is fixed, that fix is recorded? These questions have been present in my mind for several years. I want to use this space to explore an idea I have had for a while on how to address them.

Development in OpenStack Keystone evolved over the years to have (at least) three classes of tests. Unit test were the worker bees, executed as the developer actively changed code. In project Functional tests were the slower but more complete check that the user had not changed fundamental assumptions about how Keystone interacted with the other components. Out of project functional tests acted as a form of double-entry accounting, making sure that Keystone changes could not be “snuck through” paired with test changes. These last reside in a separate project called Tempest. This structure works to keep the project stable.

Lets assume we have a bug in a subsystem. It gets recorded in the bug tracker as bug 56789. The user that reports the bug provides the steps to reproduce it. When it first gets recorded, it gets a state of “new” until one of the core members looks at it and triages the bug. If they get to the point where they can reproduce, they change the status of the bug and, at some point, someone should offer a fix. Part of that fix is a tag in the commit message:

closes-bug: 56789

What if we want to apply automation to that check? What if we want to write code that performs that same validation and confirmation of the bug? We can’t check it in to any repository in a runing state as it will then prevent the checks and gate jobs in Zuul from passing. The best we can do is merge a disabled version of the test that is skipped until the fix goes in.

What I would like to see is a way to add a failing test to a commit. The “scorecare” referenced in the title of this article is a way to record the state of the tests, including the fact that this test fails. Something that looks like this:

+Failedkeystone.tests.unit.test_healthcheck.HealthCheckTestCase.test_set_healthcheck: Failed

The scorecard changes get submitted with each patch. If a test runs and it switches the scorecard from Failed to Passed, that would get recorded.

Tests that fail without a corresponding scorecard entry showing they should fail cause a break in the circuit.

If a change gets submitted and it switches a test from Passed to Failed, that should get caught in code review. The submitter better have a really good reason for introducing what appears to be a regression; if the test should never pass, it should be removed

Adding a “Failed” entry should not be automatic; it is a deliberate decision on the part of the user that the tests should not pass. The essential part of this process is the automating of the scorecard from failed to passed so that regressions are updated by the system.

Although I have referenced Keystone in this article, this is not a Keystone specific change. The mechanism for scorecards should be impleneted in the test runners for any language, and could be part of the Continuous Integration process for any project.

What projects and tooling is there in the wild that supports test scorecards?

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things

Leave a Reply