All about the new Test2 framework and how it will help your tests

The new Test2 framework has been released after a couple years of development. I wanted to find out about what this means for users of Test::Simple and Test::More, so I chatted with the project leader, Chad Granum (exodist).

Andy Lester: So Test2 has just been released after a couple of years of work, and a lot of discussion. For those of us who haven’t followed its development, what is Test2 and why is it a good thing?

Chad Granum: The big changes will be for people who write test modules. The old Test::Builder was tied to specific generation of TAP output. That’s been replaced with a flexible event system.

It all started when David Golden submitted a patch to change the indentation of a comment intended for humans who read the test. The change would help people, but meant nothing to the machine. I had to reject the patch because it broke a lot of downstream modules. Things broke because they tested that Test::Builder produced the message in its original form. I thought that was crazy, and wanted to make things easier to maintain, test, and improve.

Andy: Test::Builder’s internals were pretty fragile?

Chad: That is true, but that’s not the whole picture. The real problem was the tools people used to validate testing tools. Test::Builder::Tester was the standard, and it boiled down to giant string comparisons of TAP output, which mixes messages for the computer’s use, and messages for human use.

While most of the changes are under the hood, there are improvements for people who just want to write tests. Test2 has a built-in synchronization system for forking/threading. If you modify a test to load Test2::IPC before loading Test::More, then you can fork in your tests and it will work in sane/reasonable ways. Up until now doing this required external tools such as Test::SharedFork which had severe limitations.

Another thing I want to note is an improvement in how Test2 tracks file+line number for error reporting purposes. As you know diagnostics are reported when a test fails, and it gives you the filename and line number of the failure. Test::Builder used a global variable $Test::Builder::Level which people were required to localize and bump whenever they added a stack frame to their tool. This was confusing and easy to get wrong.

Test2 now uses a Context object. This object solves the problem by locking in the “context” (file + line) when the tool is first called. All nested tools will then find that context. The context object also doubles as the primary interface to Test2 for tool writers, which means it will not be obscure like the $Level variable was.

Andy: I just counted 1045 instances of $Test::Builder::Level in my codebase at work. Are you saying that I can throw them all away when I start using Test2?

Chad: Yes, if you switch to using Test2 in those tools you can stop counting your stack frames. That said, the $Level variable will continue to work forever for backwards compatibility.

Andy: Will the TAP output be the same? We’re still using an ancient install of Smolder as our CI tool and I believe it expects TAP to look a certain way.

Chad: Extreme care was taken to ensure that the TAP output did not change in any significant ways. The one exception is David Golden’s change that started all this:

Ok 1 - random test
    # a subtest
    Ok 1 - subtest result
    1..1
Ok 2 - a subtest

This has changed to:

Ok 1 - random test
# a subtest
    Ok 1 - subtest result
    1..1
Ok 2 - a subtest

That is the change that started all this, and had the potential to break CPAN.

Andy: So Test2 is all about possibilities for the future. It’s going to make it easier for people to create new Test:: modules. As the author of a couple of Test:: modules myself, I know that the testing of the tests is always a big pain. There’s lots of cut & paste from past modules that work and tweaking things until they finally pass the tests. What’s different between the old way of doing the module testing and now?

Chad: Test::Builder assumed TAP would be the final product, and did not give you any control or hooks into everything between your tool and the TAP, as such you had to test your final TAP output, which often included text you did not yourself produce. In Test2 we drop those assumptions, TAP is no longer assumed, and you also have hooks in almost every step of the process between your tool and the final output.

Many of the actions Test::Builder would accomplish have been turned into Event objects. Test tools do their thing, and then fire events off to Test2 for handling. Eventually these events hit a formatter (TAP by default) and are rendered for a harness. Along with the hooks there is a tool in Test2::API called intercept, it takes a codeblock, all events generated inside that codeblock are captured and returned, they are not rendered and do not affect the global test state. Once you capture your events you can test them as data structures, and ignore ones that are not relevant to your tools.

The Test::Builder::Tester way may seem more simple at first, but that is deceptive. There is a huge loss of information. Also if there are changes to how Test::Builder renders TAP, such as dropping the ‘-‘ then everything breaks.

Using Test::Builder::Tester

test_out("ok 1 - a passing test");
ok(1, 'a passing test');
test_test("Got expected line of TAP output");

Using intercept and basic Test::More tools

my $events = intercept {
    ok(1, 'a passing test');
};

my $e = shift @$events;

ok($e->pass, "passing tests event");
is($e->name, "a passing test", "got event name");
is_deeply(
    $e->trace->frame,
    [__PACKAGE__, __FILE__, 42, 'Test2::Tools::Basic::ok'],
    "Got package, file, line and sub name"
);

Using Test2::Tools::Compare

like(
    intercept {
        ok(1, 'a passing test');
    },
    array {
        event Ok => sub {
            call pass => 1;
            call name => 'a passing test';

            prop file    => __FILE__;
            prop package => __PACKAGE__;
            prop line    => 42; 
            prop subname => 'Test2::Tools::Basic::ok';
        };
    },
    'A passing test'
);

Andy: What other features does Test2 include for users who aren’t creating Test:: modules?

Chad: Test2’s core, which is included in the Test-Simple distribution does not have new features at the user level. However Test2-Suite was released at the same time as Test2/Test-Simple, and it contains new versions of all the Test::More tools, and adds some things people have been requesting for years, but were not possible with the old Test::Builder

The biggest example would be “die/bail on fail”, which lets you tell the test suite to stop after the first failure. The old stuff could not do this because there was no good hook point, and important diagnostics would be lost.

It’s as simple as using one of these two modules:

use Test2::Plugin::DieOnFail;
use Test2::Plugin::BailOnFail;

The difference is that DieOnFail calls die under the hood. The BailOnFail will send a bail-out event which will abort the current file, and depending on the harness might stop the entire test run.

Andy: So how do I start using Test2? At my day job, our code base has 1,200 *.t files totalling 282,000 lines of code. Can I expect to install the new version of Test::Simple (version 1.302019) that includes Test2 and everything will “just work”?

Chad: For the vast majority of cases the answer is “yes”. Back-compatibility was one of the most significant concerns for the project. That said, some things did unfortunately break. A good guide to what breaks, and why can be found in this document. Usually things that break do so because they muck about with the Test::Builder internals in nasty ways. Usually these modules had no choice due to Test::Builder’s limitations. When I found such occurrences I tried to add hooks or APIs to do those things in sane/reasonable ways.

Andy: Do I have to upgrade? Can I refuse to go up to Test-Simple 1.302019? What are the implications of that?

Chad: Well, nobody is going to come to you and force you to install the latest version. If you want to keep using your old version you can. You might run into trouble down the line if other Test:: tools you use decide to make use of Test2-specific features, at which point you would need to lock in old versions of those as well. You also would not be able to start using any new tools people build using Test2.

Andy: And the tools you’re talking about are Test:: modules, right? The command line tool prove and make test haven’t changed, because they’re part of Test::Harness?

Chad: Correct. Test::Harness has not been touched, it will work on any test files that produce TAP, and Test2 still produces TAP by default. That said I do have a project in the works to create an alternative harness specifically for Test2 stuff, but it will never be a requirement to use it, things will always work on Test::Harness.

Andy: So if I’m understanding the Changes file correctly, Test-Simple 1.302012 was the last old-style version and 1.302014 is the new version with Test2?

Chad: No, Test-Simple-1.001014 is the last STABLE release of Test-Simple that did not have Test2, then Test-Simple-1.302015 was the first stable release to include Test2. There were a lot of development releases between the 2, but no stable ones. The version numbers had to be carefully crafted to follow the old scheme, but we also had to keep it below 1.5xxxxx because of the previous maintainers’ projects which used that number as well as 2.0. Some downstream users had code switched based on version number and expected an API that never came to be. Most of these downstream distributions have been fixed now, but we are using a “safe” version number just in case.

Andy: What has development for this been like? This has been in the works for, what, two years now? I remember talking to you briefly about it at OSCON 2014.

Chad: At the point we talked I had just been given Test-Simple, and did not have any plans to make significant changes. What we actually talked about was my project Fennec which was a separate Test::Builder based test framework. Some features from Fennec made their way into Test2, enough so that Fennec will be deprecated once I have a stable Test2::Workflow release.

Initially development started as a refactor of Test::Builder that was intended to be fairly small. The main idea was to introduce the events, and a way to capture them. From there it ballooned out as I fixed bugs, or made other changes necessary to support events.

At one point the changes were significant enough, and broke enough downstream modules that I made it a complete fork under the name Test-Stream. I figured it would be easier to make Test::Builder a compatibility wrapper.

In 2015, I attended the QA hackathon in Berlin, and my Test-Stream fork was a huge topic of conversation. The conversation resulted in a general agreement (not unanimous) that it would be nice to have these changes. There was also a list of requests (demands?) for the project before it could go stable. We called it the punch-list.

After the Berlin hackathon there was more interest in the project. Other toolchain people such as Graham Knop (Haarg), Daniel Dragan (bulk88), Ricardo Signes (rjbs), Matt Trout (mst), Karen Etheridge (ether), Leon Timmermans (leont), Joel Berger (jberger), Kent Fredric (kentnl), Peter Rabbitson (ribasushi), etc. started reviewing my code, making suggestions and reporting bugs. This was one of the most valuable experiences. The project as it is now, is much different than it was in Berlin, it is much better from the extra eyes and hands.

A month ago was another QA hackathon, in Rugby UK, and once again Test2 was a major topic. This time the general agreement was that it was ready now. The only new requirements on the table were related to making the broken downstream modules very well known, and also getting a week of extra cpan-testers results prior to release.

I must note that at both QA hackathons the decisions were not unanimous, but in both cases there was a very clear majority.

Andy: So what’s next? I see that you have a grant for more documentation. Tell me about that, and what can people do to help?

Chad: The Test2 core API is not small, and has more moving pieces than Test::Builder did. Right now there is plenty of technical/module documentation, but there is a lack of overview documentation. There is a need for a manual that helps people find solutions to their problems, and tied the various parts together. This is the first part of the manual docs for tool authors.

Test2::Suite is also not small, but provides a large set of tools for people to use, some are improvements on old tools, some are completely new. The manual will have a second section on using these new tools. This second part of the manual will be geared towards people writing tests.

The best way for people to help would be to start using Test2::Suite in their tests, and Test2 in their test tools. People will undoubtedly find places where more documentation is needed, or where things are not clear. Reporting such documentation gaps would help me to write better documentation. (Test::More repo, Test2::Suite repo)

Apart from the documentation, I have 2 other Test2 related projects nearing completion: Test2-Workflow, which is an implementation of the tools from Fennec that are not a core part of Test2, and Test2-Harness which is an optional alternative to Test::Harness. Both are pretty much code-complete on GitHub, but neither has the test coverage I feel is necessary before putting them on CPAN.

Andy: Thanks for all the work that’s gone into this, both to you and the rest of those who’ve contributed. It sounds like we’ll soon see more tools to make testing easier and more robust.