Heat Map Reporter for Minitest · Minitest Heat automatically collects failures and inspects backtraces to proactively identify patterns and provide a heat map summary of the files and line numbers that most frequently appear to be the causes of issues so you can get to the source of problems more efficiently.
The more time I spent on automated testing, the more I felt the need for a results formatter that would streamline the process of prioritizing and fixing failing tests. I was on the look out for a formatter that would organize and present test results in a way that could both recognize patterns and make it more efficient to find the true source of a failing test.
The problem I kept running into is that all of the formatters I found treated every test failure as an isolated issue. However, in practice, I often found that a multiple failures could often be traced back to a single change in an unrelated area of the code. I’d end up scrolling through test failures trying to identify a pattern.
It was always tedious, and it was often difficult to identify related failures. That was definitely going to be a job for a computer. So I started tinkering with formatters to see if they could proactively group similar results/failures and do a better job of recognizing nuance in test failures.
After some experimentation, I created a custom reporter for Minitest to try and proactively identify the underlying source of a problem by inspecting and classifying each failure and customizing the information displayed based on the context and type of failure. It also presents a heat map summary to help more quickly identify individual areas that are likely to be causing the other errors.
Nuance in Test Results
The first detail that makes Minitest Heat a little different revolves around recognizing nuance. Instead of Pass/Fail/Exception, it goes into a little more detail. When a test passes, it also considers whether that test was fast, slow, or “painfully slow’. In the case of an exception, it considers whether the exception arose directly from the code in the test or from the code being tested.
For example, if an exception is triggered from the source code, then that’s a special failure case worth investigating as a failure. If, however, an exception arises from the test code, that’s not necessarily a failure of the source code. It’s simply a "broken” test rather than a pure failure. It also supports the ability to set thresholds for slow and “painfully slow” tests so that even a passing test can be flagged for attention if it’s dragging down the test suite.
With these insights, test results have additional context that can help you prioritize how you approach fixing failed tests.
Nuance in Stack Traces
When exceptions arise, they can happen in your code or in a gem or other code that isn’t directly under your control. When Minitest Heat shows a stack trace, it automatically highlights the lines of code from your codebase so they stand out from other library or framework code.
In addition to highlighting your code in stack traces, it also reviews the files in the stack trace to let you know which of those files was most recently modified. That way, if a recent change caused the test failure, you spend less time swimming through stack traces and jump straight to the source of the problem.
Since Minitest Heat recognizes nuance, it can also be more selective about what it reports, and it can prioritize those results based on the type of issue.
For example, exceptions are reported first, and if there are any failures, the results will show you the counts of pending or slow tests, but they won’t clutter the detailed results with the specifics of slow or pending tests unless all of the tests in the run pass.
Connecting Stack Traces
When there are exceptions, Minitest Heat looks at the stacktrace and begins building a heat map of where exceptions occurred. And when there are test failures, it similarly maps the failures to make it more obvious if the failures are are arising from similar locations.
Like most tools, Minitest Heat is a work in progress, but it’s definitely ready for prime-time. I’ve been using it actively for some time now, and it’s been very stable and helpful. When I work on a project that doesn’t use it, I definitely miss it.