Building a Treemap Reporter for SimpleTest

Treemaps emerged in the mid 90's as a method of visualizing hierarchies, and have since been widely used throughout the software design community. But very little work is out there exploring visualizations of unit test relationships - it seems treemaps are more commonly found depicting code coverage - a good use for sure, but very metric, not very qualitative. Code coverage is not usually important for me when I'm using SimpleTest, I'm more interested in exploring a particular prototype of an API concept, and attempting tests to discover if I have an abstraction that works. Running all the tests on every run generally interferes, so I'll usually run test cases one at a time when I'm working. But I sometimes need to have a higher level view of spot failures. Running all tests in classic mode, I see a whole list of failures and exceptions, but as a visual person, it's often hard to understand at a quick glance what is going on. Treemaps could provide a unique macroscopic view of the "shape" of a test suite.

Mapping Unit Tests

A good place to start is by analyzing various unit testing interfaces, paying close attention to cues that give away the underlying mental model conveyed by the manner in which the test results are displayed. For example, JUnit in Eclipse has thumpin' little bars that indicate that something is going on in realtime. Watch your tests inflate one by one, as they pump their way through the tree menu. SimpleTest talks back in text, saying: "Your tests are running. Here are the errors, here are the failing tests. Your tests finished. Here's the overall results. Pass or fail."

None of these approaches consider spatiality from notions of size, priority, or weighting. If looked at more qualitatively, we might see the relationships between the number of assertions, methods, and test cases declared as providing a unique visual "fingerprint" for a test application. Treemaps displaying test results would communicate a weighted hierarchy - test cases would be considered to be heavier weighting if they contained more methods and more assertions. Long thin test cases with lots of short methods would stand out. Fat methods in short classes would look boxed in. Get the picture?

The Gradual Elimination of Failures

An early test results experiment

Capturing The Data Structures

The SimpleReporter provides an event driven paint method interface that can be usurped to gain control over aggregated test data. My initial burrowings into the code uncovered a peculiarity in the Reporter implementation: getTestList() is confusingly documented, and appears to be accessing an array, representing a running stack of test results. Intercepting the stack should provide information about 'where' we are in the test - ie; the nested levels of group -> testcase -> method -> assertion.

Either, I'm missing something really obvious about how to use this in my code, or else I'm just distracted with the need for explicitness, but I found it easier to just hack together a little stack wrapper class, so I could be clear about what I was doing in the handler methods (in pseudocode):

class TreemapStack ...

	void function push(TreemapNode $node);

	TreemapNode function peek();

	TreemapNode function pop();

Now instead of echoing the running results back to the browser, we capture them on the stack and use it to build a graph which consists of a single TreemapNode class. Each node simply holds a list of direct edge child nodes and provides accessor methods to the sub-graph (in pseudocode):

class TreemapNode ...

	Array[TreemapNode] function getChildren();

	void function putChild(TreemapNode $node);

	boolean function compare(TreemapNode $a, TreemapNode $b)

	integer function getSize();

	integer function getTotalSize();

	void function fail();

	boolean function isFailed();

	boolean function isLeaf();

The graph provides a sorted list by default, and has a recursive getTotalSize method used to calculate the weight of the subtrees for a node. It's pretty blah, but it is all that is really needed to implement a rudimentary treemap of test cases. Firstly, we need to create a TreemapReporter. Because of the way the SimpleReporter is activated, the HTML output for the treemap will need to be dumped from the paintFooter method. This method calls the slice and dice algorithm divideMapNodes to paint a bunch of div tags with a percentage width and height based on a slice and dice. Note the variables $this->\_graph and $map are both instances of TreemapNode:

class TreemapReporter extends SimpleReporter {...
	function paintFooter($group) {
		$this->paintRectangleStart($this->_graph, 100, 100);
		$this->divideMapNodes($this->_graph);
		$this->paintRectangleEnd();
	}

	function divideMapNodes($map, $aspect=1) {
		$aspect = !$aspect;
		$divisions = $map->getSize();
		$total = $map->getTotalSize();
		foreach($map->getChildren() as $node) {
			if (!$node->isLeaf()) {
				$dist = $node->getTotalSize() / $total * 100;
			} else {
				$dist = 1 / $total * 100;
			}
			if ($aspect) {
				$horiz = $dist;
				$vert = 100;
			} else {
				$horiz = 100;
				$vert = $dist;
			}
			$this->paintRectangleStart($node, $horiz, $vert);
			$this->divideMapNodes($node, $aspect);
			$this->paintRectangleEnd();
		}
	}
...}

Now all that is remains is the implementation of remaining SimpleTest paint methods, using the Start and Endfix events to push and pop elements from the stack into the graph. When the end of the wrapping parent group test is reached in paintGroupEnd, the bottom node is pushed onto the TreemapReporter::\_graph where it can be accessed by other paint methods. The code works because if TreemapReporter::\_stack->peek() is null, then the previously popped node must have been at the bottom of the stack:

class TreemapReporter extends SimpleReporter {...
	function paintGroupEnd($message) {
		$node = $this->_stack->pop();
		$current = $this->_stack->peek();
		if ($current) {
			if ($node->isFailed()) $current->fail();
			$current->putChild($node);
		} else {
			$this->_graph = $node;
		}
		parent::paintGroupEnd($message);
	}
...}

The actual spitting out of divs is done by paintRectangleStart and paintRectangleEnd:

class TreemapReporter extends SimpleReporter {...

	function paintRectangleStart($node, $horiz, $vert) {
		sprintf("<div title="%s: $s" class="$status" style="width:{$horiz}%;height:{$vert}%">",
				$node->getName(), $node->getDescription, $node->getStatus());
	}

	function paintRectangleEnd() {
		echo "</div>";
	}
...}

And on it goes...

Just a couple lines of CSS and a few graph oriented tweaks to the SimpleTest reporter mechanism, and we're off on the pathway towards test visualization! Here are some examples, moving between the micro and macro views of different test groupings:

Web test of a JSON/HTTP events service:

Custom reporter for test suite:

Fragment from tests for SimpleTest itself:

It's hard to go back:

But back we must go...

The TreemapReporter gets the job done, but it is problematic for several reasons:

  1. The treemap division algorithm is specifically tied to the single reporter, it's hard to customize
  2. The class currently occupies two distinct roles - it records information about the graph structure of the tests and it calculates the division of nodes.
  3. The display is hacky and brittle, relying on quirky CSS to work.

The solution we want is to have a structure that allows us to plugin multiple treemap algorithms to operate on the test node graph. We want to separate out the recording parts from the reporting parts.

This issue has come up for a number of people working on extensions to the SimpleTest reporter. In CVS, there is currently an ArrayReporter, which I always regarded as a bit of a smell, although I was never quite sure why. One of the suggestions that came up on the SimpleTest mailing list was renaming this class to ArrayRecorder - this makes sense, although it leads to more questions about the precise role that the SimpleReporter should perform.

Clearly, there are two different styles of reporting tests. I refer to them as the active style and the deferred style.

The active style is what SimpleTest is built around, and can be thought of as line-by-line reporting, where each message is sent to the output stream when it is triggered. This would be the standard way of reporting a testing process on any platform, but it is complicated when tests are triggered via HTTP - the output stream feels rather less immediate when running tests in a browser. It's still useful in many situations where errors and exceptions are propagating or when program execution is being halted with die and dump statements in the mix.

The deferred style involves a two stage process: first, building a data structure representing the tests as they run, and when the entire test suite is complete, passing the data structure on to a second stage of visual processing where information is extracted or calculated from the test results. This style is necessary for any kind of mapping or data aggregation of test results.

In terms of a test reporting API, the important difference between these two styles is that the active testing style requires an event driven interface (the reporter wants to be notified of testing events as they happen), whereas the deferred style requires an API that provides access to the structure of the tests (most likely, as a collection that implements a list or graph type).

Obviously, the active style can be used as a building block for the deferred style, simply by capturing the output events with a logger or collection builder. This is exactly what the TreemapReporter shown above is doing. Essentially, it is a recorder of test events. To better support this, the interface of SimpleReporter or SimpleScorer could drop getTestList as a public method or replace it with a proper Stack interface, and the framework could provide an additional Recorder interface that Reporters could implement in order to provide collection-like behavior.

If a reporter was a recorder, it would simply be used to capture test events in a data structure, and offer access to this data structure for the display code to tear apart. To try this out, we need to break apart the TreemapReporter and move the treemap rendering to a decorator. First we rename the class and provide methods to access the graph collection:

class TreemapRecorder {...

	function getGraph() {
		return $this->_graph;
	}

	function isComplete() {
		return ($this->_graph != null);
	}

...}

Something to think about: what should getGraph do when the graph is null? Throw an exception, or implement some kind of sensible default. It doesn't matter just now, and in the meantime, the isComplete method provides this check. Onwards...

SimpleTest provides a SimpleReporterDecorator which we can use to chain pluggable algorithms to the wrapped TreemapRecorder:

class TreemapReporter extends SimpleReporterDecorator {...

	function TreemapReporter() {
		$this->SimpleReporterDecorator(new TreemapRecorder());
	}

...}

I've never been totally comfortable with the Decorator pattern in PHP as it seems to often be used indecisively as a generic formalization when it is unclear where to use inheritance, and leads to reams of boilerplate wrapping code.

In fact, the Decorator pattern is useful in this particular case because the essence of the pattern comes from a need to add additional behaviour that the concrete class does not support. In this case, we are adding division and nesting of test nodes to the reporter. We need to capture the whole graph first before we can calculate extra visual properties to output the divs representing each test node.

All the extra behavior can be moved out of the class implementing SimpleReporter and into to the wrapping decorator:

class TreemapReporter extends SimpleReporterDecorator {...

	function paintResultsHeader();

	function paintResultsFooter();

	function paintRectangleStart();

	function paintRectangleEnd();

	funtion divideMapNodes(TreemapNode $map);

...}

The only place where we need to change the moved code is to convert the internal graph instance variable to a method call. This was originally happening in the paintFooter method - let's change it to paintResults. Due to a quirk in the way the reporter is wrapped, we also need a concrete method in the decorator to trigger the footer display - another point to consider for the process of smoothing out the SimpleTest interface for extension developers:

class TreemapReporter extends SimpleReporterDecorator {...

	function paintResults() {
		$this->paintResultsHeader();
		$this->paintRectangleStart($this->_reporter->getGraph(), 100, 100);
		$this->divideMapNodes($this->_reporter->getGraph());
		$this->paintRectangleEnd();
		$this->paintResultsFooter();
	}

	function paintGroupEnd($group) {
		$this->_reporter->paintGroupEnd($group);
		if ($this->_reporter->isComplete()) {
			$this->paintResults();
		}
	}

...}

This code now executes the main test events on the TreemapRecorder and defers display to the TreemapReporter decorator. We can make the TreemapReporter an abstract base class and implement different treemap algorithms in subclasses. But first we need to understand a little more about the possibilities of presenting treemaps.

More Possibilities For Web Presentation

The capabilities of CSS and HTML are limited in comparison to a real vector graphics API. On todays web, there are a great many more possibilities for display. Some uses of javascript or SVG may require a radically different approach to handling the output. The only way to find out is to jump straight in and implement something. Let's start with a JQuery treemap:

class JqueryTreemapReporter extends TreemapReporter {

	function _getCss() {
		$css = ".treemapView { color:white; }";
		$css .= ".treemapCell {background-color:green;font-size:10px;font-family:Arial;}
  		.treemapHead {cursor:pointer;background-color:#B34700}
		.treemapCell.selected, .treemapCell.selected .treemapCell.selected {background-color:#FFCC80}
  		.treemapCell.selected .treemapCell {background-color:#FF9900}
  		.treemapCell.selected .treemapHead {background-color:#B36B00}
  		.transfer {border:1px solid black}";
		return $css;
	}

	function paintResultsHeader() {
		$title = $this->_reporter->getTitle();
		echo "<html><head>";
		echo "<title>{$title}</title>";
		echo "<style type="text/css">" . $this->_getCss() . "</style>";
		echo "<script type="text/javascript" src="jquery.js"></script>";
		echo "<script type="text/javascript" src="treemap.js"></script>";
		echo "<script type="text/javascript">n";
		echo "	window.onload = function() { jQuery("ul").treemap(800,600,{getData:getDataFromUL}); };
				function getDataFromUL(el) {
					var data = [];
					jQuery("li",el).each(function(){
					  var item = jQuery(this);
					  var row = [item.find("span.desc").html(),item.find("span.data").html()];
					  data.push(row);
					});
				return data;
				}";
		echo "</script></head>";
		echo "<body><ul>";
	}

	function paintRectangleStart($node) {
		echo "<li><span class="desc">". basename($node->getDescription()) . "</span>";
		echo "<span class="data">" . $node->getTotalSize() . "</span>";
	}

	function paintRectangleEnd() {}

	function paintResultsFooter() {
		echo "</ul></body>";
		echo "</html>";
	}

	function divideMapNodes($map) {
		foreach($map->getChildren() as $node) {
			if (!$node->isLeaf()) {
				$this->paintRectangleStart($node);
				$this->divideMapNodes($node);
			}
		}
	}

}

Because the division of map nodes is being done on the Javascript side, the implementation of these methods just involves outputting the necessary data structure.

Of course, this raises the question of whether the deferred style I've so far outlined is even necessary – why not maintain HTML or XML as the collection data-structure representing the tests? And with JS/CSS, keep the presentation layer as a cleanly-separated UI layer from the unit testing machinery itself?

One possible reason is that we gain a lot from representing the test results as an abstract data type. Having access to graph operations directly on the tests at run-time, we can split and feed the test results in all kinds of directions. Emails, tag clouds, graph viz files. There's still quite a bit of work to do, but I think the basic patterns are starting to emerge from these code experiments.