Notes Information Apocalypse

Improving Architecture with Testable Code

Unit testing can have productive side-effects that manifest in the emergent design of classes and APIs. Testing objects in isolation from their interactions in the larger code-base pushes you to think more closely about their dependencies and responsibilities.

A class that can't function without a complex mangle of interlocking methods and internal state will be very difficult to test. Often, it's very difficult to construct an instance of such a class without pulling in a huge amount of infrastructure and chaining a bunch of components together that may only be incidentally related to the actual behavior of the class itself.

There are various patterns and techniques that make classes easier to test. The overarching idea is that the design of each individual class should be whittled down to only the bare essential functionality (responsibilities). Dependencies should be easier to manage if each class has a clearly defined role and is loosely coupled to other parts of the system, rather than tightly embedded.

For example, imagine that we were building a simple application that needed to pull a list of messages from an email server. Let's say the provider of this list is a class called MessageList, and it needs a MailServer in order to populate itself with a list of messages. A first pass implementation of the constructor might look like this:

class MessageList {
	private $connection;
	function __construct() {
		$this->connection = new MailServer();
	}
}

Immediately, you can see that we have introduced a hidden dependency on the mail server into the message list. This means that any time we want to create a throwaway test instance of the class, we are implicitly opening a connection to the mail server, which may have all kinds of side-effects that are unrelated to the basic functionality of the list that we want to test. How can we create a message list that doesn't force a hard-coded connection to be established each time the class is called?

The simple solution is to invert the dependency. Instead of MessageList being responsible for creating the connection, we force the code that creates a MessageList to be responsible for providing the a connection:

class MessageList {
	private $connection;
	function __construct(MailServer $connection) {
		$this->connection = $connection;
	}
}

In a broader sense, what we have done here, is start to move MessageList away from being dependent on the implementation of MailServer, towards being dependent on its interface. As long as the passed in connection object supports the defined interface, we can expect MessageList to behave in a consistent way. We can validate these expectations with unit tests that focus on the behavior of the object in isolation.

Perhaps we need to test some kind of sorting or filtering behavior, and we don't care how or where the messages are coming from. We can create a fake implementation of the mail server that reads from a static file of messages, purely for the test scenario:

class FakeMailServer extends MailServer {
	function getUnreadMessages() {
		return parse_fixture(file_get_contents('MessageList.fixture'));
	}
}

Now we can pass in the fake object as if it was a real one. This implementation of getUnreadMessages uses an imaginary parse_fixture helper method to return a default set of results that we can make assertions against in our test:

class TestMessageListProvider extends UnitTest {
	function testSortByDateListsRecentMessagesFirst() {
		$list = new MessageList(new FakeMailServer());
		$list->sortByDate();
		$this->assertGreaterThan($list->get(0)->date, $list->get(1)->date);
		$this->assertGreaterThan($list->get(1)->date, $list->get(2)->date);
	}
}

By inverting the dependency and making it a part of the interface, we have made the code more modular and easier to test. Through being easier to test, the MessageList also gains a more clearly defined responsibility. It no longer needs to know how to create the connection to the mail server, all it needs to know is how to use the connection to access the list of unread messages. This design gives us the flexibility to use the fake object to alleviate the complexity of connecting to a real mail server, and ensure that our tests only focus on the actual behavior that matters for the object (in this case, sorting lists of messages in the right order).

I'm a firm believer that dependency injection should not be considered rocket science or bound by more complex framework infrastructure than is necessary. Even such naive approaches as the one described here can be used to effectively propagate loose coupling through a system. The testing process serves as a method of discovering modular architecture inherent in the problem space rather than placing the burden of low level coupling on up-front design.