Tag Archives: Tautological tests

How mocking can be a recipe for disaster

Testing is a Mere Trifle

As a software developer, how would you advise someone to test if a dessert was good enough to serve to your friends and family?

Got your answer? OK. No changing your mind.

In a thanksgiving episode of Friends, Rachael Green decides to make an English trifle. These desserts are traditionally made with custard, sponge cake fingers, and raspberry jelly (jello if you are American).

The pages of her recipe book are stuck together, and she ends up mixing in beef, sauteed with peas and onions. She doesn’t question this, thinking that the British have odd traditional food, and it’ll probably work out.

The result “tastes like feet”.

Hands up if you answered “I would taste it,” or a variant thereof.

Good. You can invite me over anytime.

Did you answer “I would compare the steps taken to the steps in the recipe?”

Sorry – I’d love to come over but I’m really swamped with twitter posts I need to catch up on.

This is one example of a tautological test. “I compare the steps I planned to take to the steps I ended up taking” gives you no confidence that you had a good plan in the first place.

In general a test is Tautological when it repeats some of the implementation of the system under test.  These tests end up adding very little value.

Unfortunately software developers write this sort of test all the time, and don’t even know they are doing it.

Why do we use TDD?

Here are four commonly expressed reasons:

  1. To increase confidence that the code works
  2. To help us make better design decisions
  3. To make refactoring easy
  4. To help prevent regression bugs

Take a look at this code…

public class CarService
{
    private readonly ISearchQueryFactory _queryFactory;
    private readonly IRepository _repository;

    public CarService(ISearchQueryFactory queryFactory, IRepository repository)
    {
        _queryFactory = queryFactory;
        _repository = repository;
    }

    public List<Car> FindAll()
    {
        var query = _queryFactory.Create<Car>(); 
        return _repository.Search(query);
    }
}

This example presupposes we have a Repository that takes a Query<T> object to which we may add any search filters and allows us to Search the database for matches. Since we want all cars as results, in this case we use the default unfiltered query.

…and the code that tests it.

[Test]
public void FindAllShouldSearchForAllCars()
{
    var mockFactory = new Mock<ISearchQueryFactory>();
    var mockRepository = new Mock<IRepository>();

    var testee = new CarService(mockFactory.Object,
        mockRepository.Object);

    testee.FindAll();

    mockFactory.Verify(x => x.Create<Car>(), Times.Once);
    mockRepository.Verify(x => 
        x.Search(It.IsAny<SearchQuery<Car>>()),
        Times.Once);
}

This is a modified example from an excellent introduction to the tautological test antipattern by Fabio Pereira, which I am reusing to look at alternatives.

When we discussed this example in my office, at least initially, the room divided into two camps – was this test good or not?

Let’s rip it to shreds.

This test gives the coder no confidence that the implementation is correct (1) – because the recipe in the code is repeated in the test.

var query = _queryFactory.Create<Car>();
return _repository.Search(query);
mockFactory.Verify(x => x.Create<Car>());
mockRepository.Verify(x => 
    x.Search(It.IsAny<SearchQuery<Car>>()));

From a TDD point of view it does (emptily) perform the function of being a failing test that forces certain code to be written – but unlike black box TDD, this test gives no scope for the problem to be solved in any other way. This test cannot help a good design grow out of the requirements. (2)

A worthwhile question then is – if you knew you would write this code this way anyway, why did you bother with a test?

How about refactoring? Well if we try and implement the code another way, the test will fail, even if all the production code works. But if we try and change the test first, we still hit a red light. This is terrible – we should be able to refactor production code while maintaining a green light all the time! For each change we must edit the code and the associated test(s) – incidentally this also breaks the single responsibility principle. (3) We can’t have any confidence that our refactoring is side-effect-free if we require the tests to be rewritten – so we have no protection against regression bugs. (4)

And yeah, that eliminates all four of our reasons why we work with TDD in the first place.

So we could write tests like this and claim to be performing TDD. But if we do, we get none of the benefits of TDD whatsoever. Rather the tests have negative value – reducing the flexibility of the code; slowing down maintenance.

There must be a way of testing this feature that isn’t tautological, right?

Sure!

Let’s start with an integration test.

[Test]
public void FindAllShouldReturnAllCars()
{
    var expectedCars = 
        new List<Car> { new Car() { Id = 1 } };

    var factory = new SearchQueryFactory();
    var repository = new Repository();

    var testee = new CarService(factory, repository);
    var results = testee.FindAll();
    
    CollectionAssert.AreEqual(expectedCars, results);
}

Well this is fine. This tests that the code works, and there’s no ugly knowledge floating around in the test about the internal implementation.

In terms of being black box, this is ideal, and will have the best possible chance of remaining robust even if the implementation changes.

Supposing that the repository talks to a database, then the big issue with this test is that it does not isolate the system under test from its external dependencies. We can’t know that the test database will always be available; it may have maintenance costs of its own; and external resources may be slow to access, which reduces the effectiveness of our test suite as a fast-feedback system.

By stubbing just the external dependencies, we can resolve these speed and maintenance issues (but we trade away the knowledge that the system works correctly even across the architectural boundary).

[Test]
public void FindAllShouldReturnAllCars()
{
    var expectedCars = 
        new List<Car> { new Car() { Id = 1 } };

    var factory = new SearchQueryFactory();
    var stubRepo = new Mock<IRepository>();
    stubRepo.Setup(
        x => x.Search(It.IsAny<SearchQuery<Car>>()))
        .Returns(expectedCars);
            
    var testee = new CarService(factory, stubRepo.Object);
    var results = testee.FindAll();

    CollectionAssert.AreEqual(expectedCars, results);
}

I call this test a “classicist test” because Martin Fowler refers to Classic TDD in contrast to Mockist TDD in his vital article Mocks aren’t Stubs.

The classical TDD style is to use real objects if possible and a double if it’s awkward to use the real thing” – Fowler

When I demoed this example to my colleagues, Steven asked “What about isolation? This test isn’t a proper unit test.”

I probably would have agreed before reading Fowler’s article – but he says that unit testing has not always meant the same thing to all people. As a Classic TDDer, his method is to assume all other systems are working correctly and write the test to test the next unit in the stack based on that assumption.

“[when multiple tests break…] Classicists don’t express this as a source of problems. Usually the culprit is relatively easy to spot by looking at which tests fail and the developers can tell that other failures are derived from the root fault.”  – Fowler

The advantages this test has are speed and black box requirements testing – meaning that it is reasonably robust against implementation change – however a little white boxing has leaked through now that we are providing the return values from the stub.

We can go further.

[Test]
public void FindAllShouldSearchForAllCars()
{
    var expectedCars = 
        new List<Car> { new Car() { Id = 1 } };
		
    var stubFactory = new Mock<ISearchQueryFactory>();
    stubFactory.Setup(x => x.Create<Car>())
		.Returns(new SearchQuery<Car>());
		
    var stubRepo = new Mock<IRepository>();
    stubRepo.Setup(
        x => x.Search(It.IsAny<SearchQuery<Car>>()))
        .Returns(expectedCars);
		
    var testee = new CarService(mockFactory.Object,
        mockRepository.Object);

    var results = testee.FindAll();

    CollectionAssert.AreEqual(expectedCars, results);
}

Now all the dependencies have been stubbed, this is a true isolation test. But there’s a trade off – each time a stub is introduced, a little more white box knowledge leaks into the test.

It’s not as bad as mocking though – each time a mock is introduced, considerably more white box knowledge leaks through.

Compare this test to the mockist test – the mockist test has absolutely no advantages. And in my experience, far more often than not, a test with mocks can be re-written to use stubs. (I’ll look at some times when this isn’t true in a future blog.)

Give me one kind of test I can always write that always works.

No can do.

If you need a rule to take away, then habitual mocking is an unnecessary practice that prevents us from achieving the benefits of TDD.  Sometimes there’s no other choice though.

But there is no one best way. Understand the trade-offs of integration, classic, isolation and mockist testing, and choose whatever works best in each circumstance.

Blindly following a dogma is just a way of making sure you never innovate.

Tautological tests don’t need to include mocks – just assuming what your implementation is can be enough to make blunders. Read more!

How to spot the Tautological Test Anti-pattern

When I was starting out with TDD, it took me a while to figure out how important it is that my tests should be clean of any logic.  I remember one time I wrote a test that was supposed to check a method for calculating the net total and tax on an invoice, which duplicated the code it was supposed to test.

When it turned out that something had been missed, my test hadn’t protected me at all.  And it was a pain to rewrite!

I’d have been better off taking an example of one of my client’s handwritten invoices, and turning that into a test.  Then I could just keep coding until my results agreed with theirs.

It’s easy to write tests that don’t protect against errors in your assumptions. Until I recently spotted a bunch of tests that had this problem, the best advice I could give to junior devs was: “Ehhhh this feels wrong.”

If we want to move beyond “feelings” and toward encouraging best practices for our whole team, we’ll need a language to describe the problem, and a way of spotting it at code review.

Suppose you are reviewing the test for a “Sum” method, intended to add two numbers. You put in A and B, and it gives you the sum A + B.

Unfortunately our coder has got the code wrong. They’ve added A + A in their method and in their test. Their test passes because it is a copy of the code-under-test.

public class Calculator
{
    public int Sum(int inputA, int inputB)
    {
        return inputA + inputA;
    }
}

[TestFixture]
public class CalculatorTests
{
    [Test]
    public void Sum_Test()
    {
        var inputA = 1;
        var inputB = 3;

        var result = Calculator.Sum(inputA, inputB);

        var expected = inputA + inputA;

        Assert.AreEqual(expected, result);
    }
}

The first issue is that the test is incorrect; it can only pass if the Sum logic produces an incorrect answer.  The fact that it does pass means that the code it is testing is wrong too!  But this isn’t the lesson you want your reviewee to take away.

How did the bug get there? Why is it hard to see?

The test is tautological.

Which means it repeats the essence of the code it should be testing.

That test should never have included a calculation to get the value of the ‘expected’ variable. That makes the test as complicated as the method it is testing – ask the developer – Are you really going to write supertests to check the logic in your tests?

In The Art of Unit Testing, Roy Osherove warns the reader to “Avoid logic in tests”. He gives examples of loops, ifs, threads and so on as things to avoid. I would go much further than him in this regard.

Here’s some advice that would have helped me when I started out:

  • Never calculate an expected value to check against within your test. This is almost always duplication of the production code you are trying to test.
  • Logic driving the assertions in your test code is always a smell – if the WHEN step of your test doesn’t require the logic, then you can guarantee that your THEN step is doing too much.
  • Always work from real examples whenever you can. Seek a stakeholder who can provide concrete examples, or if you have no other choice, prepare your data by hand – not automatically.
  • Never write test code that assumes it knows how the method under test should be implemented – that is what we mean by tautological tests, and the sooner you can recognise them, the better.

Read more about tautological tests and why testing using mocks is like serving your Friends sauteed beef in custard and sponge cake.

The 2 Inch Widget Factory – A parable of Quality Assurance

Once upon a time there was a factory that manufactured 2 inch widgets. It was actually even called “The 2 Inch Widget Factory” because the owners believed in letting people know what to expect.

Like all modern factories, the 2 Inch Widget Factory had a strict quality control process. Once a day, the factory workers would be marched to the production line and instructed to run off a batch of widgets and confirm they were all indeed 2 inches long. And each morning, after the all clear, they were given the go ahead for the day’s work to continue.

So, with top notch quality control processes like that, it’d be hard to imagine a set of circumstances under which a bug wrong sized widget could get out.

Nevertheless, complaints rolled in anyway, and everyone at the factory had a bit of a laugh about their stupid buyers, who couldn’t recognise a 2 inch widget when they saw one.

Eventually however, a manager stepped in and told the workers that they’d have to go through the motions of investigating the complaints, even though the customers had to be mistaken, just to put the matter to bed.

So it came to pass that a factory worker, a customer, and a quality control expert got together and talked about the problem.

“It’s awful,” the customer said, “we just can’t use these widgets – look,” and she held up a ruler next to the widget, “it’s 1.8 inches long and it just doesn’t fit our sprockets.”

The factory worker was amazed. “That’s amazing,” he said, “every single day we check, and I’ll tell you one thing for certain. The widgets we send you are exactly the same size as the 2 inch molds we cast them from.”

Testing is a tricky thing.  If you rely on tests that are specced out by the same folks who write the code you won’t ever catch a bug formed from a misunderstood feature.  None of us want to ship bad software.  We care too much.

But thankfully, the answer is easy – there’s a simple method we call “the Three Amigos” – it makes things clear even before you start to code and allows your QAs to verify exactly what correct behaviour looks like.  And you barely have to do a thing!  To hear more about that, you’ll have to read my upcoming blogs.

But till then, if as a developer you ever find yourself working without your stakeholder to write your acceptance tests, stop.  If you ever write tests that refer to the way something is implemented rather than what outcome is expected, for goodness sake, STOP.

Spare a thought for the customers of the defective 2 inch widget factory.