Performance and TDD

TDD works wonders for developing code that meets functional requirements. But what about non-functional requirements? Let’s take a look at one that most users care about: performance.

Most TDD examples are necessarily small, so that the author can see the process through to completion. That reduces the chances for intricacies like performance aspects. But it’s not impossible. Let’s try this with the WordWrap kata, where we have to ensure lines aren’t too long by inserting newline characters in appropriate places.

As usual, we start with input validation:

class WhenWrappingWords {

    @Test
    void shouldHandleEmptyText() {
        assertWrap(null, "");
    }

    private void assertWrap(String text, String expected) {
        assertThat(text, Wrapper.wrap(text, 5), is(expected));
    }

}

Which is easy enough:

public class Wrapper {

    public static String wrap(String text, int length) {
         return "";
    }

}

Next, the degenerate case, where the text doesn’t require newlines:

    @Test
    void shouldNotWrapShortText() {
        assertWrap("short", "short");
    }
    public static String wrap(String text, int length) {
        if (text == null) {
             return "";
        }
        return text;
    }

Now we get to the meat: longer texts require wrapping:

    @Test
    void shouldWrapLongText() {
        assertWrap("toolong", "toolo\nng");
    }
    private static final char NL = '\n';

    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        if (text.length() <= length) {
            return text;
        }
        return text.substring(0, length) + NL 
            + text.substring(length);
    }

But, if possible, we should wrap at word boundaries rather than in the middle of a word:

    @Test
    void shouldPreferToWrapAtWhitespace() {
        assertWrap("too long", "too\nlong");
    }
    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        if (text.length() <= length) {
            return text;
        }
        var index = text.lastIndexOf(' ', length);
        if (index < 0) {
            return text.substring(0, length) + NL
                + text.substring(length);
        }
        return text.substring(0, index) + NL
            + text.substring(index + 1);
    }

And finally, we should wrap into multiple lines if needed:

    @Test
    void shouldWrapVeryLongTextMultipleTimes() {
        assertWrap("toolongtext", "toolo\nngtex\nt");
        assertWrap("too long text", "too\nlong\ntext");
    }
    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        if (text.length() <= length) {
            return text;
        }
        var index = text.lastIndexOf(' ', length);
        if (index < 0) {
            return text.substring(0, length) + NL 
                + wrap(text.substring(length), length);
        }
        return text.substring(0, index) + NL 
            + wrap(text.substring(index + 1), length);
    }

Which we can clean up a bit:

    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        if (text.length() <= length) {
            return text;
        }
        var index = text.lastIndexOf(' ', length);
        var skip = 1;
        if (index < 0) {
            index = length;
            skip = 0;
        }
        return text.substring(0, index) + NL 
            + wrap(text.substring(index + skip), length);
    }

Now let’s consider the performance of this code. Can we use it to format a book? A novel has around 100,000 words and an English word consists of 5.1 letters on average. Let’s say we want to wrap lines at 80 characters:

    private static final int NUM_WORDS_IN_BOOK = 100_000;
    private static final float AVG_NUM_CHARS_PER_WORD = 5.1f;
    private static final int MAX_BOOK_LINE_LENGTH = 80;
    private static final int NUM_TRIES = 10;
    private static final float MAX_WRAPPING_MS = 1000;

    private final Random random = new SecureRandom();

    @Test
    void shouldWrapBook() {
        var time = 0;
        for (var i = 0; i < NUM_TRIES; i++) {
            var text = randomStringOfBookLength();
            var start = System.currentTimeMillis();
            Wrapper.wrap(text, MAX_BOOK_LINE_LENGTH);
            var stop = System.currentTimeMillis();
            time += stop - start;
        }
        assertThat(1.0f * time / NUM_TRIES, 
            lessThanOrEqualTo(MAX_WRAPPING_MS));
    }

    private String randomStringOfBookLength() {
        var numCharsInBook = (int) (NUM_WORDS_IN_BOOK * ( 1 + AVG_NUM_CHARS_PER_WORD));
        var result = new StringBuilder(numCharsInBook);
        for (var i = 0; i < numCharsInBook; i++) {
            result.append(randomChar());
        }
        return result.toString();
    }

    private char randomChar() {
        if (random.nextFloat() < 1.0 / (1 + AVG_NUM_CHARS_PER_WORD)) {
            return ' ';
        }
        return (char) (random.nextInt(26) + 'a');
    }

Normally, you’d use the Java Microbenchmark Harness to investigate the performance of an algorithm like this. I don’t want to introduce new tech for this already long post, however, so this test will have to do. Note that we have to run multiple tries, since we’re using randomness.

Running this test gives a stack overflow, so clearly we need to do something about that.

In this case, it would be easy to replace the recursion with a while loop, so we could just go do that and see the test pass. In the real world, however, things usually aren’t that simple.

This is where the Strategy pattern can come in handy. With multiple implementations of the strategy interface, we can run our tests against all of them. We can develop alternative implementations from scratch, using TDD, or copy some code into a new implementation and start modifying it. Once we’re satisfied with the results, we can keep the best implementation and remove the others.

But hang on, we used TDD to get to this implementation, so how is doing that again going to give us a different result?

Well, when we did it the first time, we weren’t focused on performance. We shouldn’t have been, since premature optimization is the root of all evil. Now that we have proof that our performance isn’t good enough, things are different. Let’s see how that plays out.

The implementation of the first two tests can remain the same:

    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        return text;
    }

To make shouldWrapLongText() pass, we need to pay more attention to performance this time. We don’t want to use substring() and add two Strings together, since that involves copying characters. So let’s use a StringBuilder instead:

    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        var result = new StringBuilder(text);
        if (result.length() > length) {
            result.insert(length, NL);
        }
        return result.toString();
    }

This still means we have to copy some arrays around to make room for the newline. We can avoid that by allocating enough capacity from the start:

    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        var capacity = text.length() + text.length() / length;
        var result = new StringBuilder(capacity);
        result.append(text);
        if (result.length() > length) {
            result.insert(length, NL);
        }
        return result.toString();
    }

This would normally be looking ahead a bit too much for my taste, but since we already implemented the algorithm once, we know for sure we’re going to need this, so I’m cool with it.

Next let’s make shouldPreferToWrapAtWhitespace() pass:

    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        var result = new StringBuilder(text.length() + text.length() / length);
        result.append(text);
        if (result.length() > length) {
            var spaceIndex = text.lastIndexOf(' ', length);
            if (spaceIndex < 0) {
                result.insert(length, NL);
            } else {
                result.setCharAt(spaceIndex, NL);
            }
        }
        return result.toString();
    }

Finally, we can generalize the if to a while to make the last test pass:

    public static String wrap(String text, int length) {
        if (text == null) {
            return "";
        }
        var capacity = text.length() + text.length() / length;
        var result = new StringBuilder(capacity);
        result.append(text);
        var columnEnd = length;
        while (columnEnd < result.length()) {
            var spaceIndex = result.lastIndexOf(" ", columnEnd);
            if (spaceIndex < columnEnd - length) {
                result.insert(columnEnd, NL);
                columnEnd += length + 1;
            } else {
                result.setCharAt(spaceIndex, NL);
                columnEnd = spaceIndex + 1 + length;
            }
        }
        return result.toString();
    }

This passes all our tests, including the one about performance.

The above do-over may seem like a wasteful approach: why wouldn’t we do it “right” from the start? Like I said earlier, we didn’t because we didn’t know that our implementation wasn’t going to perform well. But what if we did know from the start that performance was important?

We could’ve written our tests in a different order, tackling the test for performance earlier in the process. That would’ve prevented us from getting to green with recursion in this example, saving us a bit of time. In a real-world scenario, it might have saved a lot more time. Yet again, we see that the order of tests is important.

I would argue, however, that not much time was lost with the initial approach. I still believe that the proper order is make it pass, make it right, make it fast. One of the reasons TDD works so well is the explicit distinction between making the test green and then refactoring. Doing one thing at a time is solid advice when it comes to addressing performance as well.

I’ll accept a little bit of rework, knowing that I’ll win back that time and more in all the cases where the “right” solution is also fast enough and I don’t waste time on premature optimization.

Please Join the Discussion