How To Remove Friction From Your Version Control Experience

ErrorLast week, I spend several days fixing a bug that only surfaced in a distributed environment.

I felt pressure to fix it quickly, because our continuous integration build was red, and we treat that as a “stop the line” event.

Then I came across a post from Tomasz Nurkiewicz who claims that breaking the build is not a crime.

Tomasz argues that a better way to organize software development is to make sure that breaking changes don’t affect your team mates. I agree.

Broken Builds Create Friction

Breaking changes from your co-workers are a form of friction, since they take away time and focus from your job. Tomasz’ setup has less friction than ours.

But I feel we can do better still. In a perfect Frictionless Development Environment (FDE), all friction is removed. So what would that look like with regard to version control?

With current version control systems, there is lots of friction. I complained about Perforce before because of that.

Git is much better, but even then there are steps that have to be performed that take away focus from the real goal you’re trying to achieve: solving the customer’s problem using software.

For instance, you still have to create a new topic branch to work on. And you have to merge it with the main development line. In a perfect world, we wouldn’t have to do that.

Frictionless Version Control

version-controlSo how would a Frictionless Development Environment do version control for us?

Knowing when to create a branch is easy.

All work happens on a topic branch, so every time you start to work on something, the FDE could create a new branch.

The problem is knowing when to merge. But even this is not as hard as it seems.

You’re done with your current work item (user story or whatever you want to call it) when it’s coded, all the tests pass, and the code is clean.

So how would the FDE know when you’re done thinking of new tests for the story?

Well, if you practice Behavior-Driven Development (BDD), you start out with defining the behavior of the story in automated tests. So the story is functionally complete when there is a BDD test for it, and all scenarios in that test pass.

Now we’re left with figuring out when the code is clean. Most teams have a process for deciding this too. For instance, code is clean when static code analysis tools like PMD, CheckStyle, and FindBugs give no warnings.

Some people will argue that we need a minimum amount of code coverage from our tests as well. Or that the code needs to be reviewed by a co-worker. Or that Fortify must not find security vulnerabilities. That’s fine.

pipelineThe basic point is that we can formally define a pipeline of processes that we want to run automatically.

At each stage of the pipeline can we reject the work. Only when all stages complete successfully, are we done.

And then the FDE can simply merge the branch with the main line, and delete it. Zero friction from version control.

What do you think?

Would you like to lubricate your version control experience? Do you think an automated branching strategy as outlined above would work?


How Friction Slows Us Down

FrictionI once joined a project where running the “unit” tests took three and a half hours.

As you may have guessed, the developers didn’t run the tests before they checked in code, resulting in a frequently red build.

Running the tests just gave too much friction for the developers.

I define friction as anything that resist the developer while she is producing software.

Since then, I’ve spotted friction in numerous places while developing software.

Friction in Software Development

Since friction impacts productivity negatively, it’s important that we understand it. Here are some of my observations:

  • Friction can come from different sources.
    It can result from your tool set, like when you have to wait for Perforce to check out a file over the network before you can edit it.
    Friction can also result from your development process, for example when you have to wait for the QA department to test your code before it can be released.
  • Friction can operate on different time scales.
    Some friction slows you down a lot, while others are much more benign. For instance, waiting for the next set of requirements might keep you from writing valuable software for weeks.
    On the other hand, waiting for someone to review your code changes may take only a couple of minutes.
  • Friction can be more than simple delays.
    It also rears its ugly head when things are more difficult then they ought to be.
    In the vi editor, for example, you must switch between command and insert modes. Seasoned vi users are just as fast as with editors that don’t have that separation. Yet they do have to keep track of which mode they are in, which gives them a higher cognitive load.

Lubricating Software Development

LubricationThere has been a trend to decrease friction in software development.

Tools like Integrated Development Environments have eliminated many sources of friction.

For instance, Eclipse will automatically compile your code when you save it.

Automated refactorings decrease both the time and the cognitive load required to make certain code changes.

On the process side, things like Agile development methodologies and the DevOps movement have eliminated or reduced friction. For instance, continuous deployment automates the release of software into production.

These lubricants have given us a fighting chance in a world of increasing complexity.

Frictionless Software Development

It’s fun to think about how far we could take these improvements, and what the ultimate Frictionless Development Environment (FDE) might look like.

My guess is that it would call for the combination of some of the same trends we already see in consumer and enterprise software products. Cloud computing will play a big role, as will simplification of the user interaction, and access from anywhere.

What do you think?

What frictions have you encountered? Do you think frictions are the same as waste in Lean?

What have you done to lubricate the frictions away? What would your perfect FDE look like?

Please let me know your thoughts in the comments.

The verdict on Perforce

At work, I’m now forced to use the Perforce version control system, since that’s what our company has standardized upon. I’ve had some bad feelings about that from the start (based on reading about it), but I’ve hold off on publicizing them so I could give Perforce a fair chance. After all, I have been wrong before ;). So now that I’ve worked with it for several months, here’s my verdict.

The Perforce slogan is Perforce, The Fast Software Configuration Management System. So they’re basically claiming that they are faster than their competitors. How does this claim hold up?

That question is not so easy to answer, since their competitors are not a homogeneous bunch. But let’s look at one category of competitors: the distributed version control systems. The most well-known of these are Git, Bazaar, and Mercurial. Interestingly, Git calls itself The fast version control system and Mercurial’s slogan is Work easier, Work faster.

Distributed version control systems work locally, meaning they don’t need a network connection. Contrast this with Perforce, that needs a network connection for everything. And I mean everything, to a ridiculous level. For instance, even the help command needs a network connection:

$ p4 help
Perforce client error:
        Connect to server failed; check $P4PORT.
        TCP connect to perforce failed.
        perforce: host unknown.

Now, obviously network access is going to slow things down a lot, so it’s difficult to see how Perforce can still beat their competitors on speed. And my experience has been very clear: it doesn’t! In fact, Perforce is very slow. Now obviously, that depends on your network bandwidth, so your mileage may vary.

But it gets worse. My primary interface to Perforce is not the command line client, but the Eclipse integration, P4WSAD. And although Perforce claims that this is the best of both worlds, my opinion is that this is a piece of crap. There, I said it. P4WSAD makes my life as a developer a hell.

Perforce makes all files read-only by default. Only once you’ve checked out a file, will it become writable. And, you’ve guessed it, that requires network access. In practice, this means that everytime I want to change some source file, I have to wait until P4SWAD checks out the file, which can take up to five seconds! This is extremely annoying, because it completely breaks my flow. And if you think one file is bad, try doing a refactoring that affects multiple files… It is reason enough for me to not ever want to work with Perforce again.

BTW, it is interesting to note that none of the aforementioned distributed version control systems appear in Perforce’s comparison with its competitors.

More connection troubles
Now if all this slowness actually bought me some nice features, that could change the story, right? Well, yes, it could. But it doesn’t.

The same cause for slowness, accessing the network for everything, is also limiting what you can do.

For instance, I have a long commute in the train, so I like to work there. And guess what, I don’t have an internet connection there. Not to worry, Perforce has a workaround called offline mode. This basically means that P4WSAD will nag you for confirmation every time you try to change a file.

It also means that it looses track of which files were changed, so that when you get back online, you forget to submit some files and break the build. That has happened to me quite a few times now, because the reconcile feature is not available in P4SWAD. You need to use the Perforce Visual Client (P4V) for that. So now I need to use two tools to get my work done.

Another limitation of P4WSAD is that it will block a refactoring affecting a file that you haven’t already modified since you went offline. This means you have to hunt down all the places where, say, a method to be renamed is used, and force a “checkout” of all those places by changing something in the file. Only then can you do your refactoring. Very annoying.

Perforce claims to support transactions, which is a must for a source code control system. We don’t want our automated build to pick up part of a set of changed files and break because of that!

Unfortunately, transactions in Perforce only work when they work. In other words, when an error occurs, it’s very well possible that Perforce will have comitted only a subset of the files in the “transaction”. This is not a really big deal, as it doesn’t happen all that often, but still.

Perforce is completely file based; it doesn’t track directories. So it’s impossible to add an empty directory to a repository, for instance. Also, when someone removes a directory, Perforce by default will leave empty directories on people’s file systems when they synchronize. There is a setting to fix that, but it’s set to the wrong value by default. I consider this only a minor flaw, but it’s annoying nonetheless.

Would I recommend Perforce to anybody? Not really. I think there are better alternatives out there. Free ones, mind you. So save yourself some money and check out (pun intended) Git, Mercurial, or Bazaar.

Root Cause Analysis

I’ve moved on to a new project recently. It’s quite different from the previous one. Before I worked on a monolythic web application, now we’re using OSGi. As a result, our project consists of a lot of sub-projects (OSGi bundles) which makes it very inconvenient to use Ant. So we’ve switched to Gradle. Our company has also standardized on Perforce, where we used Subversion before.

To summarize, a lot has changed and I’m not really up to speed yet. This is evidenced by the fact that I broke our build 4 times in 5 days. As an aspiring software craftsman, I feel really bad about that. So why did this happen so often? And can I do anything to prevent it from happening again? Enter Root Cause Analysis.

Root cause analysis is performed to not just treat the symptoms, but cure the disease:

The key to effective problem solving is to first make sure you understand the problem that you are
trying to solve – why it needs to be solved, how you will know when you’ve solved it, and what the
root cause is.
Often symptoms show up in one place while the actual cause of the problem is somewhere
completely different. If you just “solve” the symptom without digging deeper it is highly likely that
problem will just reappear later in a different shape.

Henrik Kniberg has written about one way of doing root cause analysis: using Cause-Effect diagrams. Using this method, I ended up with the following:

Build Failure Cause Effect Diagram

I started out with the problem: Build Failure, in the orange rectangle. I then repeatedly asked myself why this is a bad thing and added the effects in the red rectangles. Just repeat this until you find something that conflicts with your goal. Finally, I repeatedly added causes in blue rectangles. You can stop when you’ve found something you can fix, but in general it’s good to keep asking a bit deeper than feels comfortable. This is where the Five Whys technique comes in handy.

As you can see, my main problem is impatience. For those who know me, that won’t come as a surprise 😉 However, in this case this personal flaw of mine gets in the way of my goal of making customers happy.

With the causes identified, it’s time to think up some countermeasures. Pick some causes that you can fix, and think of a way to treat them. The ones I’m going to work on are marked with a star in the figure above:

  • Use a checklist when submitting code to the source code repository. This will prevent me from making silly mistakes, such as forgetting to add a new file
  • Take the time to learn the tools better, in particular Gradle and Groovy
  • In general, try to be more patient

The last one is the hard one, of course. Wish me luck!

Update 2010-08-01: I have created a checklist with things to consider before submitting code to our Perforce repository and used this checklist all week. I broke the build twice this week, both times because of special circumstances that were not accounted for on the checklist. So I think I’m improving, but I’m not there yet.