Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Should every commit compile?

2013 Eclipse Git

In a recent conversation on Twitter, the question of whether all individual git commits should compile* was discussed.

There’s broadly two answers to this:

  • Yes
  • No

and which one you choose is more a function of your workflow and practices than a definitive answer one way or another.

To explain what I mean by the ‘no’ answer above; if you have a series of commits in a repository (let’s call them a→b→c to denote them), provided that the final commit c compiles* successfully or not, does it really matter whether a or b do as well?

For a local repository that you’re working on, it doesn’t matter too much; after all, local git history is mutable. It’s when you start pushing to remote branches – particularly ones others can see – that rewriting history is generally frowned upon.

These two patterns collide when it comes to shared code reviews. You’d like to get early feedback on an implementation, so you share it with others via some kind of remote repository/branch; but when you do, the expectation is that you’ll build upon it by adding further commits (c→d→e) instead of replacing ones you’ve already pushed (a→b'→c').

These two patterns are exemplified by GitHub’s pull requests and Gerrit’s reviews.

A GitHub branch or pull request is typically additive; newer commits amend changes until the feature is complete, upon which the pull request is merged. In this case what’s being reviewed is not necessarily the history that got to that point, but the resulting effect this change has on the branch it is being merged into. The result of the pull effect, not how it got there, is the thing being measured.

Gerrit’s reviews on the other hand allow you to review commits on a commit-by-commit basis; in other words, reviewing the history. Implicitly this ends up reviewing the resulting change as well, but through a bottom-up rather than top-down view. If an issue is found, the original commit needs to be edited and rebased before being pushed again.

In both cases it’s possible to set up a trigger, such that when a change occurs the compiler* is invoked and marks the change as verified or not; but in Gerrit’s case, it will do this for every commit in the history whereas in GitHub’s case it will just do this for the branch as a whole.

There are a couple of reasons for this differences, which are more to do with environment than anything else.

Firstly, GitHub’s site mainly consists of Ruby and JavaScript back-ends. In these cases there is no ‘compile’ step (hence the compiles* notations above); and as a result, there’s less merit in determining if an individual commit is successful or not. Gerrit on the other hand came out of Google, and is used for the Android source tree, which does use a compiled language (and so does have a ‘compile’ step).

Secondly, the ability for each commit to be compiled or not is of assistance when using git bisect. For those that haven’t used it before (or read the git tip of the week on the subject), git bisect allows you to run a program to determine ‘good’ or ‘bad’ and then perform a binary search to find out when ‘good’ changes to ‘bad’.

In the case of code at least compiling correctly, it’s possible to run the output in some form of test harness or against a test case to determine whether the resulting code is good or not; but if the code haphazardly doesn’t compile correctly then the problem can be more challenging to resolve.

As a result, those who use git bisect tend to value the benefits of each individual commit compiling correctly, whereas those who either don’t need to do a compile step or are forward-looking with roll forwards tend to not need it.

So the summary is; it’s really up to taste. If you’re using a compiled language, and you want to be able to use git bisect effectively, then having every commit compile can be a valuable tool. But if you’re not using a compiled language, and don’t have a sequence of tests that you can run, then ignoring the interim state and focussing on the final result can be more agile.

Updated Thanks to @martiell for pointing out I meant git bisect instead of git blame.