Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Git Tip of the Week: Rebasing

2011, git, gtotw, tip

This week's Git Tip of the Week is about rewriting history. You can subscribe to the feed if you want to receive new instalments automatically.


Rewriting history

One of the philosophical differences between Git and Mercurial is whether history should be allowed to be re-written or not. When a commit is made, the commit hash represents a point on that history – and subsequent commits then rely on that hash for integrity and representation of parental links.

Rewriting history can thus be dangerous; if you change a commit in the past, it invalidates the current commit's hash. Instead, you need to re-commit the current change against the new commit to get a new hash value.

However, dangerous is relative. It's not always the case that changing the history is bad – if that history is local, and hasn't been seen by anyone else, then the only person it's affecting is you. As long as you know what you're doing (and who you will affect) then changing the local history is no different than undoing your local changes in a text editor and re-saving.

(Side note; Mercurial has the concept of 'patch queues' which are the equivalent of local mutable history – but you end up with two separate repository concepts instead of a single concept of history as in Git.)

So, the question is not so much as whether rewriting history is dangerous as to understanding the effects if these changes are exposed to others (e.g. via pushing to GitHub). Sometimes, it's necessary to publicly break the commit hashes – for example, someone accidentally committed a large binary, or copyrighted code which shouldn't be present (or even a password which shouldn't have been committed) – but in these cases, making the change often involves a public notification to warn others.

Rebasing

So, what is rebasing? Well, rebasing is Git's concept of changing (recent) local history. In essence, it is an undo/replay option that you can use to make changes as if they were done in the past.

A Git rebase unwinds history to a particular point (typically specified as HEAD~n where n is the small number of previous commits in the past), and then replays the same changes on top of the code. If the changes are unmodified, then the resulting commit will be the same as before.

However, it's more normal to want to adjust the commit(s) in some way, for example:

  • Reword – change the commit message to something else (e.g. to add a bug reference)
  • Edit – to make changes to the commit itself (e.g. fix a typo in the code)
  • Pick – to include that commit in the history
  • Squash – to condense that commit with the previous and make them one (and concatenate log entry)
  • Fixup – to condense that commit with the previous and make them one (and discard log entry)

As well as these options, it is also possible to re-order them simply by re-ordering the list of changes.

Example

Let's build up a repository with some changes we'd like to make:


$ git init example
Initialized empty Git repository in example/.git
$ cd example
$ git commit --allow-empty -m "Initial Commit"
$ echo Helo World > README.txt
$ git add README.txt
$ git commit -m "Typo" README.txt
$ echo Second > Second.txt
$ git add Second.txt
$ git commit -m "Second" Second.txt
$ echo Frst > First.txt
$ git add First.txt
$ git commit -m "First" First.txt
$ echo First > First.txt
$ git add First.txt
$ git commit -m "First" First.txt
$ git log
07e9061 First
756281e First
13aba60 Second
7b49271 Typo
82f9a21 Initial Commit

What we'd like to do is fix the typo made in the first commit, join the two First commits into one, and reorder the Second so that it comes second in the list. To do this, we kick off an interactive rebase, which will give us an editor:


$ git rebase -i 82f9a21
@ pick 7bf9271 Typo
@ pick 13aba60 Second
@ pick 756281e First
@ pick 07e9061 First

What this is saying is a sequence of cherry-picks to replay the history with the specific changes listed. We can re-order them to replay history in a different manner:


@ edit 7bf9271 Typo
@ pick 756281e First
@ fixup 07e9061 First
@ pick 13aba60 Second

Git will then rewind history to the parent of the Typo commit, and drop us down into a shell which allows us to make changes:


$ echo Hello World > README.txt
$ git add README.txt
$ git commit -m "Readme" README.txt
$ git rebase --continue

Here, we've stopped editing for a while and kept going through the rebase operation. We could insert more commits if we wanted to but we've just committed the current state as is. You might also see:


error: could not apply 07e9061... First
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' and run 'git rebase --continue'
Could not apply 07e9061... First

This is caused because we're changing the same line in the same file, and Git is asking if that's OK. We can simply add that file and continue, or (given that 07e9061 is a complete replacement for 756281e) just not have done it in the first place:


$ git rebase --abort
$ git rebase -i 82f9a21
...
@ edit 7bf9271 Typo
@ pick 07e9061 First
@ pick 13aba60 Second

By removing the commit from the list, it is as if that commit never happened. This should run through and allow you to commit all the changes without having any conflicts.

Summary

Rebasing allows you to re-write history in an automated manner, instead of having to unwind and manually replay the changes yourself. It's often used with git rebase -i HEAD~5 (or some other small number) to fix up changes in your local history before merging or pushing to a central repository.

Rebasing also allows the transplantation entire sections of a tree, which we'll talk about another time.

Finally, remember that Git never loses commit data. If you're working against a branch, you've got the branch's reflogs to fall back on; what Git allows you to do is effortlessly rebuild new commit trees (whilst keeping the old commit trees around in your local cache) until you're happy with the result.


Come back next week for another instalment in the Git Tip of the Week series.