Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Git Tip of the Week: Git Notes

Gtotw 2011 Git

This week's Git Tip of the Week is about git notes. You can subscribe to the feed if you want to receive new instalments automatically.


At a recent talk for the London Java Community (recorded video is available via the link), I presented Git and Gerrit (based on the successful screencasts I have done previously). One of the things I demonstrated was the use of git notes, so I thought writing about them and explaining what they are made sense.

When files are committed into a Git repository, they are addressed by a hash of the contents. The same is true of trees and commits. One of the benefits of this structure is that the objects cannot be modified after they have been committed (since doing so would change that hash).

However, sometimes it is desirable to be able to add metadata to a commit after it has already been committed. There are three ways of doing this:

  1. Amend the commit message to add in the additional metadata, accepting this will change the branch.
  2. Create a merge node with a more detailed commit, and push that (so that the previous commit is retained and can be fast forwarded).
  3. Add additional metadata in the form of git notes.

Of these three options, only the last one will not change the current branch.

Git Notes

Git Notes are, in effect, a separate ‘branch’ of the repository (stored at .git/refs/notes). They don't show up in the git branch command (that lists .git/refs/heads by default). However, although you could check it out and manually update it, there is a command provided which helps you do that; git notes.


(master) $ git log --oneline
056ca11 More Stuff Again
9defb31 MoreStuff
0c7ff4f Additional
19b6cdf Initial
(master) $ git notes show
(master) $ git notes add -m "ToDo: Fix stuff"
(master) $ git notes show
ToDo: Fix stuff
(master) $ git log
(master) $ git log
commit 056ca11c01b47e2bfe1e51178b65c80bbdeef7b0
…

    More Stuff Again

Notes:
    ToDo: Fix stuff

When you look at the output of git log, it checks to see if there is an associated note, and if so, prints it out as if it were an appendix to the commit. Furthermore, the notes are mutable and can be updated over time:


(master) $ git notes add --force -m "ToDone: Fixed stuff"
Overwriting existing notes for object 056ca11c01b47e2bfe1e51178b65c80bbdeef7b0
(master) $ git notes show
ToDone: Fixed stuff

The advantage of the notes is that they can be updated without changing the commit message (and therefore the hash) of the item that they are referring to. Of course, this can be used for good as well as bad; but bear in mind the mutability if you need to depend on the notes' contents.

Gits all the way down …

Actually, a better title might have been “objects all the way down”, but I liked this one better.

Since Git is a content addressable database, the notes themselves are git objects. You can even view the history of the branch using git log and even check it out. But how are the notes stored?


(master) $ git log --oneline notes/commits
d6ac2b2 Notes added by 'git notes add'
5eb0ee5 Notes added by 'git notes add'
(master) $ git checkout notes/commits
Note: checking out 'notes/commits'.

You are in 'detached HEAD' state. You can look around, make experimental
…
HEAD is now at d6ac2b2... Notes added by 'git notes add'
((d6ac2b2...)) $ ls
056ca11c01b47e2bfe1e51178b65c80bbdeef7b0
((d6ac2b2...)) $ cat 056ca11c01b47e2bfe1e51178b65c80bbdeef7b0
ToDone: Fixed stuff

The branch contains a list of notes, with file names referenced by the commit (or other object) ID that they correspond to. We can make a change here and update our notes:


((d6ac2b2...)) $ echo Note: Git notes are just objects >> 056ca11c01b47e2bfe1e51178b65c80bbdeef7b0
((d6ac2b2...)) $ git commit -a -m "Note added by me"
[detached HEAD 89e6afa] Note added by me
 1 files changed, 1 insertions(+), 0 deletions(-)
((89e6afa...)) $ git checkout master
Warning: you are leaving 1 commit behind, not connected to
any of your branches:

  89e6afa Note added by me
…
(master) $ git log HEAD^..HEAD
commit 056ca11c01b47e2bfe1e51178b65c80bbdeef7b0
…
    More Stuff Again

Notes:
    ToDone: Fixed stuff

So, we added a new commit and then switched back to master; but as the warning message told us, this has left the commit behind. We really need to update the refs/notes/comits reference if we want to see the new values:


(master) $ git update-ref refs/notes/commits 89e6afa
(master) $ git log HEAD^..HEAD
commit 056ca11c01b47e2bfe1e51178b65c80bbdeef7b0
…
    More Stuff Again

Notes:
    ToDone: Fixed stuff
    Note: Git notes are just objects

Here, the git update-ref is assigning the content of refs/notes/commits the value 89e6afa… (although it's resolving it to a full 40 character hash and checking that it exists first).

Conventions

Just a quick note on conventions; since the notes file is essentially on its own branch, the content doesn't get merged with merges between branches. If you wanted to merge git notes, then following the Key: Value on separate lines is the way to achieve git note merging nirvana. The merging options for git notes allow for appending of notes (i.e. similar to cat noteV1 noteV2) or sorting and uniquifying the data (i.e. cat noteV1 noteV2 | sort | uniq).

However, the notes don't have to be textual, nor do they have to be something which is mergeable. They don't even need to be on the notes/commits ref; you can create notes based on any reference.

In fact, this is how Gerrit works (which I've written about before). Gerrit stores its review information in the Git repository under notes/review. Ordinarily, this doesn't show up (the git log only shows notes in the notes/commits refspace), but you can make it do so if you want:


(BARE:master) $ git show refs/notes/review
commit bb7cba258eaaf4851b20b66c7ef56775f0cb4367
…
    Update notes for submitted changes

    * Goodbye world

diff --git a/f7f38314247063271631cfddf560ea99214cd438 b/…
@@ -0,0 +1,7 @@
+Code-Review+2: Alex Blewitt
+Verified+1: Jenkins
+Submitted-by: Alex Blewitt
+Submitted-at: Thu, 20 Oct 2011 20:11:16 +0100
+Reviewed-on: http://localhost:9080/7
+Project: SkillsMatter
+Branch: refs/heads/master
(BARE:master) $ git log HEAD^..HEAD
commit f7f38314247063271631cfddf560ea99214cd438
…
    Goodbye world

    Change-Id: I692f8de08938f22da9d6e26005ba44c95a1479d7
(BARE:master) $ git log --show-notes=* HEAD^..HEAD
commit f7f38314247063271631cfddf560ea99214cd438
…
    Goodbye world

    Change-Id: I692f8de08938f22da9d6e26005ba44c95a1479d7

Notes (review):
    Code-Review+2: Alex Blewitt
    Verified+1: Jenkins
    Submitted-by: Alex Blewitt
    Submitted-at: Thu, 20 Oct 2011 20:11:16 +0100
    Reviewed-on: http://localhost:9080/7
    Project: SkillsMatter
    Branch: refs/heads/master

In this case, I reviewed the commit (with a +2 from me, and a +1 from Jenkins) and it's stored in the Git repository, along with everything else. Normally, it's not received by the user when pulling or cloning; but it is a permanent record on the repository (and will be visible if you e.g. do a git clone --mirror). However, if you want to fetch the notes as well you can do so:


[remote "origin"]
	fetch = +refs/notes/*:refs/notes/*
	fetch = +refs/heads/*:refs/remotes/origin/*
	url = ssh://localhost:29418/SkillsMatter.git
	push = refs/heads/master:refs/for/master

The fetch refspec in bold allows me to pull any/all reviews from the repository and make them available in my local clone.

Exercise for the reader …

Since the Git notes can contain any blob, and it's not cloned by default (unless you specifically review it), you can create a distribution and check it into a repository. Instead of storing it in refs/notes/commit, store it in refs/notes/dist and have the binary generated from your compile system export it as a Git Note pointing to the tag. That way, if you want to check out the pre-built bundle for a given tag, you can use refs/notes/dist to point to the tag you want and extract the full binary.

Of course, you don't really need to use git notes to store any blob in the repository in any case; there's no reason why you couldn't have a refs/dists tree, with one file per tag.

Git notes demonstrates the fact that Git is not just a source code control system, like Hg or Bzr. Instead, it's a content-addressable file-system, which just happens to be able to represent trees and files (blobs) in an easy way. As a result, Git will always be capable of being extended with functionality like Gerrit and git notes, because it is not limited to what it can store in a repository – yet, the cloning of the repository can still be efficient since the data you pull from a clone is only the reachable objects from a specific commit. As a result, review notes (and/or binary distributions) need never be part of a cloned repository, even if it is persisted and available in the same Git back-end.


Come back next week for another instalment in the Git Tip of the Week series.