Last week I wrote about the behaviour of pulling tracked branches; this week, it's worth taking a dive in to find out what a tracked branch is.
When you initially use Git, you learn that to update items from master involves a
git pull (or
git fetch). Both of these reach out to the remote repository and get content that you're interested in, with the
git pull variant doing either a merge or a rebase as appropriate.
But how does Git know what to pull when you invoke
git pull? Where should it pull it from? What makes a branch you have checked out locally differ from one you have pulled from a remote repository?
Remotes and Refspecs
Firstly, a (local) git repository can have many remotes. Each remote is a name of a repository on a remote end, which corresponds to a URL and a refspec. (In fact, remotes can have a second URL; one is used for fetching, whilst the other is used for pushing – this is used to permit anonymous fetches but authenticated pushes.) You need to specify, when fetching and pulling, what repository you're talking about. For remote repositories, this will default to
origin if not specified
You can specify what the refspec is when interacting with a remote repository. This is the set of branches that will be updated if you interact with that repository. This is normally of the form
refs/heads are the pointers to your local branches, and
refs/remotes are the remote branches.
An optional + prefix on fetch refspecs indicates whether or not to fetch non fast-forward commits automatically. And whilst you can't have partial wildcards (like
refs/for/qa*) you can have sub paths (like
refs/for/qa/*). You can also use the reference
HEAD to refer to the commit that the current branch is on as a source for the refspec, which can be useful for pushes.
However, each branch also has the concept of what it is tracking. As well as the branche(es) that will be affected by a fetch/pull/push, tracking says which branch is upstream of which.
Normally, branches checked out of a remote repository are automatically set up as tracking branches. If you check out EGit, you'll get a
master branch that tracks
origin if you didn't specify a default repository identifier). Any changes you pull into your
master come (by default) from EGit's
However, what if you wanted to spin off another branch for experimental purposes, and keep that updated? If you do
git checkout -b experimental, it diverges from your local
master at that point in time. You either need to pull changes through
master and then rebase, or remember where your merge point was.
Instead, you can set up your experimental branch to track another one. This means you can fetch and pull, as if you were pulling from a remote repository, and consume changes from the ongoing branch moves. This is useful if you have a long-running UAT branch which needs to be refreshed periodically from a moving target; setting it up as a tracked branch means that the only thing you need to do is
git pull, and you're up to date.
So, how do you set up a branch for tracking? Well, when you check out a branch from a remote master, it gets set up automatically. In fact, all a tracked branch is is one that's explicitly mentioned in the
.git/config file, since it lists what its remote is and where to merge from:
$ git clone upstream clone Cloning into clone... done. $ cd clone $ tail .git/config [branch "master"] remote = origin merge = refs/heads/master
The way to read this is that
master is a local branch, which tracks
refs/heads/master on the remote
origin. Any pulls that happen on
master will result in a merge (or rebase) from
What if we wanted to set up our experimental branch? If we just do
git checkout -b experimental, it won't be tracked:
$ git checkout -b experimental Switched to a new branch 'experimental' $ grep branch .git/config [branch "master"]
We can flag it as tracked using the
--track option of
git checkout (or its shorter
$ git checkout master Switched to branch 'master' $ git branch -d experimental Deleted branch experimental (was 4a3fa88). $ git checkout --track -b experimental $ tail -3 .git/config [branch "experimental"] remote = . merge = refs/heads/master $ git pull From . * branch master -> FETCH_HEAD Already up-to-date.
Hang on, what's the
remote = . doing in here? Well, that's a special short hand meaning this repository, much like it means this directory in filesystem access. What we have here is
master, and not
origin/master; in other words, it's a local branch tracking another local branch. There are times when this is useful, but what if you want to track the remote one directly instead of having to pull through a local copy?
$ git checkout master Switched to branch 'master' $ git branch -d experimental Deleted branch experimental (was 4a3fa88). $ git checkout -b experimental origin/master Branch experimental set up to track remote branch master from origin. Switched to a new branch 'experimental' $ tail -3 .git/config [branch "experimental"] remote = origin merge = refs/heads/master
Now we have a branch
experimental which is tracking
origin/master. When we have an update in the upstream repository, and do a pull, we see it updating both
master, and also
$ git pull remote: Counting objects: 3, done. remote: Compressing objects: 100% (2/2), done. remote: Total 2 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (2/2), done. From upstream 4a3fa88..55eb534 master -> origin/master Updating 4a3fa88..55eb534 Fast-forward 0 files changed, 0 insertions(+), 0 deletions(-) create mode 100644 third
The update shows
origin/master being updated to the new value. The subsequent step is the updating and fast-forward of the local
experimental branch. But what of the local
$ git log --oneline experimental 55eb534 Third 4a3fa88 Second ff8536c Start $ git log --oneline master 4a3fa88 Second ff8536c Start
So although both
master are tracking the same upstream branch, they can be updated and processed independently. This is useful when you want to advance the state of one branch (perhaps for experimentation purposes) but don't want to change the local state of a branch.
It's possible to add upstream tracking information to an existing local branch after the fact, in recent versions of git. If we'd checked out the
experimental branch as in the first step, and didn't want to delete/re-create it (perhaps because we'd made some local changes) then you can add it afterwards:
$ git checkout master Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. $ git checkout -b experimental2 $ git checkout -b experimental2 Switched to a new branch 'experimental2' $ git branch --set-upstream experimental2 origin/master Branch experimental2 set up to track remote branch master from origin. $ tail -3 .git/config [branch "experimental2"] remote = origin merge = refs/heads/master
So even if you have existing branches, it's possible to wire them up to be tracking branches after the fact. You can also use this if you want to change which branch you're tracking (say, swapping a local branch for a remote one or vice versa) by re-running the command.
It's also worth mentioning that there is a
--no-track option of
git checkout, which can be used to prevent the tracking of branches upon checkout if that's desired. This is sometimes useful if you are consuming a feature or bugfix branch and you don't want/need to pull from it in the future.
Lastly, all of this is configured with the
branch.autosetupmerge config option. If this option is
false, then branches are never tracked by default. If the option is
true, then branches are tracked if they are remote, and not tracked if they are local. If the option is
always, then branches are always set up as tracked branches, regardless of whether they are local or remote. These effectively specify the defaults, but they can be overridden on a branch-by-branch basis using the
--track command line flags of the
git checkout or
git branch commands.
Come back next week for another instalment in the Git Tip of the Week series.