Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Git Tip of the Week: Git Archive

Gtotw 2011 Git

This week's Git Tip of the Week is about archives. You can subscribe to the feed if you want to receive new instalments automatically.


If you want to extract the contents of a Git repository, perhaps to make it available for a source download somewhere, then you can of course zip (or tar) up the contents of the repository with a command line tool.

However, there's another way of doing this with a Git repository, using the git archive command. This takes the contents of the current working tree and generates a zip (or tar) file.

One key advantage of using Git to perform the archive rather than a command line tool is to avoid accidentally capturing the (large) .git directory, or any work-in-progress content. For example, if you have just run a build, then zip (tar) will include the content of the build output as well.

Another advantage is that you can extract the content of the repository at an arbitrary revision. Whilst HEAD is used by default, you can put in any tree or tag in the extraction, which makes it useful for being able to generate a source tar ball from a given tag (even if that tree doesn't happen to be the default). For example, let's say we wanted to generate a source bundle from the EGit repository


(master) $ git archive --format tar v1.0.0.201106090707-r | gzip -9 > /tmp/egit-v1.0.0.tgz
(master) $ tar tzf /tmp/egit-v1.0.0 | head
.eclipse_iplog
.gitattributes
EGIT_INSTALL
LICENSE
README
SUBMITTING_PATCHES
org.eclipse.egit-feature/
org.eclipse.egit-feature/.gitignore
org.eclipse.egit-feature/.project
org.eclipse.egit-feature/.settings/

This feature is used when browsing the contents of a repository via cgit. It's possible to click on any link (commit or branch) and download a tgz of the repository at the time. All of this is powered by git archive. In fact, you can create an archive from a remote repository, without needing an explicit clone – though it's worth noting that most http repositories don't support this.


(master) $ git archive --format tar -9 --remote ssh://server.org/path/to/git > /tmp/remotearchive.tgz

Finally, it's possible to extract only a subset of files rather than the whole repository. If you wanted to generate only the docs for a project, and they were all present in the docs/ folder, then you could create an archive just containing that with:


(master) $ git archive --format tar -9 HEAD docs > /tmp/docs.tgz

It's fairly common that git describe will be used in conjunction with git archive in creating the name of the output file, and optionally, the global prefix to put in the compressed archive output as well:


(master) $ NAME=project-`git describe`
(master) $ git archive --format tar -9 HEAD docs > ${NAME}-docs.tgz 

Come back next week for another instalment in the Git Tip of the Week series.