Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Embedding JGit

2013 Eclipse Eclipsecon Jgit

I gave a lightening talk at EclipseCon Europe 2013 on “Embedding JGit” talking about the different levels of JGit integration. Here’s the slides and a rough transcript of the talk; when the Eclipse Foundation YouTube video is up, I’ll link to it here as well.

Level Zero

Since JGit is an executable, you can simply fork out using System.exec or use ProcessBuilder to execute a JGit command, e.g.:

```java System.exec(“java -jar –git-dir /tmp/repo init”);

The <a href="">``</a>
is actually an executable shell script as well, so if you are
running on a Unix system then you can invoke `./` as well.

Of course, this is cheating somewhat since the JGit executable isn't really
embedded; but this can be useful for applications that are sensitive to
memory pressure or where the execution can be done on a cloud host.

This approach has a number of advantages, specifically that the embedder
already knows how to use it since JGit provides a (sub)set of the standard
git commands.

Level One

If you need to embed JGit in an existing Java process, then it's possible
to use the program main class `org.eclipse.jgit.pgm.Main` and invoke the
`main` method. The arguments can then be passed in as an array of Strings.
This has the advantage that executing JGit doesn't require spinning up a
new JVM process, and as such, it can turn around multiple requests faster.

It's still necessary to parse the output from the command using stream
parsing to know anything other than 'success' or 'not success' (since the
return code from the `main` method will indicate that already).

import org.eclipse.jgit.pgm.Main;

Main.main(new String[] { "--git-dir", "/tmp/repo/.git", "show", "HEAD" });

There are still some optimisations that get missed out if using this level; specifically, the JGit libraries have to parse the contents of the repository repeatedly as no information is shared between the runs.

Note that the repository passed here has to have the .git directory specified.

Level Two

The most popular way of interacting with JGit involves using the Git class to wrap a repository and to provide a set of porcelain commands. This is a set of commands that roughly mirror the high-level commands that are given at the command line; for example, .add() or .log().

```java import org.eclipse.jgit.api.Git

Git git = File(“/tmp/repo/.git”)); git.clean(); git.lsRemote(); git.log();

The advantages of using the `Git` class are that you get to re-use the same
repository between invocations, so subsequent commands may be faster. You also
have IDE completion and compile time correctness for the arguments, as opposed
to the untested strings in the prior examples.

To invoke the command the <em>builder</em> pattern is used; the result from
the `.clean()` is actually a `CleanCommand`. So to invoke it, you need to
invoke the `.call()` method, after providing any necessary arguments:



Although the builder allows an arbitrary number of arguments to be built up over repeated calls, care must be taken to ensure that any required arguments are set up appropriately.

Level Three

The Git API provides a high-level overview to commands in a portable fashion, allowing for the building of porcelain (high-level commands that operate on lower layers called the plumbing).

To go one level further down an instance of Repository is used.

This is typically constructed using a FileRepositoryBuilder, which again uses a builder pattern to instantiate a repository. This repository can then be re-used across multiple commands, can be served via JGit using something like the org.eclipse.jgit.http.server.glue.MetaServlet class.

The Repository doesn’t provide much information on its own; it provides a means to evaluate certain tree-ish expressions such as HEAD and master~2. However, if you just need to know what the current branch is or get a list of tags, the Repository is all you need.

```java Repository repository = FileRepositoryBuilder.create(new File(“/tmp/repo/.git”)) Map tags = repository.getTags(); Map refs = repository.getAllRefs(); String currentBranch = repository.getBranch(); Ref HEAD = repository.getRef(“HEAD”);

Level Four

Interacting with the `Repository` will only give read-only information, and
only allow getting objects that are already known. To find out information
from a path level or commit level, a couple of iterators must be used, known
as `RevWalk` (commit iterator) and `TreeWalk` (path/directory iterator).

To implement a `log` like command, you can get a `RevWalk` on the repository
and then iterate over commits. To express a start point, the walker needs to
know what commits are included (and also, what commits are excluded).

RevWalk rw = new RevWalk(repository);
Ref HEAD = repository.resolve("HEAD");
Iterator<RevCommit> it = rw.iterator();
while(it.hasNext()) {
  RevCommit commit =;
    + " " + commit.getShortMessage());

To get information about a specific path, the TreeWalk is used against a single commit:

```java TreeWalk tw = new TreeWalk(repository); ObjectId tree = repository.resolve(“HEAD^{tree}”); tw.addTree(tree); // tree ‘0’ tw.setRecursive(true); tw.setFilter(PathFilter.create(“some/file”)); while( { ObjectId id = tw.getObjectId(0);; } tw.release();

Although this may look like a complex way of processing commits and directories,
this maps to the underlying Git representation in an efficient manner. It also
permits the ability to walk through multiple trees or ranges of commits in
a single pass; additional filters such as an
<a href="">`AuthorRevFilter`</a> or
<a href="">`CommitTimeFilter`</a>
can be used to restrict the ranges of commits, or similarly for the paths with
subclasses of
<a href="">TreeFilter</a>.

Note that the walkers should be released/disposed at the end of the use to
ensure that they do not retain information (and thus memory) that may be no
longer of interest. Also note that the walkers are not thread-safe, so should
only be invoked within a single thread.

Level Five

Finally, to get objects in and out of a Git repository requires the use of
a <a href="">`ObjectInserter`</a> and a
<a href="">`ObjectReader`</a>.

Knowledge of these is outside the scope of this tutorial, but here's how to
do a "Hello World" with JGit:

ObjectId hello = repository.newObjectInserter().insert(Constants.OBJ_BLOB,
  "hello world".getBytes("UTF-8"));

Note that objects inserted into a Git repository become eligible for garbage collection unless they are referred to via a commit and a tree that is reachable from a ref in the repository.