I gave a lightening talk at EclipseCon Europe 2013 on “Embedding JGit” talking about the different levels of JGit integration. Here’s the slides and a rough transcript of the talk; when the Eclipse Foundation YouTube video is up, I’ll link to it here as well.
Since JGit is an executable, you can simply fork out using
ProcessBuilder to execute a JGit command, e.g.:
is actually an executable shell script as well, so if you are
running on a Unix system then you can invoke
./jgit.sh as well.
Of course, this is cheating somewhat since the JGit executable isn’t really embedded; but this can be useful for applications that are sensitive to memory pressure or where the execution can be done on a cloud host.
This approach has a number of advantages, specifically that the embedder already knows how to use it since JGit provides a (sub)set of the standard git commands.
If you need to embed JGit in an existing Java process, then it’s possible
to use the program main class
org.eclipse.jgit.pgm.Main and invoke the
main method. The arguments can then be passed in as an array of Strings.
This has the advantage that executing JGit doesn’t require spinning up a
new JVM process, and as such, it can turn around multiple requests faster.
It’s still necessary to parse the output from the command using stream
parsing to know anything other than ‘success’ or ‘not success’ (since the
return code from the
main method will indicate that already).
1 2 3
There are still some optimisations that get missed out if using this level; specifically, the JGit libraries have to parse the contents of the repository repeatedly as no information is shared between the runs.
Note that the repository passed here has to have the .git directory specified.
The most popular way of interacting with JGit involves using the
class to wrap a repository and to provide a set of porcelain commands.
This is a set of commands that roughly mirror the high-level commands that
are given at the command line; for example,
1 2 3 4 5 6
The advantages of using the
Git class are that you get to re-use the same
repository between invocations, so subsequent commands may be faster. You also
have IDE completion and compile time correctness for the arguments, as opposed
to the untested strings in the prior examples.
To invoke the command the builder pattern is used; the result from
.clean() is actually a
CleanCommand. So to invoke it, you need to
.call() method, after providing any necessary arguments:
Although the builder allows an arbitrary number of arguments to be built up over repeated calls, care must be taken to ensure that any required arguments are set up appropriately.
Git API provides a high-level overview to commands in a portable fashion,
allowing for the building of porcelain (high-level commands that
operate on lower layers called the plumbing).
To go one level further down an instance of
This is typically constructed using a
which again uses a builder pattern to instantiate a repository. This repository
can then be re-used across multiple commands, can be served via JGit using
something like the
Repository doesn’t provide much information on its own; it provides
a means to evaluate certain tree-ish expressions such as
However, if you just need to know what the current branch is or get a list of
Repository is all you need.
1 2 3 4 5 6
Interacting with the
Repository will only give read-only information, and
only allow getting objects that are already known. To find out information
from a path level or commit level, a couple of iterators must be used, known
RevWalk (commit iterator) and
TreeWalk (path/directory iterator).
To implement a
log like command, you can get a
RevWalk on the repository
and then iterate over commits. To express a start point, the walker needs to
know what commits are included (and also, what commits are excluded).
1 2 3 4 5 6 7 8 9 10
To get information about a specific path, the
TreeWalk is used against a
1 2 3 4 5 6 7 8 9 10
Although this may look like a complex way of processing commits and directories,
this maps to the underlying Git representation in an efficient manner. It also
permits the ability to walk through multiple trees or ranges of commits in
a single pass; additional filters such as an
can be used to restrict the ranges of commits, or similarly for the paths with
Note that the walkers should be released/disposed at the end of the use to ensure that they do not retain information (and thus memory) that may be no longer of interest. Also note that the walkers are not thread-safe, so should only be invoked within a single thread.
Knowledge of these is outside the scope of this tutorial, but here’s how to do a “Hello World” with JGit:
1 2 3
Note that objects inserted into a Git repository become eligible for garbage collection unless they are referred to via a commit and a tree that is reachable from a ref in the repository.