I gave a lightening talk at EclipseCon Europe 2013 on “Embedding JGit” talking about the different levels of JGit integration. Here’s the slides and a rough transcript of the talk; when the Eclipse Foundation YouTube video is up, I’ll link to it here as well.
Level Zero
Since JGit is an executable, you can simply fork out using System.exec
or use ProcessBuilder
to execute a JGit command, e.g.:
```java Level0.java System.exec(“java -jar jgit.sh –git-dir /tmp/repo init”);
The <a href="http://search.maven.org/remotecontent?filepath=org/eclipse/jgit/org.eclipse.jgit.pgm/3.1.0.201310021548-r/org.eclipse.jgit.pgm-3.1.0.201310021548-r.sh">`jgit.sh`</a>
is actually an executable shell script as well, so if you are
running on a Unix system then you can invoke `./jgit.sh` as well.
Of course, this is cheating somewhat since the JGit executable isn't really
embedded; but this can be useful for applications that are sensitive to
memory pressure or where the execution can be done on a cloud host.
This approach has a number of advantages, specifically that the embedder
already knows how to use it since JGit provides a (sub)set of the standard
git commands.
Level One
---------
If you need to embed JGit in an existing Java process, then it's possible
to use the program main class `org.eclipse.jgit.pgm.Main` and invoke the
`main` method. The arguments can then be passed in as an array of Strings.
This has the advantage that executing JGit doesn't require spinning up a
new JVM process, and as such, it can turn around multiple requests faster.
It's still necessary to parse the output from the command using stream
parsing to know anything other than 'success' or 'not success' (since the
return code from the `main` method will indicate that already).
```java Level1.java
import org.eclipse.jgit.pgm.Main;
Main.main(new String[] { "--git-dir", "/tmp/repo/.git", "show", "HEAD" });
There are still some optimisations that get missed out if using this level; specifically, the JGit libraries have to parse the contents of the repository repeatedly as no information is shared between the runs.
Note that the repository passed here has to have the .git directory specified.
Level Two
The most popular way of interacting with JGit involves using the
Git
class to wrap a repository and to provide a set of porcelain commands.
This is a set of commands that roughly mirror the high-level commands that
are given at the command line; for example, .add()
or .log()
.
```java Level2.java import org.eclipse.jgit.api.Git
Git git = Git.open(new File(“/tmp/repo/.git”)); git.clean(); git.lsRemote(); git.log();
The advantages of using the `Git` class are that you get to re-use the same
repository between invocations, so subsequent commands may be faster. You also
have IDE completion and compile time correctness for the arguments, as opposed
to the untested strings in the prior examples.
To invoke the command the <em>builder</em> pattern is used; the result from
the `.clean()` is actually a `CleanCommand`. So to invoke it, you need to
invoke the `.call()` method, after providing any necessary arguments:
```java Level2.java
git.clean().setCleanDirectories(true).setIgnore(true).call();
git.lsRemote().setRemote("origin").setTags(true).setHeads(true).call();
Although the builder allows an arbitrary number of arguments to be built up over repeated calls, care must be taken to ensure that any required arguments are set up appropriately.
Level Three
The Git
API provides a high-level overview to commands in a portable fashion,
allowing for the building of porcelain (high-level commands that
operate on lower layers called the plumbing).
To go one level further down an instance of
Repository
is used.
This is typically constructed using a
FileRepositoryBuilder
,
which again uses a builder pattern to instantiate a repository. This repository
can then be re-used across multiple commands, can be served via JGit using
something like the org.eclipse.jgit.http.server.glue.MetaServlet
class.
The Repository
doesn’t provide much information on its own; it provides
a means to evaluate certain tree-ish expressions such as HEAD
and master~2
.
However, if you just need to know what the current branch is or get a list of
tags, the Repository
is all you need.
```java Level3.java Repository repository = FileRepositoryBuilder.create(new File(“/tmp/repo/.git”)) Map tags = repository.getTags(); Map refs = repository.getAllRefs(); String currentBranch = repository.getBranch(); Ref HEAD = repository.getRef(“HEAD”); repository.open(HEAD.getObjectId()).copyTo(System.out)
Level Four
----------
Interacting with the `Repository` will only give read-only information, and
only allow getting objects that are already known. To find out information
from a path level or commit level, a couple of iterators must be used, known
as `RevWalk` (commit iterator) and `TreeWalk` (path/directory iterator).
To implement a `log` like command, you can get a `RevWalk` on the repository
and then iterate over commits. To express a start point, the walker needs to
know what commits are included (and also, what commits are excluded).
```java Level4.java
RevWalk rw = new RevWalk(repository);
Ref HEAD = repository.resolve("HEAD");
rw.markStart(rw.parseCommit(HEAD));
Iterator<RevCommit> it = rw.iterator();
while(it.hasNext()) {
RevCommit commit = it.next();
System.out.println(commit.abbreivate(6).name()
+ " " + commit.getShortMessage());
}
rw.dispose();
To get information about a specific path, the TreeWalk
is used against a
single commit:
```java Level4.java TreeWalk tw = new TreeWalk(repository); ObjectId tree = repository.resolve(“HEAD^{tree}”); tw.addTree(tree); // tree ‘0’ tw.setRecursive(true); tw.setFilter(PathFilter.create(“some/file”)); while(tw.next()) { ObjectId id = tw.getObjectId(0); repository.open(id).copyTo(System.out); } tw.release();
Although this may look like a complex way of processing commits and directories,
this maps to the underlying Git representation in an efficient manner. It also
permits the ability to walk through multiple trees or ranges of commits in
a single pass; additional filters such as an
<a href="http://download.eclipse.org/jgit/docs/latest/apidocs/org/eclipse/jgit/revwalk/filter/AuthorRevFilter.html">`AuthorRevFilter`</a> or
<a href="http://download.eclipse.org/jgit/docs/latest/apidocs/org/eclipse/jgit/revwalk/filter/CommitTimeRevFilter.html">`CommitTimeFilter`</a>
can be used to restrict the ranges of commits, or similarly for the paths with
subclasses of
<a href="http://download.eclipse.org/jgit/docs/latest/apidocs/org/eclipse/jgit/treewalk/filter/TreeFilter.html">TreeFilter</a>.
Note that the walkers should be released/disposed at the end of the use to
ensure that they do not retain information (and thus memory) that may be no
longer of interest. Also note that the walkers are not thread-safe, so should
only be invoked within a single thread.
Level Five
----------
Finally, to get objects in and out of a Git repository requires the use of
a <a href="http://download.eclipse.org/jgit/docs/latest/apidocs/org/eclipse/jgit/lib/ObjectInserter.html">`ObjectInserter`</a> and a
<a href="http://download.eclipse.org/jgit/docs/latest/apidocs/org/eclipse/jgit/lib/ObjectReader.html">`ObjectReader`</a>.
Knowledge of these is outside the scope of this tutorial, but here's how to
do a "Hello World" with JGit:
```java Level5.java
ObjectId hello = repository.newObjectInserter().insert(Constants.OBJ_BLOB,
"hello world".getBytes("UTF-8"));
repository.newObjectReader().open(hello).copyTo(System.out);
Note that objects inserted into a Git repository become eligible for garbage collection unless they are referred to via a commit and a tree that is reachable from a ref in the repository.