Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Opening of the JDK

Java Harmony 2007

Sun have finally released the majority of the JDK under an open-source license, modulo some encumbered parts that they don't have the rights to release yet. It's an impressive turnaround, and it wouldn't surprise me if by the end of the year the encumbered parts haven't been resolved (either by relicensing or reimplementation). So we now have a full open-source VM to work from.

TheReg had a story entitled “What is Harmony for?” in which they postulated that an open-source JDK would mean the death of that project. In fact, nothing could be further from the truth. What will happen is the GNU Classpath effort will share a whole whack of code between the two, or possibly end up as a single stream. (I don't personally see GNU Classpath dying; I can see it being maintained as an active fork that's not under the control of Sun or whoever the JDK stewards end up being.)

So, you might wonder what Harmony brings to the table. Certainly, both are open-source; and if all you care about is having a Java runtime on your Linux box, either are good enough for you. Linux open-source fanatics will probably be more tempted with the GNU variety for a couple of reasons; firstly, most distributions are actually GNU/Linux (as Stallman is fond of pointing out to anyone who will listen), but secondly because the Sun codebase is the most compatible with the existing non-open-source versions (being as it was cut from the same tree).

However, scratch the terms of the license and you start to see the differences, and deeper investigation reveals what they mean. Harmony is licensed under the Apache License, and GNU Classpath and OpenJDK are licensed under the GPLv2(+classpath exception). The GPL is a viral license, meaning that anything that it touches has to be immediately licensed under the GPL; something that's kept the SCO lawyers busy in the past. You might think that's fine – after all, it's worked for Linux – but actually it takes away a certain kind of freedom; the freedom to use it how you want. It's ironic that the same people who argue that device drivers should be opened so that they can do what they want actually have a license that prohibits certain uses.

Any open-source license allows you to tinker with the code that's provided. Most of the time, they'll let you do what you want with it, provided that the changes are kept under the same license. Some explicitly force you to make the changes available; others say that you have to keep the code under that license but don't need to release it. In any of these cases, if you change the code, then you're expected to use the same terms and give back that code, which is reasonable enough.

But the GPL is different. All of this boils down to the poorly-defined concept of 'Linking', from the days when everything was in C. Roughly interpreted, if a program interacts with another program by means of programmatic calls, that can be considered 'Linked'. It's actually one of the grey areas of the GPL that hasn't been tested in court. If you have any code, under any license (open- or closed-source) then if you 'Link' GPL code, you've got to open up your codebase to make it available under the GPL. (Non-programmatic interfacing, such as parsing the output of a program or communicating with it via a network socket, are OK; otherwise no-one could use anything on a GNU/Linux system unless it were GPL. It will be inteeresting to consider whether web-services that form RPC are considered a form of linking; it's not actually linking the code but it has the same effect.)

This causes a problem with libraries. If libraries such as an XML parser were made available under such a license, then any program that used them would have to be GPL as well. This would severely limit the applicability of any library, and so the LGPL was created. Originally standing for the Library GPL, it's now the Lesser GPL. The key distinction is that the LGPL gives you back a freedom to link that code in with other non-GPL codebases. LGPL can be used with either GPL or non-GPL programs, and everyone benefits from having a single library codebase to work with (and report problems to, fixes etc.)

(From GPLv2):

This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License.

Unfortunately, the GNU Classpath project dropped the ball. They wanted a strong GPL system, but that would have meant that any Java program that ran (aka Linked) on the VM would have to be released as GPL. In essence, the JVM is nothing more than a big library, and running Java code is considered Linking with that library. (Even if that doesn't turn out to be the case in a court of law, there's sufficient ambiguity in the concept of Linking that a competitor might take someone to court on the chance that it would be in order to shut a competitor down. I'll leave it up to you to insert your favourite competitor here.)

So GNU Classpath created the “Classpath exception”. Specifically, the whole is considered GPL, but as long as what you're talking to is a Classpath-excepted object, you don't have to be GPL. It's almost, but not quite, like making every individual Java API class under the LGPL whilst keeping the VM under GPL. The upshot is instead of considering it one library, you have to consider it as a VM and 1500 supporting libraries, each of which may either be GPL or GPL+Classpath. In fact, Sun don't give you any more information; the download page claims that it is GPL but that some of the content is GPL+Classpath, and it's up to you to manually inspect every one of those source files to know whether what you're talking to is GPL or not.

Worse than that; what happens in the case of dynamically loaded libraries that are invoked via System.loadLibrary()? What do they fall under? To be safe, you'd have to assume GPL; and (for example) this prohibits running any SWT code on OpenJDK. In addition, the concept of 'derived work' may be liberally taken to include subclassing existing objects (after all, that's what deriving a subclass is) which is pretty legally vague.

Now think outside the box. What if you were amending the JVM to run with other languages, like JRuby? Or there was a need to introduce a new bytecode, like invokedynamic into the underlying architecture? For sure, you'd have to go with the license of the VM itself, regardless of what other classes might be doing in the classpath libraries. And hey, bringing in a JRuby script suddenly becomes GPL-infected because now the JRuby runtime is also GPL-infected. It's going to kill off innovation on the dynamic front.

Interestingly enough, there's two VMs that don't suffer from this problem. The first is Harmony, which is under the more permissive Apache License. Not only that, but it's already designed as a modular architecture (which something that Sun has only recently noticed needs to be fixed, but they're too busy inventing their own modular architecture for future bloatware) and you can elect to run with as many (or as few) components as you like. In order to be called Java, it will have to pass the TCK (which Sun have yet to offer under the terms of the JSR) so any such cut-down VMs aren't going to be called Java. But anyone can innovate with Harmony; whether it's bringing on a new language runtime (like Scala) or innovations in the memory/garbage collection algorithms (and some would say that the current JVM's collection algorithms are garbage). And all of these can be proprietary or released back as Apache license, and anyone making any open-source project can use them, not just those in the GPL fan-club. They can also be sold and distributed with commercial products; witness the bundling of eclipse-harmony and all of the pre-pressed CDs being given out at JavaNetBeansOne.

The other VM is Microsoft's CLR (or its mono open-sourced equivalent). This doesn't have as draconian restrictions where it will be used; in fact, the whole Silverlight technology is venturing back into the browser world which gave Java its first big outing. My suspicion is that this will be an interesting area for research, if only because there's far more supported languages available for the CLR than there are for the JVM; and there's some interesting research issues about the common type system of the DLR, which I've written about previously.

Meanwhile, the final draft of GPLv3 has been released, and allegedly is AL compatible, although the community has yet to decide if that's the case. Whilst it is an interesting note, the same problems can arise with the GPL as before, and since the OpenJDK codebase isn't GPL (but GPL+Classpath) it remains to be seen if anyone thinks that's a good thing, or what the adoption rate is (though one would expect in the course of time for others to go down the GPLv3 road if for nothing else than keeping up with the latest version number. Unfortunately, none of this changes the viral nature of the GPL, nor the way in which programs are derived or linked.

In one way, the fact that OpenJDK under the GPL is good. It means that finally Java will become more prevalent on Linux systems, where before it was very much ignored. (Whether it's too late is a question to be seen.) It also means that Harmony will continue to run, seeing as the freedoms that Harmony provides aren't covered by the GPL. As usual, it does amount to a significant amount of continued work; but at least Harmony is showing the way in the form of a modularised Java engine. If Sun had released OpenJDK under an Apache License (or even EPL) then perhaps we wouldn't still be in an innovative position today; and that's something worth working for.