Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Using EMF for OSGi service creation

2010 Eclipse

I have never really understood the benefits of model-driven development, and although I have looked at EMF and friends (briefly) in the past, I've never really found it useful. Like other styles of programming, model-driven development is useful for solving certain types of needs; but you can get away without it quite easily. The same can also be said of dependency injection, modularisation, test-driven development and so forth; each of those provide advantages which aren't immediately apparent to the novice user.

So I figured I'd give EMF another spin to see if I could use EMF for OSGi service creation.

Models, diagrams, genmodels - oh my!

The first hurdle to cross is understanding the different kinds of files in an EMF project, and to reason why they're all needed.

  • ecore - stores the model representation itself
  • ecorediag - used if you want to create a graphical representation of the models (far easier than the point-and-click editor)
  • genmodel - configuration for determining how the ECore file is translated into Java source code

Why do you need so many individual files for representing and generating the models? Well, one argument is that they're each specifically suited to one type of job; the ECore could be used to drive other types of code generation (into C++, say) and the diagram makes it easier to see what's happening (but not strictly necessary). The GenModel contains configuration options that are used by the generation process to customise the output.

This won't be a full tutorial on EMF; for that, have a look at Lars Vogel's excellent tutorials on EMF and others.

Out of the box, EMF generates classes that use a separate interface and implementation classes. That's good, but it uses a few oddities which may not be to everyone's tastes.

  • The interface extends EObject
  • The implementation class is called ...Impl
  • The implementation class is put in the impl package

None of these are show-stoppers, but they are different from the standard Eclipse mechanisms of using an I prefix for interfaces, and using an unadorned name for the class.

Not like that, like this...

Fortunately, some of these can be addressed with customisations made to the genmodel file. For example, the names of the interface and class, as well as the package, can be adjusted by making the following modifications to the genmodelsource:

<genmodel:GenModel interfaceNamePattern="I{0}" classNamePattern="{0}"
rootExtendsInterface=""...>
  <genPackages classPackageSuffix="internal" ...>

It's possible to change this in the drop-down properties list as well, by selecting the Interface Name Pattern and Class Name Pattern in the “Model” section, and the Implementation in the “Package names” section of the package(s) on the genmodel.

Having customised this, it's now necessary to remove traces of the EMF from the public interface. By default, it will set up the interface with a EObject parent. In a number of cases, this isn't desirable, since you may not want to expose the fact that EMF is behind the implementation. Fortunately, you can change the Root Extends Interface in the “Model Class Defaults” of the properties. Changing it to empty (which gives you rootExtendsInterface="" in the genmodel) gives you plain interfaces for the objects in question.

The next problem is the factory. This is used for creation of the concrete nodes, much like the DocumentBuilderFactory works for XML documents in Java. By default, it extends EFactory, an EMF-specific interface, which again leaks implementation details out. This can be removed by setting the “Supress EMF Metadata” option, or by doing suppressEMFMetaData="true" in the genmodel.

Hitting the brick wall

We're now at a state where the generated interfaces are almost completely EMF free. We have a factory and a type which are both specified with “pure” interfaces.

However, any abstract factory needs a way of acquiring the factory in the first place. Such factory factories exist (hello DocumentBuilderFactory), but there are many ways of acquiring this. A property set on the JVM, using injection in Spring, or even an OSGi service lookup (coupled with a declarative services registration).

EMF, on the other hand, takes an in-your-face approach to providing the factory factory by virtue of a public static final field to an internal class. Although this doesn't use EMF visibly from the outside, the class is still in the internal package (which shouldn't be exported) and leaks the EMF inheritance via many of the methods on that type.

The documentation notes that it's possible to generate API with no EMF dependencies but the existence of this field directly contradicts that. Worse, this field is on the interface of the factory, not even a class. Now whilst it's possible to have multiple factories (or additional ones to the first), the 'default' one is used in many places and is not going to change (WONTFIX).

As if that wasn't enough, the generated bundle that EMF spits out has Bundle-SymbolicName: ..singleton:=true which prevents multiple instances being installed at the same time. There's a lot of software that doesn't have that constraint – the Eclipse UI is mostly littered with singletons to prevent multiple options being available – but for generic OSGi services, that's not acceptable. And, as if that wasn't enough, it leaks out a dependency on EMF even when it doesn't need to:

Bundle-SymbolicName: ..singleton:=true
Require-Bundle: org.eclipse.core.runtime,
 org.eclipse.emf.ecore;visibility:=reexport

Ultimately, if there were configurable ways of providing the factory – like registering an OSGi service, or providing a class method instead of a static variable which could do service lookups, then the approach might be more useful. But as it is, there's one too many OSGi anti-patterns in the generated code to make using EMF for OSGi services practical for general use.

But what about E4?

Doesn't E4 suffer from these problems? It uses EMF all over the place. Well, yes, it does. There's a couple of options you can take if you're wanting to use EMF in this way:

  • Live with the EMF exports and leaks, and turn everything into models
  • Manually remove the leaks each time you regenerate the source
  • Don't use EMF for representing a separate interface/model dichotomy and write your own interfaces

As it happens, most of the Eclipse workbench is driven by singletons (there's only one workbench, for example – something that's caused problems for RAP style solutions. So the fact that now, in the OSGi runtime, there's only one implementation available in the runtime isn't a significant downside for those bundles wishing to use it. In any case, most of the UI is the same.

This is of course one reason why the dreaded “Restart Eclipse now” messages gets popped up whenever you install something into Eclipse – you just can't update singletons without taking them and everything they depend on down and back up again. (There's also some issues with native code dependencies, so hybrid bundles like SWT may not be updatable in an OSGi runtime in any case.)

But for developing well-behaved OSGi services, you really don't want to be caught by the same kind of self-imposed restrictions that led to the state of the world in Eclipse. Yes, it might work for them – but even in the E4 world, with its new service-based instead of singleton-based APIs still suffers from the same singleton aspect.

So, is MDD or EMF bad?

Well, neither really. Model driven design is a way of amplifying small models (text or otherwise) into vast quantities of generated code. I'll bet you use this all the time, in fact. Hibernate is an example – given a model (hibernate configuration file), it generates a set of files (DDL, SQL and the like) based on options you specify. For Oracle, it'll generate one set of files; for DB2, another. With other DB access tools, you can even use it to generate the DDL which you can save to files.

Model driven design is far more popular as a runtime aspect than a compile-time aspect, though. And whilst it's possible to use EMF as a run-time model, most of the uses and examples you'll find are at the compile-time level. A key distinction is that it's usually possible to introduce models at runtime without having to change the build process, whereas for models at compile time you either need generated code checked in or a modification to the build process to run the model generation.

And let's not forget that EMF is pretty powerful at generating EMF models. Many don't like the verbosity of XML that EMF uses to store that information; in fact, XML permeates the entire EMF infrastructure (which is why models need to have a namespace and namespace URI – the EMF model editor doesn't complain if they're not there, but it will cause problems for code generation if they're missing). There's also a lot of projects using EMF (though the old adage that 90% of them are automatically generated usually comes up), so that many people can't be wrong.

The real problem is that there's a difference between “Generate Java code” and “Generate EMF models”. Many get frustrated at EMF, not because EMF isn't powerful, but because by default it generates EMF models rather than models you want it to generate. And that keeps putting people off using EMF.

It's possible to change EMF – after all, the templates are available in source form (see the templates/model directory in org.eclipse.emf.codegen.ecore_*.jar for more). And there's a project called Acceleo which allows arbitrary model-to-something translations. Sadly, there's very little documentation available on the Eclipse Wiki. You'd be hard pushed to think it wasn't a vapourware product – except that it's been around for a few years at their old website which explain it in a bit more detail. Update: There is documentation in the help pages at http://help.eclipse.org.

(Incidentally, this does seem to be a trend for modelling projects at Eclipse; there's about as much documentation as their is model definitions. Pity there's no way of automatically generating documentation from models :-)

Summary

EMF is seductively simple in terms of being able to generate a model and then use that to generate a number of classes. But the generated output is really just another EMF model; don't try and pretend it's something else. Using EMF as an OSGi service is not a practical solution for general cases; Eclipse's workbench of existing singletons is such a special case.

However, if you can find the right model generation – which after all, exists today in the form of New Project wizards, maven archetype generation and the like – then having a model-driven design becomes more practical. Additionally, model-based approach allows you to regenerate after the initial generation (something you can't do for archetype or new projects easily).