Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Why ServiceCaller is better (than ServiceTracker)

2020 java osgi eclipse

My previous post spurned a reasonable amount of discussion, and I promised to also talk about the new ServiceCaller which simplifies a number of these issues. I also thought it was worth looking at what the criticisms were because they made valid points.

The first observation is that it’s possible to use both DS and ServiceTracker to track ServiceReferences instead. In this mode, the services aren’t triggered by default; instead, they only get accessed upon resolving the ServiceTracker using the getService() call. This isn’t the default out of the box, because you have to write a ServiceTrackerCustomizer adapter that intercepts the addingService() call to wrap the ServiceTracker for future use. In other words, if you change:

serviceTracker = new ServiceTracker<>(bundleContext, Runnable.class, null);
serviceTracker.open();

to the slightly more verbose:

serviceTracker = new ServiceTracker<>(bundleContext, Runnable.class,
new ServiceTrackerCustomizer<Runnable, Wrapped<Runnable>>() {
public Wrapped<Runnable> addingService(ServiceReference<Runnable> ref) {
return new Wrapped<>(ref, bundleContext);
}
}
}
static class Wrapped<T> {
private ServiceReference<T> ref;
private BundleContext context;
public Wrapped(ServiceReference<T> ref, BundleContext context) {
this.ref = ref;
this.context = context;
}
public T getService() {
try {
return context.getService(ref);
} finally {
context.ungetService(ref);
}
}
}

Obviously, no practical code uses this approach because it’s too verbose, and if you’re in an environment where DS services aren’t widely used, the benefits of the deferred approach are outweighed by the quantity of additional code that needs to be written in order to implement this pattern.

(The code above is also slightly buggy; we’re getting the service, returning it, then ungetting it afterwards. We should really just be using it during that call instead of returning it in that case.)

Introducing ServiceCaller

This is where ServiceCaller comes in.

The approach of the ServiceCaller is to optimise out the over-eager dereferencing of the ServiceTracker approach, and apply a functional approach to calling the service when required. It also has a mechanism to do single-shot lookups and calling of services; helpful, for example, when logging an obscure error condition or other rarely used code path.

This allows us to elegantly call functional interfaces in a single line of code:

Class callerClass = getClass();
ServiceCaller.callOnce(callerClass, Runnable.class, Runnable:run);

This call looks for Runnable service types, as visible from the caller class, and then invoke the function getClass() as lambda. We can use a method reference (as in the above case) or you can supply a Consumer<T> which will be passed the reference that is resolved from the lookup.

Importantly, this call doesn’t acquire the service until the callOnce call is made. So, if you have an expensive logging factory, you don’t have to initialise it until the first time it’s needed – and even better, if the error condition never occurs, you never need to look it up. This is in direct contrast to the ServiceTracker approach (which actually needs more characters to type) that accesses the services eagerly, and is an order of magnitude better than having to write a ServiceTrackerCustomiser for the purposes of working around a broken API.

However, note that such one-shot calls are not the most efficient way of doing this, especially if it is to be called frequently. So the ServiceCaller has another mode of operation; you can create a ServiceCaller instance, and hang onto it for further use. Like its single-shot counterpart, this will defer the resolution of the service until needed. Furthermore, once resolved, it will cache that instance so you can repeatedly re-use it, in the same way that you could do with the service returned from the ServiceTracker.

private ServiceCaller<Runnable> service;
public void start(BundleContext context) {
this.service = new ServiceCaller<>(getClass(), Runnable.class);
}
public void stop(BundleContext context) {
this.service.unget();
}
public void doSomething() {
service.call(Runnable::run);
}

This doesn’t involve significantly more effort than using the ServiceTracker that’s widely in use in Eclipse Activators at the moment, yet will defer the lookup of the service until it’s actually needed. It’s obviously better than writing many lines of ServiceTrackerCustomiser and performs better as a result, and is in most cases a type of drop-in replacement. However, unlike ServiceTracker (which returns you a service that you can then do something with afterwards), this call provides a functional consumer interface that allows you to pass in the action to take.

Wrapping up

We’ve looked at why ServiceTracker has problems with eager instantiation of services, and the complexity of code required to do it the right way. A scan of the Eclipse codebase suggests that outside of Equinox, there are very few uses of ServiceTrackerCustomiser and there are several hundred calls to ServiceTracker(xxx,yyy,null) – so there’s a lot of improvements that can be made fairly easily.

This pattern can also be used to push down the acquisition of the service from a generic Plugin/Activator level call to where it needs to be used. Instead of standing this up in the BundleActivator, the ServiceCaller can be used anywhere in the bundle’s code. This is where the real benefit comes in; by packaging it up into a simple, functional consumer, we can use it to incrementally rid ourselves of the various BundleActivators that take up the majority of Eclipse’s start-up.

A final note on the ServiceCaller – it’s possible that when you run the callOnce method (or the call method if you’re holding on to it) that a service instance won’t be available. If that’s the case, you get notified by a false return call from the call method. If a service is found and is processed, you’ll get a true returned. For some operations, a no-op is a fine behaviour if the service isn’t present – for example, if there’s no LogService then you’re probably going to drop the log event anyway – but it allows you to take the corrective action you need.

It does mean that if you want to capture return state from the method call then you’ll need to have an alternative approach. The easiest way is to have an final Object result[] = new Object[1]; before the call, and then the lambda can assign the return value to the array. That’s because local state captured by lambdas needs to be a final reference, but a final reference to a mutable single element array allows us to poke a single value back. You could of course use a different class for the array, depending on your requirements.

So, we have seen that ServiceCaller is better than ServiceTracker, but can we do even better than that? We certainly can, and that’s the purpose of the next post.

Why ServiceTracker is Bad (for DS)

2020 java osgi eclipse

In a presentation I gave at EclipseCon Europe in 2016, I noted that there were prolems when using ServiceTracker and on slide 37 of my presentation noted that:

  • ServiceTracker.open() is a blocking call
  • ServiceTracker.open() results in DS activating services

Unfortunately, not everyone agrees because it seems insane that ServiceTracker should do this.

Unfortunately, ServiceTracker is insane.

The advantage of Declarative Services (aka SCR, although no-one calls it that) is that you can register services declaratively, but more importantly, the DS runtime will present the existence of the service but defer instantiation of the component until it’s first requested.

The great thing about this is that you can have a service which does many class loads or timely actions and defer its use until the service is actually needed. If your service isn’t required, then you don’t pay the cost for instantiating that service. I don’t think there’s any debate that this is a Good Thing and everyone, so far, is happy.

Problem

The problem, specifically when using ServiceTracker, is that you have to do a two-step process to use it:

  1. You create a ServiceTracker for your particular service class
  2. You call open() on it to start looking for services
  3. Time passes
  4. You acquire the service form the ServiceTracker to do something with it

There is a generally held mistaken belief that the DS component is not instantiated until you hit step 4 in the above. After all, if you’re calling the service from another component – or even looking up the ServiceReference yourself – that’s what would happen.

What actually happens is that the DS component is instantiated in step 2 above. That’s because the open() call – which is nicely thread-safe by the way, in the way that getService() isn’t – starts looking for services, and then caches the InitialTracked service, which causes DS to instantiate the component for you. Since most DS components often have a default, no-arg constructor, this generally misses most people’s attention.

If your component’s constructor – or more importantly, the fields therein, cause many classes to be loaded or perform substantial work or calculation, the fact that you’re hitting a ServiceTracker.open() synchronized call can take some non-trivial amount of time. And since this is typically in an Activator.start() method, it means that your nicely delay-until-its-needed component is now on the critical path of this bundle’s start-up, despite not actually needing the service right now.

This is one of the main problems in Eclipse’s start-up; many, many thousands of classes are loaded too eagerly. I’ve been working over the years to try and reduce the problem but it’s an uphill struggle and bad patterns (particularly the use of Activator) are endemic in a non-trivial subset of the Eclipse ecosystem. Of course, there are many fine and historical reasons why this is the case, not the least of which is that we didn’t start shipping DS in the Eclipse runtime until fairly recently.

Repo repro

Of course, when you point this out, not everyone is aware of this subtle behaviour. And while opinions may differ, code does not. I have put together a sample project which has two bundles:

  • Client, which has an Activator (yeah I know, I’m using it to make a point) that uses a ServiceTracker to look for Runnable instances
  • Runner, which has a DS component that provides a Runnable interface

When launched together, as soon as the ServiceTracker.open() method is called, you can see the console printing "Component has been instantiated" message. This is despite the Client bundle never actually using the service that the ServiceTracker causes to be obtained.

If you run it with the system property -DdisableOpen=true, the ServiceTracker.open() statement is not called, and the component is not instantiated.

This is a non-trivial reason as to why Eclipse startup can be slow. There are many, many uses of ServiceTracker to reach out to other parts of the system, and regardless of whether these are lazy DS components or have been actively instantiated, the use of ServiceTracker.open() causes them to all be eagerly activated, even before they’re needed. We can migrate Eclipse’s services to DS (and in fact, I’m working on doing just that) but until we eliminate the ServiceTracker from various Activators, we won’t see the benefit.

The code in the github repository essentially boils down to:

public void start(BundleContext bundleContext) throws Exception {
serviceTracker = new ServiceTracker<>(bundleContext, Runnable.class, null);
if (!Boolean.getBoolean("disableOpen")) {
serviceTracker.open(); // This will cause a DS component to be instantiated even though we don't use it
}
}

Unfortunately, there’s no way to use ServiceTracker to listen to lazily activated services, and as an OSGi standard, the behaviour is baked in to it.

Fortunately, there’s a lighter-weight tracker you can use called ServiceCaller – but that’s a topic for another blog post.

Summary

Using ServiceTracker.open() will cause lazily instantiated DS components to be activated eagerly, before the service is used. Instead of using ServiceTracker, try moving your service out to a DS component, and then DS will do the right thing.

Bite-sized bytecode and class loaders

2020 java

Today I gave a talk at the London Java Community on bytecode and classloaders. The presentation is available at SpeakerDeck; the presentation was recorded and is on the London Java Community channel.

For the presentation, I wrote a JVM emulator that allows stepping through bytecode and seeing the result of the local and stack as you go. It’s not a complete implementation (the deficiencies are listed on the README) but it’s something you could look through to get a feel of how the JVM works when interpreting code.

The JVMulator is available at https://github.com/alblue/jvmulator and you can build it with Maven or your favourite IDE. There’s a GUI which is set up to run as the main class, so once built, you can run it with java -jar or even mvn exec:java to launch it.

Bytecode

The JVM runs on bytecode; it’s a compact encoding of instructions where most instructions take up a single byte. There’s a good description of it on Wikipedia, and there’s also a useful table of bytecodes as well.

The majority of bytecodes take no operands, but deal with values being pushed to or pulled from the stack. There are also a number of local variable placeholders which are specific to the frame being executed; these typically hold things like the counter in the loop for iteration or other local variables. Methods can have zero or more locals and require zero or more stack depth; both figures are encoded in the method bytecode, so that when the JVM runs it can reserve the amount of required space on the stack for the method to execute.

Arguments passed in to the method take up one local slot, though these placeholders can be re-used throughout a method’s execution if the argument is no longer required after first use. For instance methods specifically, there’s a hidden first argument which contains the this pointer, so if you have an instance method with 2 arguments, it’s always going to reserve at least 3 slots for local variables.

Some bytecodes take operands in the instruction stream, so not all bytes in the stream represent valid instructions. For example, when pushing a constant to the stack the bipush will push the next byte on the stack, and sipush will push the next two bytes as a short onto the stack. Although many such opcodes take only one or two bytes, there is a special wide mode which means that the next instruction takes double the normal amount of variables. This is primarily used when dealing with local variables; the first 256 local variables can be accessed by having a single byte, but if you have more than 256 local variables (why‽) then you’d use the wide form of the iload bytecode for that.

Bytecode is stored in the Code attribute of a method, so all Java class files that have code associated with them (i.e. everything that’s not purely an interface) will have the string Code inside the file somewhere. Interfaces and abstract methods have no Code attribute, though a class will typically have a default constructor injected by the javac compiler.

Stack

The stack forms a key part of the Java bytecode. Operations are consumed from the stack, and results are pushed onto the stack. At the end of the method’s return, the top level of the stack is the return result. Simple math operations (e.g. iadd, fmul) consume two stack elements and then push the result back; some, like ineg pull and push a single value.

One quirk of the JVM is that long and double values occupy two slots on the stack; that is, there’s a missing stack element value which can’t be accessed each time you push or pull one of these values. This was an implementation workaround when JVMs were 32-bit; unnecessary for today, but kept for backwards compatibility and to prevent requiring re-compiling Java code.

There are some 2 operators that deal with two slots at a time (like dup2) which exist as an optimisation to duplicate a long or double value; otherwise, dup is used.

Locals

Locals are accessed with various iload or aload operators to pull the values from the local variables onto the stack. You’ll typically see programs pulling with aload_0 which pulls the first local variable - for instance fields, this is the this parameter. The aload instructions deal with objects by reference (address load) as well as arrays; there’s separate aaload for accessing an object in an object array (such as that you’d process with main(String args[])).

Since bytecodes operate on the stack, if a variable is to be used, it needs to be pulled there first. The only time this isn’t needed is for incrementing (or decrementing) an integer value – there’s a special instruction which is used to do that – and that’s typically used for loops where pulling and stashing the loop counter each time would be unproductive.

Classfiles

A classfile is a tightly packed mechanism for representing Java classes (and interfaces, and some special containers like package-info and module-info). It contains several variable-length sections, so it can’t be randomly accessed directly when loading; it has to be parsed to be understood.

The constant pool is a key component of a class file. It contains a list of typed data values; UTF-8 strings, long values, double values etc. that are used in the method’s code or as field initialisers. There are some instructions which encode specific values – for example, iconst_5 will push 5 onto the stack – but if you are out of luck with your value, you can encode in the constant pool.

As well as UTF-8 strings and numeric values (for int/long/float/double – char/short/byte/boolean are figments of the JVM’s imagination) there are fields which define what it means to be a Class, what a FieldRef or MethodRef is, and a pairing called NameAndType which is essentially used to bind together a method name like equals with its descriptor type (Ljava/lang/Object;)Z – or as programmers know it as, boolean equals(Object). Java decomposes its methods this way, because if there are any other methods that have the same signature of boolean something(Object) then they can use the same type descriptor in the file and simply pair it with a different name.

All of the constant pool references are cross-checked by index number, which starts at 1 – the special slot 0 is only used to encode no parent for the java.lang.Object class as far as I can tell. It encodes a tree-like structure through the power of indices; the this class and super class are merely a short value pointing into the pool, so to understand what a class file is from its bytes you have to parse the full constant pool first of all.

I put together this infographic showing how the class file looks in my presentation referenced at the top of this post, which hopefully paints a picture.

Class file format infographic

Other tools for introspecting bytecode are available; I’d recommend starting off with javap and using the -c and -v options to give you a bunch of information on the class. If you want to see stepping through real bytecode on a real JVM then I recommend looking at Chris Newland’s JITWatch which shows you the bytecode as it executes and how it maps back onto the source files, using the LineNumberTable attributes encoded in the bytecode.

Compiling classes in memory

Bytecode can be read in from previously generated .class files, but you can also generate it on the fly. Many JVM languages have the ability to generate .class files, but if you want to stick with Java you can use the built-in JavaC compiler to generate code:

var javac = ToolProvider.getSystemJavaCompiler();
var fileManager = javac.getStandardFileManager(null,null,null);
var sources = fileManager.getJavaFileObjects(new File(...));
javac.getTask(null,fileManager,null,null,null,sources).compile();

You can create a file manager from the tool, but you can provide your own as well. I’ve written an InMemoryFileManager which allows you to compile Java source from a String and then obtain the appropriate .class bytes as a byte array, or even load it dynamically in a class file with a classloader. The example fits on a slide if you’re interested.

Bytecode can also be created on the fly using tools like Mockito, or through generation agents like the higher level ByteBuddy or the lower level ASM. Many of these types of tools provide simple transformation operations on existing classes, like inserting instrumentation, and there are constraints about methods (including the ability to generate accurate object maps for the compiler) which can be challenging.

Summary

The bytecode format used by classfiles is remarkably compact yet extensible. Of the constant pool types, very few new entries have arrived and only one removal since the JVM was created; the majority of new features have been added through attributes, either on the class as a whole or on the individual methods.

Bytecode has remained very similar as well; much of the innovation has come from higher up the stack in the Java compiler. The only significant changes were the introduction of invokedynamic in Java 7 (which set the ground for Lambdas arriving later) and building on top of that the CONSTANT_Dynamic_Info and CONSTANT_InvokeDynamic constant pool types.

There was a political decision to increment the bytecode number upon each major release since Java 8, although the bytecode hasn’t changed that much. One argument for doing this is you know when you have a class file that requires Java 11 runtime features, even if the bytecode could run on a Java 8 VM. Since it’s also possible to get a Java compiler to output bytecode with a lower level, it doesn’t make much of a difference, and it also allows you to use javap to find out what version of Java is required to run a particular class.

Getting started with understanding bytecode is easy; just run javap -c -v java.lang.Object or javap -c -v java.lang.String and see if you can understand what it tells you. Then try stepping through some compiled bytecode with JITWatch or the JVMulator. Finally, use the code snippets above or in the presentation to compile some Java code on the fly and then execute it. Once you’ve done that, you’ll have a much greater appreciation of what the JVM does for you every day.