In my previous post, I talked about my war on new Boolean()
and how this
was improving the code by reducing object overhead. Fortunately, all of these
changes have now been merged, and we’re a step closer to having no additional
Boolean instances in memory. Well, actually that’s not true; if you use
getField().get()
on a boolean
value, you get a new Boolean()
instance created instead of a pointer to Boolean.TRUE
or Boolean.FALSE
.
Oh well …
So the second part of my optimisation is talking about speeding up Eclipse’s launching. I’ve been looking through what happens when an Eclipse instance starts in order to determine where we can optimise the start-up process. I initially started this by running a new Eclipse instance, but this proved to be variable and in any case too big to measure or optimise appropriately. Instead, I switched focus to running a simple Eclipse 4 based application product, created by File → New → Project → Eclipse 4 → Eclipes 4 Application Product. This provides a simple window, but importantly depends on everything that is needed to bring up an Eclipse instance. So if we can optimise that, we can optimise Eclipse!
To determine where the time is spent, Eclipse has some debug options which
can be enabled with tracing. If you don’t know about Eclipse tracing, it’s
essentially a bunch of key=value
pairs that you can set (like -D) but specified
in a .options
file instead of on the command line. Tracing options can be
set when launching an Eclipse application by going to the launch configuration,
and going to the Tracing tab. Each bundle has possible debug options (specified
in a .options
file inside the bundle itself) which can be enabled; if the
org.eclipse.osgi
bundle is selected then this will switch on general debug
for this bundle (although there are other switches which can be enabled as
well).
Running Eclipse with this will show how long it takes to start up:
Debug options: file:/.../.metadata/.plugins/org.eclipse.pde.core/Empty.product/.options loaded Time to load bundles: 24 Starting application: 620 Application Started: 1757
OK, this gives a high level view of how long it takes to start up based on the time taken to display a window. 1.7-1.8s on my laptop is not abnormal for an empty window, but it’s not great. (Your environment will vary.)
Note that the first run is almost certainly invalid; when Equinox starts up, it generates a lot of cached information and stores that in a set of files, which it will re-read a second time. You can either measure the cold start by deleting the workspace each time, or you can measure a warm start by ignoring the first one and then running a second time. In fact, running a few times makes sense; the OS will cache the JAR files in memory and so these can vary between results depending on availability of operating system.
(An empty product doesn’t take up much memory; however, size the JVM so that
it doesn’t incur a GC during start-up. This will allow more repeatable
measurements to be made between runs. Either pick a big number, or use
-verbose:gc
if you want to verify that a GC run isn’t happening. Ideally,
use this to pick a JVM size you know isn’t going to GC and then remove the
-verbose:gc
flag so that it doesn’t impact the measurements.)
So, we’ve got a product and we’re running it to give us a number. Time to make the number go down. How do we do that?
First, we need to understand what’s happening at start-up. The messages before
printing out Time to load bundles
are messages from the Equinox framework
booting up and initialising the classes specified in the config.ini
file.
This runs the org.eclipse.equinox.simpleconfigurator
bundle, which reads
a file bundles.info
in a similarly named directory. This is the file that
it’s printing out the Time to load bundles
message, and there’s not much
that can be improved here.
Equinox then moves onto starting the framework by raising the start level. As
the levels increase, bundles are started corresponding to their individual
start level, which is in the bundles.info
. This is what brings up Declarative
Services and the Event Admin in level 2, for example. Once the start level has
been increased, and all of the bundles have been started or lazily started
by then, the application launcher then hands it over to the EclipseAppLauncher
which passes it into the application’s run()
method. There’s a listener
which is passed back by the application when it’s up (IApplicationContext
has
an applicationRunning()
method) which is also the same one used to hide the
splash screen.
OK, so each part of these in itself is pretty understandable, and pretty tight.
There will always be a part that is proportional to the number of bundles
involved (the bundles.info
has one line per installed bundle) but otherwise
this is pretty lean. So where does the time go?
In short, loading and defining classes from disk. When you run a Java class, the JVM has to be able to acquire the bytes and define the class itself. Similarly when calling a method, any referenced arguments and exceptions have to be loaded. Fortunately Java is pretty good at this, and is intelligent so that when you define a class you don’t immediately need to define all of the referenced classes; only when you use them for the first time.
However, there are things that run when classes are defined. Firstly, a
class may have static
blocks that have code associated with it – although
that only gets run once the class’ static (or instance) contents are accessed
for the first time. This is a little like the constructor but at the class
level instead of the instance level. The bottom line is if you have a block
that contains blocking or long processing, accessing the class’ fields will
incur a hit for the first time.
The other thing – unique to OSGi – is that a bundle’s Bundle-Activator
will
be run if a class is accessed from the bundle, and the bundle is set to start
lazily (i.e. when a bundle is accessed for the first time). The net effect is
that whilst simply referring to a class won’t necessarily run its static
blocks, simply accessing it from a remote bundle may cause that bundle to be
started, and hence load its activator.
Activators are bad.
Well, in general they aren’t, but they can often cause delays in bundle
start-up. For example, in Eclipse 4.5, the contenttype
bundle’s Activator
referred to Eclipse’s extension registry
which meant that because the extension registry’s interface was referenced the
extension registry bundle was started. This of course takes some time to load
(a few hundred ms on average) which meant the bundle activator for the
contenttype
bundle was blocked until the registry finished starting.
In theory, Eclipse can start bundles asynchronously – though at the moment,
it doesn’t – but when you have a blocking sequence of activators there’s
nothing much that Equinox can do. You can see how long an activator takes to
run if you enable the org.eclipse.osgi/debug/bundleLoad
tracing option:
Starting org.eclipse.core.contenttype_3.5.0.v20150421-2214 [33] Starting org.eclipse.equinox.registry_3.6.0.v20150318-1503 [43] End starting org.eclipse.equinox.registry_3.6.0.v20150318-1503 [43] 36 Starting org.eclipse.equinox.preferences_3.5.300.v20150408-1437 [5] End starting org.eclipse.equinox.preferences_3.5.300.v20150408-1437 [5] 25 End starting org.eclipse.core.contenttype_3.5.0.v20150421-2214 [33] 68
Here, contenttype
is starting, which (serially) is starting the registry
and preference
bundles, before itself finishing start. The number at the end
is in ms but it only represents the total time for the start()
method to run;
any other items (including defining the actual classes themselves) are
effectively hidden here. (We can run this against a 4.6 nightly to see the time
taken to define the activator which is new.)
So, removing the Activator
can speed up some time – albeit only a little –
by removing the need for the bundle to be lazily started. In general,
activators do one of three things:
- Set up data structures or load them from disk (preferences, registry)
- Acquire references to the
BundleContext
- Acquire references to other services (either through the
BundleContext
directly or through aServiceTracker
)
The first case there’s not much that can be done. There’s a possibility that the code could be moved to provide a cache-on-demand style lazy accessor instead of up front; but in the case of the Equinox registry or preference service they’d be needed on start-up anyway (and they’re already pretty well optimised).
The second one can trivially be replaced with a call to
FrameworkUtil.getBundle(the.class).getBundleContext()
where needed. This is
actually a pretty fast operation; the bundle is associated with the class’
ClassLoader, so the net effect is that it boils down to something like
the.class.getClassLoader().getBundle()
which is pretty fast.
The third one can be replaced by moving the components that require the services to a declarative services component. This has the advantage that DS can bring up components in an asynchronous manner and use dependency graph to bring them up in the right order. In fact, in the case of some of the bundles in the Eclipse platform, they were already being brought up by DS but the connection to the other services was outside of DS itself. By moving everything into DS, the pattern is simplified and the code can be removed.
Put all this together, and to date four activators have been removed from Eclipse 4.6:
- org.eclipse.core.contenttype
- org.eclipse.e4.core.commands
- org.eclipse.e4.core.di.extensions
- org.eclipse.jface
So, Eclipse 4.6 is faster at starting up, right? Well, no … it turns out that the empty project is now loading more classes than before. We’re not loading four activators (yay!) but now we’re loading a bunch more classes.
It turns out that it’s a
new bug
in that it’s new code, added to a static
block of
CSSRenderingUtils
that adds a single image to JFaceResources
internal
ImageRegistry
:
static { URL url = ... ImageDescriptor desc = ImageDescriptor.createFromURL(url); if (desc != null) { JFaceResources.getImageRegistry().put(DRAG_HANDLE, desc); } }
That’s all it takes to activate the JFace bundle (formerly; activating the
JFaceActivator
too) and to trigger the creation of the image
registry (which includes auto-populating it). Doesn’t sound like much, but
the image registry hits in the order of 40 separate images, using constants
referenced from any number of classes (which themselves are then loaded and
defined …)
(JFace should support a deferred image loader; but that’s a different issue)
You can measure the amount of code that gets loaded by enabling another trace
option; org.eclipse.ogsi/debug/loader
. This will spit out a
(large) log which shows which classes are being defined and how many bytes
they take. It’s best to log this to a file so you can look through it
afterwards; it’s too big for the Eclipse console view to hold all in one go.
You can grep for classes defined with 'grep defined log.txt'
– piping this
through wc -l
will show how many classes get loaded during a run of the
application. (This was the clue that JFace’s ImageRegistry was being hit,
since it showed up as a diff between that and the same project running in
Eclipse 4.5.) You can also figure out how much stuff is being loaded;
grepping for ‘bytes’ and then piping the second column will give you the number
of bytes that were being loaded. In fact, it’s possible to sum these all up:
grep bytes log.txt | awk -- 'BEGIN {sum=0} {sum += $2} END {print sum}'
It turns out this single change added around 20 classes to the startup of an otherwise empty Eclipse 4 application, and an extra 100k of classes (not to mention the number of bytes of images being defined).
Why do I mention all this? Well, this isn’t about this specific change as such (after all, I broke the build not once but twice yesterday, and possibly a third time as well) – but it’s more to highlight a few things:
- Activators are generally not required, and can be replaced
- The amount of time Eclipse spends loading is proportional to the number and size of classes being loaded
- It’s really easy to add static dependencies between classes which then trigger performance problems later on
Fortunately the size of an application’s loaded classes can be measured, as can the time. So we can show whether or not we’re moving in the right direction – and there are also Eclipse performance tests which look at other factors. But it’s really easy to move the wrong way on the performance profile, and only by identifying what’s on the (serial) critical path and removing unnecessary cruft are we going to speed up Eclipse’s launch process.
PS algorithm choices matter a lot more than micro-optimisations like these, and in general use the JIT will do a lot of the optimisations for you. However, start-up is a ‘special’ case; it only runs once (so most of it won’t be JITted) and even though Equinox’s classloader is re-entrant (and so can handle multiple threads) the declaration of classes is mainly single-threaded. Only bundles whose components are instantiated from declarative services are going to be able to benefit from multi-threaded startup. So; Activator bad, DS good and avoid expensive static blocks are the takeaways here.