The second day of QConLondon had an opening keynote by
(@anjuan)[https://twitter.com/anjuan] talking about the Underground Railroad
network that helped free slaves in the United States. He likened various
players in the scheme with the structure of management and developers. The talk
was delivered in an entertaining enough fashion, but the subject was based on
an analogy to something that is a part of American history rather than the UK
or European history, and so was a little bit disconnected from the current
reality – especially given the dual challenges to the UK of Brexshit and the
Coronavirus. Ultimately I think this keynote may have worked better for an
American audience.
The first real talk of the day talked about Quarkus, as a Java framework for
small quick-start applications. One of the differences with a JIT enabled
language is that there’s a warm-up phase when the application is reaching peak
performance; not an issue for long-running applications, but can be a concern
if you’re following continuous delivery, redelivering multiple times a day. If
you’re deploying every 10 minutes, with every commit, then you may find
that the Java application doesn’t ever reach steady state before being
shut down for a new version of the service. Quarkus has supersonic/fast
boot. It is designed for GraalVM by default, and takes a generated
build file and uses a Maven/Gradle plugin to produce the optimised JAR,
and using ahead-of-time compiler, an ELF executable for the GraalVM. It
also produces containers with a
small on-disk footprint.
The hot reload of code makes for a fast turnaround time, and Quarkus is
opinionated about how it starts Java – like creating a debug listener on the
port at the same time.
Sergey Kuksenko @kuksenk0 has been a JVM
engineer since 2005, and working on performance for the last decade. He kicked
off a demo with two Java Mandelbrot generators, using a Complex class - but in
the case of the faster demo (12fps vs 5fps), the only difference was the
addition of the “inline” keyword. Valhalla provides a denser memory layout for
inline class (aka value types), with the plan to have specialised generics. The
‘inline class’ rather than ‘value’ was chosen to make it easier to update the
Java Language spec with fewer changes. Inline classes don’t have identity,
which means they can’t be compared with == and so avoid problems like
Integer.valueOf(42) == Integer.valueOf(42)
.
An inline class has no identity, is immutable, not nullable and not
synchronizable. The JVM will decide whether to allocate the inline classes on
the heap, or whether they’ll be on the stack or inlined into the container
class. The phrase “Code like a class, work like an int” summarises the goal of
Valhalla. The performance is about 50% faster, but importantly, has many fewer
L1 cache misses. The work on benchmarking the current implementation is in
progress, and seems to be under 2% regression at the moment. For any kind of
arithmetic types, having Complex inline classes gave an order of magnitude
speed up in some cases. The Optional
class will be a value class in the
future, and will be a proof of concept of a migration path. Work is ongoing to
reduce the performance overheads so that it can be available in the future; for
the time being, the JDK14 release in the next couple of weeks will have a
version available for experimentation.
A wildcard session followed, due to the last-minute cancellation of a speaker.
Instead, I went to a talk on
TornadoVM, which provides a way
of compiling and running Java code in parallel on a variety of different FPGA
or GPU solutions. By translating Java bytecode into Tornado bytecode, and then
having different translators which re-write those kernels to GPU specific
instruction sets, it’s possible to get a many thousand times speed up on
numerical calculations. A demo showed capturing depth information from video
captured from a Microsoft Kinect, and re-rendering into three dimensional
representation afterwards. Importantly they have a docker
image which can
be used for testing in beehivelab/tornado-gpu:latest
which is covered by the
project’s README.
The next session I attended was by Alina Yurenko from Oracle on “Maximising
application performance with GraalVM”, which talked about using Graal as an
ahead-of-time compiler for generating native images. Not only does starting the
application run much faster, it also uses a lot less memory than the equivalent
running application under the JVM. Partially this is due to the fact that the
C2 compiler doesn’t need to compile the underlying JVM classes, and partially
due to the fact that the actual runtime of the application has a far lower
total memory layout. Of course, creating an accurate execution profile requires
running the application under an expected (simulated) load, so that the correct
hot code samples can be identified and thus translated appropriately. Graal
uses information gained from the initial execution to prepare appropriate code
for expected types; if these assumptions are incorrect, then the executed
binary will be different. There was also a sales pitch of using GraalVM to
host multiple languages, along with an evolution of the Nashorn JavaScript
engine. It’s unclear as to whether people will really want to use a JVM for
running multiple languages, but then again, those people never really saw
JavaScript as anything other than a toy language, so what do they know? :)
The next session I attended was my own. I thought it would be a little rude
not to turn up :) Fortunately the talk went OK – after all, I completed
writing it with at least an hour to go – and as for timing, I finished 10s
early. My talk was on CPU microarchitecture for maximum performance, looking at
the nitty gritty details of how CPUs execute. I didn’t go down to the electron
or transistor level, but rather talking about the general architectural
details of the processor. My slides have been uploaded to
my SpeakerDeck profile
and apart from a quibble about a bullet on page 25, it seems that most people
seem to have enjoyed it; after all, I did. The video was recorded and will
be available on InfoQ at some point in the future; I’ll update this post with
the link when I have a public link.
Day 2 ended with a get-together of the speakers in the usual location, and I
made several contacts with people who had been speaking at QCon; some whom I
knew, some whom I did not. One of the pleasures of QCon is meeting and talking
to people; I enjoy meeting up with the attendees during the breaks, but it is
also excellent to be able to talk to the movers and shakers of the conference.
Tomorrow I’m leading the Java track, which I’m looking forward to; stay tuned!