The second day of QConLondon had an opening keynote by (@anjuan)[https://twitter.com/anjuan] talking about the Underground Railroad network that helped free slaves in the United States. He likened various players in the scheme with the structure of management and developers. The talk was delivered in an entertaining enough fashion, but the subject was based on an analogy to something that is a part of American history rather than the UK or European history, and so was a little bit disconnected from the current reality – especially given the dual challenges to the UK of Brexshit and the Coronavirus. Ultimately I think this keynote may have worked better for an American audience.
The first real talk of the day talked about Quarkus, as a Java framework for small quick-start applications. One of the differences with a JIT enabled language is that there’s a warm-up phase when the application is reaching peak performance; not an issue for long-running applications, but can be a concern if you’re following continuous delivery, redelivering multiple times a day. If you’re deploying every 10 minutes, with every commit, then you may find that the Java application doesn’t ever reach steady state before being shut down for a new version of the service. Quarkus has supersonic/fast boot. It is designed for GraalVM by default, and takes a generated build file and uses a Maven/Gradle plugin to produce the optimised JAR, and using ahead-of-time compiler, an ELF executable for the GraalVM. It also produces containers with a small on-disk footprint. The hot reload of code makes for a fast turnaround time, and Quarkus is opinionated about how it starts Java – like creating a debug listener on the port at the same time.
Sergey Kuksenko @kuksenk0 has been a JVM
engineer since 2005, and working on performance for the last decade. He kicked
off a demo with two Java Mandelbrot generators, using a Complex class - but in
the case of the faster demo (12fps vs 5fps), the only difference was the
addition of the “inline” keyword. Valhalla provides a denser memory layout for
inline class (aka value types), with the plan to have specialised generics. The
‘inline class’ rather than ‘value’ was chosen to make it easier to update the
Java Language spec with fewer changes. Inline classes don’t have identity,
which means they can’t be compared with == and so avoid problems like
Integer.valueOf(42) == Integer.valueOf(42).
An inline class has no identity, is immutable, not nullable and not
synchronizable. The JVM will decide whether to allocate the inline classes on
the heap, or whether they’ll be on the stack or inlined into the container
class. The phrase “Code like a class, work like an int” summarises the goal of
Valhalla. The performance is about 50% faster, but importantly, has many fewer
L1 cache misses. The work on benchmarking the current implementation is in
progress, and seems to be under 2% regression at the moment. For any kind of
arithmetic types, having Complex inline classes gave an order of magnitude
speed up in some cases. The
Optional class will be a value class in the
future, and will be a proof of concept of a migration path. Work is ongoing to
reduce the performance overheads so that it can be available in the future; for
the time being, the JDK14 release in the next couple of weeks will have a
version available for experimentation.
A wildcard session followed, due to the last-minute cancellation of a speaker.
Instead, I went to a talk on
TornadoVM, which provides a way
of compiling and running Java code in parallel on a variety of different FPGA
or GPU solutions. By translating Java bytecode into Tornado bytecode, and then
having different translators which re-write those kernels to GPU specific
instruction sets, it’s possible to get a many thousand times speed up on
numerical calculations. A demo showed capturing depth information from video
captured from a Microsoft Kinect, and re-rendering into three dimensional
representation afterwards. Importantly they have a
docker image which can
be used for testing in
beehivelab/tornado-gpu:latest which is covered by the
The next session I attended was my own. I thought it would be a little rude not to turn up :) Fortunately the talk went OK – after all, I completed writing it with at least an hour to go – and as for timing, I finished 10s early. My talk was on CPU microarchitecture for maximum performance, looking at the nitty gritty details of how CPUs execute. I didn’t go down to the electron or transistor level, but rather talking about the general architectural details of the processor. My slides have been uploaded to my SpeakerDeck profile and apart from a quibble about a bullet on page 25, it seems that most people seem to have enjoyed it; after all, I did. The video was recorded and will be available on InfoQ at some point in the future; I’ll update this post with the link when I have a public link.
Day 2 ended with a get-together of the speakers in the usual location, and I made several contacts with people who had been speaking at QCon; some whom I knew, some whom I did not. One of the pleasures of QCon is meeting and talking to people; I enjoy meeting up with the attendees during the breaks, but it is also excellent to be able to talk to the movers and shakers of the conference.
Tomorrow I’m leading the Java track, which I’m looking forward to; stay tuned!