Last week, Doug Schaefer wished on Twitter that other Eclipse projects were getting the same kind of contribution love as Platform UI. Lars Vogel attributed that to the effort in cleaning up the codebase and the focus on new contributions and contributors.
I thought I’d spend some time helping out CDT in assisting with this effort, and over the past week or so have been sending a few patches that way. Fortunately Sergey Prigogin has been an excellent reviewer, turning around my patches in a matter of hours in some cases, and that in turn has meant that I’ve been able to make further and faster progress than on some of the other projects I’ve tried contributing improvements to.
Most recently I’ve been looking into optimising some of the StringBuffer code and thought I’d go into a little bit of detail about the performance aspects of these changes.
The TL;DR of this post is:
StringBuilder
is better thanStringBuffer
StringBuilder.append(a).append(b)
is better thanStringBuilder.append(a+b)
StringBuilder.append(a).append(b)
is better thanStringBuilder.append(a); StringBuilder.append(b);
StringBuilder.append()
and+
are only equivalent provided that they are not nested and you don’t need to pre-sizing the builder- Pre-sizing the
StringBuilder
is like pre-sizing anArrayList
; if you know the approximate size you can reduce the garbage by specifying a capacity up-front
Most of this may be common knowledge but I hope that I can back this up with data using JMH.
Introduction to JMH
The Java Microbenchmark Harness or JMH is the tool to use for performance testing microbenchmarks. In the same way that JUnit is the de facto standard for testing, JMH is the de facto standard for performance measurement. There’s a great thread that goes into the details behind some of JMH’s evolution and the choices that were made; and the fact that since then it seems to have edged out other performance testing benchmark tools like Caliper seems to be a good indicator of its future existence.
JMH projects can be bootstrapped from mvn
and then compiled/post annotated
with the launcher to generate a benchmarks.jar
file, which contains the
code under test as well as a copy of the JMH code in an uber JAR. It also
helpfully sets up a command line interface that you can use to test your
code, and is the simplest way to generate a project.
You can create a stub JMH project using the steps on the JMH homepage:
```sh Generating a JMH project with mvn
$ mvn archetype:generate
-DinteractiveMode=false
-DarchetypeGroupId=org.openjdk.jmh
-DarchetypeArtifactId=jmh-java-benchmark-archetype
-DgroupId=org.sample
-DartifactId=test
-Dversion=1.0
From the command line, the sample project can be run by executing:
```sh Compiling and Running the JMH benchmark
$ mvn clean package
$ java -jar target/benchmarks.jar
There’s a lot of flags that can be passed on the command line; passing -h
will show the full list of flags that can be passed.
Using JMH in Eclipse
If you’re trying to run JMH in Eclipse, you will need to ensure that annotation
processing is enabled. That’s because JMH uses annotations not only to
annotate the benchmarks, but uses a annotation processing tool to transform
the benchmarked code into executable units. If you don’t have annotation
processing enabled and try to run it, you’ll see a cryptic message like
Unable to read /META-INF/BenchmarkList
If you’ve created a Maven project (and presumably, therefore, have m2e
installed) the easiest way is to install JBoss’ m2e-apt
connector, which
allows you to configure the project for JDT’s support for APT. This can be
installed from Eclipse → Preferences → Discovery and choosing the
m2e-apt
connector. After restart this can be used to enable the JDT support
automatically by going to Window → Preferences → Maven →
Annotation Processing and then choosing the “Automatically configure JDT APT”
option.
If you’re not using Maven then you can add the jmh-generator-annprocess
JAR
(along with its dependencies) to the project’s Java Compiler → Annotation
Processing → Factory Path, and ensure that the annotation processing is
switched on.
Tests can then be run by creating a launch configuration to run the main class
org.openjdk.jmh.Main
or by using the JMH APIs.
StringBuilder vs StringBuffer benchmark
So having got the basis for benchmarking set up, it’s time to look at the
performance of the StringBuilder
vs the StringBuffer
. It’s a good idea
to see what the performance is like of the empty buffers before we start
adding content to it:
```java StringBenchmark.java public class StringBenchmark { @Benchmark public String testEmptyBuffer() { StringBuffer buffer = new StringBuffer(); return buffer.toString(); }
@Benchmark public String testEmptyBuilder() { StringBuilder builder = new StringBuilder(); return builder.toString(); }
@Benchmark public String testEmptyLiteral() { return “”; } }
Two things are worth calling out: the first is that the resulting expression
you're using always has to be returned to the caller, otherwise the JIT will
optimise the code away. The second is that it's worth testing the empty case
first of all so that it sets a baseline for measurement.
We can run it from the command line by doing:
```sh
$ mvn clean package
$ java -jar target/benchmarks.jar Empty \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmptyBuffer avgt 20 8.306 +- 0.497 ns/op
StringBenchmark.testEmptyBuilder avgt 20 8.253 +- 0.416 ns/op
StringBenchmark.testEmptyLiteral avgt 20 3.510 +- 0.139 ns/op
The flags used here are -wi
(warmup iterations),
-tu
(time unit; nanoseconds), -f
(number of forked JVMs) and -bm
(benchmark mode; in this case, average time).
Somewhat unsurprisingly the values are relatively similar, with the return literal being the fastest.
What if we’re concatenating two strings? We can write a method to test that as well:
```java StringBenchmark.java @Benchmark public String testHelloWorldBuilder() { StringBuilder builder = new StringBuilder(); builder.append(“Hello”); builder.append(“World”); return builder.toString(); }
@Benchmark public String testHelloWorldBuffer() { StringBuffer buffer = new StringBuffer(); buffer.append(“Hello”); buffer.append(“World”); return buffer.toString(); }
When run, it looks like:
```sh
$ mvn clean package
$ java -jar target/benchmarks.jar Hello \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testHelloWorldBuffer avgt 20 25.747 +- 1.188 ns/op
StringBenchmark.testHelloWorldBuilder avgt 20 25.411 +- 1.015 ns/op
Not much difference there, although the Buffer
is marginally slower than
the Builder
is. That shouldn’t be too surprising; they are both
subclasses of AsbtractStringBuilder
anyway, which has all the logic.
Job done?
Are we all done yet? Well, no, because there are other things at play.
Firstly, JMH is a benchmarking tool to find the highest possible value of
performance under load. What happens in Java is that by default HotSpot
uses a tiered compilation model; it starts off interpreted, then once
a method has been executed a number of times it gets compiled. In fact,
there are different levels of compilation that kick in after a different
amount of calls. You can see these if you look at the various *Threshold*
flags generated by -XX:+PrintFlagsFinal
from an OpenJDK installation.
When a method has been called thousands of times, it will be compiled using the Tier 3 (client) or Tier 4 (server) compiler. This generally involves optimisations such as in-lining methods, dead code elimination and the like. This gives the best possible code performance for the application.
But what if the method is called infrequently, or puts memory pressure on the
garbage collector instead? It won’t be JIT compiled and so will take longer. We
can see the effect of running in interpreted mode by running the generated
benchmark code with -jvmArgs -Xint
to force the forked JVM used to run the
benchmarks to only use the interpreter:
```sh Running benchmarks in interpreted mode
$ mvn clean package
$ java -jar target/benchmarks.jar Empty Hello
-wi 5 -tu ns -f 1 -bm avgt -jvmArgs -Xint
…
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmptyBuffer avgt 20 1102.609 +- 66.596 ns/op
StringBenchmark.testEmptyBuilder avgt 20 769.682 +- 27.962 ns/op
StringBenchmark.testEmptyLiteral avgt 20 184.061 +- 13.587 ns/op
StringBenchmark.testHelloWorldBuffer avgt 20 2299.749 +- 70.087 ns/op
StringBenchmark.testHelloWorldBuilder avgt 20 2381.348 +- 38.726 ns/op
A better option is to use the JMH specific annotation
`@CompilerControl(Mode.EXCLUDE)` which prevents benchmarking methods from being
JIT compiled, while allowing the other Java classes to be JIT compiled as
usual. This is akin to having other classes call the `StringBuffer` (so that is
sufficiently well exercised) while emulating code that isn't called all that
frequently. It can be added at the class level or at the method level.
```sh
$ grep -B2 class StringBenchmark.java
@State(Scope.Benchmark)
@CompilerControl(Mode.EXCLUDE)
public class StringBenchmark {
$ mvn clean package
$ java -jar target/benchmarks.jar Empty Hello \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmptyBuffer avgt 20 144.745 +- 4.561 ns/op
StringBenchmark.testEmptyBuilder avgt 20 122.477 +- 3.273 ns/op
StringBenchmark.testEmptyLiteral avgt 20 91.139 +- 1.685 ns/op
StringBenchmark.testHelloWorldBuffer avgt 20 236.223 +- 7.679 ns/op
StringBenchmark.testHelloWorldBuilder avgt 20 222.462 +- 5.733 ns/op
Either way, calling the code before the JIT compilation has kicked in magnifies the difference between the different types of data structure by a factor of around 10%. So for methods that are called less than 1000 times – such as during start-up or when invoked from a user interface – the difference will exist.
Different calling patterns
What about different calling patterns? One example I came across was using an
implicit String
concatenation inside a StringBuilder
or StringBuffer
.
This might be the case when generating a buffer to represent an e-mail, for
example.
To test this, and to prevent Strings being concatenated by the javac
compiler, we need to use non-final instance variables. However, to do that with
the benchmark requires that the class be annotated with
@State(Scope.Benchmark)
. (As with public static void main(String args[])
it’s best to just learn that this is necessary when you’re getting started, and
then understand what it means later.)
```java StringBenchmark.java @State(Scope.Benchmark) public class StringBenchmark { private String from = “Alex”; private String to = “Readers”; private String subject = “Benchmarking with JMH”; … @Benchmark public String testEmailBuilderSimple() { StringBuilder builder = new StringBuilder(); builder.append(“From”); builder.append(from); builder.append(“To”); builder.append(to); builder.append(“Subject”); builder.append(subject); return builder.toString(); }
@Benchmark public String testEmailBufferSimple() { StringBuffer buffer = new StringBuffer(); buffer.append(“From”); buffer.append(from); buffer.append(“To”); buffer.append(to); buffer.append(“Subject”); buffer.append(subject); return buffer.toString(); } }
You can selectively run the benchmarks by putting one or more regular
expressions on the command line:
```sh
$ mvn clean package
$ java -jar target/benchmarks.jar Simple \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmailBufferSimple avgt 20 88.149 +- 1.014 ns/op
StringBenchmark.testEmailBuilderSimple avgt 20 88.277 +- 1.201 ns/op
These obviously take a lot longer to run. But what about other forms of the
code? What if a developer has used +
to concatenate the fields together in
the append calls?
```java StringBenchmark.java public String testEmailBuilderConcat() { StringBuilder builder = new StringBuilder(); builder.append(“From” + from); builder.append(“To” + to); builder.append(“Subject” + subject); return builder.toString(); }
@Benchmark public String testEmailBufferConcat() { StringBuffer buffer = new StringBuffer(); buffer.append(“From” + from); buffer.append(“To” + to); buffer.append(“Subject” + subject); return buffer.toString(); }
Running this again shows why this is a bad idea:
```sh
$ mvn clean package
$ java -jar target/benchmarks.jar Simple Concat \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmailBufferConcat avgt 20 105.424 +- 3.704 ns/op
StringBenchmark.testEmailBufferSimple avgt 20 91.427 +- 2.971 ns/op
StringBenchmark.testEmailBuilderConcat avgt 20 100.295 +- 1.985 ns/op
StringBenchmark.testEmailBuilderSimple avgt 20 90.884 +- 1.663 ns/op
Even though these calls do the same thing, the cost of having an embedded
implicit String
concatenation is enough to add a 10% penalty on the time
taken for the methods to return.
This shouldn’t be too surprising; the cost of doing the in-line concatenation
means that it’s generating a new StringBuilder
, appending the two String
expressions, converting it to a new String
with toString()
and finally
inserting that resulting String
into the outer
StringBuilder
/StringBuffer
.
This should probably be a warning in the future.
Chaining methods
Finally, what about chaining the methods instead of referring to a local variable? That can’t make any difference; after all, this is equivalent to the one before, right?
```java StringBenchmark.java @Benchmark public String testEmailBuilderChain() { return new StringBuilder() .append(“From”) .append(from) .append(“To”) .append(to) .append(“Subject”) .append(subject) .toString(); }
@Benchmark public String testEmailBufferChain() { return new StringBuffer() .append(“From”) .append(from) .append(“To”) .append(to) .append(“Subject”) .append(subject) .toString(); }
What's interesting is that you do see a significant difference:
```sh
$ java -jar target/benchmarks.jar Simple Concat Chain \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmailBufferChain avgt 20 38.950 +- 1.120 ns/op
StringBenchmark.testEmailBufferConcat avgt 20 103.151 +- 4.197 ns/op
StringBenchmark.testEmailBufferSimple avgt 20 89.685 +- 2.041 ns/op
StringBenchmark.testEmailBuilderChain avgt 20 38.113 +- 1.012 ns/op
StringBenchmark.testEmailBuilderConcat avgt 20 102.193 +- 2.829 ns/op
StringBenchmark.testEmailBuilderSimple avgt 20 89.117 +- 2.658 ns/op
In this case, the chaining together of arguments has resulted in a 50% speed up of the method call after JIT. One possible reason this may occur is that the length of the method’s bytecode has been significantly reduced:
$ javap -c StringBenchmark.class | egrep "public|areturn"
public java.lang.String testEmailBuilder();
60: areturn
public java.lang.String testEmailBuffer();
60: areturn
public java.lang.String testEmailBuilderConcat();
84: areturn
public java.lang.String testEmailBufferConcat();
84: areturn
public java.lang.String testEmailBuilderChain();
46: areturn
public java.lang.String testEmailBufferChain();
46: areturn
Simply by chaining the .append()
methods together has resulted in a smaller
method, and thus a faster call site when compiled to native code. The other
advantage (though not demonstrated here) is that the size of the bytecode
affects the caller’s ability to in-line the method; smaller than 35 bytes
(-XX:MaxInlineSize
) means the method can be trivially inlined, and if it’s
smaller than 325 bytes then it can be in-lined if it’s called enough times
(-XX:FreqInlineSize
).
Finally, what about ordinary String
concatenation? Well, as long as you don’t
mix and match it, then you’re fine – it works out as being identical to the
testEmailBuilderChain
methods.
```java StringBenchmark.java @Benchmark public String testEmailLiteralConcat() { return “From” + from + “To” + to + “Subject” + subject; }
Running it shows:
```sh
$ java -jar target/benchmarks.jar EmailLiteral \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmailLiteral avgt 20 38.033 +- 0.588 ns/op
And for comparative purposes, running the lot with
@CompilerControl(Mode.EXCLUDE)
(simulating an infrequently used method)
gives:
$ java -jar target/benchmarks.jar Email \
-wi 5 -tu ns -f 1 -bm avgt
...
Benchmark Mode Cnt Score Error Units
StringBenchmark.testEmailBufferChain avgt 20 416.745 +- 9.087 ns/op
StringBenchmark.testEmailBufferConcat avgt 20 764.726 +- 9.535 ns/op
StringBenchmark.testEmailBufferSimple avgt 20 462.361 +- 15.091 ns/op
StringBenchmark.testEmailBuilderChain avgt 20 384.936 +- 9.173 ns/op
StringBenchmark.testEmailBuilderConcat avgt 20 752.375 +- 19.544 ns/op
StringBenchmark.testEmailBuilderSimple avgt 20 414.372 +- 6.940 ns/op
StringBenchmark.testEmailLiteral avgt 20 417.772 +- 9.515 ns/op
What a lot of rubbish
The other aspect that affects the performance is how much garbage is created
during the program’s execution. Allocating new data in Java is very, very fast
these days, regardless of whether it’s interpreted or JIT compiled code. This
is especially true of the new +XX:+UseG1GC
which is available in Java 8 and
will become the default in Java 9. (Hopefully it will also become a part of
the standard Eclipse packages
in the future.) That being said, there are certainly cycles that get wasted,
both from the CPU but also the GC, when using concatenation.
The StringBuffer
and StringBuilder
are implemented like an ArrayList
(except dealing with an array of characters instead of an array of Object
instances). When you add new content, if there’s capacity, then the content is
added at the end; if not, a new array is created with double-plus-two size, the
content backing store is copied to a new array, and then the old array is
thrown away. As a result this step can take between O(1) and O(n lg n)
depending on whether the initial capacity is exceeded.
By default both classes start with a size of 16 elements (and thus the implicit
String
concatenation also uses that number); but the explicit constructors can
be overridden to specify a default starting size.
JHM also comes with a garbage profiler that can provide (in my experience, fairly accurate) estimates of how much garbage is collected per operation. It does this by hooking into some of the serviceability APIs in the OpenJDK runtime (so other JVMs may find this doesn’t work) and then provides a normalised estimate for how much garbage is attributable per operation. Since garbage is a JVM wide construct, any other threads executing in the background will cause the numbers to be inaccurate.
By modifying the creation of the StringBuffer
with a JMH parameter, it’s
possible to provide different values at run-time for experimentation:
```java StringBenchmark.java public class StringBenchmark { @Param({“16”}) private int size; … public void testEmail… { StringBuilder builder = new StringBuilder(size); } }
It's possible to specify multiple parameters; JMH will then iterate over each
and give the results separately. Using `@Param({"16","48"})` would run first
with `16` and then `48` afterwards.
```sh
$ java -jar target/benchmarks.jar EmailBu \
-wi 5 -tu ns -f 1 -bm avgt -prof gc
...
Benchmark (size) Mode Cnt Score Error Units
StringBenchmark.testEmailBufferChain 16 avgt 20 37.593 +- 0.595 ns/op
StringBenchmark.testEmailBufferChain: gc.alloc.rate.norm 16 avgt 20 136.000 +- 0.001 B/op
StringBenchmark.testEmailBufferConcat 16 avgt 20 155.290 +- 2.206 ns/op
StringBenchmark.testEmailBufferConcat: gc.alloc.rate.norm 16 avgt 20 576.000 +- 0.001 B/op
StringBenchmark.testEmailBufferSimple 16 avgt 20 136.341 +- 3.960 ns/op
StringBenchmark.testEmailBufferSimple: gc.alloc.rate.norm 16 avgt 20 432.000 +- 0.001 B/op
StringBenchmark.testEmailBuilderChain 16 avgt 20 37.630 +- 0.847 ns/op
StringBenchmark.testEmailBuilderChain: gc.alloc.rate.norm 16 avgt 20 136.000 +- 0.001 B/op
StringBenchmark.testEmailBuilderConcat 16 avgt 20 153.879 +- 2.699 ns/op
StringBenchmark.testEmailBuilderConcat: gc.alloc.rate.norm 16 avgt 20 576.000 +- 0.001 B/op
StringBenchmark.testEmailBuilderSimple 16 avgt 20 136.587 +- 3.146 ns/op
StringBenchmark.testEmailBuilderSimple: gc.alloc.rate.norm 16 avgt 20 432.000 +- 0.001 B/op
Running this shows that the normalised allocation rate for the various methods
(gc.alloc.rate.norm
) varies between 136 bytes and 576 for both classes. This
shouldn’t be a surprise; the implementation of the storage structure is the
same between both classes. It’s more noteworthy to observe that there is a
variation between using the chained implementation and the simple allocation
(136 vs 432).
The 136 bytes is the smallest value we can expect to see; the resulting
String
in our test method works out at 45 characters, or 90 bytes.
Considering a String
instance has a 24 byte header and a character array has
a 16 byte header, 90 + 24 + 16 = 130. However, the character array is aligned
on an 8 bit boundary, so it is rounded up to 96 bits. In other words, the code
for the *Chain
methods has been JIT optimised to produce a single String
with the exact data in place.
The *Simple
methods have additional data generated by the increasing size of
the internal character backing array. 136 of the bytes are the returned
String
value, so that can be taken out of the equation. Of the 296 remaining
bytes, 24 bytes are taken up by the StringBuilder
leaving 272 bytes to
account for. This actually turns out to be the character arrays; a
StringBuilder
starts off with a size of 16 chars, then doubles to 34 chars
and then 70 chars, following a 2n+2 growth. Since each char[]
has an overhead
of 16 bytes (12 for the header, 4 for the length) and that chars are stored as
16 bit entities, this results in 48, 88 and 160 bytes. Perhaps unsurprisingly
the growth (and subsequent discarded char[]
arrays) equal 296 bytes. So the
growth of both the *Simple
elements are equivalent here.
The larger values in the *Concat
methods show additional garbage growth
caused due to the temporary internal StringBuilder
elements.
To test a different starting size of the buffer, passing the -p size=48
JMH
argument will allow us to test the effect of initialising the buffers with 48
characters:
$ java -jar target/benchmarks.jar EmailBu \
-wi 5 -tu ns -f 1 -bm avgt -prof gc -p size=48
...
Benchmark (size) Mode Cnt Score Error Units
StringBenchmark.testEmailBufferChain 48 avgt 20 38.961 +- 1.732 ns/op
StringBenchmark.testEmailBufferChain: gc.alloc.rate.norm 48 avgt 20 136.000 +- 0.001 B/op
StringBenchmark.testEmailBufferConcat 48 avgt 20 106.726 +- 4.118 ns/op
StringBenchmark.testEmailBufferConcat: gc.alloc.rate.norm 48 avgt 20 392.000 +- 0.001 B/op
StringBenchmark.testEmailBufferSimple 48 avgt 20 93.455 +- 2.702 ns/op
StringBenchmark.testEmailBufferSimple: gc.alloc.rate.norm 48 avgt 20 248.000 +- 0.001 B/op
StringBenchmark.testEmailBuilderChain 48 avgt 20 39.056 +- 1.723 ns/op
StringBenchmark.testEmailBuilderChain: gc.alloc.rate.norm 48 avgt 20 136.000 +- 0.001 B/op
StringBenchmark.testEmailBuilderConcat 48 avgt 20 103.264 +- 2.404 ns/op
StringBenchmark.testEmailBuilderConcat: gc.alloc.rate.norm 48 avgt 20 392.000 +- 0.001 B/op
StringBenchmark.testEmailBuilderSimple 48 avgt 20 88.175 +- 2.442 ns/op
StringBenchmark.testEmailBuilderSimple: gc.alloc.rate.norm 48 avgt 20 248.000 +- 0.001 B/op
By tweaking the initialised StringBuffer
/StringBuilder
instances to 48
bytes, we can reduce the amount of garbage generated as part of the
concatenation process. The Java implicit String
concatenation is outside our
control, and is a result of the underlying character array resizing itself.
Here, the *Simple
methods have dropped from 432 to 248 bytes, which
represents the 136 byte String
result and a copy of the 112 byte array
(corresponding to an 41-48 character array with the 16 byte header).
Presumably in this case the JIT has managed to avoid the creation of the
StringBuilder
instance in the *Simple
methods, but the array copy has
leaked through. However other than these two values, there is no additional
garbage created.
Conclusion
Running benchmarks is a good way of finding out what the cost of a particular operation is, and JMH makes it easy to be able to generate such benchmarks. Being able to ensure that the benchmarks are correct are a little harder, as well as what effect seen by other processes. Of course, different machines will give different results to these, and you’re encouraged to replicate this on your own setup.
Although the fully JIT compiled method for both StringBuffer
and
StringBuilder
are very similar, there is an underlying trend for the
StringBuilder
to be at least as fast as its StringBuffer
older cousin. In
any case, implicit String
concatenation (with +
) creates a StringBuilder
under the covers and it’s likely therefore that the StringBuilder
will hit
hot compilation method before StringBuffer
in any case.
The most efficient way of concatenating strings is to have a single expression
which uses either implicit String
concatenation ( + + + +
) or has a series
of (e.g. .append().append().append()
) without any intermediate reference to a
local variable. If you’ve got a lot of constants then using +
will also have
the advantage of using constant folding of the String
literals ahead of time.
Mixing +
and .append()
is a bad idea though, because there will be extra
pressure on the memory as the String
instances are created and then
immediately thrown away.
Finally, although using + + + +
is easy, it doesn’t let you pre-size the
StringBuilder
array, which starts off with 16 characters by default. If the
StringBuilder
is used to create large Strings
then avoiding multiple
results is a relatively simple optimisation technique as far as reducing
garbage is concerned. In addition, the array copy operation will grow larger as
the size of the data set increases.
Update 2020
I have uploaded this code to https://github.com/alblue/com.bandlem.jmh.microopts/ along with an updated version of the results, also committed to the repository.
One of the significant changes in the results was that the JVM has now learnt how to do indification of string concatenation, which has improved both the speed and also the garbage collection profile of the operations. However, the overall relative behaviour of the differences still holds.