Multiple JVMs, different numbers of threads and BenchmarkMode
Dmitry Vyazelenko
vyazelenko at yahoo.com
Fri Aug 1 10:29:58 UTC 2014
Hi all,
I have several questions about JMH usage. In particular I’m interested in:
1) How do you handle need to run benchmarks on multiple JVMs
2) Using different number of threads for the same benchmarks
3) Choosing BenchmarkMode/iteration config for dynamic workloads
Here what I’ve been doing:
1) To run on multiple JVMs I’m using -jvm parameter. Something like this:
java -jar target/benchmarks.jar -jvm <path>/jdk1.7.0_65/bin/java -rf CSV -rff jdk1.7.0_65_results.csv
java -jar target/benchmarks.jar -jvm <path>/jdk1.8.0_11/bin/java -rf CSV -rff jdk1.8.0_11_results.csv
I’ve been wondering is there any better way to do that? I mean in version 0.9.3 ability to specify JVM was added to the @Fork annotation. However it only allows single value to be specified (same as -jvm parameter). I was thinking maybe it should support multiple JVMs? This way one could start benchmark once and get results from all JVMs. Right now it requires writing bash scripts to run benchmarks on multiple JVMs.
I understand that this change would require changing output table to also include VM with which benchmarks were executed. Does anyone else think something like that would be a good idea?
2) Lately I found myself in a situation in which I want to execute the same benchmark with different number of threads. Up to this point I know of 3 different ways to do that:
a) Invoke benchmark with -t parameter with given number of threads. This works fine for single benchmark class but does not work when I run benchmarks suite, because this parameter applies to all benchmarks.
b) Using @Threads annotation on a class . The problem is that it only allows single value to be specified. Which means that either one has to change annotation and re-build benchmark. Or sub-class benchmark class and redefine @Threads annotation with different value.
c) Using @Threads annotation on a method. This is probably the cleanest solution as I’m able to declare same method N times with different number of threads, The drawback is that one have to repeat benchmark method code or extract it into helper method and invoke from real benchmark methods annotated with @Benchmark.
I think it would be really great if JMH would allow specifying multiple values for the @Threads annotation. And will execute benchmarks with the number of threads defined there. Of course reporting of the results should then include number of threads that were used to produce result.
3) This last point is more of a question on which BenchmarkMode and/or iteration configuration should be used when work inside benchmark method is not constant but instead depends on the parameter for current run.
For example I have a benchmark that measures Map.put() performance for different number of "batch size”. Where batch size is a parameter with values 10, 100, 1000…10000000. The benchmark method is then looping over array of keys which are allocated based on the batch size and invoke Map.put(), e.g.:
@Benchmark
public void put(ThreadData data) {
for (K key : data.keys) {
map.put(key, data.value);
}
}
Now because of dynamic nature of the work done by a benchmark method what would be the best BenchmarkMode to use and how the best to configuration @Warmup/@Measurement iterations?
What I tried so far was to use AverageTime mode with default (i.e. time-based) configurations for warmup and measurement iterations. However I see the following issues: for small batches benchmark method is invoked lots of time but as the batch size increase it leads to maybe a single invocation. Also the target method (i.e. Map.put()) is invoked more time then expected, i.e. it should be called only “batch size”-number of times but due to time-based nature of iterations it will be called much more often.
Therefore I’ve been looking for ways to mitigate these problems. To solve problem of ensuring fixed work per benchmark method I’m considering using either SingleShotTime mode or “batchSize” feature to configure iterations (http://hg.openjdk.java.net/code-tools/jmh/file/6354acecccb7/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_26_BatchSize.java).
- While the former will allow keeping code as-is I’ll have to deal with the timers overhead on small batch sizes and also use bigger number of warmup and measurement iterations. Also SingleShotTime won’t aggregate any results for me which will complicate post-processing of the results.
- With the latter approach I’ll have to completely re-think my code, eliminate loops from measurement methods and rely on on fixed number of invocation defined by the “batchSize” parameter (as shown in the http://hg.openjdk.java.net/code-tools/jmh/file/6354acecccb7/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_26_BatchSize.java example) of the @Warmaup/@Measurement annotation (or command flag). It would also mean that I can’t use my own “batch size” parameter anymore. That would also mean that I won’t be able to run my benchmarks with set of values and instead will have to invoke them multiple times each time with different batchSize. And that would require writing script to drive benchmarks again.
Maybe it would be a good idea to allow specifying array of batchSize values via command line or annotations. At least this will eliminate scripting part but will require adding this information to the results table (e.g. batchSize 10 value1 batchSize 100 value2 etc.).
Thanks for the awesome tool and your help in advance!
Best regards,
Dmitry Vyazelenko
More information about the jmh-dev
mailing list