Multiple JVMs, different numbers of threads and BenchmarkMode

Mon Aug 4 20:48:09 UTC 2014

Hi Aleksey,

Thanks for the reply. I’ll do my best formatting-wise in the future.
For other things, see my comment inline.

On Aug 3, 2014, at 18:20 , Aleksey Shipilev <aleksey.shipilev at oracle.com> wrote:

> Hi Dmitry,
> 
> First, uncalled-for formatting advice: a) use text-wrapping tools in
> your messenger, b) use an empty line to demarcate paragraphs. Otherwise,
> your messages look like these wide walls of text in mailing clients:
> http://mail.openjdk.java.net/pipermail/jmh-dev/2014-August/001195.html
> 
> ...with no hope for auto-reformat.
> 
> 
> On 08/01/2014 02:29 PM, Dmitry Vyazelenko wrote:
>> I have several questions about JMH usage. In particular I’m interested in:
>> 1) How do you handle need to run benchmarks on multiple JVMs
>> 2) Using different  number of threads for the same benchmarks
>> 3) Choosing BenchmarkMode/iteration config for dynamic workloads
> 
> The rule of thumb is: annotations cover the generic cases; if current
> annotations are too constraining, fall back to Java API. You should be
> able to code non-trivial things there without much trouble. I tend to
> think all three questions are answered by Java API.
> 
> A few suggestions otherwise:
> 
>> Here what I’ve been doing:
>> 1) To run on multiple JVMs I’m using -jvm parameter. Something like this:
>>  java -jar target/benchmarks.jar -jvm <path>/jdk1.7.0_65/bin/java -rf CSV -rff jdk1.7.0_65_results.csv
>>  java -jar target/benchmarks.jar -jvm <path>/jdk1.8.0_11/bin/java -rf CSV -rff jdk1.8.0_11_results.csv
> 
>> I’ve been wondering is there any better way to do that? 
> 
> Most of us are invoking the benchmark.jar with the target VM to make
> sure: a) we indeed running the requested VM, otherwise we need to
> cross-check if JMH is actually running the given one; b) ensures
> compatibility between hosted and forked VM. Granted, both are guaranteed
> in sane environments, but this is defense-in-depth concern.
I’ve been doing the same until recently when I discovered -jvm parameter. So it seems 
to be a better way anyway than specifying -jvm parameter. 
> 
>> I mean in version 0.9.3 ability to specify JVM was added to the @Fork
>> annotation. However it only allows single value to be specified (same
>> as -jvm parameter). I was thinking maybe it should support multiple
>> JVMs?
> 
> ...
> 
>> I understand that this change would require changing output table to
> also include VM with which benchmarks were executed. Does anyone else
> think something like that would be a good idea?
> 
> I think that we would need to rethink how benchmark parameters with
> named annotations interact with @Param. It would be profitable to
> adopt/convert named params to @Param-s, which will give us what you
> suggest for free and in a clean manner.
That would be great if named annotations and @Param can be unified. ;-)
> 
>> I think it would be really great if JMH would allow specifying 
>> multiple values for the @Threads annotation. And will execute
>> benchmarks with the numberof threads defined there. Of course
>> reporting of the results should then include number of threads that
>> were used to produce result.
> 
> Ditto, see above. Requires the connection with @Param-s.
> 
> 
>> 3) This last point is more of a question on which BenchmarkMode
>> and/or iteration configuration should be used when work inside
>> benchmark method is not constant but instead depends on the parameter
>> for current run.
> 
> 
> 
>> Now because of dynamic nature of the work done by a benchmark method
>> what would be the best BenchmarkMode to use and how the best to
>> configuration @Warmup/@Measurement iterations?
> 
> AverageTime or Throughput, obviously. I'm not sure how much of a problem
> that is, actually. We should not care how many times @Benchmark was
> called in throughput modes, we should only care it was called *enough*
> times.
Well, my use-case here is different one. I actually want to measure certain amount
of work/invocations. For example I want to benchmark how long it takes to add 1000
elements into my data structure. If I just rely on time-based modes like AverageTime
it means that my @Benchmark will be called unknown/unpredictable number of times.
If inside this method I put a loop which adds 1000 elements into collection then this
would not yield expected results (i.e. my collections will contain millions of entries and
not 1000).

Therefore I think that I should use @Warmup/@Measurment annotations with batchSize
defined and SingleShotTime mode, i.e. using similar technique shown in the 
JMHSample_26_BatchSize.java example. And of course I’ll have to use Java API to code
proper invocation of my benchmarks with batchSize set dynamically.
> 
> Sure, for large benchmarks warmup and measurement needs adjusting, but
> that's in users' hands. We could theoretically go for something
> SPECjvm2008 does, and enforce the minimum number of @Benchmark calls to
> constitute the iteration. SPECjvm2008 experience tells us it can be
> frustrating to users who need for predictable time, and those who have
> large workloads, but don't know about that yet -- wasting hours waiting
> for benchmark to complete.
> 
> 
>> It would also mean that I can’t use my own “batch size” parameter
>> anymore. That would also mean that I won’t be able to run my
>> benchmarks with set of values and instead will have to invoke them
>> multiple times each time with different batchSize. And that would
>> require writing script to drive benchmarks again.
> 
> See, you need advanced behavior with scripting. Java API provides you
> with the opportunity to write "scripts" in Java. It should not be the
> burden of JMH runner to encompass the user-defined benchmarking logic
> for users. Instead, we enable users to tell JMH what to do in
> coarse-grained (via annotations) or in fine-grained (via
> API/command-line) fashion.
> 
>> Maybe it would be a good idea to allow specifying array of batchSize
>> values via command line or annotations. At least this will eliminate
>> scripting part but will require adding this information to the
>> results table (e.g. batchSize 10 value1 batchSize 100 value2 etc.).
> 
> See above. Needs a proper connection with @Param-s.
> 
> Thanks,
> -Aleksey.
> 

Regards,
Dmitry