Need some help for a book intro to JMH

Fri Aug 12 20:58:04 UTC 2016

Hello Bruce,

first of all I am not sure if using a books gradle to setup JMH is
easier than using the blessed way with maven archetypes

but refering to the benchmark, I think the problem (if there is any
problem, it would actually be good educational material to show how
different benchmarks can be if they are modelling the reality :) is that
you test parallel code (the common fork join pool) with parallel
benchmarks and therefore the parallelisation cannot speed up the thing.
Running with threads=1 (-t 1) it should speed up on all cpu configs (at
least in this case for the independend lambda setter).

JMH by default uses max threads (system available CPUs which is
all CMT logical CPUs in the activated cpuset)

Gruss
Bernd

 Am Fri, 12 Aug 2016 11:33:22
-0600 schrieb Bruce Eckel <bruceteckel at gmail.com>:

> I'm using JMH in parts of an upcoming Java book. I've got some other
> parts working but have gotten stuck on the introduction because JMH
> is producing rather unexpected results.
> 
> I'm using Arrays.setAll() vs Arrays.parallelSetAll() here because I've
> already shown the perils of trying to write your own benchmarking
> system using that example and it has demonstrated it well (I can
> include the whole section if you want but I wanted to start by
> focusing on the JMH part).
> 
> To reproduce it:
> git clone https://github.com/BruceEckel/OnJava8-Examples
> cd OnJava8-Examples
> gradlew :verifying:jmh
> 
> So my questions are:
> 1. Is Arrays.setAll() vs Arrays.parallelSetAll() just too tricky and
> subtle as an example, or are there JMH annotations I can add to fix
> this? 2. Is there a better introductory example I should be using?
> (Ideally such an example would also show problems when using simple
> timing).
> 
> Here is the introductory section (so far), including test results:
> 
> ### Introducing JMH
> 
> At this writing, the only microbenchmark system for Java that produced
> decent
> results is The *Java Microbenchmarking Harness*
> [JMH](http://openjdk.java.net/projects/code-tools/jmh/). Configuring
> and using
> JMH by hand is tricky, but fortunately the book's `build.gradle`
> automates everything so you don't need to struggle with it.
> 
> It's possible to write JMH code so it can be run from the command
> line, but the
> recommended approach is to let the JMH system run the tests for you.
> JMH attempts to make the writing of benchmarks as easy as possible.
> For example, we
> can rewrite `BadMicroBenchmark.java` to use JMH:
> 
> ```java
> // verifying/jmhtests/ParallelSetAll.java
> package verifying.jmhtests;
> import java.util.*;
> import org.openjdk.jmh.annotations.*;
> 
> @State(Scope.Thread)
> public class ParallelSetAll {
>   private long[] la;
>   @Setup
>   public void setup() {
>     la = new long[20_000_000];
>   }
>   @Benchmark
>   public void setAll() {
>     Arrays.setAll(la, n -> n);
>   }
>   @Benchmark
>   public void parallelSetAll() {
>     Arrays.parallelSetAll(la, n -> n);
>   }
> }
> ```
> 
> To run the benchmark, you use an explicit command (executed from the
> root directory of the example code), so the benchmarking doesn't
> happen as part of
> the normal `gradlew run` command:
> 
> ```
> gradlew :verifying:jmh
> ```
> 
> You'll see that it takes a *long* time, around 15 minutes depending
> on your machine. Each test has a default of 20 warmup iterations, and
> 20 test iterations. The output indicates where the results are
> summarized in a `results.txt` file. This is from my 2-core laptop
> running Windows 10:
> 
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  33.496 ± 0.125  ops/s
> ParallelSetAll.setAll          thrpt  200  32.936 ± 1.254  ops/s
> ```
> 
> On my 4-core laptop running Windows 10, the results are different:
> 
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  19.621 ± 0.254  ops/s
> ParallelSetAll.setAll          thrpt  200  30.875 ± 0.038  ops/s
> ```
> 
> Here are the results from the same laptop, running Linux Mint:
> 
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  31.335 ± 0.023  ops/s
> ParallelSetAll.setAll          thrpt  200  34.842 ± 0.066  ops/s
> ```
> 
> On the Appveyor *Continuous Integration* (CI) server, which provides
> two dedicated cores, the results are:
> 
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  98.987 ± 0.644  ops/s
> ParallelSetAll.setAll          thrpt  200  53.148 ± 0.165  ops/s
> ```
> 
> On Travis-ci, which provides two "bursted" cores, the results are:
> 
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  63.895 ± 6.768  ops/s
> ParallelSetAll.setAll          thrpt  200  54.631 ± 0.298  ops/s
> ```
> 
> 
> 
> -- Bruce Eckel
> www.MindviewInc.com <http://www.mindviewinc.com/>
> Blog: BruceEckel.github.io
> www.WinterTechForum.com
> www.AtomicScala.com
> www.Reinventing-Business.com
> http://www.TrustOrganizations.com <http://www.ScalaSummit.com>