Need some help for a book intro to JMH

Fri Aug 12 21:13:47 UTC 2016

Hi Bruce,

it is "by design"...

I've logged a bug, but I doubt that it will ever be fixed: 
https://bugs.openjdk.java.net/browse/JDK-8161676

Whether you have a single core or dual core (meaning 
Runtime.getRuntime().availableProcessors() returns 1 or 2), the 
parallelism in the common FJP is reported as 1 (number of hardware 
threads - 1).  So the parallel algorithms in Arrays all just throw up 
their hands and do it sequentially.  Note that it looks at the number of 
hardware threads, rather than cores.

Regards

Heinz
-- 
Dr Heinz M. Kabutz (PhD CompSci)
Author of "The Java(tm) Specialists' Newsletter"
Sun/Oracle Java Champion since 2005
JavaOne Rock Star Speaker 2012
http://www.javaspecialists.eu
Tel: +30 69 75 595 262
Skype: kabutz

Bruce Eckel wrote:
> I'm using JMH in parts of an upcoming Java book. I've got some other parts
> working but have gotten stuck on the introduction because JMH is producing
> rather unexpected results.
>
> I'm using Arrays.setAll() vs Arrays.parallelSetAll() here because I've
> already shown the perils of trying to write your own benchmarking system
> using that example and it has demonstrated it well (I can include the whole
> section if you want but I wanted to start by focusing on the JMH part).
>
> To reproduce it:
> git clone https://github.com/BruceEckel/OnJava8-Examples
> cd OnJava8-Examples
> gradlew :verifying:jmh
>
> So my questions are:
> 1. Is Arrays.setAll() vs Arrays.parallelSetAll() just too tricky and subtle
> as an example, or are there JMH annotations I can add to fix this?
> 2. Is there a better introductory example I should be using? (Ideally such
> an example would also show problems when using simple timing).
>
> Here is the introductory section (so far), including test results:
>
> ### Introducing JMH
>
> At this writing, the only microbenchmark system for Java that produced
> decent
> results is The *Java Microbenchmarking Harness*
> [JMH](http://openjdk.java.net/projects/code-tools/jmh/). Configuring and
> using
> JMH by hand is tricky, but fortunately the book's `build.gradle` automates
> everything so you don't need to struggle with it.
>
> It's possible to write JMH code so it can be run from the command line, but
> the
> recommended approach is to let the JMH system run the tests for you. JMH
> attempts to make the writing of benchmarks as easy as possible. For
> example, we
> can rewrite `BadMicroBenchmark.java` to use JMH:
>
> ```java
> // verifying/jmhtests/ParallelSetAll.java
> package verifying.jmhtests;
> import java.util.*;
> import org.openjdk.jmh.annotations.*;
>
> @State(Scope.Thread)
> public class ParallelSetAll {
>   private long[] la;
>   @Setup
>   public void setup() {
>     la = new long[20_000_000];
>   }
>   @Benchmark
>   public void setAll() {
>     Arrays.setAll(la, n -> n);
>   }
>   @Benchmark
>   public void parallelSetAll() {
>     Arrays.parallelSetAll(la, n -> n);
>   }
> }
> ```
>
> To run the benchmark, you use an explicit command (executed from the root
> directory of the example code), so the benchmarking doesn't happen as part
> of
> the normal `gradlew run` command:
>
> ```
> gradlew :verifying:jmh
> ```
>
> You'll see that it takes a *long* time, around 15 minutes depending on your
> machine. Each test has a default of 20 warmup iterations, and 20 test
> iterations. The output indicates where the results are summarized in a
> `results.txt` file. This is from my 2-core laptop running Windows 10:
>
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  33.496 ± 0.125  ops/s
> ParallelSetAll.setAll          thrpt  200  32.936 ± 1.254  ops/s
> ```
>
> On my 4-core laptop running Windows 10, the results are different:
>
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  19.621 ± 0.254  ops/s
> ParallelSetAll.setAll          thrpt  200  30.875 ± 0.038  ops/s
> ```
>
> Here are the results from the same laptop, running Linux Mint:
>
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  31.335 ± 0.023  ops/s
> ParallelSetAll.setAll          thrpt  200  34.842 ± 0.066  ops/s
> ```
>
> On the Appveyor *Continuous Integration* (CI) server, which provides two
> dedicated cores, the results are:
>
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  98.987 ± 0.644  ops/s
> ParallelSetAll.setAll          thrpt  200  53.148 ± 0.165  ops/s
> ```
>
> On Travis-ci, which provides two "bursted" cores, the results are:
>
> ```
> Benchmark                       Mode  Cnt   Score   Error  Units
> ParallelSetAll.parallelSetAll  thrpt  200  63.895 ± 6.768  ops/s
> ParallelSetAll.setAll          thrpt  200  54.631 ± 0.298  ops/s
> ```
>
>
>
> -- Bruce Eckel
> www.MindviewInc.com <http://www.mindviewinc.com/>
> Blog: BruceEckel.github.io
> www.WinterTechForum.com
> www.AtomicScala.com
> www.Reinventing-Business.com
> http://www.TrustOrganizations.com <http://www.ScalaSummit.com>
>
>