Usage of Blackhole in a loop distorts benchmark results

Wed Jan 17 10:40:12 UTC 2018

See this post: http://psy-lob-saw.blogspot.com/2014/08/the-volatile-read-suprise.htmlAnd the current BH code: http://hg.openjdk.java.net/code-tools/jmh/file/a0c4f5e23278/jmh-core/src/main/java/org/openjdk/jmh/infra/Blackhole.java#l306In summary, blackhole carries semantics of calling into an NOT-inlined method and a memory barrier. This is arguably heavy handed, but consider the formidable foe (the compiler) we are trying to fool with it. We are trying to prevent DCE due to unsunk values. It works, yay!The benchmarks you compare with/out blackhole are therefore very different in meaning, and as such different results are not surprising.  

    On Wednesday, January 17, 2018 9:47 AM, Сергей Цыпанов <sergei.tsypanov at yandex.ru> wrote:

 Say I have this benchmark:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-XX:+UseParallelGC", "-Xms2g", "-Xmx2g"})
public class IteratorFromStreamBenchmark {

    @Benchmark
    public void iteratorFromStream(Data data, Blackhole bh) {
        Iterator<Integer> iterator = data.items.stream()
                .iterator();

        while (iterator.hasNext())
            bh.consume(iterator.next());
    }

    @Benchmark
    public void forEach(Data data, Blackhole bh) {
        data.items.stream().forEach(bh::consume);
    }

    @State(Scope.Thread)
    public static class Data {
        private Collection<Integer> items;

        private int size = 1000;

        @Setup
        public void init() {
            items = IntStream.range(0, size).boxed().collect(toList());
        }
    }
}

which on Java 9 (JDK 9, VM 9+181) yields this output:

Benchmark              Mode  Cnt        Score        Error  Units
forEach                      avgt  100  6130,066 ± 308,597  ns/op
iteratorFromStream    avgt  100  4835,355 ±  57,886  ns/op

Here 'iteratorFromStream' appears to be faster than 'forEach'

Then I change the behaviour to accumulate the result of iteration over elements and return it:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-XX:+UseParallelGC", "-Xms2g", "-Xmx2g"})
public class IteratorFromStreamBenchmark {

    @Benchmark
    public int iteratorFromStream(Data data) {
        int sum = 0;
        Iterator<Integer> iterator = data.list.stream()
                .iterator();

        while (iterator.hasNext())
            sum += iterator.next();

        return sum;
    }

    @Benchmark
    public int forEach(Data data) {
        int[] sum = {0};
        data.list.stream().forEach(integer -> sum[0] = sum[0] + integer);
        return sum[0];
    }

    @State(Scope.Thread)
    public static class Data {
        private List<Integer> list;

        private int size = 100;

        @Setup
        public void init() {
            list = IntStream.range(0, size).boxed().collect(toList());
        }
    }
}

Which yields:

Benchmark              Mode  Cnt      Score    Error  Units
forEach                      avgt  100  133,118 ± 1,580  ns/op
iteratorFromStream    avgt  100  228,061 ± 5,491  ns/op

The question here is not only huge difference in absolute values, but the fact 'forEach' now appears to be faster than 'iteratorFromStream'. Also error is lower in case of value returning benchmark.

Could anyone explain is it correct behaviour of Blackhole?

Best regards,
Sergei Tsypanov