Request for tracking down C1 optimizations: handwritten cartesian product similar to flatmap/map performance!

Thu May 29 17:55:01 UTC 2014

Hello all,

I would like to ask you something regarding C1 compilation (VM options:
-Xms769m -Xmx769m -XX:-TieredCompilation) of a Cartesian product stream
operation with the new stream API.
I have two versions of this computation, one handwritten and one with
flatmap/map. It is remarkable that these two have similar performance so I
would like to trace-back the JIT compilation decisions (apart from
inlining), and more specifically if escape analysis has any effect.

...
>
> valuesHi = IntStream.range(0, 10000).mapToLong(i -> i).toArray();
> valuesLo = IntStream.range(0, 1000).mapToLong(i -> i).toArray();
>
> @GenerateMicroBenchmark // -> 4.984 ms / op on avg
> public long cartSeq() {
>    long cart
>          = LongStream.of(valuesHi)
>             .flatMap(d -> LongStream.of(valuesLo).map(dP -> dP * d))
>             .sum();
>    return cart;
> }
>
> @GenerateMicroBenchmark // -> 4.258 ms / op on avg
> public long cartBaseline() {
>     long cart = 0;
>     for (int d = 0 ; d < valuesHi.length ; d++) {
>         for (int dp = 0 ; dp < valuesLo.length ; dp++){
>         cart += valuesHi[d] * valuesLo[dp];
>         }
>     }
>     return cart;
> }
>

I've tested the code above with -XX:-DoEscapeAnalysis and I've got the same
execution times, however I would like to confirm what happens.
Regarding inlining, only by noticing the result of PrintInlining we
conclude that cartSeq inlines all the nested forEachRemaining operations
(of of, flatmap, map), but is that the only optimization?

Thank you in advance!

Cheers,
Aggelos Biboudis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140529/0d553d1b/attachment.html>