Request for tracking down C1 optimizations: handwritten cartesian product similar to flatmap/map performance!

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri May 30 19:07:06 UTC 2014


Hi Aggelos,

Based on your log file I see that you use jdk8 fcs. We switched off 
boxing elimination in jdk8 because of 
https://bugs.openjdk.java.net/browse/JDK-8032566. The bug is fixed in 
8u20 now and the optimization is enabled again. But you can try 
-XX:+EliminateAutoBox flag anyway. May be it can help you but it is not 
guaranteed :).

Regards,
Vladimir

On 5/30/14 11:29 AM, Aggelos Biboudis wrote:
> Experimentally we found that heap allocations do indeed occur. After
> increasing the outer stream (of vHigh) and decreasing the vLow (keeping
> the total number of iterations to 10000000) we discovered that number of
> gc collections increases significantly and the delay is noticeable.
>
> Kind regards,
> Aggelos Biboudis.
>
>
>
> On Fri, May 30, 2014 at 4:36 PM, Paul Sandoz <paul.sandoz at oracle.com
> <mailto:paul.sandoz at oracle.com>> wrote:
>
>
>     On May 30, 2014, at 2:48 PM, Aggelos Biboudis <biboudis at gmail.com
>     <mailto:biboudis at gmail.com>> wrote:
>
>      > The quick thing to do, was to produce the log including the
>     generated code:
>      >
>      > http://cgi.di.uoa.gr/~biboudis/hotspot_pid5379.log
>      >
>      > These are what I know:
>      >       • The execution happens inside copyInto
>      >       • we have three forEachRemaining calls (of, flatMap, map)
>     that are delegated via the accept methods of the wrapped sink.
>
>     Just two, there is no loop for the map operation.
>
>     In the log search for the compiled method at address
>     0x00007f1b4909e8d0, then within that search for the instruction
>     imul, then look how the code loops surrounding that instruction.
>
>     I can just about fake an understand of x86 machine code :-) AFAICT
>     the capturing instance of LongUnaryOperator passed to the map
>     operation is unboxed.
>
>     I have not tried it yet, but JITWatch might help to analyse the logs:
>
>     https://github.com/AdoptOpenJDK/jitwatch
>
>     Paul.
>
>
>      >       • the second lambda which is captured is obtained via a
>     method internalMemberName
>      > If scalar replacement happens I should expect to see the captured
>     lambda spilled on the stack and accessed accordingly when the
>     execution reaches a request to the reference of the mapper lambda
>     inside flatMap. From what I understand, this should have happened in
>     the accept of flatMap where the inner lambda is linked and more
>     specifically "inside" the mapper.apply of LongPipeline:286, right
>     (in terms of runtime execution)? And even more specifically when
>     accepting the captured lambda (lambda$8 to my understanding).
>      >
>      > On a side note, I'll produce a debug build of the vm to examine
>     this better. ;-) Thx for the direction.
>
>


More information about the hotspot-compiler-dev mailing list