Request for tracking down C1 optimizations: handwritten cartesian product similar to flatmap/map performance!
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri May 30 19:07:06 UTC 2014
Hi Aggelos,
Based on your log file I see that you use jdk8 fcs. We switched off
boxing elimination in jdk8 because of
https://bugs.openjdk.java.net/browse/JDK-8032566. The bug is fixed in
8u20 now and the optimization is enabled again. But you can try
-XX:+EliminateAutoBox flag anyway. May be it can help you but it is not
guaranteed :).
Regards,
Vladimir
On 5/30/14 11:29 AM, Aggelos Biboudis wrote:
> Experimentally we found that heap allocations do indeed occur. After
> increasing the outer stream (of vHigh) and decreasing the vLow (keeping
> the total number of iterations to 10000000) we discovered that number of
> gc collections increases significantly and the delay is noticeable.
>
> Kind regards,
> Aggelos Biboudis.
>
>
>
> On Fri, May 30, 2014 at 4:36 PM, Paul Sandoz <paul.sandoz at oracle.com
> <mailto:paul.sandoz at oracle.com>> wrote:
>
>
> On May 30, 2014, at 2:48 PM, Aggelos Biboudis <biboudis at gmail.com
> <mailto:biboudis at gmail.com>> wrote:
>
> > The quick thing to do, was to produce the log including the
> generated code:
> >
> > http://cgi.di.uoa.gr/~biboudis/hotspot_pid5379.log
> >
> > These are what I know:
> > • The execution happens inside copyInto
> > • we have three forEachRemaining calls (of, flatMap, map)
> that are delegated via the accept methods of the wrapped sink.
>
> Just two, there is no loop for the map operation.
>
> In the log search for the compiled method at address
> 0x00007f1b4909e8d0, then within that search for the instruction
> imul, then look how the code loops surrounding that instruction.
>
> I can just about fake an understand of x86 machine code :-) AFAICT
> the capturing instance of LongUnaryOperator passed to the map
> operation is unboxed.
>
> I have not tried it yet, but JITWatch might help to analyse the logs:
>
> https://github.com/AdoptOpenJDK/jitwatch
>
> Paul.
>
>
> > • the second lambda which is captured is obtained via a
> method internalMemberName
> > If scalar replacement happens I should expect to see the captured
> lambda spilled on the stack and accessed accordingly when the
> execution reaches a request to the reference of the mapper lambda
> inside flatMap. From what I understand, this should have happened in
> the accept of flatMap where the inner lambda is linked and more
> specifically "inside" the mapper.apply of LongPipeline:286, right
> (in terms of runtime execution)? And even more specifically when
> accepting the captured lambda (lambda$8 to my understanding).
> >
> > On a side note, I'll produce a debug build of the vm to examine
> this better. ;-) Thx for the direction.
>
>
More information about the hotspot-compiler-dev
mailing list