Request for tracking down C1 optimizations: handwritten cartesian product similar to flatmap/map performance!

Aggelos Biboudis biboudis at gmail.com
Fri May 30 18:29:17 UTC 2014


Experimentally we found that heap allocations do indeed occur. After
increasing the outer stream (of vHigh) and decreasing the vLow (keeping the
total number of iterations to 10000000) we discovered that number of gc
collections increases significantly and the delay is noticeable.

Kind regards,
Aggelos Biboudis.



On Fri, May 30, 2014 at 4:36 PM, Paul Sandoz <paul.sandoz at oracle.com> wrote:

>
> On May 30, 2014, at 2:48 PM, Aggelos Biboudis <biboudis at gmail.com> wrote:
>
> > The quick thing to do, was to produce the log including the generated
> code:
> >
> > http://cgi.di.uoa.gr/~biboudis/hotspot_pid5379.log
> >
> > These are what I know:
> >       • The execution happens inside copyInto
> >       • we have three forEachRemaining calls (of, flatMap, map) that are
> delegated via the accept methods of the wrapped sink.
>
> Just two, there is no loop for the map operation.
>
> In the log search for the compiled method at address 0x00007f1b4909e8d0,
> then within that search for the instruction imul, then look how the code
> loops surrounding that instruction.
>
> I can just about fake an understand of x86 machine code :-) AFAICT the
> capturing instance of LongUnaryOperator passed to the map operation is
> unboxed.
>
> I have not tried it yet, but JITWatch might help to analyse the logs:
>
>   https://github.com/AdoptOpenJDK/jitwatch
>
> Paul.
>
>
> >       • the second lambda which is captured is obtained via a method
> internalMemberName
> > If scalar replacement happens I should expect to see the captured lambda
> spilled on the stack and accessed accordingly when the execution reaches a
> request to the reference of the mapper lambda inside flatMap. From what I
> understand, this should have happened in the accept of flatMap where the
> inner lambda is linked and more specifically "inside" the mapper.apply of
> LongPipeline:286, right (in terms of runtime execution)? And even more
> specifically when accepting the captured lambda (lambda$8 to my
> understanding).
> >
> > On a side note, I'll produce a debug build of the vm to examine this
> better. ;-) Thx for the direction.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140530/1a842016/attachment.html>


More information about the hotspot-compiler-dev mailing list