Request for tracking down C1 optimizations: handwritten cartesian product similar to flatmap/map performance!

Thu May 29 18:39:36 UTC 2014

Of course, C2, my bad.

Regarding cache misses, I have to examine if loop interchange happens and
in general what x86 code is emitted but thanks for pointing that out.

Aggelos

On Thu, May 29, 2014 at 9:03 PM, Andrew Haley <aph at redhat.com> wrote:

> Hi,
>
> On 05/29/2014 06:55 PM, Aggelos Biboudis wrote:
>
> > I would like to ask you something regarding C1 compilation (VM options:
> > -Xms769m -Xmx769m -XX:-TieredCompilation)
>
> That's C2 compilation.
>
> > of a Cartesian product stream
> > operation with the new stream API.
> > I have two versions of this computation, one handwritten and one with
> > flatmap/map. It is remarkable that these two have similar performance so
> I
> > would like to trace-back the JIT compilation decisions (apart from
> > inlining), and more specifically if escape analysis has any effect.
>
> Are you quite sure your numbers aren't dominated by cache misses?  Your
> data is about 40 Megabytes and it's being accessed sequentially.
>
> > I've tested the code above with -XX:-DoEscapeAnalysis and I've got the
> same
> > execution times, however I would like to confirm what happens.
> > Regarding inlining, only by noticing the result of PrintInlining we
> > conclude that cartSeq inlines all the nested forEachRemaining operations
> > (of of, flatmap, map), but is that the only optimization?
>
> Not if this really is C2, no.  There are many optimization passes,
> and several will be effective for this code.
>
> Andrew.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140529/29b0b421/attachment.html>