Intrinsics for Math.min and max

Vitaly Davidovich vitalyd at gmail.com
Wed Apr 2 13:03:25 UTC 2014


Andrew,

You mean substantial unrolling occurs with certain probabilities? I think
that's where the cmov really hurts (for predictable branches) as the cpu
will be slowed down by dependency chains.

As I mentioned in my other reply, it's hard to build a software model of a
branch prediction unit as they're not simple probability counters but look
at branch patterns.  I think best hotspot can do here is just record highly
likely/unlikely code paths, but leave the gray area alone (i.e. prefer
jumps).

Sent from my phone
On Apr 2, 2014 5:59 AM, "Andrew Haley" <aph at redhat.com> wrote:

> On 04/02/2014 12:31 AM, Vitaly Davidovich wrote:
> > Thanks for putting the jmh code inline.
> >
> > Yes, I tend to agree with not forcing cmov in the intrinsic given modern
> > hardware (unless, of course, profiling via interpreter shows the branch
> > highly unpredictable).  Perhaps JIT should see if the min/max is executed
> > in a loop body, and if so, consider it predictable (and generate jumps);
> if
> > outside loop, it probably doesn't matter for perf all that much whether
> > it's cmov or jump.
>
> When probabilities are equal (i.e. max selects its left and right args
> equally often, code appended) HotSpot generates the same code for the
> intrinsic and the own version
>
>   0x00007f2659249fc1: mov    $0x80000000,%eax   ;*aload_2
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::intrinsic at 20 (line 71)
>
>   ;; B6: #      B6 B7 <- B5 B6  Loop: B6-B6 inner pre of N141 Freq: 1.99806
>
>   0x00007f2659249fc6: mov    0x10(%rdx,%r10,4),%ecx  ;*iaload
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::intrinsic at 23 (line 71)
>
>   0x00007f2659249fcb: cmp    %ecx,%eax
>   0x00007f2659249fcd: cmovl  %ecx,%eax          ;*invokestatic max
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::intrinsic at 29 (line 71)
>
>   0x00007f2659249fd0: inc    %r10d              ;*iinc
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::intrinsic at 33 (line 71)
>
>   0x00007f2659249fd3: cmp    $0x1,%r10d
>   0x00007f2659249fd7: jl     0x00007f2659249fc6  ;*if_icmpge
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::intrinsic at 17 (line 71)
>
> and
>
>   0x00007fcf3124a283: mov    $0x80000000,%eax   ;*aload_2
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::own at 20 (line 77)
>
>   ;; B6: #      B6 B7 <- B5 B6  Loop: B6-B6 inner pre of N157 Freq: 1.99805
>
>   0x00007fcf3124a288: mov    0x10(%rdx,%r10,4),%ecx  ;*iaload
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::own at 23 (line 77)
>
>   0x00007fcf3124a28d: cmp    %ecx,%eax
>   0x00007fcf3124a28f: cmovl  %ecx,%eax          ;*ireturn
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::max at 10 (line 82)
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::own at 30 (line 77)
>
>   0x00007fcf3124a292: inc    %r10d              ;*iinc
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::own at 34 (line 77)
>
>   0x00007fcf3124a295: cmp    $0x1,%r10d
>   0x00007fcf3124a299: jl     0x00007fcf3124a288  ;*if_icmpge
>                                                 ; -
> org.openjdk.jmh.samples.JmhMaxBenchmark::own at 17 (line 77)
>
> Unsurprisingly, the measured time is the same for own() and intrinsic().
>
> I am concerned that the inner part of the loop is too small.  As much
> time is spent in the loop machinery as in the actual calculation.  I
> have noticed that substantial inlining occurs with some probabilities,
> and this might significantly change the measurements.
>
> With the setup code below it's easy to fiddle with probabilities and
> see what happens.
>
> Andrew.
>
>
>     @Setup public void setUp() {
>         final Random random = new Random();
>         for (int i=0; i<table.length; ++i)  {
>             // table[i] = random.nextInt();
>             if (random.nextDouble() > 0.5) {
>                 table[i] = i;
>             } else {
>                 table[i] = -i;
>             }
>         }
>     }
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140402/052967c3/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list