jvm handling of long indexed for loops

Remi Forax forax at univ-mlv.fr
Thu Jun 19 00:35:08 UTC 2014


On 06/19/2014 02:15 AM, Vitaly Davidovich wrote:
>
> Yes, hotspot has a bunch of loop optimizations that only kick in when 
> the counter is an int.  My hunch is that, given your code, the 
> difference is that the int loop is unrolled and the long one isn't.  
> Checking the generated assembly should deny/confirm that, and 
> generally shed light on the compiled differences.
>
> As an aside, System.nanoTime is more appropriate for timing execution; 
> prefer that over System.currentTimeMillis.
>
> Sent from my phone
>

yes, I can see that with PrintAssembly,
also each method test* is compiled twice, first as OSR then as a normal 
method.

I've discover something worst, this benchmark is weird on my haswell laptop
(I see the same kind of result as Alexander on my sandy bridge laptop),

Here is the execution on my haswell:
Iteration 0: x = 2305843005992468481  time(int loop) = 722
Iteration 1: x = 2305843005992468481  time(int loop) = 722
Iteration 2: x = 2305843005992468481  time(int loop) = 712
Iteration 3: x = 2305843005992468481  time(int loop) = 710
Iteration 4: x = 2305843005992468481  time(int loop) = 711
Iteration 5: x = 2305843005992468481  time(int loop) = 740
Iteration 6: x = 2305843005992468481  time(int loop) = 711
Iteration 7: x = 2305843005992468481  time(int loop) = 713
Iteration 8: x = 2305843005992468481  time(int loop) = 712
Iteration 9: x = 2305843005992468481  time(int loop) = 716
Iteration 0: x = 2305843005992468481 time(long loop) = 658
Iteration 1: x = 2305843005992468481 time(long loop) = 656
Iteration 2: x = 2305843005992468481 time(long loop) = 666
Iteration 3: x = 2305843005992468481 time(long loop) = 663
Iteration 4: x = 2305843005992468481 time(long loop) = 667
Iteration 5: x = 2305843005992468481 time(long loop) = 662
Iteration 6: x = 2305843005992468481 time(long loop) = 662
Iteration 7: x = 2305843005992468481 time(long loop) = 662
Iteration 8: x = 2305843005992468481 time(long loop) = 681
Iteration 9: x = 2305843005992468481 time(long loop) = 665

The long version is faster despite the fact that the generated code is a 
simple loop with no unrolling,
I've copy/pasted the code generated for testLong (non-OSR) below.
As you can see, the code is very straighforward,
there is even no removing of the safepoint check in the body of the loop 
(at 0x00007fbc7d06c967).

   # {method} {0x00007fbc7cc005c8} 'testLong' '(J)J' in 'Test'
   # parm0:    rsi:rsi   = long
   #           [sp+0x20]  (sp of caller)
   0x00007fbc7d06c940: sub    $0x18,%rsp
   0x00007fbc7d06c947: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                                 ; - Test::testLong at -1 
(line 13)

   0x00007fbc7d06c94c: xor    %eax,%eax
   0x00007fbc7d06c94e: test   %rsi,%rsi
   0x00007fbc7d06c951: jle    0x00007fbc7d06c972  ;*iflt
                                                 ; - Test::testLong at 23 
(line 15)

   0x00007fbc7d06c953: mov    $0x1,%r10d
   0x00007fbc7d06c959: jmp    0x00007fbc7d06c967
   0x00007fbc7d06c95b: nopl   0x0(%rax,%rax,1)
   0x00007fbc7d06c960: add    %r10,%rax          ;*ladd
                                                 ; - Test::testLong at 11 
(line 16)

   0x00007fbc7d06c963: add    $0x1,%r10          ; OopMap{off=39}
                                                 ;*lcmp
                                                 ; - Test::testLong at 22 
(line 15)

   0x00007fbc7d06c967: test   %eax,0xb6e9693(%rip)        # 
0x00007fbc88756000
                                                 ;*lcmp
                                                 ; - Test::testLong at 22 
(line 15)
                                                 ;   {poll}
   0x00007fbc7d06c96d: cmp    %rsi,%r10
   0x00007fbc7d06c970: jl     0x00007fbc7d06c960  ;*lload_2
                                                 ; - Test::testLong at 26 
(line 19)

   0x00007fbc7d06c972: add    $0x10,%rsp
   0x00007fbc7d06c976: pop    %rbp
   0x00007fbc7d06c977: test   %eax,0xb6e9683(%rip)        # 
0x00007fbc88756000
                                                 ;   {poll_return}
   0x00007fbc7d06c97d: retq
   0x00007fbc7d06c97e: hlt
   0x00007fbc7d06c97f: hlt

it seems that the Haswell prefers plain old loop ??

cheers,
Rémi

> On Jun 18, 2014 7:52 PM, "Alexander Weld" <alexander.weld at oracle.com 
> <mailto:alexander.weld at oracle.com>> wrote:
>
>     Hi,
>
>     I have some questions on how the JVM handles long values,
>     especially as
>     index in for loops.
>
>     I have the following test class, which tracks the running time of two
>     versions of a simple for loop, one using an int-value as index and one
>     using a long-value as index:
>
>     public class Test {
>
>       public static long testInt(int maxIter) {
>         long x = 0;
>
>         for (int i = 0; i < maxIter; i++) {
>           x += i;
>         }
>
>         return x;
>       }
>       public static long testLong(long maxIter) {
>         long x = 0;
>
>         for (long l = 0; l < maxIter; l++) {
>           x += l;
>         }
>
>         return x;
>       }
>
>       public static void main(String[] args) {
>
>         final int MAX_ITER_INT = Integer.MAX_VALUE;
>         final long MAX_ITER_LONG = (long) MAX_ITER_INT;
>
>         long t1, t2;
>
>         for (int i = 0; i < 10; i++) {
>           long x = 0;
>
>           t1 = System.currentTimeMillis();
>           x = testInt(MAX_ITER_INT);
>           t2 = System.currentTimeMillis();
>
>           System.out.println("Iteration " + i + ": x = " + x + "  time(int
>     loop) = " + (t2 - t1));
>         }
>
>         for (int i = 0; i < 10; i++) {
>           long x = 0;
>
>           t1 = System.currentTimeMillis();
>           x = testLong(MAX_ITER_LONG);
>           t2 = System.currentTimeMillis();
>
>           System.out.println("Iteration " + i + ": x = " + x + " time(long
>     loop) = " + (t2 - t1));
>         }
>
>       }
>
>     }
>
>     When I run this on our cluster (*) I get the following results:
>
>     $ java Test
>     Iteration 0: x = 2305843005992468481  time(int loop) = 945
>     Iteration 1: x = 2305843005992468481  time(int loop) = 937
>     Iteration 2: x = 2305843005992468481  time(int loop) = 937
>     Iteration 3: x = 2305843005992468481  time(int loop) = 938
>     Iteration 4: x = 2305843005992468481  time(int loop) = 938
>     Iteration 5: x = 2305843005992468481  time(int loop) = 937
>     Iteration 6: x = 2305843005992468481  time(int loop) = 937
>     Iteration 7: x = 2305843005992468481  time(int loop) = 938
>     Iteration 8: x = 2305843005992468481  time(int loop) = 938
>     Iteration 9: x = 2305843005992468481  time(int loop) = 937
>     Iteration 0: x = 2305843005992468481 time(long loop) = 1510
>     Iteration 1: x = 2305843005992468481 time(long loop) = 1509
>     Iteration 2: x = 2305843005992468481 time(long loop) = 1799
>     Iteration 3: x = 2305843005992468481 time(long loop) = 1799
>     Iteration 4: x = 2305843005992468481 time(long loop) = 1799
>     Iteration 5: x = 2305843005992468481 time(long loop) = 1799
>     Iteration 6: x = 2305843005992468481 time(long loop) = 1799
>     Iteration 7: x = 2305843005992468481 time(long loop) = 1799
>     Iteration 8: x = 2305843005992468481 time(long loop) = 1799
>     Iteration 9: x = 2305843005992468481 time(long loop) = 1799
>
>     My questions are:
>
>       (1) Why is the long-version slower compared to the int-version?
>     Is it
>     the int to long conversion? Or are there some optimizations, that are
>     only applied to int loops?
>
>       (2) What is the reason for the long-version getting slower after
>     "Iteration 1"?
>
>     Thanks,
>     Alex
>
>     (*) System information:
>
>     Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
>
>     $ lsb_release -d
>     Description:    Oracle Linux Server release 6.5
>
>     $ java -version
>     java version "1.7.0_55"
>     Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
>     Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
>
>     (Let me know if you need more information)
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140619/7127ec18/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list