A simple optimization proposal

Paul Sandoz paul.sandoz at oracle.com
Mon Feb 17 02:58:47 PST 2014


Hi Kris,

Thanks for the explanation (very educational email thread, i am not very familiar with this area).

I was wondering about the bulk iteration methods, forEach*, for example see ArrayDeque.DeqSpliterator.forEachRemaining:

        public void forEachRemaining(Consumer<? super E> consumer) {
            if (consumer == null)
                throw new NullPointerException();
            Object[] a = deq.elements;
            int m = a.length - 1, f = getFence(), i = index;
            index = f;
            while (i != f) {
                @SuppressWarnings("unchecked") E e = (E)a[i];
                i = (i + 1) & m;
                if (e == null)
                    throw new ConcurrentModificationException();
                consumer.accept(e);
            }
        }

I thought perhaps if i reshaped to the following the range check might get eliminated:

        public void forEachRemaining(Consumer<? super E> consumer) {
            if (consumer == null)
                throw new NullPointerException();
            Object[] a = deq.elements;
            if (a.length > 0) {
                int m = a.length - 1, f = getFence(), i = (index & m);
                index = f;
                while (i != f) {
                    @SuppressWarnings("unchecked") E e = (E)a[i];
                    i = (i + 1) & m;
                    if (e == null)
                        throw new ConcurrentModificationException();
                    consumer.accept(e);
                }
            }
        }

but AFAICT it does not either with the above or with say:

    private static void foo5(Object[] a, int x, int f, Consumer c) throws Exception {
        if (c == null) throw new NullPointerException();

        if (a.length > 0) {
            int m = a.length - 1, i = (x & m);
            while (i != f) {
                Object o = a[i];
                i = (i + 1) & m;
                c.accept(o);
            }
        }
    }

Here is the compilation output (on my Mac) of the loop for foo5, I am not very familiar with the x86 output by hotspot but i can sort of reverse engineer some of what is going on:

  0x000000010309b1b0: mov    0x8(%rsp),%r8
  0x000000010309b1b5: mov    0x10(%rsp),%r9
  0x000000010309b1ba: mov    (%rsp),%ecx
  0x000000010309b1bd: mov    0x18(%rsp),%r10d
  0x000000010309b1c2: mov    0x20(%rsp),%r11    ;*aload_0
                                                ; - Test::foo5 at 35 (line 20)

  0x000000010309b1c7: cmp    %ecx,%ebp      <--- @@@ Range check?
  0x000000010309b1c9: jae    0x000000010309b204  ;*aaload
                                                ; - Test::foo5 at 38 (line 20)

  0x000000010309b1cb: mov    %r11,0x20(%rsp)
  0x000000010309b1d0: mov    %ecx,(%rsp)
  0x000000010309b1d3: mov    %r8,0x8(%rsp)
  0x000000010309b1d8: mov    0x10(%r9,%rbp,4),%r8d
  0x000000010309b1dd: mov    %r9,0x10(%rsp)
  0x000000010309b1e2: inc    %ebp
  0x000000010309b1e4: mov    %r10d,0x18(%rsp)
  0x000000010309b1e9: and    %r10d,%ebp         ;*iand
                                                ; - Test::foo5 at 47 (line 21)

  0x000000010309b1ec: mov    %r8,%rdx
  0x000000010309b1ef: shl    $0x3,%rdx          ;*aaload
                                                ; - Test::foo5 at 38 (line 20)

  0x000000010309b1f3: mov    %r11,%rsi
  0x000000010309b1f6: nop    
  0x000000010309b1f7: callq  0x0000000103060d60  ; OopMap{[8]=Oop [16]=Oop [32]=Oop off=156}
                                                ;*invokeinterface accept
                                                ; - Test::foo5 at 53 (line 22)
                                                ;   {optimized virtual_call}
  0x000000010309b1fc: cmp    0x4(%rsp),%ebp
  0x000000010309b200: jne    0x000000010309b1b0  ;*if_icmpeq
                                                ; - Test::foo5 at 32 (line 19)

(I used the following options on a release build " -XX:CompileCommand="compileonly Test foo*" -XX:-TieredCompilation -XX:+UnlockDiagnosticVMOptions  -XX:CICompilerCount=1 -XX:CompileThreshold=6  -XX:+PrintCompilation -XX:+PrintAssembly")

Paul.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
Url : http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140217/981fc556/signature-0001.asc 


More information about the hotspot-compiler-dev mailing list