LRB midpath code quality

Roman Kennke rkennke at redhat.com
Mon Mar 4 21:29:12 UTC 2019


Also, it seems weird that the null-check is after the in-cset-check, but 
not before. It's probably a left-over from null-check-cloning that 
should actually disappear too?

Roman

> Hi there,
>
> I have been looking into generated code quality for LRB.
>
> Run the gc-bench test that writes a single int:
>   https://icedtea.classpath.org/hg/gc-bench/
>
> $ ~/trunks/shenandoah-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -jar
> target/benchmarks.jar -jvmArgs "-XX:+UnlockExperimentalVMOptions -Xmx20g -XX:+UseShenandoahGC"
> writes.Plain.test_int -prof perfasm:printMargin=30 2>&1 | tee lrb.perfasm
>
> There are things to improve in default mode, but it is also visible with -XX:-UseCompressedOops:
>
>                  [Verified Entry Point]
>    7.34%           0x00007f60e3a167b0: mov    %eax,-0x14000(%rsp)
>    5.73%           0x00007f60e3a167b7: push   %rbp
>    6.09%           0x00007f60e3a167b8: sub    $0x10,%rsp
>    5.31%           0x00007f60e3a167bc: mov    0x10(%rsi),%r10
> .......................... LRB fastpath check ..........................
>    0.85%           0x00007f60e3a167c0: testb  $0x1,0x20(%r15)
>    7.14%  ╭        0x00007f60e3a167c5: jne    0x00007f60e3a167db
> .........│......... LRB fastpath ends, store to %r10 follows ...........
>    0.38%  │   ↗    0x00007f60e3a167c7: movl   $0x2a,0x20(%r10)
>   12.63%  │   │    0x00007f60e3a167cf: add    $0x10,%rsp
>    0.40%  │   │    0x00007f60e3a167d3: pop    %rbp
>    5.56%  │   │    0x00007f60e3a167d4: test   %eax,0x177b9826(%rip)
>    0.29%  │   │    0x00007f60e3a167da: retq
> ---------│---│----------- LRB midpath starts --------------------------
> .........│...│............ checking in-cset ...........................
>           ↘   │    0x00007f60e3a167db: mov    %r10,%r11
>               │    0x00007f60e3a167de: shr    $0x17,%r11
>               │    0x00007f60e3a167e2: movabs $0x7f60f309c048,%r8
>               │    0x00007f60e3a167ec: cmpb   $0x0,(%r8,%r11,1)
>            ╭  │    0x00007f60e3a167f1: je     0x00007f60e3a16806
> ..........│..│............ checking null ..............................
>            │  │    0x00007f60e3a167f3: test   %r10,%r10
>            │╭ │    0x00007f60e3a167f6: je     0x00007f60e3a16820
> ..........││.│............ checking is-forwarded ......................
>            ││ │    0x00007f60e3a167f8: mov    -0x8(%r10),%r11
>            ││ │    0x00007f60e3a167fc: cmp    %r10,%r11
>            ││╭│    0x00007f60e3a167ff: je     0x00007f60e3a1680b
> ..........││││............ return mess ................................
>            ││││↗↗  0x00007f60e3a16801: mov    %r11,%r10
>            │││╰││  0x00007f60e3a16804: jmp    0x00007f60e3a167c7
>            ↘││ ││  0x00007f60e3a16806: mov    %r10,%r11
>             ││ ╰│  0x00007f60e3a16809: jmp    0x00007f60e3a16801
> ...........││..│.......... slowpath call ..............................
>             │↘  │  0x00007f60e3a1680b: mov    %r11,%rdi
>             │   │  0x00007f60e3a1680e: movabs $0x7f60f9afad70,%r10
>             │   │  0x00007f60e3a16818: callq  *%r10
>             │   │  0x00007f60e3a1681b: mov    %rax,%r11
>             │   ╰  0x00007f60e3a1681e: jmp    0x00007f60e3a16801
>
>
> I would have expected the branches return straight to 0x00007f60e3a167c7, instead of jumping through
> the "return mess", since %r10 is kept untouched.
>
> -XX:+UseCompressedOops is messier:
>
>                 [Verified Entry Point]
>    3.26%          0x00007f39ac476150: mov    %eax,-0x14000(%rsp)
>    6.60%          0x00007f39ac476157: push   %rbp
>    1.94%          0x00007f39ac476158: sub    $0x10,%rsp
>    1.70%          0x00007f39ac47615c: mov    0xc(%rsi),%r11d
> .......................... LRB fastpath check ..........................
>    5.84%          0x00007f39ac476160: testb  $0x1,0x20(%r15)
>    2.07%  ╭       0x00007f39ac476165: jne    0x00007f39ac47617c
> .........│......... LRB fastpath ends, store to %r11 follows ...........
>    1.36%  │   ↗   0x00007f39ac476167: movl   $0x2a,0xc(%r12,%r11,8)
>   13.28%  │   │   0x00007f39ac476170: add    $0x10,%rsp
>    3.36%  │   │   0x00007f39ac476174: pop    %rbp
>    1.90%  │   │   0x00007f39ac476175: test   %eax,0x19e85e85(%rip)
>    0.98%  │   │   0x00007f39ac47617b: retq
> ---------│---│----------- LRB midpath starts --------------------------
> .........│...│............ checking in-cset ...........................
>           ↘   │   0x00007f39ac47617c: mov    %r11,%r9
>               │   0x00007f39ac47617f: shl    $0x3,%r9
>               │   0x00007f39ac476183: mov    %r9,%r10
>               │   0x00007f39ac476186: shr    $0x17,%r10
>               │   0x00007f39ac47618a: movabs $0x7f39bc0871e0,%r8
>               │   0x00007f39ac476194: cmpb   $0x0,(%r8,%r10,1)
>            ╭  │   0x00007f39ac476199: je     0x00007f39ac4761ae
> ..........│..│............ checking null ..............................
>            │  │   0x00007f39ac47619b: test   %r11d,%r11d
>            │╭ │   0x00007f39ac47619e: je     0x00007f39ac4761cc
> ..........││.│............ checking is-forwarded ......................
>            ││ │   0x00007f39ac4761a0: mov    -0x8(%r12,%r11,8),%r9
>            ││ │   0x00007f39ac4761a5: lea    (%r12,%r11,8),%r10
>            ││ │   0x00007f39ac4761a9: cmp    %r10,%r9
>            ││╭│   0x00007f39ac4761ac: je     0x00007f39ac4761b7
> ..........││││............ return mess ................................
>            ↘│││↗  0x00007f39ac4761ae: mov    %r9,%r11
>             ││││  0x00007f39ac4761b1: shr    $0x3,%r11
>             ││╰│  0x00007f39ac4761b5: jmp    0x00007f39ac476167
> ...........││.│.......... slowpath call ...............................
>             │↘ │  0x00007f39ac4761b7: mov    %r9,%rdi
>             │  │  0x00007f39ac4761ba: movabs $0x7f39c4c26d70,%r10
>             │  │  0x00007f39ac4761c4: callq  *%r10
>             │  │  0x00007f39ac4761c7: mov    %rax,%r9
>             │  ╰  0x00007f39ac4761ca: jmp    0x00007f39ac4761ae
>
> Same thing here, and "return mess" packs the reference back for returning. It seems useless as %r11
> still carries the unpacked reference on non-in-cset path. Also, %r9 is available with unpacked
> reference during "checking is-forwarded" execution, being unpacked earlier during "checking in-cset".
>
> Maybe LRB expansion in C2 needs touchups to handle these, to optimize code size and performance when
> GC is active.
>
> -Aleksey
>


More information about the shenandoah-dev mailing list