Perf: excess store in allocation fast path?

Aleksey Shipilev shade at redhat.com
Tue Dec 6 18:39:01 UTC 2016


Hi, (Roland?)

I think we have the excess store at allocation fast path, compare Shenandoah [1]
and Parallel [2]. And this is not storing the fwdptr, but seems to be the excess
zeroing. In that test, allocating a simple Object yields this:

  mov    %r11,(%rax)            ; mark word
  prefetchnta 0xc0(%r10)
  movl   $0xf80001dd,0x8(%rax)  ; class word
  mov    %rax,-0x8(%rax)        ; fwdptr
  mov    %r12d,0xc(%rax)        ; zeroing last 4 bytes
  mov    %r12,0x10(%rax)        ; <--- hey, what?

I think this happens because allocation fastpath bumps the instance size to
"cover" for the upcoming object's fwdptr, and accidentally zeroes it as well? Do
we need this? I can imagine the invariant that everything up to top pointer
should be zeroed, is this such a case?

The original test is in our suite [3], runnable like this, if you want to poke
around it:

$ java -jar target/benchmarks.jar alloc.plain.Objects --jvmArgs
"-XX:+UseShenandoahGC -Xmx8g -Xms8g" -f 1 -wi 5 -i 5 -t 1 -prof perfasm

Thanks,
-Aleksey

[1]
http://cr.openjdk.java.net/~shade/shenandoah/alloc-excess-store/alloc-shenandoah.txt
[2]
http://cr.openjdk.java.net/~shade/shenandoah/alloc-excess-store/alloc-parallel.txt
[3] http://icedtea.classpath.org/people/shade/gc-bench



More information about the shenandoah-dev mailing list