Perf: excess store in allocation fast path?
Aleksey Shipilev
shade at redhat.com
Tue Dec 6 18:39:01 UTC 2016
Hi, (Roland?)
I think we have the excess store at allocation fast path, compare Shenandoah [1]
and Parallel [2]. And this is not storing the fwdptr, but seems to be the excess
zeroing. In that test, allocating a simple Object yields this:
mov %r11,(%rax) ; mark word
prefetchnta 0xc0(%r10)
movl $0xf80001dd,0x8(%rax) ; class word
mov %rax,-0x8(%rax) ; fwdptr
mov %r12d,0xc(%rax) ; zeroing last 4 bytes
mov %r12,0x10(%rax) ; <--- hey, what?
I think this happens because allocation fastpath bumps the instance size to
"cover" for the upcoming object's fwdptr, and accidentally zeroes it as well? Do
we need this? I can imagine the invariant that everything up to top pointer
should be zeroed, is this such a case?
The original test is in our suite [3], runnable like this, if you want to poke
around it:
$ java -jar target/benchmarks.jar alloc.plain.Objects --jvmArgs
"-XX:+UseShenandoahGC -Xmx8g -Xms8g" -f 1 -wi 5 -i 5 -t 1 -prof perfasm
Thanks,
-Aleksey
[1]
http://cr.openjdk.java.net/~shade/shenandoah/alloc-excess-store/alloc-shenandoah.txt
[2]
http://cr.openjdk.java.net/~shade/shenandoah/alloc-excess-store/alloc-parallel.txt
[3] http://icedtea.classpath.org/people/shade/gc-bench
More information about the shenandoah-dev
mailing list