Perf: excess store in allocation fast path?
Roman Kennke
rkennke at redhat.com
Tue Dec 6 19:25:14 UTC 2016
Am Dienstag, den 06.12.2016, 19:39 +0100 schrieb Aleksey Shipilev:
> Hi, (Roland?)
>
> I think we have the excess store at allocation fast path, compare
> Shenandoah [1]
> and Parallel [2]. And this is not storing the fwdptr, but seems to be
> the excess
> zeroing. In that test, allocating a simple Object yields this:
>
> mov %r11,(%rax) ; mark word
> prefetchnta 0xc0(%r10)
> movl $0xf80001dd,0x8(%rax) ; class word
> mov %rax,-0x8(%rax) ; fwdptr
> mov %r12d,0xc(%rax) ; zeroing last 4 bytes
> mov %r12,0x10(%rax) ; <--- hey, what?
>
> I think this happens because allocation fastpath bumps the instance
> size to
> "cover" for the upcoming object's fwdptr, and accidentally zeroes it
> as well? Do
> we need this? I can imagine the invariant that everything up to top
> pointer
> should be zeroed, is this such a case?
It looks like initialization for the first field in the object. Maybe
we're failing the c2 opt that eliminates initial zeroing for fields?
Maybe our barrier or allocation stuff somehow gets in the way of that
and c2 can't see the initialization and therefore cannot optimize it
away?
Roman
More information about the shenandoah-dev
mailing list