Perf: SATB and WB coalescing
Aleksey Shipilev
shade at redhat.com
Thu Jan 11 10:51:24 UTC 2018
On 01/10/2018 09:29 PM, Aleksey Shipilev wrote:
> Okay, so the dirty patch for the idea:
> http://cr.openjdk.java.net/~shade/shenandoah/single-flag/webrev.00/
>
> perfasm for the offending test:
> http://cr.openjdk.java.net/~shade/shenandoah/single-flag/single-flag.perfasm
>
> *) Can we instruct compiler to trust the value of 0x3d8(%r15) until the next safepoint poll? I
> think that would eliminate excessive L1 accesses for that TLS field at expense of wasting a register
> -- which might be the lesser evil;
Hey, this one works with the dirty hack like this:
http://cr.openjdk.java.net/~shade/shenandoah/perf-wb-satb/common-single-flag.patch
It now drags commons GC state loads (and puts in the register):
http://cr.openjdk.java.net/~shade/shenandoah/perf-wb-satb/WB-SATB-commonTLS.perfasm
...and this eliminates around 8 L1 reads, that recovers 50% of the overhead:
Benchmark Mode Cnt Score Error Units
# -WB -SATB
BarriersMultiple.test avgt 15 2.760 ± 0.081 ns/op
BarriersMultiple.test:L1-dcache-loads avgt 3 13.121 ± 0.444 #/op
BarriersMultiple.test:L1-dcache-stores avgt 3 8.089 ± 0.141 #/op
BarriersMultiple.test:branches avgt 3 4.039 ± 0.220 #/op
BarriersMultiple.test:cycles avgt 3 10.429 ± 2.041 #/op
BarriersMultiple.test:instructions avgt 3 30.306 ± 2.414 #/op
# +WB +SATB
BarriersMultiple.test avgt 15 4.897 ± 0.003 ns/op
BarriersMultiple.test:L1-dcache-loads avgt 3 28.195 ± 0.838 #/op
BarriersMultiple.test:L1-dcache-stores avgt 3 8.102 ± 0.274 #/op
BarriersMultiple.test:branches avgt 3 13.074 ± 0.344 #/op
BarriersMultiple.test:cycles avgt 3 18.492 ± 2.365 #/op
BarriersMultiple.test:instructions avgt 3 56.423 ± 1.681 #/op
# +WB +SATB +TLS commoning
BarriersMultiple.test avgt 15 3.884 ± 0.003 ns/op
BarriersMultiple.test:L1-dcache-loads avgt 3 20.221 ± 0.602 #/op // -8!
BarriersMultiple.test:L1-dcache-stores avgt 3 8.093 ± 0.264 #/op
BarriersMultiple.test:branches avgt 3 13.133 ± 0.395 #/op
BarriersMultiple.test:cycles avgt 3 14.668 ± 0.771 #/op // -4!
BarriersMultiple.test:instructions avgt 3 58.636 ± 2.368 #/op
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list