RFR: Match barrier fastpath checks better
Aleksey Shipilev
shade at redhat.com
Tue Jan 9 15:28:52 UTC 2018
http://cr.openjdk.java.net/~shade/shenandoah/match-barrier-checks/webrev.01/
(Roland made the draft revision of this patch last year)
Current barrier fastpath checks the flags like this:
0x0: movzbl 0x3d8(%r15),%r10d ; check evac-in-progress
+0x8: test %r10d,%r10d
+0xB: jne SLOW-PATH
+0x11: ...
This wastes the register %reg, which is bad when barriers are back-to-back and register pressure is
high. The fix trivially folds the checks against memory with byte-sized immediates with cmpb, so the
resulting code is register-less and shorter:
0x0: cmpb $0x0,0x3d8(%r15)
+0x8: jne SLOW-PATH
+0xE: ...
This follows similar .ad patterns that fold particular cmp shapes, and the fix would be upstreamed
separately. We would like to have this in Shenandoah repos for more thorough testing. "Unsigned"
shape covers Shenandoah WB checks, and "signed" covers SATB checks. (Amusingly, this affects C2, but
not C1, which generates cmpb for cases like these.) We actually need only tests against zero-es, but
there is nothing that prevents us to check for the entire range of bytes.
Regular benchmarks are affected very little, with some tiny improvements -- because barriers there
are already well-optimized. But in cases where barriers are not optimized(-able), the improvement is
substantial. For example, in recent SPSCQueue benchmarks [1], the score improved around +50%.
Testing: hotspot_gc_shenandoah {fastdebug|release}, specjvm
Thanks,
-Aleksey
[1] http://cr.openjdk.java.net/~shade/shenandoah/jctools-QueueThroughputBackoffNone.txt
More information about the shenandoah-dev
mailing list