RFR: Match barrier fastpath checks better
Roman Kennke
rkennke at redhat.com
Tue Jan 9 15:57:05 UTC 2018
Am 09.01.2018 um 16:28 schrieb Aleksey Shipilev:
> http://cr.openjdk.java.net/~shade/shenandoah/match-barrier-checks/webrev.01/
> (Roland made the draft revision of this patch last year)
>
> Current barrier fastpath checks the flags like this:
>
> 0x0: movzbl 0x3d8(%r15),%r10d ; check evac-in-progress
> +0x8: test %r10d,%r10d
> +0xB: jne SLOW-PATH
> +0x11: ...
>
> This wastes the register %reg, which is bad when barriers are back-to-back and register pressure is
> high. The fix trivially folds the checks against memory with byte-sized immediates with cmpb, so the
> resulting code is register-less and shorter:
>
> 0x0: cmpb $0x0,0x3d8(%r15)
> +0x8: jne SLOW-PATH
> +0xE: ...
>
> This follows similar .ad patterns that fold particular cmp shapes, and the fix would be upstreamed
> separately. We would like to have this in Shenandoah repos for more thorough testing. "Unsigned"
> shape covers Shenandoah WB checks, and "signed" covers SATB checks. (Amusingly, this affects C2, but
> not C1, which generates cmpb for cases like these.) We actually need only tests against zero-es, but
> there is nothing that prevents us to check for the entire range of bytes.
>
> Regular benchmarks are affected very little, with some tiny improvements -- because barriers there
> are already well-optimized. But in cases where barriers are not optimized(-able), the improvement is
> substantial. For example, in recent SPSCQueue benchmarks [1], the score improved around +50%.
>
> Testing: hotspot_gc_shenandoah {fastdebug|release}, specjvm
>
> Thanks,
> -Aleksey
>
> [1] http://cr.openjdk.java.net/~shade/shenandoah/jctools-QueueThroughputBackoffNone.txt
>
Looks good to me. Will test it later with traversal heuristics.
More information about the shenandoah-dev
mailing list