RFR: Match barrier fastpath checks better
Roman Kennke
rkennke at redhat.com
Wed Jan 10 11:45:37 UTC 2018
Am 09.01.2018 um 16:28 schrieb Aleksey Shipilev:
> http://cr.openjdk.java.net/~shade/shenandoah/match-barrier-checks/webrev.01/
> (Roland made the draft revision of this patch last year)
>
> Current barrier fastpath checks the flags like this:
>
> 0x0: movzbl 0x3d8(%r15),%r10d ; check evac-in-progress
> +0x8: test %r10d,%r10d
> +0xB: jne SLOW-PATH
> +0x11: ...
>
> This wastes the register %reg, which is bad when barriers are back-to-back and register pressure is
> high. The fix trivially folds the checks against memory with byte-sized immediates with cmpb, so the
> resulting code is register-less and shorter:
>
> 0x0: cmpb $0x0,0x3d8(%r15)
> +0x8: jne SLOW-PATH
> +0xE: ...
>
> This follows similar .ad patterns that fold particular cmp shapes, and the fix would be upstreamed
> separately. We would like to have this in Shenandoah repos for more thorough testing. "Unsigned"
> shape covers Shenandoah WB checks, and "signed" covers SATB checks. (Amusingly, this affects C2, but
> not C1, which generates cmpb for cases like these.) We actually need only tests against zero-es, but
> there is nothing that prevents us to check for the entire range of bytes.
>
> Regular benchmarks are affected very little, with some tiny improvements -- because barriers there
> are already well-optimized. But in cases where barriers are not optimized(-able), the improvement is
> substantial. For example, in recent SPSCQueue benchmarks [1], the score improved around +50%.
>
> Testing: hotspot_gc_shenandoah {fastdebug|release}, specjvm
>
> Thanks,
> -Aleksey
>
> [1] http://cr.openjdk.java.net/~shade/shenandoah/jctools-QueueThroughputBackoffNone.txt
>
I tested it with traversal GC. It works and doesn't crash. It doesn't
seem faster. But traversal GC is handicapped anyway until we get some
proper optimizations.
Roman
More information about the shenandoah-dev
mailing list