Load Barrier Assembly

Per Liden per.liden at oracle.com
Thu Apr 4 07:57:53 UTC 2019

On 4/4/19 9:40 AM, Aleksey Shipilev wrote:
> On 4/4/19 9:04 AM, Simone Bordet wrote:
>> test %rsi, 0x20(%r15)
>> jne slow_path
>> This is slightly different from what reported in Per's presentations
>> where it was:
>> test %rsi, (0x16)%r15
>> jnz slow_path
>> I'm not an assembly expert, is the second version is a typo?
> Second version has a typo, it should be 0x16(%r15).

Ah, missed that typo.

>> But the question I have is: what's loaded in r15, and why the bad mask
>> is 32 bytes after that address?
> %r15 is the pointer to thread-local storage:
>   http://hg.openjdk.java.net/jdk/jdk/file/5c7418757bad/src/hotspot/cpu/x86/x86_64.ad#l12887
> Bad mask is at offset 0x20 there:
>   http://hg.openjdk.java.net/jdk/jdk/file/5c7418757bad/src/hotspot/share/runtime/thread.hpp#l147
>   http://hg.openjdk.java.net/jdk/jdk/file/5c7418757bad/src/hotspot/share/gc/z/zThreadLocalData.hpp#l35
>> Can the bad mask be stored in a registry (at the cost of losing one registry)?
> Well, in Shenandoah, we store thread-local gc state the similar way and check it on barrier
> fastpath. We did experiment with putting it into register and the short answer is: losing one of the
> registers means significant drawback when register pressure is high, think heavily unrolled and
> pipelined loop. Additionally, you'd need to handle the restoration of the register value when the
> flag/mask finally changes (happens during safepoint/handshake poll). It is doable, but tedious. In
> Shenandoah, there is ShenandoahCommonGCStateLoads that just caches the value between the safepoint
> polls.
> The greater idea to eliminate barrier costs is to use nmethod entry barriers to hot-patch the code
> (e.g. nop them out) eliminating barrier overhead altogether. I don't think it was actually tried for
> either G1, Shenandoah or ZGC.

Yep, not yet tried in ZGC.

For phases where barrier are needed, and can't be completely eliminated, 
we could have the test instruction take the bad mask as an immediate 
instead of loading it, and let the nmethod barrier patch that immediate 
bad mask as needed.


More information about the zgc-dev mailing list