Load Barrier Assembly

Aleksey Shipilev shade at redhat.com
Thu Apr 4 07:40:13 UTC 2019

On 4/4/19 9:04 AM, Simone Bordet wrote:
> test %rsi, 0x20(%r15)
> jne slow_path
> This is slightly different from what reported in Per's presentations
> where it was:
> test %rsi, (0x16)%r15
> jnz slow_path
> I'm not an assembly expert, is the second version is a typo?

Second version has a typo, it should be 0x16(%r15).

> But the question I have is: what's loaded in r15, and why the bad mask
> is 32 bytes after that address?

%r15 is the pointer to thread-local storage:

Bad mask is at offset 0x20 there:

> Can the bad mask be stored in a registry (at the cost of losing one registry)?

Well, in Shenandoah, we store thread-local gc state the similar way and check it on barrier
fastpath. We did experiment with putting it into register and the short answer is: losing one of the
registers means significant drawback when register pressure is high, think heavily unrolled and
pipelined loop. Additionally, you'd need to handle the restoration of the register value when the
flag/mask finally changes (happens during safepoint/handshake poll). It is doable, but tedious. In
Shenandoah, there is ShenandoahCommonGCStateLoads that just caches the value between the safepoint

The greater idea to eliminate barrier costs is to use nmethod entry barriers to hot-patch the code
(e.g. nop them out) eliminating barrier overhead altogether. I don't think it was actually tried for
either G1, Shenandoah or ZGC.


More information about the zgc-dev mailing list