[master] RFR: Implement non-racy fast-locking [v2]

Fri Jul 29 04:06:11 UTC 2022

On Wed, 27 Jul 2022 09:48:03 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

>> src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 80:
>> 
>>> 78: 
>>> 79:   movptr(disp_hdr, Address(obj, hdr_offset));
>>> 80:   andb(disp_hdr, ~0x3); // Clear lowest two bits. 8-bit AND preserves upper bits.
>> 
>> I see you added a new andb instruction so that you can clear two low order bits while preserving the others. It's worth noting that the immediates are sign extended. So I don't think you need to do that. For example you could do and with -4 of any signed immediate size to clear the low order 2 bits only.
>
> Right. The downside is that the instruction encoding is larger (32 vs 16 bits, I believe). I don't think it matters much, though. I'll do what you suggest.

Note that on x86 an instruction that read from a register that was written with a smaller-width instruction would result in register stall. For example in this occasion a later read on 32-bit of `disp_hdr` would be stalled as the last write is only 8-bit wide. As a result, it would be less efficient to use `andb` instead of `andptr` as you have fixed.

On a side note, they fix this issue with 32-bit write by making it an actual 64-bit write of the zero-extended value.

Thanks.

-------------

PR: https://git.openjdk.org/lilliput/pull/51