[External] : Re: Why is mark read as MO_RELAXED in read_stable_mark()?

Mon Jul 12 21:18:50 UTC 2021

On 7/12/21 4:26 PM, Roman Kennke wrote:
>>> Hi Hotspot runtime devs,
>>>
>>> I am messing a little bit with read_stable_mark() in 
>>> synchronizer.cpp in Lilliput project (because I want to load the 
>>> Klass* from the header).
>>>
>>> I notice that, in read_stable_mark(), we are loading the header with 
>>> MO_RELAXED memory ordering, even though the INFLATING header is 
>>> stored either via CAS or (re-)stored with MO_RELEASE. Wouldn't it be 
>>> more consequential to load the header with MO_ACQUIRE instead?
>> For the monitor case I think we already have an address dependency in 
>> the reader so there is no need for the acquire. So we first read the 
>> monitor address and then use that address to access the _header 
>> field(FastHashCode() case) or _owner 
>> field(current_thread_holds_lock() and get_lock_owner() cases). The 
>> writer in turn orders the stores to _owner/_header and the publishing 
>> of the monitor address by using a release store in inflate() (I think 
>> this is the one you are referring to).
>>
>
> Thanks Patricio,
>
> I am not quite sure that I follow you.
>
> On the reader side, given an object, when a thread calls into 
> FastHashCode() or any locking that causes inflation, it needs to read 
> the object header, and that is always done with a plain read. In 
> particular, read_stable_mark() does that, while waiting for the header 
> to become != INFLATING. On the writer side, we use CAS (using 
> MO_SEQ_CST) to temporarily turn on INFLATING(with release_store() to 
> write back the original header, and store with MO_RELEASE to write 
> back the original header.
>
> It probably doesn't matter all that much.
>
> I am wondering about this because I am currently moving the Klass* 
> into the object header, which means that for klass() implementation, 
> we need to load the header similar to how we do it for hashcode(). In 
> particular it involves read_stable_mark(). Now, there is a variant of 
> klass-accessor that implies acquire-semantics: 
> oopDesc::klass_or_null_acquire(). I am not quite sure why exactly we 
> need acquire semantics there, but in order to implement it, I believe 
> we would have to make a variant of read_stable_mark() that uses 
> acquire semantics to load the header, instead of plain reads.
>
> But I'm thinking it may be cleaner overall to always use acquire 
> semantics in read_stable_mark() (mirroring the release-store and 
> seq-cst-CAS), which would make the acquire-accessors for klass() 
> superfluous.
>
> What do you think?
It definitely doesn't break correctness if you add it and probably 
doesn't change anything in terms of performance. Now, looking back in 
history, klass_or_null_acquire() was added in 8166583 to protect against 
reading a partially allocated object. The description of the issue 
actually argues to explicitly have klass_or_null_acquire() instead of 
making klass_or_null() quietly have acquire semantics. If that is the 
case then I think you would still need to have a version with a plain 
load without the extra fence. You could probably have two versions of 
read_stable_mark() by refactoring everything except the load of the mark 
and check into a separate shared function. Also by removing the uneeded 
extra code before the 'for' loop will make them even shorter. Another 
option, although more conservative in terms of memory ordering on 
certain cpus, is to keep read_stable_mark() as is and have 
klass_or_null_acquire() use a standalone OrderAccess::acquire() fence 
before returning. I think any solution would work, but maybe wait for 
somebody else to chime in to hear other thoughts.

Thanks,
Patricio