[External] : Re: Why is mark read as MO_RELAXED in read_stable_mark()?
David Holmes
david.holmes at oracle.com
Mon Jul 19 00:35:37 UTC 2021
Hi Roman,
Sorry for the delay in getting to this ...
On 13/07/2021 7:18 am, patricio.chilano.mateo at oracle.com wrote:
>
> On 7/12/21 4:26 PM, Roman Kennke wrote:
>>>> Hi Hotspot runtime devs,
>>>>
>>>> I am messing a little bit with read_stable_mark() in
>>>> synchronizer.cpp in Lilliput project (because I want to load the
>>>> Klass* from the header).
>>>>
>>>> I notice that, in read_stable_mark(), we are loading the header with
>>>> MO_RELAXED memory ordering, even though the INFLATING header is
>>>> stored either via CAS or (re-)stored with MO_RELEASE. Wouldn't it be
>>>> more consequential to load the header with MO_ACQUIRE instead?
>>> For the monitor case I think we already have an address dependency in
>>> the reader so there is no need for the acquire. So we first read the
>>> monitor address and then use that address to access the _header
>>> field(FastHashCode() case) or _owner
>>> field(current_thread_holds_lock() and get_lock_owner() cases). The
>>> writer in turn orders the stores to _owner/_header and the publishing
>>> of the monitor address by using a release store in inflate() (I think
>>> this is the one you are referring to).
>>>
>>
>> Thanks Patricio,
>>
>> I am not quite sure that I follow you.
>>
>> On the reader side, given an object, when a thread calls into
>> FastHashCode() or any locking that causes inflation, it needs to read
>> the object header, and that is always done with a plain read. In
>> particular, read_stable_mark() does that, while waiting for the header
>> to become != INFLATING. On the writer side, we use CAS (using
>> MO_SEQ_CST) to temporarily turn on INFLATING(with release_store() to
>> write back the original header, and store with MO_RELEASE to write
>> back the original header.
>>
>> It probably doesn't matter all that much.
>>
>> I am wondering about this because I am currently moving the Klass*
>> into the object header, which means that for klass() implementation,
>> we need to load the header similar to how we do it for hashcode(). In
>> particular it involves read_stable_mark(). Now, there is a variant of
>> klass-accessor that implies acquire-semantics:
>> oopDesc::klass_or_null_acquire(). I am not quite sure why exactly we
>> need acquire semantics there, but in order to implement it, I believe
>> we would have to make a variant of read_stable_mark() that uses
>> acquire semantics to load the header, instead of plain reads.
>>
>> But I'm thinking it may be cleaner overall to always use acquire
>> semantics in read_stable_mark() (mirroring the release-store and
>> seq-cst-CAS), which would make the acquire-accessors for klass()
>> superfluous.
>>
>> What do you think?
> It definitely doesn't break correctness if you add it and probably
> doesn't change anything in terms of performance. Now, looking back in
> history, klass_or_null_acquire() was added in 8166583 to protect against
> reading a partially allocated object. The description of the issue
> actually argues to explicitly have klass_or_null_acquire() instead of
> making klass_or_null() quietly have acquire semantics. If that is the
> case then I think you would still need to have a version with a plain
> load without the extra fence. You could probably have two versions of
> read_stable_mark() by refactoring everything except the load of the mark
> and check into a separate shared function. Also by removing the uneeded
> extra code before the 'for' loop will make them even shorter. Another
> option, although more conservative in terms of memory ordering on
> certain cpus, is to keep read_stable_mark() as is and have
> klass_or_null_acquire() use a standalone OrderAccess::acquire() fence
> before returning. I think any solution would work, but maybe wait for
> somebody else to chime in to hear other thoughts.
Looking at the uses of read_stable_mark() I would say that FastHashCode
and get_lock_owner() should have acquire semantics to match the implied
release semantics used when there is inflation and those methods have to
read fields of the monitor. I'm not convinced we can rely on any address
dependency for semantics in shared code - and it seems too subtle a
point even if we can.
The other uses don't need acquire semantics, so I'd argue for adding a
memory order parameter to read_stable_mark so that it can be controlled
explicitly. That memory ordering parameter would then need to be passed
through to the call to obj->mark() - either as a memory order parameter
or by calling mark() or a new mark_acquire() (adding mark_acquire()
seems more in keeping with current oopDesc API design style).
Cheers,
David
> Thanks,
> Patricio
More information about the hotspot-runtime-dev
mailing list