[External] : Re: Why is mark read as MO_RELAXED in read_stable_mark()?

David Holmes david.holmes at oracle.com
Mon Jul 19 11:29:35 UTC 2021


On 19/07/2021 8:05 pm, Roman Kennke wrote:
> Hi David,
> 
>>>>>> I am messing a little bit with read_stable_mark() in 
>>>>>> synchronizer.cpp in Lilliput project (because I want to load the 
>>>>>> Klass* from the header).
>>>>>>
>>>>>> I notice that, in read_stable_mark(), we are loading the header 
>>>>>> with MO_RELAXED memory ordering, even though the INFLATING header 
>>>>>> is stored either via CAS or (re-)stored with MO_RELEASE. Wouldn't 
>>>>>> it be more consequential to load the header with MO_ACQUIRE instead?
>>>>> For the monitor case I think we already have an address dependency 
>>>>> in the reader so there is no need for the acquire. So we first read 
>>>>> the monitor address and then use that address to access the _header 
>>>>> field(FastHashCode() case) or _owner 
>>>>> field(current_thread_holds_lock() and get_lock_owner() cases). The 
>>>>> writer in turn orders the stores to _owner/_header and the 
>>>>> publishing of the monitor address by using a release store in 
>>>>> inflate() (I think this is the one you are referring to).
>>>>>
>>>>
>>>> Thanks Patricio,
>>>>
>>>> I am not quite sure that I follow you.
>>>>
>>>> On the reader side, given an object, when a thread calls into 
>>>> FastHashCode() or any locking that causes inflation, it needs to 
>>>> read the object header, and that is always done with a plain read. 
>>>> In particular, read_stable_mark() does that, while waiting for the 
>>>> header to become != INFLATING. On the writer side, we use CAS (using 
>>>> MO_SEQ_CST) to temporarily turn on INFLATING(with release_store() to 
>>>> write back the original header, and store with MO_RELEASE to write 
>>>> back the original header.
>>>>
>>>> It probably doesn't matter all that much.
>>>>
>>>> I am wondering about this because I am currently moving the Klass* 
>>>> into the object header, which means that for klass() implementation, 
>>>> we need to load the header similar to how we do it for hashcode(). 
>>>> In particular it involves read_stable_mark(). Now, there is a 
>>>> variant of klass-accessor that implies acquire-semantics: 
>>>> oopDesc::klass_or_null_acquire(). I am not quite sure why exactly we 
>>>> need acquire semantics there, but in order to implement it, I 
>>>> believe we would have to make a variant of read_stable_mark() that 
>>>> uses acquire semantics to load the header, instead of plain reads.
>>>>
>>>> But I'm thinking it may be cleaner overall to always use acquire 
>>>> semantics in read_stable_mark() (mirroring the release-store and 
>>>> seq-cst-CAS), which would make the acquire-accessors for klass() 
>>>> superfluous.
>>>>
>>>> What do you think?
>>> It definitely doesn't break correctness if you add it and probably 
>>> doesn't change anything in terms of performance. Now, looking back in 
>>> history, klass_or_null_acquire() was added in 8166583 to protect 
>>> against reading a partially allocated object. The description of the 
>>> issue actually argues to explicitly have klass_or_null_acquire() 
>>> instead of making klass_or_null() quietly have acquire semantics. If 
>>> that is the case then I think you would still need to have a version 
>>> with a plain load without the extra fence. You could probably have 
>>> two versions of read_stable_mark() by refactoring everything except 
>>> the load of the mark and check into a separate shared function. Also 
>>> by removing the uneeded extra code before the 'for' loop will make 
>>> them even shorter. Another option, although more conservative in 
>>> terms of memory ordering on certain cpus, is to keep 
>>> read_stable_mark() as is and have klass_or_null_acquire() use a 
>>> standalone OrderAccess::acquire() fence before returning. I think any 
>>> solution would work, but maybe wait for somebody else to chime in to 
>>> hear other thoughts.
>>
>> Looking at the uses of read_stable_mark() I would say that 
>> FastHashCode and get_lock_owner() should have acquire semantics to 
>> match the implied release semantics used when there is inflation and 
>> those methods have to read fields of the monitor. I'm not convinced we 
>> can rely on any address dependency for semantics in shared code - and 
>> it seems too subtle a point even if we can.
> 
> Ok, I agree.
> 
>> The other uses don't need acquire semantics
> 
> Why do you think so?
> 
> I see a use in current_thread_holds_lock(), and how is that different 
> from get_lock_owner() ? It also reads a field of the monitor after 
> loading the header word.

Because that thread is only interested in correctly seeing values 
written by itself. It can't get any kind of false positive. If it is the 
owner it must see its own writes; if it is not the owner it can't see 
itself incorrectly listed as the owner.

> And then I see another use in ::inflate(), which seems to me the primary 
> use where acquire is needed. It certainly reads from the monitor after 
> loading the header. I'd actually go a little further here, and also give 
> acquire semantics to the first plain header-loads in that method. 
> (In-fact, why not restructure the retry-loop to do read_stable_mark() 
> once at the loop-entry?)

Yes you're right I didn't reason correctly about inflate. Just taking 
the first inflated case we read the displaced markword, but without 
acquire semantics on the original markWord, we're not guaranteed to see 
the displaced header (except by the previously mentioned address 
dependency).

Cheers,
David


> 
> , so I'd argue for adding a
>> memory order parameter to read_stable_mark so that it can be 
>> controlled explicitly. That memory ordering parameter would then need 
>> to be passed through to the call to obj->mark() - either as a memory 
>> order parameter or by calling mark() or a new mark_acquire() (adding 
>> mark_acquire() seems more in keeping with current oopDesc API design 
>> style).
> 
> In my local prototype, I have added a template parameter to 
> read_stable_mark and pass that to (a new) oopDesc::mark() that also 
> takes the MO template parameter to load the header word.
> 
> I filed:
> https://bugs.openjdk.java.net/browse/JDK-8270894
> 
> Thanks,
> Roman
> 


More information about the hotspot-runtime-dev mailing list