RFR(L) 8153224 Monitor deflation prolong safepoints (CR7/v2.07/10-for-jdk14)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon Nov 4 21:20:00 UTC 2019
Hi David,
This set of comments is not addressed in the CR8/v2.08/11-for-jdk14
code review request that I just sent out.
I'll have to go through these comments and address them as part of the
CR8 resolution cycle.
Dan
On 11/4/19 1:28 AM, David Holmes wrote:
> Hi Dan,
>
> A few follow ups to your responses, with trimming ...
>
> On 30/10/2019 6:20 am, Daniel D. Daugherty wrote:
>> On 10/24/19 7:00 AM, David Holmes wrote:
>>> 122 // Set _owner field to new_value; current value must match
>>> old_value.
>>> 123 inline void ObjectMonitor::set_owner_from(void* new_value,
>>> void* old_value) {
>>> 124 void* prev = Atomic::cmpxchg(new_value, &_owner, old_value);
>>> 125 ADIM_guarantee(prev == old_value, "unexpected prev owner="
>>> INTPTR_FORMAT
>>>
>>> The use of cmpxchg seems a little strange here if you are asserting
>>> that when this is called _owner must equal old_value. That means you
>>> don't expect any race and if there is no race with another thread
>>> writing to _owner then you don't need the cmpxchg. A normal:
>>>
>>> if (_owner == old_value) {
>>> Atomic::store(&_owner, new_value);
>>> log(...);
>>> } else {
>>> guarantee(false, " unexpected old owner ...");
>>> }
>>
>> The two parameter version of set_owner_from() is only called from three
>> places and we'll cover two of them here:
>>
>> src/hotspot/share/runtime/objectMonitor.cpp:
>>
>> 1041 if (AsyncDeflateIdleMonitors) {
>> 1042 set_owner_from(NULL, Self);
>> 1043 } else {
>> 1044 OrderAccess::release_store(&_owner, (void*)NULL); // drop
>> the lock
>> 1045 OrderAccess::storeload(); // See if
>> we need to wake a successor
>> 1046 }
>>
>> and:
>>
>> 1221 if (AsyncDeflateIdleMonitors) {
>> 1222 set_owner_from(NULL, Self);
>> 1223 } else {
>> 1224 OrderAccess::release_store(&_owner, (void*)NULL);
>> 1225 OrderAccess::fence(); // ST
>> _owner vs LD in unpark()
>> 1226 }
>>
>> So I've replaced the existing {release_store(), storeload()} combo
>> for one
>> call site and the existing {release_store(), fence()} combo for the
>> other
>> call site with a cmpxchg(). I chose cmpxchg() for these reasons:
>>
>> 1) I wanted the same memory sync behavior at both call sites.
>> 2) I wanted similar/same memory sync behavior as the original
>> code at those call sites.
>
> Why? The memory sync requirements for non-async deflation may be
> completely different to those required for async-delfation (given all
> the other bits if the protocol).
>
>> 3) I wanted the return value from cmpxchg() for my state machine
>> sanity check.
>
> I'm somewhat dubious about using cmpxchg just for the side-effect of
> getting the existing value.
>
>> I don't think that using 'Atomic::store(&_owner, new_value)' is the
>> right choice for these two call sites.
>
> If you don't actually need the cmpxchg to handle concurrent updates to
> the _owner field, then a plain store (not an Atomic::store - that was
> an error on my part) does not seem unreasonable; or if there are still
> memory sync issues here, perhaps a release_store.
>
> If you use cmpxchg then anyone reading the code will assume there is a
> concurrent update that you are guarding against.
>
>> The last two parameter set_owner_from() is talked about in the
>> next reply.
>>
>>
>>> Similarly for the old_value1/old_valuie2 version.
>>
>> The three parameter version of set_owner_from() is only called from one
>> place and the last two parameter version is called from the same place:
>>
>> src/hotspot/share/runtime/synchronizer.cpp:
>>
>> 1903 if (AsyncDeflateIdleMonitors) {
>> 1904 m->set_owner_from(mark.locker(), NULL, DEFLATER_MARKER);
>> 1905 } else {
>> 1906 m->set_owner_from(mark.locker(), NULL);
>> 1907 }
>>
>> The original code was:
>>
>> 1399 m->set_owner(mark.locker());
>>
>> The original set_owner() code was defined like this:
>>
>> 87 inline void ObjectMonitor::set_owner(void* owner) {
>> 88 _owner = owner;
>> 89 }
>>
>> So the original code didn't do any memory sync'ing at all and I've
>> changed that to a cmpxchg() on both code paths. That appears to be
>> overkill for that callsite...
>
> Again I'm not sure any memory sync requirements from the non-async
> case should necessarily transfer over to the async case. Even if you
> end up requiring similar memory sync the reasoning would be quite
> different I would expect.
>
>>
>> We're in ObjectSynchronizer::inflate(), in the "CASE: stack-locked"
>> section of the code. We've gotten our ObjectMonitor from om_alloc()
>> and are initializing a number of fields in the ObjectMonitor. The
>> ObjectMonitor is not published until we do:
>>
>> 1916 object->release_set_mark(markWord::encode(m));
>>
>> So we don't need the memory sync'ing features of the cmpxchg() for
>> either of the set_owner_from() calls and all that leaves is the
>> state machine sanity check.
>>
>> I really like the state machine sanity check on the owner field but
>> that's just because it came in handy when chasing the recent races.
>> It would be easy to change the three parameter version of
>> set_owner_from() to not do memory sync'ing, but still do the state
>> machine sanity check.
>>
>> Update: Changing the three parameter version of set_owner_from()
>> may impact the changes to owner_is_DEFLATER_MARKER() discussed
>> above. Sigh...
>> Update 2: Probably no impact because the three parameter version of
>> set_owner_from() is only used before the ObjectMonitor is published
>> and owner_is_DEFLATER_MARKER() is used after the ObjectMonitor has
>> appeared on an in-use list.
>>
>> However, the two parameter version of set_owner_from() needs its
>> memory sync'ing behavior for it's objectMonitor.cpp call sites so
>> this call site would need something different.
>>
>> I'm not sure which solution I'm going to pick yet, but I definitely
>> have to change something here since we don't need cmpxchg() at this
>> call site. More thought is required.
>
> I will look to see where this ended up.
>
>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>
>>>
>>> 267 if (AsyncDeflateIdleMonitors &&
>>> 268 try_set_owner_from(Self, DEFLATER_MARKER) ==
>>> DEFLATER_MARKER) {
>>
>> For more context, we are in:
>>
>> 241 void ObjectMonitor::enter(TRAPS) {
>>
>>
>>> I don't see why you need to call try_set_owner_from again here as
>>> "cur" will already be DEFLATER_MARKER from the previous try_set_owner.
>>
>> I assume the previous try_set_owner() call you mean is this one:
>>
>> 248 void* cur = try_set_owner_from(Self, NULL);
>>
>> This first try_set_owner() is for the most common case of no owner.
>>
>> The second try_set_owner() call is for a different condition than the
>> first:
>>
>> 268 try_set_owner_from(Self, DEFLATER_MARKER) ==
>> DEFLATER_MARKER) {
>>
>> L248 is trying to change the _owner field from NULL -> 'Self'.
>> L268 is trying to change the _owner field from DEFLATER_MARKER to
>> 'Self'.
>>
>> If the try_set_owner() call on L248 fails, 'cur' can be several possible
>> values:
>>
>> - the calling thread (recursive enter is handled on L254-7)
>> - other owning thread value (BasicLock* or Thread*)
>> - DEFLATER_MARKER
>
> I'll give a caution okay to that explanation (the deficiency being in
> my understanding, not your explaining :) ).
>
>>> Further, I don't see how installing self as the _owner here is valid
>>> and means you acquired the monitor, as the fact it was
>>> DEFLATER_MARKER means it is still being deflated by another thread
>>> doesn't it ???
>>
>> I guess the comment after L268 didn't work for you:
>>
>> 269 // The deflation protocol finished the first part (setting
>> owner),
>> 270 // but it failed the second part (making ref_count
>> negative) and
>> 271 // bailed. Or the ObjectMonitor was async deflated and reused.
>>
>> It means that the deflater thread was racing with this enter and
>> managed to set the owner field to DEFLATER_MARKER as the first step
>> in the deflation protocol. Our entering thread actually won the race
>> when it managed to set the ref_count to a positive value as part of
>> the ObjectMonitorHandle stuff done in the inflate() call that preceded
>> the enter() call. However, the deflater thread hasn't realized that it
>> lost the race yet and hasn't restored the owner field back to NULL.
>
> You're right the comment didn't work for me as it required me to be
> holding too much of the protocol in my head. Makes more sense now.
>
> Thanks,
> David
> -----
More information about the hotspot-runtime-dev
mailing list