RFR(S): 8236035: refactor ObjectMonitor::set_owner() and _owner field setting

Tue Jan 28 23:16:01 UTC 2020

On 29/01/2020 3:11 am, Daniel D. Daugherty wrote:
> Hi David and Kim,
> 
> David, thanks for chiming in on this comment.
> 
> More below...
> 
> 
> On 1/27/20 10:06 PM, David Holmes wrote:
>> Hi Kim,
>>
>> Picking up on one area of comments ...
>>
>> On 28/01/2020 7:44 am, Kim Barrett wrote:
>>>> On Jan 27, 2020, at 3:06 PM, Daniel D. Daugherty 
>>>> <daniel.daugherty at oracle.com> wrote:
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>    88 // Clear _owner field; current value must match old_value.
>>>    89 // If needs_fence is true, we issue a fence() after the 
>>> release_store().
>>>    90 // Otherwise, a storeload() is good enough. See the callers for 
>>> more info.
>>>    91 inline void 
>>> ObjectMonitor::release_clear_owner_with_barrier(void* old_value,
>>> 92 bool needs_fence) {
>>>
>>> Here's what I was going to say about the original version:
>>>
>>> I would prefer the description of needs_fence to be on the API
>>> declaration in the header.  On the other hand, I think that argument
>>> is unnecessary specialization that has no actual effect (other than to
>>> perhaps make the code larger and slower on some platforms). storeload
>>> is equivalent to fence on every platform in HotSpot.  (Either
>>> storeload is implemented as a call to fence, or they have the same
>>> definitions in terms of some platform-specific thing.)  I think it
>>> should just use release_store_fence (and perhaps adjust the function's
>>> name).
>>
>> The code should use the operation that is semantically required for 
>> correctness, irrespective of what the underlying implementation is 
>> equivalent to.
>>
>>> The second version has reverted to having the callers specify the
>>> barrier explicitly.  It looks to me like the distinction in the two
>>> call sites is just different authors at different times.
>>
>> Actually this is all Dave Dice code from 2005.
> 
> David, thanks for digging this up. I know that I dug it up once before
> when you commented on this code in the 8153224 review thread so you
> saved me the spelunking trip...

Ha! Wish I'd saved myself the trip - I'd forgotten all the details from 
previous discussions :(

> 
>>
>>>  For example,
>>> the fence at objectMonitor.cpp:1097 is commented as being for store of
>>> _owner vs load in unpark, so seems like it could be a storeload if one
>>> wanted to go that way.
>>
>> Must admit I find that comment hard to understand.
>>
>> Let's examine the two code fragments. Here's current code for first 
>> chunk:
>>
>>  919     Atomic::release_store(&_owner, (void*)NULL);   // drop the lock
>>  920     OrderAccess::storeload();                      // See if we 
>> need to wake a successor
>>  921     if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != 
>> NULL) {
>>  922       return;
>>  923     }
>>
>> Back in 2005 we had some additional commentary that explained this:
>>
>>  // Observe the Dekker/Lamport duality:
>> 2801       // A thread in ::exit() executes:
>> 2802       //   ST Owner=null; MEMBAR; LD EntryList|cxq.
>> 2803       // A thread in the contended ::enter() path executes the 
>> complementary:
>> 2804       //   ST EntryList|cxq = nonnull; MEMBAR; LD Owner.
>>
>> The MEMBAR referenced at L2802 is the storeload we see above in 
>> current code. The enter code is covered by this comment (in part) in 
>> current code:
>>
>>  527   // Note the Dekker/Lamport duality: ST cxq; MEMBAR; LD Owner.
>>  528   // In this case the ST-MEMBAR is accomplished with CAS().
>>
>> So the storeload is all that is required here.
> 
> When David and I discussed this in the 8153224 thread, we concluded that
> the storeload was all that we needed and we chose not to change it to a
> fence(). That discussion motivated the more fleshed out comment here
> in the proposed change:
> 
>   915     // Uses a storeload to separate release_store(owner) from the
>   916     // successor check. The try_set_owner() below uses cmpxchg() so
>   917     // we get the fence down there.
>   918     release_clear_owner(Self);
>   919     OrderAccess::storeload();
>   920
>   921     if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != 
> NULL) {
>   922       return;
>   923     }

After re-reading various versions of the code and comments from 
2005-2007 I can't help but feel we've lost some of the big picture here 
when it comes to explaining the memory barriers:
- the release_store of _owner is necessary to ensure anyone seeing 
_owner as NULL is guaranteed to see the previous "meta-data" updates
- the storeload() is needed as part of the Dekker-duality used when 
accessing _owner and _succ
- the fence() is needed for ... well that depends on the exact location 
of the fence.

Cheers,
David
-----

> 
>> In ExitEpilog we currently have:
>>
>> 1094   // Drop the lock
>> 1095   Atomic::release_store(&_owner, (void*)NULL);
>> 1096   OrderAccess::fence();                               // ST 
>> _owner vs LD in unpark()
>> 1097
>> 1098   DTRACE_MONITOR_PROBE(contended__exit, this, object(), Self);
>> 1099   Trigger->unpark();
>>
>> Back in 2005 this was:
>>
>> 2670    _owner = NULL ;
>> 2671    OrderAccess::fence() ;
>> 2672
>> <unrelated comments elided>
>> 2682    if (SafepointSynchronize::do_call_back()) {
>> 2683       TEVENT (unpark before SAFEPOINT) ;
>> 2684    }
>> 2685
>> <unrelated comments elided>
>> 2699    Trigger->unpark() ;
>>
>> No explanation for the fence() - though to me this is needed to ensure 
>> visibility of the store to _owner, as well as ensuring ordering with 
>> the unpark code. The comment was added (again by Dice) in Feb 2007. So 
>> the question is:
>> - is the fence() stronger than what we need, or is the comment 
>> incomplete?
>> I tend to favor the latter, given the fence has always been there.
>>
>> So I remain in favor of isolating the trailing memory-barrier from the 
>> release_store of _owner.
> 
> When David and I discussed this in the 8153224 thread, we concluded that
> this location needed the full fence() so that motivated the only slightly
> fleshed out comment here:
> 
> 1095   // Uses a fence to separate release_store(owner) from the LD in 
> unpark().
> 1096   release_clear_owner(Self);
> 1097   OrderAccess::fence();
> 
> 
> In summary, I'm planning to leave these two pieces of code as they are
> in this version of the fix.
> 
> Kim, please confirm that you are okay with this or not.
> 
> David, thanks again for saving me the spelunking work.
> 
> Dan
> 
> 
>>
>> Cheers,
>> David
>> -----
>>
>>
>>> So I still suggest just using release_store_fence, possibly with a
>>> function name adjustment.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>   102   void* prev = _owner;
>>>   105   _owner = new_value;
>>>   114   void* prev = _owner;
>>>   119   _owner = self;
>>>
>>> Consider using Atomic::load and Atomic::store for these, making the
>>> intent of a relaxed atomic operation explicit.  Though that might be
>>> seen as inconsistent with various other places where _owner is being
>>> directly read or written.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>    90   void* prev = _owner;
>>>
>>> Consider making this DEBUG_ONLY, and changing the reference to prev in
>>> the later log_trace use old_value instead of prev.  This would allow
>>> the load of _owner to be eliminated in a release build; that isn't
>>> permitted as written, because it's volatile.
>>>
>>> (The old_value seems to be (nearly?) always either a value we already
>>> have for other reasons, or NULL, so the only additional cost we're
>>> paying for it is register pressure to keep it around until the possible
>>> tracing use.)
>>>
>>> Similarly in the other nearby functions.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/objectMonitor.cpp
>>> In ObjectMonitor::exit.
>>>   864   if (THREAD != _owner) {
>>>   865     void* cur = _owner;
>>>
>>> The value of _owner is being captured in a variable to avoid multiple
>>> reads in the code below.  I don't see any reason not to exchange these
>>> two lines and change the compare to use cur instead of _owner.
>>> (_owner being volatile prevents the compiler from automatically
>>> coalescing the loads.)
>>>
>>> Similarly in complete_exit, around lin 1124.
>>> Similarly in check_owner, around line 1175.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
>>>
>