RFR(S): 8236035: refactor ObjectMonitor::set_owner() and _owner field setting

Wed Jan 29 00:28:03 UTC 2020

Hi Dan,

My comments re comments was more of an aside - something to look at 
separately.

I will review your latest changes asap - slightly swamped at the moment.

Thanks,
David

On 29/01/2020 10:00 am, Daniel D. Daugherty wrote:
> On 1/28/20 6:16 PM, David Holmes wrote:
>> On 29/01/2020 3:11 am, Daniel D. Daugherty wrote:
>>> Hi David and Kim,
>>>
>>> David, thanks for chiming in on this comment.
>>>
>>> More below...
>>>
>>>
>>> On 1/27/20 10:06 PM, David Holmes wrote:
>>>> Hi Kim,
>>>>
>>>> Picking up on one area of comments ...
>>>>
>>>> On 28/01/2020 7:44 am, Kim Barrett wrote:
>>>>>> On Jan 27, 2020, at 3:06 PM, Daniel D. Daugherty 
>>>>>> <daniel.daugherty at oracle.com> wrote:
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>>>    88 // Clear _owner field; current value must match old_value.
>>>>>    89 // If needs_fence is true, we issue a fence() after the 
>>>>> release_store().
>>>>>    90 // Otherwise, a storeload() is good enough. See the callers 
>>>>> for more info.
>>>>>    91 inline void 
>>>>> ObjectMonitor::release_clear_owner_with_barrier(void* old_value,
>>>>> 92 bool needs_fence) {
>>>>>
>>>>> Here's what I was going to say about the original version:
>>>>>
>>>>> I would prefer the description of needs_fence to be on the API
>>>>> declaration in the header.  On the other hand, I think that argument
>>>>> is unnecessary specialization that has no actual effect (other than to
>>>>> perhaps make the code larger and slower on some platforms). storeload
>>>>> is equivalent to fence on every platform in HotSpot. (Either
>>>>> storeload is implemented as a call to fence, or they have the same
>>>>> definitions in terms of some platform-specific thing.)  I think it
>>>>> should just use release_store_fence (and perhaps adjust the function's
>>>>> name).
>>>>
>>>> The code should use the operation that is semantically required for 
>>>> correctness, irrespective of what the underlying implementation is 
>>>> equivalent to.
>>>>
>>>>> The second version has reverted to having the callers specify the
>>>>> barrier explicitly.  It looks to me like the distinction in the two
>>>>> call sites is just different authors at different times.
>>>>
>>>> Actually this is all Dave Dice code from 2005.
>>>
>>> David, thanks for digging this up. I know that I dug it up once before
>>> when you commented on this code in the 8153224 review thread so you
>>> saved me the spelunking trip...
>>
>> Ha! Wish I'd saved myself the trip - I'd forgotten all the details 
>> from previous discussions :(
>>
>>>
>>>>
>>>>>  For example,
>>>>> the fence at objectMonitor.cpp:1097 is commented as being for store of
>>>>> _owner vs load in unpark, so seems like it could be a storeload if one
>>>>> wanted to go that way.
>>>>
>>>> Must admit I find that comment hard to understand.
>>>>
>>>> Let's examine the two code fragments. Here's current code for first 
>>>> chunk:
>>>>
>>>>  919     Atomic::release_store(&_owner, (void*)NULL);   // drop the 
>>>> lock
>>>>  920     OrderAccess::storeload();                      // See if we 
>>>> need to wake a successor
>>>>  921     if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != 
>>>> NULL) {
>>>>  922       return;
>>>>  923     }
>>>>
>>>> Back in 2005 we had some additional commentary that explained this:
>>>>
>>>>  // Observe the Dekker/Lamport duality:
>>>> 2801       // A thread in ::exit() executes:
>>>> 2802       //   ST Owner=null; MEMBAR; LD EntryList|cxq.
>>>> 2803       // A thread in the contended ::enter() path executes the 
>>>> complementary:
>>>> 2804       //   ST EntryList|cxq = nonnull; MEMBAR; LD Owner.
>>>>
>>>> The MEMBAR referenced at L2802 is the storeload we see above in 
>>>> current code. The enter code is covered by this comment (in part) in 
>>>> current code:
>>>>
>>>>  527   // Note the Dekker/Lamport duality: ST cxq; MEMBAR; LD Owner.
>>>>  528   // In this case the ST-MEMBAR is accomplished with CAS().
>>>>
>>>> So the storeload is all that is required here.
>>>
>>> When David and I discussed this in the 8153224 thread, we concluded that
>>> the storeload was all that we needed and we chose not to change it to a
>>> fence(). That discussion motivated the more fleshed out comment here
>>> in the proposed change:
>>>
>>>   915     // Uses a storeload to separate release_store(owner) from the
>>>   916     // successor check. The try_set_owner() below uses 
>>> cmpxchg() so
>>>   917     // we get the fence down there.
>>>   918     release_clear_owner(Self);
>>>   919     OrderAccess::storeload();
>>>   920
>>>   921     if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != 
>>> NULL) {
>>>   922       return;
>>>   923     }
>>
>> After re-reading various versions of the code and comments from 
>> 2005-2007 I can't help but feel we've lost some of the big picture 
>> here when it comes to explaining the memory barriers:
>> - the release_store of _owner is necessary to ensure anyone seeing 
>> _owner as NULL is guaranteed to see the previous "meta-data" updates
>> - the storeload() is needed as part of the Dekker-duality used when 
>> accessing _owner and _succ
>> - the fence() is needed for ... well that depends on the exact 
>> location of the fence.
> 
> Ummm... it's not clear here whether you want me to change the comments
> or not. I'm kind of hoping that you're happy with the latest version
> (CR2) that I sent out for review this afternoon (my TZ)...
> 
> Dan
> 
> 
>>
>> Cheers,
>> David
>> -----
>>
>>>
>>>> In ExitEpilog we currently have:
>>>>
>>>> 1094   // Drop the lock
>>>> 1095   Atomic::release_store(&_owner, (void*)NULL);
>>>> 1096   OrderAccess::fence();                               // ST 
>>>> _owner vs LD in unpark()
>>>> 1097
>>>> 1098   DTRACE_MONITOR_PROBE(contended__exit, this, object(), Self);
>>>> 1099   Trigger->unpark();
>>>>
>>>> Back in 2005 this was:
>>>>
>>>> 2670    _owner = NULL ;
>>>> 2671    OrderAccess::fence() ;
>>>> 2672
>>>> <unrelated comments elided>
>>>> 2682    if (SafepointSynchronize::do_call_back()) {
>>>> 2683       TEVENT (unpark before SAFEPOINT) ;
>>>> 2684    }
>>>> 2685
>>>> <unrelated comments elided>
>>>> 2699    Trigger->unpark() ;
>>>>
>>>> No explanation for the fence() - though to me this is needed to 
>>>> ensure visibility of the store to _owner, as well as ensuring 
>>>> ordering with the unpark code. The comment was added (again by Dice) 
>>>> in Feb 2007. So the question is:
>>>> - is the fence() stronger than what we need, or is the comment 
>>>> incomplete?
>>>> I tend to favor the latter, given the fence has always been there.
>>>>
>>>> So I remain in favor of isolating the trailing memory-barrier from 
>>>> the release_store of _owner.
>>>
>>> When David and I discussed this in the 8153224 thread, we concluded that
>>> this location needed the full fence() so that motivated the only 
>>> slightly
>>> fleshed out comment here:
>>>
>>> 1095   // Uses a fence to separate release_store(owner) from the LD 
>>> in unpark().
>>> 1096   release_clear_owner(Self);
>>> 1097   OrderAccess::fence();
>>>
>>>
>>> In summary, I'm planning to leave these two pieces of code as they are
>>> in this version of the fix.
>>>
>>> Kim, please confirm that you are okay with this or not.
>>>
>>> David, thanks again for saving me the spelunking work.
>>>
>>> Dan
>>>
>>>
>>>>
>>>> Cheers,
>>>> David
>>>> -----
>>>>
>>>>
>>>>> So I still suggest just using release_store_fence, possibly with a
>>>>> function name adjustment.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>>>   102   void* prev = _owner;
>>>>>   105   _owner = new_value;
>>>>>   114   void* prev = _owner;
>>>>>   119   _owner = self;
>>>>>
>>>>> Consider using Atomic::load and Atomic::store for these, making the
>>>>> intent of a relaxed atomic operation explicit.  Though that might be
>>>>> seen as inconsistent with various other places where _owner is being
>>>>> directly read or written.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>>>    90   void* prev = _owner;
>>>>>
>>>>> Consider making this DEBUG_ONLY, and changing the reference to prev in
>>>>> the later log_trace use old_value instead of prev.  This would allow
>>>>> the load of _owner to be eliminated in a release build; that isn't
>>>>> permitted as written, because it's volatile.
>>>>>
>>>>> (The old_value seems to be (nearly?) always either a value we already
>>>>> have for other reasons, or NULL, so the only additional cost we're
>>>>> paying for it is register pressure to keep it around until the 
>>>>> possible
>>>>> tracing use.)
>>>>>
>>>>> Similarly in the other nearby functions.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> src/hotspot/share/runtime/objectMonitor.cpp
>>>>> In ObjectMonitor::exit.
>>>>>   864   if (THREAD != _owner) {
>>>>>   865     void* cur = _owner;
>>>>>
>>>>> The value of _owner is being captured in a variable to avoid multiple
>>>>> reads in the code below.  I don't see any reason not to exchange these
>>>>> two lines and change the compare to use cur instead of _owner.
>>>>> (_owner being volatile prevents the compiler from automatically
>>>>> coalescing the loads.)
>>>>>
>>>>> Similarly in complete_exit, around lin 1124.
>>>>> Similarly in check_owner, around line 1175.
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>>
>>>>>
>>>
>