RFR(S): 8236035: refactor ObjectMonitor::set_owner() and _owner field setting

Thu Dec 19 23:02:51 UTC 2019

Hi Dan,

Couple of follow ups :)

On 20/12/2019 3:15 am, Daniel D. Daugherty wrote:
> Hi David,
> 
> Thanks for the quick review! Sorry for my delay in responding. I've been
> working on the Async Monitor Deflation wiki updates...
> 
> 
> On 12/18/19 12:35 AM, David Holmes wrote:
>> Hi Dan,
>>
>> On 18/12/2019 7:35 am, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I'm extracting another standalone fix from the Async Monitor Deflation
>>> project (JDK-8153224) and sending it out for review (and testing)
>>> separately.
>>>
>>>      JDK-8236035 refactor ObjectMonitor::set_owner() and _owner field 
>>> setting
>>>      https://bugs.openjdk.java.net/browse/JDK-8236035
>>>
>>> Here's the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8236035-webrev/0-for-jdk15/
>>>
>>> Folks that have reviewed JDK-8153224 will recognize these changes as
>>> a subset of the _owner field changes from the Async Monitor Deflation
>>> project. It's a subset because the Async Monitor Deflation project
>>> needs one additional simply_set_owner_from() variant.
>>>
>>> In this set of changes:
>>>
>>> - We replace the one use of set_owner() and the direct _owner field
>>>    updates with four distinct functions:
>>>
>>>   235   // Clear _owner field; current value must match old_value.
>>>   236   void      release_clear_owner_with_barrier(void* old_value, 
>>> bool needs_fence);
>>>   237   // Simply set _owner field to new_value; current value must 
>>> match old_value.
>>>   238   void      simply_set_owner_from(void* old_value, void* 
>>> new_value);
>>>   239   // Simply set _owner field to self; current value must match 
>>> basic_lock_p.
>>>   240   void      simply_set_owner_from_BasicLock(void* basic_lock_p, 
>>> Thread* self);
>>>   241   // Try to set _owner field to new_value if the current value 
>>> matches
>>>   242   // old_value. Otherwise, does not change the _owner field.
>>>   243   void*     try_set_owner_from(void* old_value, void* new_value);
>>>
>>> - Each function has an assert() to verify the pre-condition and has new
>>>    log_trace(monitorinflation, owner) calls for logging ownership 
>>> changes.
>>> - The gory details about necessary memory syncing or why memory syncing
>>>    is not needed is now in one place rather than at each call site.
>>>
>>> These changes are being tested in a Mach5 Tier[1-3] run that's still
>>> running (JDK14 ATR has priority). I'm also running these changes thru
>>> a set of 10 SPECjbb2015 runs.
>>>
>>> Thanks, in advance, for comments, questions or suggestions.
>>
>> This is obviously functionally fine so the only issues are 
>> stylistic/naming ones. Apologies if what I'm about to say contradicts 
>> anything I've said previously. :)
>>
>> 1. I think the simply_ prefix is unnecessary.
> 
> I'll remove the "simply_" prefix.
> 
> 
>> 2. I think release_clear_owner_with_barrier is too complex/awkward. It 
>> would be better IMO to just have release_clear_owner() and leave the 
>> correct trailing barrier in the caller.
> 
> I made that change in the 8153224 review based on this comment from you
> (on 2019.11.06):
> 
>> > I think it needs to be something like this:
>> >
>> >    124 // Clear _owner field; current value must match old_value.
>> >    125 inline void ObjectMonitor::clear_owner_from(void* old_value) {
>> >    126   void* prev = _owner;
>> >    127   ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
>> INTPTR_FORMAT
>> >    128                  ", expected=" INTPTR_FORMAT, p2i(prev), 
>> p2i(old_value));
>> >    129   OrderAccess::release_store(&_owner, (void*)NULL);
>> >    130   OrderAccess::fence();
>> >    131   log_trace(monitorinflation, owner)("clear_owner_from(): 
>> mid=" INTPTR_FORMAT
>> >    132                                      ", prev=" INTPTR_FORMAT, 
>> p2i(this), p2i(prev));
>> >    133 }
>>
>> When factored out like this you are forced to use the heaviest memory 
>> barrier needed by a given callsite and then apply it to all - ie using 
>> the fence() always when it might not be needed.
>>
>> I'm really not a fan of this kind of factoring out in complex 
>> lock-free code as it hides the details of the memory sync operations. 
>> You could make this more obvious in the naming e.g.
>>
>> inline void ObjectMonitor::release_clear_owner_and_fence(void* 
>> expected_owner)
>>
>> or if the fence is not always needed you could parameterise the final 
>> memory barrier as well. 
> 
> But as you said in this set of review comments:
> 
>  > Apologies if what I'm about to say contradicts anything I've said 
> previously. :)
> 
> For now I'm planning to go your original 2019.11.06 suggestion and
> keep the parameterized memory barrier. Let's see what other reviewers
> have to say.

Now that I see the barrier type parameterised I don't like it :)

> 
>> 3. try_ operations are typically boolean operations. In this code one 
>> of the try_set_owner_from calls appears to need the returned Thread*, 
>> but on closer inspection this seems not to be the case:
>>
>> 1684       ox = (Thread*)try_set_owner_from(NULL, Self);
>> 1685       if (ox == NULL) {
>> ... // implied else
>> 1707       // The CAS failed ... we can take any of the following 
>> actions:
>> ...
>> 1712       prv = ox;
>> 1713       goto Abort;
>>
>> So we set prv from ox but then goto Abort which never references prv, 
>> so AFAICS we don't need to set it, so all we need from 
>> try_set_owner_from is true or false.
> 
> In the above location I changed this original code:
> 
>      old L1683:       ox = (Thread*)Atomic::cmpxchg(&_owner, 
> (void*)NULL, Self);
> 
> to:
> 
>      new L1684:       ox = (Thread*)try_set_owner_from(NULL, Self);
> 
> which perfectly illustrates that I was trying for try_set_owner_from()
> to be a direct replacement for this kind of Atomic::cmpxchg() call. And
> like cmpxchg(), try_set_owner_from() returns current value in the memory
> location at the time of the compare. If try_set_owner_from() returns the
> compare value (NULL in this case), then you know that the owner field was
> changed from NULL -> Self. If try_set_owner_from() returns anything other
> than the compare value, then you know that the owner field was not changed.
> 
> You are quite right that 'ox' on that specific code path is not needed
> beyond the NULL check because that code path will 'goto Abort' where
> neither 'ox' or 'prv' are needed. The assignment 'prv = ox' is definitely
> not needed (but that's a baseline bug).
> 
> However, you missed the fact that the non-boolean return from
> try_set_owner_from() is needed in ObjectMonitor::enter():

Yes I missed that - sorry.

>   L248:   void* cur = try_set_owner_from(NULL, Self);
>   L249:   if (cur == NULL) {
>   L250:     assert(_recursions == 0, "invariant");
>   L251:     return;
>   L252:   }
>   L253:
>   L254:   if (cur == Self) {
>   L255:     // TODO-FIXME: check for integer overflow!  BUGID 6557169.
>   L256:     _recursions++;
>   L257:     return;
>   L258:   }
>   L259:
>   L260:   if (Self->is_lock_owned((address)cur)) {
>   L261:     assert(_recursions == 0, "internal state error");
>   L262:     _recursions = 1;
>   L263:     simply_set_owner_from_BasicLock(cur, Self);  // Convert from 
> BasicLock*      to Thread*.
>   L264:     return;
>   L265:   }
> 
> That return value is checked for different possible values on L249, L254,
> L260 and passed as a param on L263. A boolean won't do the trick in this
> code. That's the only location in the baseline where we need more than
> a boolean would give us. However, in the Async Monitor Deflation project
> there are more places that use the return from try_set_owner_from() in
> more than one way so a boolean won't do the trick.
> 
> What I was trying for with the name try_set_owner_from() was to capture
> that we are _attempting_ to set the owner field unlike set_owner_from()
> where we _always_ set the owner field.
> 
> I suppose we could look at different names? Would cmp_set_owner_from()
> or cmpxchg_set_owner_from() be better?

The try_ is not terrible or wrong, it's just more common for it to be a 
boolean function. No big deal. I can't think of a better name.

> 
>> 4. In my own Java objectMonitor work I call 
>> simply_set_owner_from_BasicLock "transition_from_basic_lock". It makes 
>> it clearer (to me at least) exactly what purpose this serves and 
>> avoids the need for comments to that effect.
> 
> As mentioned above, I'm going to remove the "simply_" prefix so we have
> 
>      set_owner_from_BasicLock  <=> transition_from_basic_lock
> 
> I could go with either name. I did use "BasicLock" instead of "basic_lock"
> because the type in the system is BasicLock so I wanted the function to
> show up in a grep.
> 
> Let's see what other reviewers think...
> 
> 
>> One remaining concern with this refactoring and the associated logging 
>> is the potential impact it may have on performance due to inlining 
>> changes and the need to check if logging is enabled. '
> 
> I've been doing Aurora SPECjbb2015 runs (tuned G1, 10 iterations) between
> my baseline (jdk-14+26) and the various fixes. For this fix (8236035)
> relative to my first baseline run:
> 
>      criticalJOPS   +1.73%  (Non-significant)
>      maxJOPS        +0.26%  (Non-significant)
> 
> I reran the baseline a second time using the same bits and for this fix
> (8236035) relative to the second run:
> 
>      criticalJOPS   +8.98%  (Non-significant)
>      maxJOPS       +10.00%  (Non-significant)
> 
> When I compare the two baseline runs:
> 
>      criticalJOPS   -6.65%  (Non-significant)
>      maxJOPS        -8.86%  (Non-significant)
> 
> So even with 10 iterations, these SPECjbb2015 runs are all over the
> place. And Aurora is agreeing with that by flagging the results as
> "Non-significant" because the 'p' values are too large.
> 
> I'm doing another set of Aurora SPECjbb2015 runs and I'm bumping the
> iterations from 10 to 15. In the recent past, Per told me I might
> have to do 20 runs of SPECjbb2015 to shake out the variations.
> We'll see what turns up with this latest set of runs...

Yes those results seem somewhat all over the place. Maybe see if Claes' 
startup benchmark runs are impacted at all.

> 
>> The above comments carry no more or less weight than anyone else's so 
>> I'll leave it to you and other reviewers as to whether anything 
>> actually changes. :)
> 
> Thanks! I appreciate the flexibility.

Christmas spirit :D

I'll be back to my intransigent self in the New Year ;-)

Cheers,
David

> Dan
> 
> 
>>
>> Thanks,
>> David
>> -----
>>
>>> Dan
>