RFR(L) 8153224 Monitor deflation prolong safepoints (CR7/v2.07/10-for-jdk14)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon Nov 4 21:25:20 UTC 2019
Hi David and Erik,
Thanks for chiming in here Erik...
This set of comments is not addressed in the CR8/v2.08/11-for-jdk14
code review request that I just sent out.
I've read this response twice and I'm not quite sure what to do with it
relative to David's CR comment. I'll repeat those here:
> 199 // The decrement only needs to be MO_ACQ_REL since the reference
> 200 // counter is volatile.
> 201 Atomic::dec(&_ref_count);
>
> volatile is irrelevant with regards to memory ordering as it is a
compiler
> annotation. And you haven't specified any memory order value so the
default
> is conservative ie. implied full fence. (I see the same incorrect comment
> is in threadSMR.cpp!)
Should I delete this comment? Or should it be changed? If changed, then
what text do you recommend here?
> 208 // The increment needs to be MO_SEQ_CST so that the reference
> 209 // counter update is seen as soon as possible in a race with the
> 210 // async deflation protocol.
> 211 Atomic::inc(&_ref_count);
>
> Ditto you haven't specified any ordering - and inc() and dec() will
have the same default.
Should I delete this comment? Or should it be changed? If changed, then
what text do you recommend here?
Dan
On 11/4/19 8:09 AM, erik.osterlund at oracle.com wrote:
> Hi,
>
> TL/DR: David is right; the commentary is weird and does not capture
> what the real constraints are.
>
> As the comment implied before "8222034: Thread-SMR functions should be
> updated to remove work around", the PPC port used to have incorrect
> memory ordering, and the code guarded against that. inc/dec used to be
> memory_order_relaxed and add/sub used to be memory_order_acq_rel on
> PPC, despite the shared contract promising memory_order_conservative.
>
> The implication for the nested counter in the Thread SMR project was
> that I wanted to use the inc/dec API but knew it was not gonna work as
> expected on PPC because we really needed *at least*
> memory_order_acq_rel when decrementing (and memory_order_conservative
> when incrementing, which was simulated in a CAS loop... yuck), but
> would find ourselves getting memory_order_relaxed. Rather than
> treating it as a bug in the PPC atomics implementation, and having the
> code be broken while we waited for a fix, I changed the use to sub
> when decrementing (which gave me the required memory_order_acq_rel
> ordering I needed), and the horrible CAS loop when incrementing, as a
> workaround, and alerted Martin Doerr that this would needed to be
> sorted out in the PPC code. Since then, the PPC code did indeed get
> cleaned up so that inc/dec stopped being relaxed-only and worked as
> advertised.
>
> After that, the "8222034: Thread-SMR functions should be updated to
> remove work around" change removed the workaround that was no longer
> required from the code, and put back the desired inc/dec calls (which
> now used an overly conservative memory_order_conservative ordering,
> which is suboptimal, in particular for decrements, but importantly not
> incorrect). Since the nested case would almost never run and is
> possibly the coldest code path in the VM, I did not care to comment in
> that review thread about optimizing it by explicitly passing in a
> weaker ordering. However, I should have commented on the comment that
> was changed, which does indeed look a bit confused. David is right
> that the stuff about volatile has nothing to do with why this is
> correct. The correctness required memory_order_acq_rel for decrements,
> but the implementation provided more, which is fine.
>
> The actual reason why I wanted memory_order_conservative for
> correctness when incrementing and memory_order_acq_rel when
> decrementing, was to prevent accesses inside of the critical section
> (in particular - reading Thread*s from the acquired ThreadsList), from
> floating outside of the reference increment and decrement that marks
> reading the list as safe to access without the underlying list blowing
> up. In practice, it might have been possible to relax it a bit by
> relying on side effects of other unrelated parts of the protocol to
> have spurious fencing... but I did not want to get the protocol
> tangled in that way because it would be difficult to reason about.
>
> Hope this explanation clears up that confusion.
>
> Thanks,
> /Erik
>
> On 11/2/19 2:15 PM, Daniel D. Daugherty wrote:
>> Erik,
>>
>> David H. made a comment during this review cycle that should interest
>> you.
>>
>> The longer version of this comment came up in early reviews of the Async
>> Monitor Deflation code because I copied the code and the longer comment
>> from threadSMR.cpp. I updated the comment based on your input and review
>> and changed the comment and code in threadSMR.cpp and in the Async
>> Monitor
>> Deflation project code.
>>
>> The change in threadSMR.cpp was done with this changeset:
>>
>> $ hg log -v -r 54517
>> changeset: 54517:c201ca660afd
>> user: dcubed
>> date: Thu Apr 11 14:14:30 2019 -0400
>> files: src/hotspot/share/runtime/threadSMR.cpp
>> description:
>> 8222034: Thread-SMR functions should be updated to remove work around
>> Reviewed-by: mdoerr, eosterlund
>>
>> Here's one of the two diffs to job your memory:
>>
>> void ThreadsList::dec_nested_handle_cnt() {
>> - // The decrement needs to be MO_ACQ_REL. At the moment, the
>> Atomic::dec
>> - // backend on PPC does not yet conform to these requirements.
>> Therefore
>> - // the decrement is simulated with an Atomic::sub(1, &addr).
>> - // Without this MO_ACQ_REL Atomic::dec simulation, the nested SMR
>> mechanism
>> - // is not generally safe to use.
>> - Atomic::sub(1, &_nested_handle_cnt);
>> + // The decrement only needs to be MO_ACQ_REL since the reference
>> + // counter is volatile (and the hazard ptr is already NULL).
>> + Atomic::dec(&_nested_handle_cnt);
>> }
>>
>> Below is David's comment about the code comment...
>>
>> Dan
>>
>>
>> Trimming down to just that issue...
>>
>> On 10/29/19 4:20 PM, Daniel D. Daugherty wrote:
>>> On 10/24/19 7:00 AM, David Holmes wrote:
>> >
>> > src/hotspot/share/runtime/objectMonitor.inline.hpp
>>>
>>>> 199 // The decrement only needs to be MO_ACQ_REL since the reference
>>>> 200 // counter is volatile.
>>>> 201 Atomic::dec(&_ref_count);
>>>>
>>>> volatile is irrelevant with regards to memory ordering as it is a
>>>> compiler annotation. And you haven't specified any memory order
>>>> value so the default is conservative ie. implied full fence. (I see
>>>> the same incorrect comment is in threadSMR.cpp!)
>>>
>>> I got that wording from threadSMR.cpp and Erik O. confirmed my use
>>> of that
>>> wording previously. I'll chase it down with Erik and get back to you.
>>>
>>>
>>>> 208 // The increment needs to be MO_SEQ_CST so that the reference
>>>> 209 // counter update is seen as soon as possible in a race with
>>>> the
>>>> 210 // async deflation protocol.
>>>> 211 Atomic::inc(&_ref_count);
>>>>
>>>> Ditto you haven't specified any ordering - and inc() and dec() will
>>>> have the same default.
>>>
>>> And again, I'll have to chase this down with Erik O. and get back to
>>> you.
>>
>
More information about the hotspot-runtime-dev
mailing list