RFR (S) cleanup misc issues prior to Contended Locking reorder and cache line bucket (8047104)

Wed Jul 2 22:15:04 UTC 2014

Thanks for the quick re-review!

Dan

On 7/2/14 3:46 PM, serguei.spitsyn at oracle.com wrote:
> This looks good to me.
>
> Thanks,
> Serguei
>
> On 7/2/14 1:04 PM, Daniel D. Daugherty wrote:
>> On 7/2/14 10:15 AM, Daniel D. Daugherty wrote:
>>> Closing the loop on the remaining open issues from the first
>>> round of code review...
>>>
>>> Hopefully, I'll have an updated webrev ready for the second round of
>>> code review later today.
>>
>> OK, here's the webrev URL for the second round:
>>
>> http://cr.openjdk.java.net/~dcubed/8047104-webrev/1-jdk9-hs-rt/
>>
>> The files changed during this round:
>>
>>     src/share/vm/runtime/objectMonitor.cpp
>>     src/share/vm/runtime/objectMonitor.hpp
>>     src/share/vm/runtime/synchronizer.cpp
>>     src/share/vm/runtime/synchronizer.hpp
>>     src/share/vm/runtime/thread.cpp
>>
>>
>> Resolutions are embedded below so I could make sure I made
>> all the changes...
>>
>> Dan
>>
>>
>>>
>>>
>>> On 6/20/14 9:23 AM, Daniel D. Daugherty wrote:
>>>>
>>>>> A few nits and queries and one more significant issue, which 
>>>>> happens to be first:
>>>>>
>>>>> src/share/vm/runtime/atomic.hpp
>>>>>
>>>>> + // add* provide full bidirectional fence, i.e. 
>>>>> storeload/read-modify-write/storeload
>>>>>
>>>>> That's not a "full bi-directional fence". A fence is: 
>>>>> loadLoad|loadStore|storeLoad|storeStore - as per orderAccess.hpp. 
>>>>> The cmpxchg wording:
>>>>>
>>>>> "Guarantees a two-way memory barrier across the cmpxchg. I.e., 
>>>>> it's really a 'fence_cmpxchg_acquire'."
>>>>>
>>>>> was accurate in terms of required behaviour for cmpxchg. Whether 
>>>>> the fence and acquire needed to be explicit depends on the platform.
>>>>>
>>>>> It is also not clear to me that other than cmpxchg* those atomic 
>>>>> ops need "full bi-directional fences", or whether the current 
>>>>> implementations actually provide them.
>>>>
>>>> This item will take some research to resolve so I'll have to reply
>>>> again on this one later.
>>>
>>> We've been trying to reach closure on the new wording for atomic.hpp
>>> in off-thread e-mails for almost two weeks. At this point, I'm dropping
>>> the atomic.hpp changes from this bug fix (8047104). Changes to the
>>> wording in atomic.hpp may happen in a future Contended Locking bucket,
>>> but for now it's not happening.
>>
>> Reverted src/share/vm/runtime/atomic.hpp.
>>
>>
>>>
>>>>
>>>>> src/share/vm/runtime/synchronizer.cpp
>>>>>
>>>>> 579 // Handle for oop obj in case of STW safepoint
>>>>>
>>>>> STW is redundant - all safepoints are STW.
>>>>
>>>> Yes, currently, but folks keep talking about adding per-thread
>>>> safepoints. I see that 'STW <this-and-that>' seems to be a common
>>>> comment in the monitor subsystem. I'll chase down what Dave D is
>>>> really trying to communicate here and see what I can do to clarify
>>>> the comments.
>>>
>>> After off-thread discussions with both Dave D and David D, we're
>>> going to leave the uses of 'STW' in the monitor code alone.
>>>
>>>
>>>>
>>>>> 896 // Too slow for general assert or debug
>>>>> 897 #if 0
>>>>>
>>>>> I thought the trend these days was to delete unused code?
>>>>
>>>> Yes, I've been observing that trend and I find it "interesting".
>>>> This particular code is useful when debugging experimental
>>>> optimizations in the monitor subsystem, but there might be a
>>>> better way to keep the code around. I'll investigate.
>>>
>>> Will add Knob_VerifyInUse flag in the same way as the Knob_Verbose
>>> flag, will remove the '#if 0 ... #endif' to enable compiling
>>> ObjectSynchronizer::verifyInUse(), and will uncomment the calls
>>> to verifyInUse() and make the calls conditional on Knob_VerifyInUse.
>>
>> Done.
>>
>>
>>>
>>>>
>>>>> src/share/vm/runtime/thread.cpp
>>>>>
>>>>> 4402 // That is, all the LDs and STs within the critical section must
>>>>> 4403 // be globally visible before the ST that releases the lock 
>>>>> becomes visible.
>>>>>
>>>>> Nit: LDs don't become visible.
>>>>
>>>> I wonder if Dave D was trying to say that the "LDs" need to see
>>>> the changes made by the "ST". I'll get clarification.
>>>
>>> Dave D has proposed the following rewrite:
>>>
>>> > Loads and stores in the critical section — which appear in program
>>> > order before the store that releases the lock — must also appear
>>> > before the store that releases the lock in memory visibility order.
>>>
>>> David H and I are good with the above.
>>
>> Done.
>>
>>
>>>
>>>>
>>>>> 4404 // Conceptually we need a #storeload|#storestore "release" 
>>>>> MEMBAR before
>>>>> 4405 // the ST of 0 into the lock-word which releases the lock, so 
>>>>> fence
>>>>> 4406 // more than covers this on all platforms.
>>>>>
>>>>> I think that should be loadStore|storeStore. In which case the 
>>>>> fence() is completely overkill as we only need a 
>>>>> release_store(addr, 0). ??
>>>>
>>>> This item will also take some research to resolve so I'll have to 
>>>> reply
>>>> again on this one later.
>>>
>>> Two resolutions here. First, fix this typo:
>>>
>>> 4404 // Conceptually we need a #storeload|#storestore "release" 
>>> MEMBAR before
>>>
>>> by changing it to:
>>>
>>> 4404 // Conceptually we need a #loadstore|#storestore "release" 
>>> MEMBAR before
>>
>> Done.
>>
>>
>>> Second, consider changing the following code:
>>>
>>> 4397 OrderAccess::fence(); // guarantee at least release consistency.
>>> 4407 *adr = 0;
>>>
>>> to this code in the "adaptive spin" bucket:
>>>
>>> 4407 OrderAccess::release_store(adr, 0); // guarantee at least 
>>> release consistency.
>>>
>>> where the change can undergo more extensive testing than in this 
>>> cleanup
>>> bucket.
>>>
>>> Dan 
>