RFR (S) cleanup misc issues prior to Contended Locking reorder and cache line bucket (8047104)

Wed Jul 2 21:46:42 UTC 2014

This looks good to me.

Thanks,
Serguei

On 7/2/14 1:04 PM, Daniel D. Daugherty wrote:
> On 7/2/14 10:15 AM, Daniel D. Daugherty wrote:
>> Closing the loop on the remaining open issues from the first
>> round of code review...
>>
>> Hopefully, I'll have an updated webrev ready for the second round of
>> code review later today.
>
> OK, here's the webrev URL for the second round:
>
> http://cr.openjdk.java.net/~dcubed/8047104-webrev/1-jdk9-hs-rt/
>
> The files changed during this round:
>
>     src/share/vm/runtime/objectMonitor.cpp
>     src/share/vm/runtime/objectMonitor.hpp
>     src/share/vm/runtime/synchronizer.cpp
>     src/share/vm/runtime/synchronizer.hpp
>     src/share/vm/runtime/thread.cpp
>
>
> Resolutions are embedded below so I could make sure I made
> all the changes...
>
> Dan
>
>
>>
>>
>> On 6/20/14 9:23 AM, Daniel D. Daugherty wrote:
>>>
>>>> A few nits and queries and one more significant issue, which 
>>>> happens to be first:
>>>>
>>>> src/share/vm/runtime/atomic.hpp
>>>>
>>>> + // add* provide full bidirectional fence, i.e. 
>>>> storeload/read-modify-write/storeload
>>>>
>>>> That's not a "full bi-directional fence". A fence is: 
>>>> loadLoad|loadStore|storeLoad|storeStore - as per orderAccess.hpp. 
>>>> The cmpxchg wording:
>>>>
>>>> "Guarantees a two-way memory barrier across the cmpxchg. I.e., it's 
>>>> really a 'fence_cmpxchg_acquire'."
>>>>
>>>> was accurate in terms of required behaviour for cmpxchg. Whether 
>>>> the fence and acquire needed to be explicit depends on the platform.
>>>>
>>>> It is also not clear to me that other than cmpxchg* those atomic 
>>>> ops need "full bi-directional fences", or whether the current 
>>>> implementations actually provide them.
>>>
>>> This item will take some research to resolve so I'll have to reply
>>> again on this one later.
>>
>> We've been trying to reach closure on the new wording for atomic.hpp
>> in off-thread e-mails for almost two weeks. At this point, I'm dropping
>> the atomic.hpp changes from this bug fix (8047104). Changes to the
>> wording in atomic.hpp may happen in a future Contended Locking bucket,
>> but for now it's not happening.
>
> Reverted src/share/vm/runtime/atomic.hpp.
>
>
>>
>>>
>>>> src/share/vm/runtime/synchronizer.cpp
>>>>
>>>> 579 // Handle for oop obj in case of STW safepoint
>>>>
>>>> STW is redundant - all safepoints are STW.
>>>
>>> Yes, currently, but folks keep talking about adding per-thread
>>> safepoints. I see that 'STW <this-and-that>' seems to be a common
>>> comment in the monitor subsystem. I'll chase down what Dave D is
>>> really trying to communicate here and see what I can do to clarify
>>> the comments.
>>
>> After off-thread discussions with both Dave D and David D, we're
>> going to leave the uses of 'STW' in the monitor code alone.
>>
>>
>>>
>>>> 896 // Too slow for general assert or debug
>>>> 897 #if 0
>>>>
>>>> I thought the trend these days was to delete unused code?
>>>
>>> Yes, I've been observing that trend and I find it "interesting".
>>> This particular code is useful when debugging experimental
>>> optimizations in the monitor subsystem, but there might be a
>>> better way to keep the code around. I'll investigate.
>>
>> Will add Knob_VerifyInUse flag in the same way as the Knob_Verbose
>> flag, will remove the '#if 0 ... #endif' to enable compiling
>> ObjectSynchronizer::verifyInUse(), and will uncomment the calls
>> to verifyInUse() and make the calls conditional on Knob_VerifyInUse.
>
> Done.
>
>
>>
>>>
>>>> src/share/vm/runtime/thread.cpp
>>>>
>>>> 4402 // That is, all the LDs and STs within the critical section must
>>>> 4403 // be globally visible before the ST that releases the lock 
>>>> becomes visible.
>>>>
>>>> Nit: LDs don't become visible.
>>>
>>> I wonder if Dave D was trying to say that the "LDs" need to see
>>> the changes made by the "ST". I'll get clarification.
>>
>> Dave D has proposed the following rewrite:
>>
>> > Loads and stores in the critical section — which appear in program
>> > order before the store that releases the lock — must also appear
>> > before the store that releases the lock in memory visibility order.
>>
>> David H and I are good with the above.
>
> Done.
>
>
>>
>>>
>>>> 4404 // Conceptually we need a #storeload|#storestore "release" 
>>>> MEMBAR before
>>>> 4405 // the ST of 0 into the lock-word which releases the lock, so 
>>>> fence
>>>> 4406 // more than covers this on all platforms.
>>>>
>>>> I think that should be loadStore|storeStore. In which case the 
>>>> fence() is completely overkill as we only need a 
>>>> release_store(addr, 0). ??
>>>
>>> This item will also take some research to resolve so I'll have to reply
>>> again on this one later.
>>
>> Two resolutions here. First, fix this typo:
>>
>> 4404 // Conceptually we need a #storeload|#storestore "release" 
>> MEMBAR before
>>
>> by changing it to:
>>
>> 4404 // Conceptually we need a #loadstore|#storestore "release" 
>> MEMBAR before
>
> Done.
>
>
>> Second, consider changing the following code:
>>
>> 4397 OrderAccess::fence(); // guarantee at least release consistency.
>> 4407 *adr = 0;
>>
>> to this code in the "adaptive spin" bucket:
>>
>> 4407 OrderAccess::release_store(adr, 0); // guarantee at least 
>> release consistency.
>>
>> where the change can undergo more extensive testing than in this cleanup
>> bucket.
>>
>> Dan