RFR(S): 8200374: Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify threads_do()

David Holmes david.holmes at oracle.com
Tue Apr 3 23:38:17 UTC 2018


Looks fine.

Thanks,
David

On 4/04/2018 7:52 AM, Daniel D. Daugherty wrote:
> The actual relative diff:
> 
> $ hg diff
> diff -r b1cf49ff8e1c src/hotspot/share/trace/tracingExport.cpp
> --- a/src/hotspot/share/trace/tracingExport.cpp    Tue Apr 03 16:57:57 
> 2018 -0400
> +++ b/src/hotspot/share/trace/tracingExport.cpp    Tue Apr 03 17:50:26 
> 2018 -0400
> @@ -36,5 +36,6 @@
>   void TracingExport::set_sampler_thread_with_lock(Thread* thread) {
>     // Grab Threads_lock to avoid conflicts with Thread-SMR scans.
>     MutexLocker ml(Threads_lock);
> -  _sampler_thread = thread;
> +  // To match with the lock-free load_acquire():
> +  OrderAccess::release_store(&_sampler_thread, thread);
>   }
> 
> 
> The annoying thing is that my plan was to switch from
> OrderAccess::release_store_fence() to OrderAccess::release_store()
> for this round of testing AND I talked myself into dropping the
> release_store() entirely...
> 
> Good thing Mach5 is rocking today!
> 
> Dan
> 
> 
> On 4/3/18 5:45 PM, Daniel D. Daugherty wrote:
>> On 4/3/18 5:39 PM, David Holmes wrote:
>>> On 4/04/2018 7:25 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> This fix has been revised due to feedback from David H. and Erik O.
>>>>
>>>> Here's the incremental webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.inc/ 
>>>>
>>>>
>>>> And here's the full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.full/ 
>>>>
>>>>
>>>> Summary of the changes relative to 'cr1':
>>>>
>>>> - Rename TracingExport::get_sampler_thread() to 
>>>> sampler_thread_acquire()
>>>>    and TracingExport::set_sampler_thread() to 
>>>> set_sampler_thread_with_lock().
>>>> - set_sampler_thread_with_lock() uses Threads_lock to avoid 
>>>> conflicts with
>>>>    Thread-SMR scans.
>>>
>>> Sorry Dan but this is now mixing lock-free and locked in a "broken" way.
>>
>> Yup... Little voice in the back of my head was telling me
>> that... I didn't listen... I'll switch it to:
>>
>> OrderAccess::release_store(&_sampler_thread, thread);
>>
>> and retest... :-)
>>
>> Dan
>>
>>
>>
>>> A load_acquire is supposed to always be there to match a 
>>> release_store - but we don't have a release_store because we hold the 
>>> lock when setting the field. The lock is being used to serialize with 
>>> the ThreadSMR scans. If we need an acquire it is because we want to 
>>> ensure that we see stores prior to the setting of the sampler_thread 
>>> when we read the sampler_thread. If that is the case then we still 
>>> need the release_store because we may read the new sampler_thread 
>>> value as soon as it is set, but before the unlock occurs - otherwise 
>>> prior stores need not be visible. If it is not the case that we need 
>>> to see prior stores then we don't need the acquire.
>>>
>>> David
>>> -----
>>>
>>>> The Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run has 
>>>> finished
>>>> and there are no test failures related to these changes.
>>>>
>>>> My Solaris-X64 server is finishing the third round of Thread-SMR stress
>>>> testing on the 'cr1' version. I'll start a round of stress testing for
>>>> this version ('cr2') as soon as I can.
>>>>
>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>
>>>> Dan
>>>>
>>>> On 4/2/18 11:11 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> This fix has been revised due to additional testing and due to
>>>>> feedback from David H.
>>>>>
>>>>> Here's the incremental webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.inc/ 
>>>>>
>>>>>
>>>>> And here's the full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.full/ 
>>>>>
>>>>>
>>>>> My usual Thread-SMR stress testing on my Solaris-X64 is still running;
>>>>> so far there has been only one unrelated intermittent test failure. 
>>>>> The
>>>>> Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run is still in
>>>>> process...
>>>>>
>>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 3/28/18 9:52 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have a (mostly) small enhancement for Thread-SMR:
>>>>>>
>>>>>> JDK-8200374 Add ThreadsSMRSupport::verify_hazard_pointer_scanned() 
>>>>>> to verify threads_do()
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8200374
>>>>>>
>>>>>> Here's the webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/0_for_jdk_hs_open/
>>>>>>
>>>>>> This is Erik O's improvement on the assertion added by the following
>>>>>> bug fix (with some minor tweaking done by me):
>>>>>>
>>>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>>>
>>>>>> Summary of the changes:
>>>>>>
>>>>>> - Replace the assertion that I added in JDK-8199813 with a closure
>>>>>>   based function that verifies the threads_do() contract depended
>>>>>>   on by Thread-SMR.
>>>>>> - Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify
>>>>>>   that the calling thread's hazard pointer is scanned by 
>>>>>> threads_do().
>>>>>>   The new function is called from both ThreadsSMRSupport::acquire_*()
>>>>>>   functions.
>>>>>> - Refactor the non-JavaThread part of Threads::threads_do() into
>>>>>>   Threads::non_java_threads_do() so that the non-JavaThread part
>>>>>>   can also be called by other threads_do() functions. Yes, the
>>>>>>   Threads::threads_do() contract is still to scan every thread in
>>>>>>   the system.
>>>>>> - Add hooks for a "tracing sampler thread" to be optionally scanned
>>>>>>   by Threads::non_java_threads_do().
>>>>>>
>>>>>> This fix has gone thru a couple of Mach5 builds-tier1, jdk-tier[1-3],
>>>>>> and hs-tier[1-3] runs. I've also started my usual 24+ hour Thread-SMR
>>>>>> stress testing run on my Solaris-X64 server.
>>>>>>
>>>>>> Just to be extra sure, I backed out the fix from JDK-8199813 to
>>>>>> JavaThread::verify_not_published() (which started this round of
>>>>>> Thread-SMR fixes) and we catch the issue in the new function
>>>>>> ThreadsSMRSupport::verify_hazard_pointer_scanned().
>>>>>>
>>>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>
> 


More information about the hotspot-runtime-dev mailing list