RFR(S): 8200374: Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify threads_do()

David Holmes david.holmes at oracle.com
Sat Mar 31 23:40:50 UTC 2018


On 31/03/2018 10:51 PM, Daniel D. Daugherty wrote:
> Hi David,
> 
> Thanks for the review!
> 
> On 3/31/18 3:08 AM, David Holmes wrote:
>> Hi Dan,
>>
>> This all seems fine to me.
> 
> Thanks.
> 
> 
>> Just to be nitpicky the set/get_sampler_thread methods could be 
>> declared inline in the .hpp file. ;-)
> 
> Then I would have to put them in an inline.hpp file right? :-)

That was a joke right?

> In any case, I tweaked those functions yesterday to fix an intermittent
> failure in a JFR test:
> 
> $ hg diff -w src/hotspot/share/trace/tracingExport.cpp 
> src/hotspot/share/trace/tracingExport.hpp
> diff -r df946ef9863e src/hotspot/share/trace/tracingExport.cpp
> --- a/src/hotspot/share/trace/tracingExport.cpp    Wed Mar 28 20:16:43 
> 2018 -0400
> +++ b/src/hotspot/share/trace/tracingExport.cpp    Sat Mar 31 08:48:01 
> 2018 -0400
> @@ -23,14 +23,16 @@
>    */
> 
>   #include "precompiled.hpp"
> +#include "runtime/orderAccess.inline.hpp"
>   #include "trace/tracingExport.hpp"
> 
> -Thread * TracingExport::_sampler_thread = (Thread *)NULL;
> +// Allow lock-free access to _sampler_thread.
> +Thread* volatile TracingExport::_sampler_thread = (Thread*)NULL;

Nit: you should never need to cast NULL when assigning a pointer variable.

>   Thread * TracingExport::get_sampler_thread() {
> -  return _sampler_thread;
> +  return (Thread*)OrderAccess::load_acquire(&_sampler_thread);
>   }
> 
>   void TracingExport::set_sampler_thread(Thread * thread) {
> -  _sampler_thread = thread;
> +  OrderAccess::release_store_fence(&_sampler_thread, thread);
>   }

Okay _now_ they would have to into a .inline.hpp file :)

> diff -r df946ef9863e src/hotspot/share/trace/tracingExport.hpp
> --- a/src/hotspot/share/trace/tracingExport.hpp    Wed Mar 28 20:16:43 
> 2018 -0400
> +++ b/src/hotspot/share/trace/tracingExport.hpp    Sat Mar 31 08:48:01 
> 2018 -0400
> @@ -33,7 +33,7 @@
>     static void set_sampler_thread(Thread * thread);
> 
>    private:
> -  static Thread * _sampler_thread;
> +  static Thread* volatile _sampler_thread;
>   };
> 
>   #endif // SHARE_VM_TRACE_TRACINGEXPORT_HPP
> 
> I will be sending out an updated webrev on Monday with all these tweaks
> after running through Therad-SMR stress on my Solaris-X64 server...

Ok.

Thanks,
David

> Dan
> 
>>
>> Thanks,
>> David
>>
>> On 30/03/2018 2:26 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> During my testing where I restored the bug in 
>>> JavaThread::verify_not_published()
>>> to verify that the new function works, I made a few minor improvements:
>>>
>>> $ hg diff
>>> diff -r df946ef9863e src/hotspot/share/runtime/thread.cpp
>>> --- a/src/hotspot/share/runtime/thread.cpp    Wed Mar 28 20:16:43 
>>> 2018 -0400
>>> +++ b/src/hotspot/share/runtime/thread.cpp    Thu Mar 29 12:18:46 
>>> 2018 -0400
>>> @@ -3438,7 +3438,9 @@
>>>     // Someday we could have a table or list of all non-JavaThreads.
>>>     // For now, just manually iterate through them.
>>>     tc->do_thread(VMThread::vm_thread());
>>> -  Universe::heap()->gc_threads_do(tc);
>>> +  if (Universe::heap() != NULL) {
>>> +    Universe::heap()->gc_threads_do(tc);
>>> +  }
>>>     WatcherThread *wt = WatcherThread::watcher_thread();
>>>     // Strictly speaking, the following NULL check isn't sufficient 
>>> to make sure
>>>     // the data for WatcherThread is still valid upon being examined. 
>>> However,
>>> diff -r df946ef9863e src/hotspot/share/runtime/threadSMR.cpp
>>> --- a/src/hotspot/share/runtime/threadSMR.cpp    Wed Mar 28 20:16:43 
>>> 2018 -0400
>>> +++ b/src/hotspot/share/runtime/threadSMR.cpp    Thu Mar 29 12:18:46 
>>> 2018 -0400
>>> @@ -580,6 +580,8 @@
>>>   // the Thread-SMR protocol.
>>>   void ThreadsSMRSupport::verify_hazard_pointer_scanned(Thread *self, 
>>> ThreadsList *threads) {
>>>   #ifdef ASSERT
>>> +  assert(threads != NULL, "threads must not be NULL");
>>> +
>>>     // The closure will attempt to verify that the calling thread can
>>>     // be found by threads_do() on the specified ThreadsList. If it
>>>     // is successful, then the specified ThreadsList was acquired as
>>>
>>>
>>> The Universe::heap() NULL check in Threads::non_java_threads_do() is
>>> needed because the particular code path to verify_not_published() is
>>> very early in VM bringup... without it we always fail on a NULL ptr
>>> deref (which isn't interesting)...
>>>
>>> The new assert() in verify_hazard_pointer_scanned() is just the usual
>>> verify your input parameters...
>>>
>>> Dan
>>>
>>> On 3/29/18 9:09 AM, Daniel D. Daugherty wrote:
>>>> Erik,
>>>>
>>>> Thanks for the quick review!
>>>>
>>>>
>>>> On 3/29/18 3:00 AM, Erik Österlund wrote:
>>>>> Hi Dan,
>>>>>
>>>>> This looks great.
>>>>
>>>> Good to hear especially since you wrote most of it... :-)
>>>>
>>>>
>>>>> I will sleep much better at night with this 1:1 verification. :)
>>>>
>>>> It does tighten things up nicely!
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>>
>>>>>
>>>>> On 2018-03-29 03:52, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have a (mostly) small enhancement for Thread-SMR:
>>>>>>
>>>>>> JDK-8200374 Add ThreadsSMRSupport::verify_hazard_pointer_scanned() 
>>>>>> to verify threads_do()
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8200374
>>>>>>
>>>>>> Here's the webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/0_for_jdk_hs_open/
>>>>>>
>>>>>> This is Erik O's improvement on the assertion added by the following
>>>>>> bug fix (with some minor tweaking done by me):
>>>>>>
>>>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>>>
>>>>>> Summary of the changes:
>>>>>>
>>>>>> - Replace the assertion that I added in JDK-8199813 with a closure
>>>>>>   based function that verifies the threads_do() contract depended
>>>>>>   on by Thread-SMR.
>>>>>> - Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify
>>>>>>   that the calling thread's hazard pointer is scanned by 
>>>>>> threads_do().
>>>>>>   The new function is called from both ThreadsSMRSupport::acquire_*()
>>>>>>   functions.
>>>>>> - Refactor the non-JavaThread part of Threads::threads_do() into
>>>>>>   Threads::non_java_threads_do() so that the non-JavaThread part
>>>>>>   can also be called by other threads_do() functions. Yes, the
>>>>>>   Threads::threads_do() contract is still to scan every thread in
>>>>>>   the system.
>>>>>> - Add hooks for a "tracing sampler thread" to be optionally scanned
>>>>>>   by Threads::non_java_threads_do().
>>>>>>
>>>>>> This fix has gone thru a couple of Mach5 builds-tier1, jdk-tier[1-3],
>>>>>> and hs-tier[1-3] runs. I've also started my usual 24+ hour Thread-SMR
>>>>>> stress testing run on my Solaris-X64 server.
>>>>>>
>>>>>> Just to be extra sure, I backed out the fix from JDK-8199813 to
>>>>>> JavaThread::verify_not_published() (which started this round of
>>>>>> Thread-SMR fixes) and we catch the issue in the new function
>>>>>> ThreadsSMRSupport::verify_hazard_pointer_scanned().
>>>>>>
>>>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>
>>>>
>>>>
>>>
> 


More information about the hotspot-runtime-dev mailing list