RFR(S): 8200374: Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify threads_do()

Daniel D. Daugherty daniel.daugherty at oracle.com
Sun Apr 1 00:42:17 UTC 2018


Hi David!

Happy Easter!


On 3/31/18 7:40 PM, David Holmes wrote:
> On 31/03/2018 10:51 PM, Daniel D. Daugherty wrote:
>> Hi David,
>>
>> Thanks for the review!
>>
>> On 3/31/18 3:08 AM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> This all seems fine to me.
>>
>> Thanks.
>>
>>
>>> Just to be nitpicky the set/get_sampler_thread methods could be 
>>> declared inline in the .hpp file. ;-)
>>
>> Then I would have to put them in an inline.hpp file right? :-)
>
> That was a joke right?

Not really... sort of... it's all so confusing... :-)
See below...


>
>> In any case, I tweaked those functions yesterday to fix an intermittent
>> failure in a JFR test:
>>
>> $ hg diff -w src/hotspot/share/trace/tracingExport.cpp 
>> src/hotspot/share/trace/tracingExport.hpp
>> diff -r df946ef9863e src/hotspot/share/trace/tracingExport.cpp
>> --- a/src/hotspot/share/trace/tracingExport.cpp    Wed Mar 28 
>> 20:16:43 2018 -0400
>> +++ b/src/hotspot/share/trace/tracingExport.cpp    Sat Mar 31 
>> 08:48:01 2018 -0400
>> @@ -23,14 +23,16 @@
>>    */
>>
>>   #include "precompiled.hpp"
>> +#include "runtime/orderAccess.inline.hpp"
>>   #include "trace/tracingExport.hpp"
>>
>> -Thread * TracingExport::_sampler_thread = (Thread *)NULL;
>> +// Allow lock-free access to _sampler_thread.
>> +Thread* volatile TracingExport::_sampler_thread = (Thread*)NULL;
>
> Nit: you should never need to cast NULL when assigning a pointer 
> variable.

I thought I saw something go by that said some compiler was
complaining about NULL assignments... maybe that was something
like this:

   char junk_byte = NULL;

and the the compiler wanted:

   char junk_byte = '\0';

In any case, I'll make sure I fix those before I send out the next webrev...

>
>>   Thread * TracingExport::get_sampler_thread() {
>> -  return _sampler_thread;
>> +  return (Thread*)OrderAccess::load_acquire(&_sampler_thread);
>>   }
>>
>>   void TracingExport::set_sampler_thread(Thread * thread) {
>> -  _sampler_thread = thread;
>> +  OrderAccess::release_store_fence(&_sampler_thread, thread);
>>   }
>
> Okay _now_ they would have to into a .inline.hpp file :)

So what's the rule again? Right now they're in a .cpp and they
call inline functions from OrderAccess so we're good. If I make
them inline, then I have to move them out of the .cpp file and
into an .inline.hpp... but only because I'm including a .inline.hpp
file myself...

I thought we were trying to move inline functions out of .hpp files
and into .inline.hpp files... So if I made my new functions inline,
then I would have to move them even before I switched them to call
OrderAccess...

I guess I could still be confused... :-)


>
>> diff -r df946ef9863e src/hotspot/share/trace/tracingExport.hpp
>> --- a/src/hotspot/share/trace/tracingExport.hpp    Wed Mar 28 
>> 20:16:43 2018 -0400
>> +++ b/src/hotspot/share/trace/tracingExport.hpp    Sat Mar 31 
>> 08:48:01 2018 -0400
>> @@ -33,7 +33,7 @@
>>     static void set_sampler_thread(Thread * thread);
>>
>>    private:
>> -  static Thread * _sampler_thread;
>> +  static Thread* volatile _sampler_thread;
>>   };
>>
>>   #endif // SHARE_VM_TRACE_TRACINGEXPORT_HPP
>>
>> I will be sending out an updated webrev on Monday with all these tweaks
>> after running through Therad-SMR stress on my Solaris-X64 server...
>
> Ok.

So far no issues other a known intermittent... I'm hoping for no more JFR
test failures... :-)

Dan


>
> Thanks,
> David
>
>> Dan
>>
>>>
>>> Thanks,
>>> David
>>>
>>> On 30/03/2018 2:26 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> During my testing where I restored the bug in 
>>>> JavaThread::verify_not_published()
>>>> to verify that the new function works, I made a few minor 
>>>> improvements:
>>>>
>>>> $ hg diff
>>>> diff -r df946ef9863e src/hotspot/share/runtime/thread.cpp
>>>> --- a/src/hotspot/share/runtime/thread.cpp    Wed Mar 28 20:16:43 
>>>> 2018 -0400
>>>> +++ b/src/hotspot/share/runtime/thread.cpp    Thu Mar 29 12:18:46 
>>>> 2018 -0400
>>>> @@ -3438,7 +3438,9 @@
>>>>     // Someday we could have a table or list of all non-JavaThreads.
>>>>     // For now, just manually iterate through them.
>>>>     tc->do_thread(VMThread::vm_thread());
>>>> -  Universe::heap()->gc_threads_do(tc);
>>>> +  if (Universe::heap() != NULL) {
>>>> +    Universe::heap()->gc_threads_do(tc);
>>>> +  }
>>>>     WatcherThread *wt = WatcherThread::watcher_thread();
>>>>     // Strictly speaking, the following NULL check isn't sufficient 
>>>> to make sure
>>>>     // the data for WatcherThread is still valid upon being 
>>>> examined. However,
>>>> diff -r df946ef9863e src/hotspot/share/runtime/threadSMR.cpp
>>>> --- a/src/hotspot/share/runtime/threadSMR.cpp    Wed Mar 28 
>>>> 20:16:43 2018 -0400
>>>> +++ b/src/hotspot/share/runtime/threadSMR.cpp    Thu Mar 29 
>>>> 12:18:46 2018 -0400
>>>> @@ -580,6 +580,8 @@
>>>>   // the Thread-SMR protocol.
>>>>   void ThreadsSMRSupport::verify_hazard_pointer_scanned(Thread 
>>>> *self, ThreadsList *threads) {
>>>>   #ifdef ASSERT
>>>> +  assert(threads != NULL, "threads must not be NULL");
>>>> +
>>>>     // The closure will attempt to verify that the calling thread can
>>>>     // be found by threads_do() on the specified ThreadsList. If it
>>>>     // is successful, then the specified ThreadsList was acquired as
>>>>
>>>>
>>>> The Universe::heap() NULL check in Threads::non_java_threads_do() is
>>>> needed because the particular code path to verify_not_published() is
>>>> very early in VM bringup... without it we always fail on a NULL ptr
>>>> deref (which isn't interesting)...
>>>>
>>>> The new assert() in verify_hazard_pointer_scanned() is just the usual
>>>> verify your input parameters...
>>>>
>>>> Dan
>>>>
>>>> On 3/29/18 9:09 AM, Daniel D. Daugherty wrote:
>>>>> Erik,
>>>>>
>>>>> Thanks for the quick review!
>>>>>
>>>>>
>>>>> On 3/29/18 3:00 AM, Erik Österlund wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>> This looks great.
>>>>>
>>>>> Good to hear especially since you wrote most of it... :-)
>>>>>
>>>>>
>>>>>> I will sleep much better at night with this 1:1 verification. :)
>>>>>
>>>>> It does tighten things up nicely!
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> /Erik
>>>>>>
>>>>>>
>>>>>> On 2018-03-29 03:52, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I have a (mostly) small enhancement for Thread-SMR:
>>>>>>>
>>>>>>> JDK-8200374 Add 
>>>>>>> ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify 
>>>>>>> threads_do()
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8200374
>>>>>>>
>>>>>>> Here's the webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/0_for_jdk_hs_open/ 
>>>>>>>
>>>>>>>
>>>>>>> This is Erik O's improvement on the assertion added by the 
>>>>>>> following
>>>>>>> bug fix (with some minor tweaking done by me):
>>>>>>>
>>>>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>>>>
>>>>>>> Summary of the changes:
>>>>>>>
>>>>>>> - Replace the assertion that I added in JDK-8199813 with a closure
>>>>>>>   based function that verifies the threads_do() contract depended
>>>>>>>   on by Thread-SMR.
>>>>>>> - Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify
>>>>>>>   that the calling thread's hazard pointer is scanned by 
>>>>>>> threads_do().
>>>>>>>   The new function is called from both 
>>>>>>> ThreadsSMRSupport::acquire_*()
>>>>>>>   functions.
>>>>>>> - Refactor the non-JavaThread part of Threads::threads_do() into
>>>>>>>   Threads::non_java_threads_do() so that the non-JavaThread part
>>>>>>>   can also be called by other threads_do() functions. Yes, the
>>>>>>>   Threads::threads_do() contract is still to scan every thread in
>>>>>>>   the system.
>>>>>>> - Add hooks for a "tracing sampler thread" to be optionally scanned
>>>>>>>   by Threads::non_java_threads_do().
>>>>>>>
>>>>>>> This fix has gone thru a couple of Mach5 builds-tier1, 
>>>>>>> jdk-tier[1-3],
>>>>>>> and hs-tier[1-3] runs. I've also started my usual 24+ hour 
>>>>>>> Thread-SMR
>>>>>>> stress testing run on my Solaris-X64 server.
>>>>>>>
>>>>>>> Just to be extra sure, I backed out the fix from JDK-8199813 to
>>>>>>> JavaThread::verify_not_published() (which started this round of
>>>>>>> Thread-SMR fixes) and we catch the issue in the new function
>>>>>>> ThreadsSMRSupport::verify_hazard_pointer_scanned().
>>>>>>>
>>>>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>



More information about the hotspot-runtime-dev mailing list