RFR(S): 8200374: Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify threads_do()

David Holmes david.holmes at oracle.com
Tue Apr 3 21:39:31 UTC 2018


On 4/04/2018 7:25 AM, Daniel D. Daugherty wrote:
> Greetings,
> 
> This fix has been revised due to feedback from David H. and Erik O.
> 
> Here's the incremental webrev:
> 
> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.inc/
> 
> And here's the full webrev:
> 
> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.full/
> 
> Summary of the changes relative to 'cr1':
> 
> - Rename TracingExport::get_sampler_thread() to sampler_thread_acquire()
>    and TracingExport::set_sampler_thread() to 
> set_sampler_thread_with_lock().
> - set_sampler_thread_with_lock() uses Threads_lock to avoid conflicts with
>    Thread-SMR scans.

Sorry Dan but this is now mixing lock-free and locked in a "broken" way. 
A load_acquire is supposed to always be there to match a release_store - 
but we don't have a release_store because we hold the lock when setting 
the field. The lock is being used to serialize with the ThreadSMR scans. 
If we need an acquire it is because we want to ensure that we see stores 
prior to the setting of the sampler_thread when we read the 
sampler_thread. If that is the case then we still need the release_store 
because we may read the new sampler_thread value as soon as it is set, 
but before the unlock occurs - otherwise prior stores need not be 
visible. If it is not the case that we need to see prior stores then we 
don't need the acquire.

David
-----

> The Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run has finished
> and there are no test failures related to these changes.
> 
> My Solaris-X64 server is finishing the third round of Thread-SMR stress
> testing on the 'cr1' version. I'll start a round of stress testing for
> this version ('cr2') as soon as I can.
> 
> Thanks, in advance, for any comments, suggestions, or feedback.
> 
> Dan
> 
> On 4/2/18 11:11 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> This fix has been revised due to additional testing and due to
>> feedback from David H.
>>
>> Here's the incremental webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.inc/
>>
>> And here's the full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.full/
>>
>> My usual Thread-SMR stress testing on my Solaris-X64 is still running;
>> so far there has been only one unrelated intermittent test failure. The
>> Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run is still in
>> process...
>>
>> Thanks, in advance, for any comments, suggestions, or feedback.
>>
>> Dan
>>
>>
>> On 3/28/18 9:52 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a (mostly) small enhancement for Thread-SMR:
>>>
>>> JDK-8200374 Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to 
>>> verify threads_do()
>>> https://bugs.openjdk.java.net/browse/JDK-8200374
>>>
>>> Here's the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/0_for_jdk_hs_open/
>>>
>>> This is Erik O's improvement on the assertion added by the following
>>> bug fix (with some minor tweaking done by me):
>>>
>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>
>>> Summary of the changes:
>>>
>>> - Replace the assertion that I added in JDK-8199813 with a closure
>>>   based function that verifies the threads_do() contract depended
>>>   on by Thread-SMR.
>>> - Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify
>>>   that the calling thread's hazard pointer is scanned by threads_do().
>>>   The new function is called from both ThreadsSMRSupport::acquire_*()
>>>   functions.
>>> - Refactor the non-JavaThread part of Threads::threads_do() into
>>>   Threads::non_java_threads_do() so that the non-JavaThread part
>>>   can also be called by other threads_do() functions. Yes, the
>>>   Threads::threads_do() contract is still to scan every thread in
>>>   the system.
>>> - Add hooks for a "tracing sampler thread" to be optionally scanned
>>>   by Threads::non_java_threads_do().
>>>
>>> This fix has gone thru a couple of Mach5 builds-tier1, jdk-tier[1-3],
>>> and hs-tier[1-3] runs. I've also started my usual 24+ hour Thread-SMR
>>> stress testing run on my Solaris-X64 server.
>>>
>>> Just to be extra sure, I backed out the fix from JDK-8199813 to
>>> JavaThread::verify_not_published() (which started this round of
>>> Thread-SMR fixes) and we catch the issue in the new function
>>> ThreadsSMRSupport::verify_hazard_pointer_scanned().
>>>
>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>
>>> Dan
>>>
>>>
>>
>>
> 


More information about the hotspot-runtime-dev mailing list