RFR(S): 8200374: Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify threads_do()

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Apr 3 21:45:47 UTC 2018


On 4/3/18 5:39 PM, David Holmes wrote:
> On 4/04/2018 7:25 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> This fix has been revised due to feedback from David H. and Erik O.
>>
>> Here's the incremental webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.inc/
>>
>> And here's the full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.full/ 
>>
>>
>> Summary of the changes relative to 'cr1':
>>
>> - Rename TracingExport::get_sampler_thread() to sampler_thread_acquire()
>>    and TracingExport::set_sampler_thread() to 
>> set_sampler_thread_with_lock().
>> - set_sampler_thread_with_lock() uses Threads_lock to avoid conflicts 
>> with
>>    Thread-SMR scans.
>
> Sorry Dan but this is now mixing lock-free and locked in a "broken" way.

Yup... Little voice in the back of my head was telling me
that... I didn't listen... I'll switch it to:

OrderAccess::release_store(&_sampler_thread, thread);

and retest... :-)

Dan



> A load_acquire is supposed to always be there to match a release_store 
> - but we don't have a release_store because we hold the lock when 
> setting the field. The lock is being used to serialize with the 
> ThreadSMR scans. If we need an acquire it is because we want to ensure 
> that we see stores prior to the setting of the sampler_thread when we 
> read the sampler_thread. If that is the case then we still need the 
> release_store because we may read the new sampler_thread value as soon 
> as it is set, but before the unlock occurs - otherwise prior stores 
> need not be visible. If it is not the case that we need to see prior 
> stores then we don't need the acquire.
>
> David
> -----
>
>> The Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run has finished
>> and there are no test failures related to these changes.
>>
>> My Solaris-X64 server is finishing the third round of Thread-SMR stress
>> testing on the 'cr1' version. I'll start a round of stress testing for
>> this version ('cr2') as soon as I can.
>>
>> Thanks, in advance, for any comments, suggestions, or feedback.
>>
>> Dan
>>
>> On 4/2/18 11:11 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> This fix has been revised due to additional testing and due to
>>> feedback from David H.
>>>
>>> Here's the incremental webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.inc/ 
>>>
>>>
>>> And here's the full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.full/ 
>>>
>>>
>>> My usual Thread-SMR stress testing on my Solaris-X64 is still running;
>>> so far there has been only one unrelated intermittent test failure. The
>>> Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run is still in
>>> process...
>>>
>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>
>>> Dan
>>>
>>>
>>> On 3/28/18 9:52 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a (mostly) small enhancement for Thread-SMR:
>>>>
>>>> JDK-8200374 Add ThreadsSMRSupport::verify_hazard_pointer_scanned() 
>>>> to verify threads_do()
>>>> https://bugs.openjdk.java.net/browse/JDK-8200374
>>>>
>>>> Here's the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/0_for_jdk_hs_open/
>>>>
>>>> This is Erik O's improvement on the assertion added by the following
>>>> bug fix (with some minor tweaking done by me):
>>>>
>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>
>>>> Summary of the changes:
>>>>
>>>> - Replace the assertion that I added in JDK-8199813 with a closure
>>>>   based function that verifies the threads_do() contract depended
>>>>   on by Thread-SMR.
>>>> - Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify
>>>>   that the calling thread's hazard pointer is scanned by threads_do().
>>>>   The new function is called from both ThreadsSMRSupport::acquire_*()
>>>>   functions.
>>>> - Refactor the non-JavaThread part of Threads::threads_do() into
>>>>   Threads::non_java_threads_do() so that the non-JavaThread part
>>>>   can also be called by other threads_do() functions. Yes, the
>>>>   Threads::threads_do() contract is still to scan every thread in
>>>>   the system.
>>>> - Add hooks for a "tracing sampler thread" to be optionally scanned
>>>>   by Threads::non_java_threads_do().
>>>>
>>>> This fix has gone thru a couple of Mach5 builds-tier1, jdk-tier[1-3],
>>>> and hs-tier[1-3] runs. I've also started my usual 24+ hour Thread-SMR
>>>> stress testing run on my Solaris-X64 server.
>>>>
>>>> Just to be extra sure, I backed out the fix from JDK-8199813 to
>>>> JavaThread::verify_not_published() (which started this round of
>>>> Thread-SMR fixes) and we catch the issue in the new function
>>>> ThreadsSMRSupport::verify_hazard_pointer_scanned().
>>>>
>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>
>>>> Dan
>>>>
>>>>
>>>
>>>
>>



More information about the hotspot-runtime-dev mailing list