RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Nov 23 02:16:35 UTC 2017


Greetings,

I've made minor tweaks to the Thread-SMR project based on the remaining
code review comments over the last couple of days. I've also rebased the
project to jdk/hs bits as of mid-afternoon (my TZ) on 2017.11.22. I'm
running baseline Mach5 Tier[1-5] testing and prototype Mach5 Tier[1-5]
testing...

Here's a delta webrev for the latest round of tweaks:

http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-11-delta


Here's the proposed commit message:

$ cat SMR_prototype_10/open/commit.txt
8167108: inconsistent handling of SR_lock can lead to crashes
Summary: Add Thread Safe Memory Reclamation (Thread-SMR) mechanism.
Reviewed-by: coleenp, dcubed, dholmes, eosterlund, gthornbr, kbarrett, 
rehn, sspitsyn, stefank
Contributed-by: daniel.daugherty at oracle.com, erik.osterlund at oracle.com, 
robbin.ehn at oracle.com

I've gone through 880+ emails in my archive for this project and added
anyone that made a code review comment. Reviewers are not in my usual
by-comment-posting order; it was way too hard to figure that out... So
reviewers and contributors are simply in alpha-sort order...

Here's the collection of follow-up bugs that I filed based on sifting
through the email archive:

8191784 JavaThread::collect_counters() should stop using Threads_lock
https://bugs.openjdk.java.net/browse/JDK-8191784

8191785 revoke_bias() needs a new assertion
https://bugs.openjdk.java.net/browse/JDK-8191785

8191786 Thread-SMR hash table size should be dynamic
https://bugs.openjdk.java.net/browse/JDK-8191786

8191787 move private inline functions from thread.inline.hpp -> thread.cpp
https://bugs.openjdk.java.net/browse/JDK-8191787

8191789 migrate more Thread-SMR stuff from thread.[ch]pp -> threadSMR.[ch]pp
https://bugs.openjdk.java.net/browse/JDK-8191789

8191798 redo nested ThreadsListHandle to drop Threads_lock
https://bugs.openjdk.java.net/browse/JDK-8191789

I have undoubtedly missed at least one future idea that someone had
and for that my apologies...

Thanks again to everyone who contributed to the Thread-SMR project!!

Special thanks goes to Karen Kinnear, Kim Barrett and David Holmes for
their rigorous review of the design that we documented on the wiki.
Thanks for keeping us honest!

I also have to thank my co-contributor Erik Österlund for starting the
ball rolling on this crazy project with his presentation on Safe Memory
Reclamation, for the initial prototype and for all your patches. Erik,
hopefully we got far enough in your crazy vision... :-)

Thanks to my co-contributor Robbin Ehn for proposing that we do early
code reviews in the form of patches. Don't like something? Fix it with
a patch and a new test if appropriate!! A pretty cool way to work that
was new to me.

Finally, many thanks to Jerry Thornbrugh for all the early code reviews,
whiteboard chats and phone calls. Thanks for asking the right questions
and pointing out some places where our logic was faulty. Also thanks for
keeping me sane when I was literally on my own in Monument, CO.

Dan


On 11/21/17 7:12 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> *** We are wrapping up code review on this project so it is time ***
> *** for the various code reviewers to chime in one last time. ***
>
> In this latest round, we had three different reviewers chime in so we're
> doing delta webrevs for each of those resolutions:
>
> David H's resolutions:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.0-delta/ 
>
>
>   mq comment for dholmes_CR:
>     - Fix indents, trailing spaces and various typos.
>     - Add descriptions for the '_cnt', '_max' and '_times" fields,
>       add impl notes to document the type choices.
>
>   src/hotspot/share/runtime/globals.hpp change is white-space only so it
>   is only visible via the file's patch link.
>
>
> Robin W's resolutions:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.0,10.10.1-delta/ 
>
>
>   mq comment for robinw_CR:
>     - Fix some inefficient code, update some comments, fix some indents,
>       and add some 'const' specifiers.
>
>   src/hotspot/share/runtime/vm_operations.hpp change is white-space 
> only so
>   it is only visible via the file's patch link.
>
>
> Coleen's resolutions:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10.1,10.10.2-delta/ 
>
>
>   mq comment for coleenp_CR:
>     - Change ThreadsList::_threads from 'mtGC' -> 'mtThread',
>     - add header comment to threadSMR.hpp file,
>     - cleanup JavaThreadIteratorWithHandle ctr,
>     - make ErrorHandling more efficient.
>
>
> Some folks prefer to see all of the delta webrevs together so here is
> a delta webrev relative to jdk10-09-full:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.09,10.10.2-delta/ 
>
>
>   src/hotspot/share/runtime/globals.hpp and
>   src/hotspot/share/runtime/vm_operations.hpp changes are white-space 
> only
>   so they are only visible via the file's patch link.
>
>
> And, of course, some folks prefer to see everything all at once so here
> is the latest full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-10.10-full/
>
>
> Dan has submitted the bits for the usual Mach5 tier[1-5] testing. The
> prelim Mach5 tier1 (JPRT equivalent) on these bits passed just fine...
>
> The project is currently baselined on Jesper's 2017-11-17 PIT snapshot
> so there will be some catching up to do before integration...
>
> We welcome comments, suggestions and feedback.
>
> Dan, Erik and Robbin
>
>
> On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> Testing of the last round of changes revealed a hang in one of the new
>> TLH tests. Robbin has fixed the hang, updated the existing TLH test, and
>> added another TLH test for good measure.
>>
>> Here is the updated full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>
>> Here is the updated delta webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>
>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>> no unexpected failures.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Dan, Erik and Robbin
>>
>>
>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> Robbin rebased the project last night/this morning to merge with Thread
>>> Local Handshakes (TLH) and also picked up additional changesets up 
>>> thru:
>>>
>>>> Changeset: fa736014cf28
>>>> Author:    cjplummer
>>>> Date:      2017-11-14 18:08 -0800
>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>
>>>> 8191049: Add alternate version of pns() that is callable from 
>>>> within hotspot source
>>>> Summary: added pns2() to debug.cpp
>>>> Reviewed-by: stuefe, gthornbr
>>>
>>> This is the first time we've rebased the project to bits that are this
>>> fresh (< 12 hours old at rebase time). We've done this because we think
>>> we're done with this project and are in the final review-change-retest
>>> cycle(s)... In other words, we're not planning on making any more major
>>> changes for this project... :-)
>>>
>>> *** Time for code reviewers to chime in on this thread! ***
>>>
>>> Here is the updated full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>
>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>
>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>> (and the new baseline also)...
>>>
>>> We're expecting this round to be a little noisier than usual because
>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>> (Yes, we're playing chase-the-repo...)
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>
>>>> Unlike the previous rebase, there were no changes required to the
>>>> open code to get the rebased bits to build so there is no delta
>>>> webrev.
>>>>
>>>> These bits have been run through JPRT and I've submitted the usual
>>>> Mach5 tier[1-5] test run...
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>
>>>>> Here are the updated webrevs:
>>>>>
>>>>> Here's the mq comment for the change:
>>>>>
>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>
>>>>> Here is the full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>
>>>>> And here is the delta webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>
>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the 
>>>>> holiday
>>>>> weekend. Didn't see any issues in a quick look. Going to take a 
>>>>> closer
>>>>> look today.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> Resolving one of the code review comments (from both Stefan K and 
>>>>>> Coleen)
>>>>>> on jdk10-04-full required quite a few changes so it is being done 
>>>>>> as a
>>>>>> standalone patch and corresponding webrevs:
>>>>>>
>>>>>> Here's the mq comment for the change:
>>>>>>
>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage to 
>>>>>> use
>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>
>>>>>> Here is the full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>
>>>>>> And here is the delta webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>
>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>
>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>
>>>>>>> Here's a PDF for the internal wiki that we've been using to 
>>>>>>> describe
>>>>>>> and track the work on this project:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>
>>>>>>>
>>>>>>> Dan has noticed that the indenting is wrong in some of the code 
>>>>>>> quotes
>>>>>>> in the PDF that are not present in the internal wiki. We don't 
>>>>>>> have a
>>>>>>> solution for that problem yet.
>>>>>>>
>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>
>>>>>>> This fix has been run through many rounds of JPRT and Mach5 
>>>>>>> tier[2-5]
>>>>>>> testing, additional stress testing on Dan's Solaris X64 server, and
>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Daniel Daugherty
>>>>>>> Erik Osterlund
>>>>>>> Robbin Ehn
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>




More information about the hotspot-gc-dev mailing list