RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Nov 21 17:36:31 UTC 2017


Thanks for keeping all the OpenJDK aliases!


On 11/21/17 12:24 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 11/21/17 11:28 AM, Daniel D. Daugherty wrote:
>> Hi Coleen!
>>
>> Thanks for making time to review the Thread-SMR stuff again!!
>>
>> I have added back the other three OpenJDK aliases... This review is
>> being done on _four_ different OpenJDK aliases.
>>
>> As always, replies are embedded below...
>>
>>
>> On 11/20/17 3:12 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/os/linux/os_linux.cpp.frames.html 
>>>
>>>
>>> I read David's comments about the threads list iterator, and I was 
>>> going to say that it can be cleaned up later, as the bulk of the 
>>> change is the SMR part but this looks truly bizarre. It looks like 
>>> it shouldn't compile because 'jt' isn't in scope.
>>>
>>> Why isn't this the syntax:
>>>
>>> JavaThreadIteratorWithHandle jtiwh;
>>> for (JavaThread* jt = jtiwh.first(); jt != NULL; jt = jtiwh.next()) {
>>> }
>>>
>>> Which would do the obvious thing without anyone having to squint at 
>>> the code.
>>
>> See my reply to David's review for the more detailed answer.
>>
>> For the above syntax, we would need braces to limit the scope of the
>> 'jtiwh' variable. With Stefan's propsal, you get limited scope on
>> 'jtiwh' for "free".
>
> Yes, thank you, I see that motivation now.
>>
>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/threadSMR.hpp.html 
>>>
>>>
>>> As a hater of acronmys, can this file use "Safe Memory Reclaimation"
>>
>> I'm not sure what you mean by this. Do you mean rename the files?
>>
>>   threadSMR.hpp -> threadSafeMemoryReclaimation.hpp
>>   threadSMR.cpp -> threadSafeMemoryReclaimation.cpp
>>
>
> No.
>
>>
>>> and briefly describe the concept in the beginning of the header file,
>>> so one knows why it's called threadSMR.hpp?
>>
>> And then this part of the sentence kind of indicates that you would be
>> okay with the threadSMR.{c,h}pp names if a comment was added to the
>> header file.
>>
>> Please clarify.
>
> Yes.  If you added a comment to the beginning of the threadSMR.hpp 
> file that said what SMR stood for and what it was, that would be 
> extremely helpful for future viewers of this file.

Yup. I just finished with the comment...


>
>>
>>
>>> It doesn't need to be long and include why Threads list needs this 
>>> Sometimes we tell new people that the hotspot documentation is in 
>>> the header files.
>>
>> Yup. When I migrated stuff from thread.hpp and thread.cpp to 
>> threadSMR.hpp
>> and threadSMR.cpp, I should have written a header comment...
>>
>> I did update a comment in thread.cpp based on Robin W's code review:
>
> Yes, I think a comment belongs in the SMR file also, or in preference.

Wasn't trying to say that I thought the update I did for Robin W was
sufficient...


>>
>>> > src/hotspot/share/runtime/thread.cpp
>>> >
>>> > 3432 // operations over all threads.  It is protected by its own 
>>> Mutex
>>> > 3433 // lock, which is also used in other contexts to protect thread
>>> >
>>> > Should this comment perhaps be revised to mention SMR?
>>>
>>> It definitely needs some updating... Here's a stab at it:
>>>
>>> // The Threads class links together all active threads, and provides
>>> // operations over all threads. It is protected by the Threads_lock,
>>> // which is also used in other global contexts like safepointing.
>>> // ThreadsListHandles are used to safely perform operations on one
>>> // or more threads without the risk of the thread exiting during the
>>> // operation.
>>> //
>>> // Note: The Threads_lock is currently more widely used than we
>>> // would like. We are actively migrating Threads_lock uses to other
>>> // mechanisms in order to reduce Threads_lock contention. 
>>
>> I'll take a look at adding a header comment to threadSMR.hpp.
>>
>>
>>> 186   JavaThreadIteratorWithHandle() : _tlh(), _index(0) {}
>>>
>>> This _tlh() call should not be necessary.  The compiler should 
>>> generate this for you in the constructor.
>>
>> Deleted.
>>
>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/threadSMR.cpp.html 
>>>
>>>
>>>   32 ThreadsList::ThreadsList(int entries) : _length(entries), 
>>> _threads(NEW_C_HEAP_ARRAY(JavaThread*, entries + 1, mtGC)), 
>>> _next_list(NULL) {
>>>
>>> Seems like it should be mtThread rather than mtGC.
>>
>> Fixed. Definitely an artifact of Erik's original prototype when he
>> extracted Thread-SMR from his GC work... Thanks for catching it.
>>
>>
>>> Should
>>>
>>>   62   if (EnableThreadSMRStatistics) {
>>>
>>> really be UL, ie: if (log_is_enabled(Info, thread, smr, statistics)) ?
>>
>> EnableThreadSMRStatistics is used in more places than UL code.
>> We use it in Thread::print_*() stuff to control output of
>> Thread-SMR statistics info in thread dumps and hs_err_pid file
>> generation.
>
> That sort of thing is also controlled by logging in general.

Don't think I want to do that since logging may be be "up" at the
time that I need Thread::print_*() or hs_err_pid generation...

Something about chickens and eggs... :-)


>>
>> Currently thread dump and hs_err_pid file output is not generated
>> using UL (and probably can't be?) so we need an option to control
>> the Thread-SMR statistics stuff in all places.
>>
>
> You can use log options in preference to this new diagnostic option in 
> these cases too.   This option looks a lot like logging to me and 
> would be nice to not have to later convert it.

See above reply...


>
>>
>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/test/hotspot/jtreg/runtime/ErrorHandling/NestedThreadsListHandleInErrorHandlingTest.java.html 
>>>
>>>
>>> Can you use for these tests instead (there were a couple):
>>>
>>> *@requires (vm.debug == true)*
>>
>> The test I cloned had this in it:
>>
>>     if (!Platform.isDebugBuild()) {
>>         // -XX:ErrorHandlerTest=N option requires debug bits.
>>         return;
>>     }
>>
>> and you would like me to switch to the newer mechanism?
>>
>> I have updated the following tests:
>>
>>   test/hotspot/jtreg/runtime/ErrorHandling/ErrorHandler.java
>> test/hotspot/jtreg/runtime/ErrorHandling/NestedThreadsListHandleInErrorHandlingTest.java 
>>
>> test/hotspot/jtreg/runtime/ErrorHandling/ThreadsListHandleInErrorHandlingTest.java 
>>
>>
>>
> Yes, the newer mechanism makes jtreg not bother to run the test, which 
> makes it faster!

More quickly not run the test... Yup... got it...


>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/thread.cpp.udiff.html 
>>>
>>>
>>> +// thread, has been added the Threads list, the system is not at a
>>>
>>> Has "not" been added to the Threads list?  missing "not"?
>>
>> Nope. If the JavaThread has been added to the Threads list
>> and it is not protected, then it is dangling. In other words,
>> a published JavaThread (on the Threads list) has to be protected
>> by either the system being at a safepoint or the JavaThread has
>> to be on some threads's ThreadsList.
>>
>>
>>>
>>> + return (unsigned int)(((uint32_t)(uintptr_t)s1) * 2654435761u);
>>>
>>> Can you add a comment about where this number came from?
>>
>> I'll have to get that from Erik...
>>
>>
>>> I can't find the caller of threads_do_smr.
>>
>> I'm guessing that's used by the GC code that depends on Thread-SMR...
>>
>
> They  should add this API when the add the use of it then.  I don't 
> see it in any sources.

We'll see what Erik wants to do...


>
>>
>>> If these functions xchg_smr_thread_list, get_smr_java_thread_list, 
>>> inc_smr_deleted_thread_count are only used by thread.cpp, I think 
>>> they should go in that file and not in the .inline.hpp file to be 
>>> included and possibly called by other files.  I think they're 
>>> private to class Threads.
>>
>> I have a vague memory that some of the compilers don't do inlining when
>> an "inline" function is in a .cpp. I believe we want these functions
>> to be inlined for performance reasons. Erik should probably chime in
>> here.
>
> I don't see why this should be.  Maybe some (older) compilers require 
> it to be found before the call though, but that can still be 
> accomplished in the .cpp file.

Again, we'll see what Erik wants to do...


>>
>>
>>> I don't have any in-depth comments. This looks really good from my 
>>> day of reading it.
>>
>> Thanks for taking the time to review it again!
>>
>
> Thanks,
> Coleen

And thanks again...

Dan


>
>> Dan
>>
>>
>>>
>>> Thanks,
>>> Coleen
>>>
>>> On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:
>>>
>>>> Greetings,
>>>>
>>>> Testing of the last round of changes revealed a hang in one of the new
>>>> TLH tests. Robbin has fixed the hang, updated the existing TLH 
>>>> test, and
>>>> added another TLH test for good measure.
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>>>
>>>> Here is the updated delta webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>>>
>>>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>>>> no unexpected failures.
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> Robbin rebased the project last night/this morning to merge with 
>>>>> Thread
>>>>> Local Handshakes (TLH) and also picked up additional changesets up 
>>>>> thru:
>>>>>
>>>>>> Changeset: fa736014cf28
>>>>>> Author:    cjplummer
>>>>>> Date:      2017-11-14 18:08 -0800
>>>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>>>
>>>>>> 8191049: Add alternate version of pns() that is callable from 
>>>>>> within hotspot source
>>>>>> Summary: added pns2() to debug.cpp
>>>>>> Reviewed-by: stuefe, gthornbr
>>>>>
>>>>> This is the first time we've rebased the project to bits that are 
>>>>> this
>>>>> fresh (< 12 hours old at rebase time). We've done this because we 
>>>>> think
>>>>> we're done with this project and are in the final 
>>>>> review-change-retest
>>>>> cycle(s)... In other words, we're not planning on making any more 
>>>>> major
>>>>> changes for this project... :-)
>>>>>
>>>>> *** Time for code reviewers to chime in on this thread! ***
>>>>>
>>>>> Here is the updated full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>>>
>>>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>>>
>>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>>>> (and the new baseline also)...
>>>>>
>>>>> We're expecting this round to be a little noisier than usual because
>>>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>>>> (Yes, we're playing chase-the-repo...)
>>>>>>
>>>>>> Here is the updated full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>>>
>>>>>> Unlike the previous rebase, there were no changes required to the
>>>>>> open code to get the rebased bits to build so there is no delta
>>>>>> webrev.
>>>>>>
>>>>>> These bits have been run through JPRT and I've submitted the usual
>>>>>> Mach5 tier[1-5] test run...
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>>>
>>>>>>> Here are the updated webrevs:
>>>>>>>
>>>>>>> Here's the mq comment for the change:
>>>>>>>
>>>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>>>
>>>>>>> Here is the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>>>
>>>>>>> And here is the delta webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>>>
>>>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the 
>>>>>>> holiday
>>>>>>> weekend. Didn't see any issues in a quick look. Going to take a 
>>>>>>> closer
>>>>>>> look today.
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Dan, Erik and Robbin
>>>>>>>
>>>>>>>
>>>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> Resolving one of the code review comments (from both Stefan K 
>>>>>>>> and Coleen)
>>>>>>>> on jdk10-04-full required quite a few changes so it is being 
>>>>>>>> done as a
>>>>>>>> standalone patch and corresponding webrevs:
>>>>>>>>
>>>>>>>> Here's the mq comment for the change:
>>>>>>>>
>>>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage 
>>>>>>>> to use
>>>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>>>
>>>>>>>> Here is the full webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>>>
>>>>>>>> And here is the delta webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>>>
>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>
>>>>>>>> Dan, Erik and Robbin
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>>>
>>>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>>>
>>>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>>>
>>>>>>>>> Here's a PDF for the internal wiki that we've been using to 
>>>>>>>>> describe
>>>>>>>>> and track the work on this project:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dan has noticed that the indenting is wrong in some of the 
>>>>>>>>> code quotes
>>>>>>>>> in the PDF that are not present in the internal wiki. We don't 
>>>>>>>>> have a
>>>>>>>>> solution for that problem yet.
>>>>>>>>>
>>>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>>>
>>>>>>>>> This fix has been run through many rounds of JPRT and Mach5 
>>>>>>>>> tier[2-5]
>>>>>>>>> testing, additional stress testing on Dan's Solaris X64 
>>>>>>>>> server, and
>>>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>>>
>>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>>
>>>>>>>>> Daniel Daugherty
>>>>>>>>> Erik Osterlund
>>>>>>>>> Robbin Ehn
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>




More information about the hotspot-gc-dev mailing list