RFR(XL): 8167108 - SMR and JavaThread Lifecycle

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Tue Nov 21 17:24:11 UTC 2017



On 11/21/17 11:28 AM, Daniel D. Daugherty wrote:
> Hi Coleen!
>
> Thanks for making time to review the Thread-SMR stuff again!!
>
> I have added back the other three OpenJDK aliases... This review is
> being done on _four_ different OpenJDK aliases.
>
> As always, replies are embedded below...
>
>
> On 11/20/17 3:12 PM, coleen.phillimore at oracle.com wrote:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/os/linux/os_linux.cpp.frames.html 
>>
>>
>> I read David's comments about the threads list iterator, and I was 
>> going to say that it can be cleaned up later, as the bulk of the 
>> change is the SMR part but this looks truly bizarre.   It looks like 
>> it shouldn't compile because 'jt' isn't in scope.
>>
>> Why isn't this the syntax:
>>
>> JavaThreadIteratorWithHandle jtiwh;
>> for (JavaThread* jt = jtiwh.first(); jt != NULL; jt = jtiwh.next()) {
>> }
>>
>> Which would do the obvious thing without anyone having to squint at 
>> the code.
>
> See my reply to David's review for the more detailed answer.
>
> For the above syntax, we would need braces to limit the scope of the
> 'jtiwh' variable. With Stefan's propsal, you get limited scope on
> 'jtiwh' for "free".

Yes, thank you, I see that motivation now.
>
>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/threadSMR.hpp.html 
>>
>>
>> As a hater of acronmys, can this file use "Safe Memory Reclaimation"
>
> I'm not sure what you mean by this. Do you mean rename the files?
>
>   threadSMR.hpp -> threadSafeMemoryReclaimation.hpp
>   threadSMR.cpp -> threadSafeMemoryReclaimation.cpp
>

No.

>
>> and briefly describe the concept in the beginning of the header file,
>> so one knows why it's called threadSMR.hpp?
>
> And then this part of the sentence kind of indicates that you would be
> okay with the threadSMR.{c,h}pp names if a comment was added to the
> header file.
>
> Please clarify.

Yes.  If you added a comment to the beginning of the threadSMR.hpp file 
that said what SMR stood for and what it was, that would be extremely 
helpful for future viewers of this file.

>
>
>> It doesn't need to be long and include why Threads list needs this 
>> Sometimes we tell new people that the hotspot documentation is in the 
>> header files.
>
> Yup. When I migrated stuff from thread.hpp and thread.cpp to 
> threadSMR.hpp
> and threadSMR.cpp, I should have written a header comment...
>
> I did update a comment in thread.cpp based on Robin W's code review:

Yes, I think a comment belongs in the SMR file also, or in preference.
>
>> > src/hotspot/share/runtime/thread.cpp
>> >
>> > 3432 // operations over all threads.  It is protected by its own Mutex
>> > 3433 // lock, which is also used in other contexts to protect thread
>> >
>> > Should this comment perhaps be revised to mention SMR?
>>
>> It definitely needs some updating... Here's a stab at it:
>>
>> // The Threads class links together all active threads, and provides
>> // operations over all threads. It is protected by the Threads_lock,
>> // which is also used in other global contexts like safepointing.
>> // ThreadsListHandles are used to safely perform operations on one
>> // or more threads without the risk of the thread exiting during the
>> // operation.
>> //
>> // Note: The Threads_lock is currently more widely used than we
>> // would like. We are actively migrating Threads_lock uses to other
>> // mechanisms in order to reduce Threads_lock contention. 
>
> I'll take a look at adding a header comment to threadSMR.hpp.
>
>
>> 186   JavaThreadIteratorWithHandle() : _tlh(), _index(0) {}
>>
>> This _tlh() call should not be necessary.  The compiler should 
>> generate this for you in the constructor.
>
> Deleted.
>
>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/threadSMR.cpp.html 
>>
>>
>>   32 ThreadsList::ThreadsList(int entries) : _length(entries), 
>> _threads(NEW_C_HEAP_ARRAY(JavaThread*, entries + 1, mtGC)), 
>> _next_list(NULL) {
>>
>> Seems like it should be mtThread rather than mtGC.
>
> Fixed. Definitely an artifact of Erik's original prototype when he
> extracted Thread-SMR from his GC work... Thanks for catching it.
>
>
>> Should
>>
>>   62   if (EnableThreadSMRStatistics) {
>>
>> really be UL, ie: if (log_is_enabled(Info, thread, smr, statistics)) ?
>
> EnableThreadSMRStatistics is used in more places than UL code.
> We use it in Thread::print_*() stuff to control output of
> Thread-SMR statistics info in thread dumps and hs_err_pid file
> generation.

That sort of thing is also controlled by logging in general.
>
> Currently thread dump and hs_err_pid file output is not generated
> using UL (and probably can't be?) so we need an option to control
> the Thread-SMR statistics stuff in all places.
>

You can use log options in preference to this new diagnostic option in 
these cases too.   This option looks a lot like logging to me and would 
be nice to not have to later convert it.

>
>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/test/hotspot/jtreg/runtime/ErrorHandling/NestedThreadsListHandleInErrorHandlingTest.java.html 
>>
>>
>> Can you use for these tests instead (there were a couple):
>>
>> *@requires (vm.debug == true)*
>
> The test I cloned had this in it:
>
>     if (!Platform.isDebugBuild()) {
>         // -XX:ErrorHandlerTest=N option requires debug bits.
>         return;
>     }
>
> and you would like me to switch to the newer mechanism?
>
> I have updated the following tests:
>
>   test/hotspot/jtreg/runtime/ErrorHandling/ErrorHandler.java
> test/hotspot/jtreg/runtime/ErrorHandling/NestedThreadsListHandleInErrorHandlingTest.java 
>
> test/hotspot/jtreg/runtime/ErrorHandling/ThreadsListHandleInErrorHandlingTest.java 
>
>
>
Yes, the newer mechanism makes jtreg not bother to run the test, which 
makes it faster!

>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/src/hotspot/share/runtime/thread.cpp.udiff.html 
>>
>>
>> +// thread, has been added the Threads list, the system is not at a
>>
>> Has "not" been added to the Threads list?  missing "not"?
>
> Nope. If the JavaThread has been added to the Threads list
> and it is not protected, then it is dangling. In other words,
> a published JavaThread (on the Threads list) has to be protected
> by either the system being at a safepoint or the JavaThread has
> to be on some threads's ThreadsList.
>
>
>>
>> + return (unsigned int)(((uint32_t)(uintptr_t)s1) * 2654435761u);
>>
>> Can you add a comment about where this number came from?
>
> I'll have to get that from Erik...
>
>
>> I can't find the caller of threads_do_smr.
>
> I'm guessing that's used by the GC code that depends on Thread-SMR...
>

They  should add this API when the add the use of it then.  I don't see 
it in any sources.

>
>> If these functions xchg_smr_thread_list, get_smr_java_thread_list, 
>> inc_smr_deleted_thread_count are only used by thread.cpp, I think 
>> they should go in that file and not in the .inline.hpp file to be 
>> included and possibly called by other files.  I think they're private 
>> to class Threads.
>
> I have a vague memory that some of the compilers don't do inlining when
> an "inline" function is in a .cpp. I believe we want these functions
> to be inlined for performance reasons. Erik should probably chime in
> here.

I don't see why this should be.  Maybe some (older) compilers require it 
to be found before the call though, but that can still be accomplished 
in the .cpp file.
>
>
>> I don't have any in-depth comments.  This looks really good from my 
>> day of reading it.
>
> Thanks for taking the time to review it again!
>

Thanks,
Coleen

> Dan
>
>
>>
>> Thanks,
>> Coleen
>>
>> On 11/18/17 8:49 PM, Daniel D. Daugherty wrote:
>>
>>> Greetings,
>>>
>>> Testing of the last round of changes revealed a hang in one of the new
>>> TLH tests. Robbin has fixed the hang, updated the existing TLH test, 
>>> and
>>> added another TLH test for good measure.
>>>
>>> Here is the updated full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>>
>>> Here is the updated delta webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>>
>>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>>> no unexpected failures.
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> Robbin rebased the project last night/this morning to merge with 
>>>> Thread
>>>> Local Handshakes (TLH) and also picked up additional changesets up 
>>>> thru:
>>>>
>>>>> Changeset: fa736014cf28
>>>>> Author:    cjplummer
>>>>> Date:      2017-11-14 18:08 -0800
>>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>>
>>>>> 8191049: Add alternate version of pns() that is callable from 
>>>>> within hotspot source
>>>>> Summary: added pns2() to debug.cpp
>>>>> Reviewed-by: stuefe, gthornbr
>>>>
>>>> This is the first time we've rebased the project to bits that are this
>>>> fresh (< 12 hours old at rebase time). We've done this because we 
>>>> think
>>>> we're done with this project and are in the final review-change-retest
>>>> cycle(s)... In other words, we're not planning on making any more 
>>>> major
>>>> changes for this project... :-)
>>>>
>>>> *** Time for code reviewers to chime in on this thread! ***
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>>
>>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>>
>>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>>> (and the new baseline also)...
>>>>
>>>> We're expecting this round to be a little noisier than usual because
>>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>>> (Yes, we're playing chase-the-repo...)
>>>>>
>>>>> Here is the updated full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>>
>>>>> Unlike the previous rebase, there were no changes required to the
>>>>> open code to get the rebased bits to build so there is no delta
>>>>> webrev.
>>>>>
>>>>> These bits have been run through JPRT and I've submitted the usual
>>>>> Mach5 tier[1-5] test run...
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>>
>>>>>> Here are the updated webrevs:
>>>>>>
>>>>>> Here's the mq comment for the change:
>>>>>>
>>>>>>   Rebase to 2017.10.25 PIT snapshot.
>>>>>>
>>>>>> Here is the full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>>
>>>>>> And here is the delta webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>>
>>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the 
>>>>>> holiday
>>>>>> weekend. Didn't see any issues in a quick look. Going to take a 
>>>>>> closer
>>>>>> look today.
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> Resolving one of the code review comments (from both Stefan K 
>>>>>>> and Coleen)
>>>>>>> on jdk10-04-full required quite a few changes so it is being 
>>>>>>> done as a
>>>>>>> standalone patch and corresponding webrevs:
>>>>>>>
>>>>>>> Here's the mq comment for the change:
>>>>>>>
>>>>>>>   stefank, coleenp CR - refactor most JavaThreadIterator usage 
>>>>>>> to use
>>>>>>>     JavaThreadIteratorWithHandle.
>>>>>>>
>>>>>>> Here is the full webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>>
>>>>>>> And here is the delta webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Dan, Erik and Robbin
>>>>>>>
>>>>>>>
>>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>>
>>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>>
>>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>>
>>>>>>>> Here's a PDF for the internal wiki that we've been using to 
>>>>>>>> describe
>>>>>>>> and track the work on this project:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf 
>>>>>>>>
>>>>>>>>
>>>>>>>> Dan has noticed that the indenting is wrong in some of the code 
>>>>>>>> quotes
>>>>>>>> in the PDF that are not present in the internal wiki. We don't 
>>>>>>>> have a
>>>>>>>> solution for that problem yet.
>>>>>>>>
>>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>>
>>>>>>>> This fix has been run through many rounds of JPRT and Mach5 
>>>>>>>> tier[2-5]
>>>>>>>> testing, additional stress testing on Dan's Solaris X64 server, 
>>>>>>>> and
>>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>>
>>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>>
>>>>>>>> Daniel Daugherty
>>>>>>>> Erik Osterlund
>>>>>>>> Robbin Ehn
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list