Fwd: RFR: 8233375: JFR emergency dump do not recover thread state

David Holmes david.holmes at oracle.com
Tue Nov 5 06:19:47 UTC 2019


On 5/11/2019 4:08 pm, Yasumasa Suenaga wrote:
> Hi David,
> 
> Sorry, I was confused :)
> This is new webrev. Could you check again?
> 
>    http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.04/

Okay structurally I'm fine with that.

What I don't know, and will leave to Markus to determine, is whether the 
rest of the code in on_vm_shutdown can actually execute okay if there is 
no current thread.

Thanks,
David

> 
> Yasumasa
> 
> 
> On 2019/11/05 14:56, David Holmes wrote:
>> On 5/11/2019 3:48 pm, Yasumasa Suenaga wrote:
>>> On 2019/11/05 14:34, David Holmes wrote:
>>>> On 5/11/2019 3:13 pm, Yasumasa Suenaga wrote:
>>>>> Hi David,
>>>>>
>>>>>> It would be cleaner/simpler if prepare_for_emergency_dump takes 
>>>>>> the thread argument. As it is just a static function that doesn't 
>>>>>> impact anything else.
>>>>>
>>>>> prepare_for_emergency_dump() returns false if some critical locks 
>>>>> could not unlock.
>>>>> So what should we return if NULL is passed as the argument? true?
>>>>
>>>> But you're not calling prepare_for_emergency_dump when the thread is 
>>>> NULL:
>>>>
>>>>   454   Thread* thread = Thread::current_or_null_safe();
>>>>   455
>>>>   456   // Ensure a JavaThread is _thread_in_vm when we make this call
>>>>   457   JavaThreadInVM jtivm(thread);
>>>>   458   if ((thread != NULL) && !prepare_for_emergency_dump()) {
>>>>   459     return;
>>>>   460   }
>>>
>>> Oh, sorry, I have a mistake!
>>> I want to change as below:
>>>
>>> ```
>>> +  Thread* thread = Thread::current_or_null_safe();
>>> +
>>> +  // Ensure a JavaThread is _thread_in_vm when we make this call
>>> +  JavaThreadInVM jtivm(thread);
>>> +  if (thread != NULL) {
>>> +    if (!prepare_for_emergency_dump()) {
>>> +      return;
>>> +    }
>>> +  }
>>> ```
>>
>> but that is the same logic ??
>>
>>>> All I'm saying is that you pass "thread" as a parameter so you can 
>>>> then delete the existing call to Thread::current() that is inside 
>>>> prepare_for_emergency_dump.
>>>
>>> Is it prefer pass "thread" to prepare_for_emergency_dump() instead of 
>>> above?
>>
>> ??? The two are not related. If you've already obtained the current 
>> thread you can pass it to prepare_for_emergency_dump and avoid the 
>> need to call Thread:current() (in whatever form) again. How you handle 
>> a NULL current thread is independent of that.
>>
>>> If so, I will push new changeset to submit repo, and will send new 
>>> review request.
>>
>> I'd send the review request first and get agreement before wasting 
>> time on the submit repo.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>> Yasumasa
>>>
>>>
>>>> David
>>>> -----
>>>>
>>>>> It might break semantics of this function, so I did not add 
>>>>> argument to prepare_for_emergency_dump() in this webrev.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2019/11/05 13:56, David Holmes wrote:
>>>>>> On 5/11/2019 1:56 pm, Yasumasa Suenaga wrote:
>>>>>>> On 2019/11/05 9:17, David Holmes wrote:
>>>>>>>> On 5/11/2019 9:56 am, Yasumasa Suenaga wrote:
>>>>>>>>> On 2019/11/04 22:43, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi Markus,
>>>>>>>>>>
>>>>>>>>>> I thought similar change, and it is running on submit repo:
>>>>>>>>>>
>>>>>>>>>>    http://hg.openjdk.java.net/jdk/submit/rev/76703c4ec1ea
>>>>>>>>>>
>>>>>>>>>> If it passes all tests, I will send review request again.
>>>>>>>>>
>>>>>>>>> This change passed all tests on submit repo 
>>>>>>>>> (mach5-one-ysuenaga-JDK-8233375-2-20191104-1317-6405064).
>>>>>>>>> Could you review again?
>>>>>>>>>
>>>>>>>>>    http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.02/
>>>>>>>>>
>>>>>>>>> In Markus's change, emergency dump will not perform when 
>>>>>>>>> Thread::current_or_null_safe() returns NULL.
>>>>>>>>> However it will occur when coredump signal (e.g. SIGSEGV) 
>>>>>>>>> throws to PID by `kill` command - main thread of the process 
>>>>>>>>> will be already detached (out of JVM).
>>>>>>>>> Also the crash might happen in native thread - created by 
>>>>>>>>> pthread_create (on Linux) from JNI code.
>>>>>>>>>
>>>>>>>>> Thus we should continue to perform emergency dump even if 
>>>>>>>>> Thread::current_or_null_safe() returns NULL.
>>>>>>>>
>>>>>>>> I didn't quite follow all that, but if there is no current 
>>>>>>>> thread then prepare_for_emergency_dump() is either going to 
>>>>>>>> assert here:
>>>>>>>>
>>>>>>>>   348   Thread* const thread = Thread::current();
>>>>>>>>
>>>>>>>> or crash here:
>>>>>>>>
>>>>>>>>   349   if (thread->is_Watcher_thread()) {
>>>>>>>
>>>>>>> Thanks David!
>>>>>>> I fixed it in new webrev:
>>>>>>>
>>>>>>>    http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.03/
>>>>>>
>>>>>> It would be cleaner/simpler if prepare_for_emergency_dump takes 
>>>>>> the thread argument. As it is just a static function that doesn't 
>>>>>> impact anything else.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> It works fine on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8233375-2-20191105-0117-6422777).
>>>>>>>
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/11/04 22:23, Markus Gronlund wrote:
>>>>>>>>>>> Hi Yasumasa and David,
>>>>>>>>>>>
>>>>>>>>>>> Apologies about the suggestion to use ThreadInVMFromUnknown. 
>>>>>>>>>>> I realized later that, as you have pointed out, it would 
>>>>>>>>>>> perform a real thread transition. Sorry.
>>>>>>>>>>>
>>>>>>>>>>> Taking some input from ThreadInVMFromUnknown and the 
>>>>>>>>>>> situations I have seen at this location, I think the only 
>>>>>>>>>>> case we need to be concerned about here is when a JavaThread 
>>>>>>>>>>> is _thread_in_native. _thread_in_java transition to 
>>>>>>>>>>> _thread_in_vm via stubs in SharedRuntime (i believe) as part 
>>>>>>>>>>> of coming out of the exception handler(s). Unfortunately I 
>>>>>>>>>>> cannot give a proper argument now to give the premises where 
>>>>>>>>>>> this invariant is enforced, so let's work with the original 
>>>>>>>>>>> thread state as you suggested Yasumasa.
>>>>>>>>>>>
>>>>>>>>>>> If we can avoid passing the thread all the way through, I 
>>>>>>>>>>> think that is preferable (this is not performance critical 
>>>>>>>>>>> code). David also alluded to the fact that you always 
>>>>>>>>>>> manipulate the current thread anyway. Although very unlikely, 
>>>>>>>>>>> we could have run into an issue with thread local storage, so 
>>>>>>>>>>> it makes sense to test this up front. If we cannot read the 
>>>>>>>>>>> thread local, the operations we intend to perform will fail, 
>>>>>>>>>>> so we might just bail out already.
>>>>>>>>>>>
>>>>>>>>>>> I took the liberty to tighten up the transition class a 
>>>>>>>>>>> little bit; you only need to restore the thread state if 
>>>>>>>>>>> there was an actual change.
>>>>>>>>>>>
>>>>>>>>>>> Perhaps we can do it like this?
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev01/
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your patience investigating this
>>>>>>>>>>>
>>>>>>>>>>> Markus
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: David Holmes
>>>>>>>>>>> Sent: den 4 november 2019 05:24
>>>>>>>>>>> To: Yasumasa Suenaga <suenaga at oss.nttdata.com>; Markus 
>>>>>>>>>>> Gronlund <markus.gronlund at oracle.com>; 
>>>>>>>>>>> hotspot-jfr-dev at openjdk.java.net; 
>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>>>> Cc: yasuenag at gmail.com
>>>>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not 
>>>>>>>>>>> recover thread state
>>>>>>>>>>>
>>>>>>>>>>> So looking at Yasumasa's proposed fix ...
>>>>>>>>>>>
>>>>>>>>>>> I don't think it is worth the disruption to pass the "thread" 
>>>>>>>>>>> all the way through these API's. It is simpler/cleaner to 
>>>>>>>>>>> just call
>>>>>>>>>>> Thread::current_or_null_safe() when you need the current thread.
>>>>>>>>>>>
>>>>>>>>>>> 357   assert(thread->is_Java_thread() &&
>>>>>>>>>>> (((JavaThread*)thread)->thread_state() == _thread_in_vm), 
>>>>>>>>>>> "invariant");
>>>>>>>>>>>
>>>>>>>>>>> This assertion is incorrect. As this can be called via
>>>>>>>>>>> VMError::report_or_die() there is AFAICS absolutely no 
>>>>>>>>>>> guarantee that we need be in a JavaThread at all.
>>>>>>>>>>>
>>>>>>>>>>>    428 class ThreadInVMForJFR : public StackObj {
>>>>>>>>>>>
>>>>>>>>>>> Can I suggest JavaThreadInVM to make it clear this only 
>>>>>>>>>>> affects JavaThreads. And as it is local we don't need the 
>>>>>>>>>>> "forJFR" part.
>>>>>>>>>>>
>>>>>>>>>>> Based on Markus's proposed change, and with a view to 
>>>>>>>>>>> constrain the scope even further can I suggest the following:
>>>>>>>>>>>
>>>>>>>>>>> if (!guard_reentrancy()) {
>>>>>>>>>>>     return;
>>>>>>>>>>> } else {
>>>>>>>>>>>     // Ensure a JavaThread is _thread_in_vm when we make this 
>>>>>>>>>>> call
>>>>>>>>>>>     JavaThreadInVM jtivm(Thread::current_or_null_safe());
>>>>>>>>>>>     if (!prepare_for_emergency_dump()) {
>>>>>>>>>>>       return;
>>>>>>>>>>>     }
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 4/11/2019 12:24 pm, David Holmes wrote:
>>>>>>>>>>>> Correction ...
>>>>>>>>>>>>
>>>>>>>>>>>> On 4/11/2019 12:11 pm, David Holmes wrote:
>>>>>>>>>>>>> On 4/11/2019 11:19 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>> On 2019/11/04 7:38, David Holmes wrote:
>>>>>>>>>>>>>>> On 4/11/2019 2:22 am, Markus Gronlund wrote:
>>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think you can simplify it to something like this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That is more like I had envisaged for this. Reusing existing
>>>>>>>>>>>>>>> thread-state transition code is preferable to adding more 
>>>>>>>>>>>>>>> custom
>>>>>>>>>>>>>>> code that directly manipulates thread-state.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do not agree with this change.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> VMError::report_and_die() has "Thread* thread" in its 
>>>>>>>>>>>>>> arguments. So
>>>>>>>>>>>>>> Thread::current() might be different with it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Not sure what you mean. You only ever manipulate the thread 
>>>>>>>>>>>>> state of
>>>>>>>>>>>>> the current thread.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> In addition, ThreadInVMfromUnknown uses 
>>>>>>>>>>>>>> transition_from_native() to
>>>>>>>>>>>>>> change the thread state.
>>>>>>>>>>>>>> It checks (and manipulates?) something which relates to 
>>>>>>>>>>>>>> safepoint.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes it does - which would be a problem if a safepoint (or 
>>>>>>>>>>>>> handshake)
>>>>>>>>>>>>> were pending. But the path through before_exit already has 
>>>>>>>>>>>>> safepoint
>>>>>>>>>>>>> checks when you acquire the BeforeExit_lock.
>>>>>>>>>>>>
>>>>>>>>>>>> But that isn't relevant. The issue is we don't want a 
>>>>>>>>>>>> safepoint check
>>>>>>>>>>>> on the report_and_die() path. So a custom transition helper 
>>>>>>>>>>>> is needed
>>>>>>>>>>>> to avoid that.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>> The main problem with the suggestion is it seems we may not be
>>>>>>>>>>>>> running in a JavaThread:
>>>>>>>>>>>>>
>>>>>>>>>>>>>    349   Thread* const thread = Thread::current();
>>>>>>>>>>>>>    350   if (thread->is_Watcher_thread()) {
>>>>>>>>>>>>>
>>>>>>>>>>>>> so we can't use the existing thread-state helpers, unless 
>>>>>>>>>>>>> we narrow
>>>>>>>>>>>>> the scope (as you do) to after the check for the 
>>>>>>>>>>>>> WatcherThread.
>>>>>>>>>>>>>
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thus I added ThreadInVMForJFR to new my webrev.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Your change still seems overly complicated.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>>>>>>>> Sent: den 2 november 2019 16:57
>>>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net;
>>>>>>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>>>>>>>>> Cc: David Holmes <david.holmes at oracle.com>; 
>>>>>>>>>>>>>>>> yasuenag at gmail.com;
>>>>>>>>>>>>>>>> Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do 
>>>>>>>>>>>>>>>> not recover
>>>>>>>>>>>>>>>> thread state
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Markus commented in JBS this change should be kept local 
>>>>>>>>>>>>>>>> to JFR.
>>>>>>>>>>>>>>>> So I updated webrev. Could you review it?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.01/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This change passed all tests on submit repo
>>>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8233375-1-20191102-1354-6373703).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2019/11/01 18:41, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>>>> Forward to hotspot-runtime-dev.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As David commented in JBS, it may need to be fixed in 
>>>>>>>>>>>>>>>>> JFR code.
>>>>>>>>>>>>>>>>> But I'm not unclear why thread state is not recover.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'd like to hear about this from JFR folks.
>>>>>>>>>>>>>>>>> If it is just a bug in JFR, I will create a patch which 
>>>>>>>>>>>>>>>>> recover
>>>>>>>>>>>>>>>>> it in JFR code.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -------- Forwarded Message --------
>>>>>>>>>>>>>>>>> Subject: RFR: 8233375: JFR emergency dump do not 
>>>>>>>>>>>>>>>>> recover thread
>>>>>>>>>>>>>>>>> state
>>>>>>>>>>>>>>>>> Date: Fri, 1 Nov 2019 17:08:42 +0900
>>>>>>>>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>>>>>> CC: yasuenag at gmail.com <yasuenag at gmail.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      JBS: https://bugs.openjdk.java.net/browse/JDK-8233375
>>>>>>>>>>>>>>>>>      webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.00/ 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If JFR is running when JVM crashes, JFR will dump data to
>>>>>>>>>>>>>>>>> hs_err_pid<PID>.jfr .
>>>>>>>>>>>>>>>>> It would perform in prepare_for_emergency_dump().
>>>>>>>>>>>>>>>>> However this function transits thread state to 
>>>>>>>>>>>>>>>>> "_thread_in_vm".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This change has been tested on submit repo as
>>>>>>>>>>>>>>>>> mach5-one-ysuenaga-JDK-8233375-20191101-0651-6334762.
>>>>>>>>>>>>>>>>> It failed at 
>>>>>>>>>>>>>>>>> compiler/types/correctness/CorrectnessTest.java
>>>>>>>>>>>>>>>>> However this test is for JIT compiler, and related 
>>>>>>>>>>>>>>>>> issue has been
>>>>>>>>>>>>>>>>> reported as JDK-8225620.
>>>>>>>>>>>>>>>>> So I think this patch can go through.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yasumasa


More information about the hotspot-jfr-dev mailing list