Fwd: RFR: 8233375: JFR emergency dump do not recover thread state

David Holmes david.holmes at oracle.com
Tue Nov 5 05:56:14 UTC 2019


On 5/11/2019 3:48 pm, Yasumasa Suenaga wrote:
> On 2019/11/05 14:34, David Holmes wrote:
>> On 5/11/2019 3:13 pm, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>>> It would be cleaner/simpler if prepare_for_emergency_dump takes the 
>>>> thread argument. As it is just a static function that doesn't impact 
>>>> anything else.
>>>
>>> prepare_for_emergency_dump() returns false if some critical locks 
>>> could not unlock.
>>> So what should we return if NULL is passed as the argument? true?
>>
>> But you're not calling prepare_for_emergency_dump when the thread is 
>> NULL:
>>
>>   454   Thread* thread = Thread::current_or_null_safe();
>>   455
>>   456   // Ensure a JavaThread is _thread_in_vm when we make this call
>>   457   JavaThreadInVM jtivm(thread);
>>   458   if ((thread != NULL) && !prepare_for_emergency_dump()) {
>>   459     return;
>>   460   }
> 
> Oh, sorry, I have a mistake!
> I want to change as below:
> 
> ```
> +  Thread* thread = Thread::current_or_null_safe();
> +
> +  // Ensure a JavaThread is _thread_in_vm when we make this call
> +  JavaThreadInVM jtivm(thread);
> +  if (thread != NULL) {
> +    if (!prepare_for_emergency_dump()) {
> +      return;
> +    }
> +  }
> ```

but that is the same logic ??

>> All I'm saying is that you pass "thread" as a parameter so you can 
>> then delete the existing call to Thread::current() that is inside 
>> prepare_for_emergency_dump.
> 
> Is it prefer pass "thread" to prepare_for_emergency_dump() instead of 
> above?

??? The two are not related. If you've already obtained the current 
thread you can pass it to prepare_for_emergency_dump and avoid the need 
to call Thread:current() (in whatever form) again. How you handle a NULL 
current thread is independent of that.

> If so, I will push new changeset to submit repo, and will send new 
> review request.

I'd send the review request first and get agreement before wasting time 
on the submit repo.

Thanks,
David
-----

> 
> Yasumasa
> 
> 
>> David
>> -----
>>
>>> It might break semantics of this function, so I did not add argument 
>>> to prepare_for_emergency_dump() in this webrev.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2019/11/05 13:56, David Holmes wrote:
>>>> On 5/11/2019 1:56 pm, Yasumasa Suenaga wrote:
>>>>> On 2019/11/05 9:17, David Holmes wrote:
>>>>>> On 5/11/2019 9:56 am, Yasumasa Suenaga wrote:
>>>>>>> On 2019/11/04 22:43, Yasumasa Suenaga wrote:
>>>>>>>> Hi Markus,
>>>>>>>>
>>>>>>>> I thought similar change, and it is running on submit repo:
>>>>>>>>
>>>>>>>>    http://hg.openjdk.java.net/jdk/submit/rev/76703c4ec1ea
>>>>>>>>
>>>>>>>> If it passes all tests, I will send review request again.
>>>>>>>
>>>>>>> This change passed all tests on submit repo 
>>>>>>> (mach5-one-ysuenaga-JDK-8233375-2-20191104-1317-6405064).
>>>>>>> Could you review again?
>>>>>>>
>>>>>>>    http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.02/
>>>>>>>
>>>>>>> In Markus's change, emergency dump will not perform when 
>>>>>>> Thread::current_or_null_safe() returns NULL.
>>>>>>> However it will occur when coredump signal (e.g. SIGSEGV) throws 
>>>>>>> to PID by `kill` command - main thread of the process will be 
>>>>>>> already detached (out of JVM).
>>>>>>> Also the crash might happen in native thread - created by 
>>>>>>> pthread_create (on Linux) from JNI code.
>>>>>>>
>>>>>>> Thus we should continue to perform emergency dump even if 
>>>>>>> Thread::current_or_null_safe() returns NULL.
>>>>>>
>>>>>> I didn't quite follow all that, but if there is no current thread 
>>>>>> then prepare_for_emergency_dump() is either going to assert here:
>>>>>>
>>>>>>   348   Thread* const thread = Thread::current();
>>>>>>
>>>>>> or crash here:
>>>>>>
>>>>>>   349   if (thread->is_Watcher_thread()) {
>>>>>
>>>>> Thanks David!
>>>>> I fixed it in new webrev:
>>>>>
>>>>>    http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.03/
>>>>
>>>> It would be cleaner/simpler if prepare_for_emergency_dump takes the 
>>>> thread argument. As it is just a static function that doesn't impact 
>>>> anything else.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> It works fine on submit repo 
>>>>> (mach5-one-ysuenaga-JDK-8233375-2-20191105-0117-6422777).
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/11/04 22:23, Markus Gronlund wrote:
>>>>>>>>> Hi Yasumasa and David,
>>>>>>>>>
>>>>>>>>> Apologies about the suggestion to use ThreadInVMFromUnknown. I 
>>>>>>>>> realized later that, as you have pointed out, it would perform 
>>>>>>>>> a real thread transition. Sorry.
>>>>>>>>>
>>>>>>>>> Taking some input from ThreadInVMFromUnknown and the situations 
>>>>>>>>> I have seen at this location, I think the only case we need to 
>>>>>>>>> be concerned about here is when a JavaThread is 
>>>>>>>>> _thread_in_native. _thread_in_java transition to _thread_in_vm 
>>>>>>>>> via stubs in SharedRuntime (i believe) as part of coming out of 
>>>>>>>>> the exception handler(s). Unfortunately I cannot give a proper 
>>>>>>>>> argument now to give the premises where this invariant is 
>>>>>>>>> enforced, so let's work with the original thread state as you 
>>>>>>>>> suggested Yasumasa.
>>>>>>>>>
>>>>>>>>> If we can avoid passing the thread all the way through, I think 
>>>>>>>>> that is preferable (this is not performance critical code). 
>>>>>>>>> David also alluded to the fact that you always manipulate the 
>>>>>>>>> current thread anyway. Although very unlikely, we could have 
>>>>>>>>> run into an issue with thread local storage, so it makes sense 
>>>>>>>>> to test this up front. If we cannot read the thread local, the 
>>>>>>>>> operations we intend to perform will fail, so we might just 
>>>>>>>>> bail out already.
>>>>>>>>>
>>>>>>>>> I took the liberty to tighten up the transition class a little 
>>>>>>>>> bit; you only need to restore the thread state if there was an 
>>>>>>>>> actual change.
>>>>>>>>>
>>>>>>>>> Perhaps we can do it like this?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev01/
>>>>>>>>>
>>>>>>>>> Thanks for your patience investigating this
>>>>>>>>>
>>>>>>>>> Markus
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes
>>>>>>>>> Sent: den 4 november 2019 05:24
>>>>>>>>> To: Yasumasa Suenaga <suenaga at oss.nttdata.com>; Markus Gronlund 
>>>>>>>>> <markus.gronlund at oracle.com>; hotspot-jfr-dev at openjdk.java.net; 
>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>> Cc: yasuenag at gmail.com
>>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not 
>>>>>>>>> recover thread state
>>>>>>>>>
>>>>>>>>> So looking at Yasumasa's proposed fix ...
>>>>>>>>>
>>>>>>>>> I don't think it is worth the disruption to pass the "thread" 
>>>>>>>>> all the way through these API's. It is simpler/cleaner to just 
>>>>>>>>> call
>>>>>>>>> Thread::current_or_null_safe() when you need the current thread.
>>>>>>>>>
>>>>>>>>> 357   assert(thread->is_Java_thread() &&
>>>>>>>>> (((JavaThread*)thread)->thread_state() == _thread_in_vm), 
>>>>>>>>> "invariant");
>>>>>>>>>
>>>>>>>>> This assertion is incorrect. As this can be called via
>>>>>>>>> VMError::report_or_die() there is AFAICS absolutely no 
>>>>>>>>> guarantee that we need be in a JavaThread at all.
>>>>>>>>>
>>>>>>>>>    428 class ThreadInVMForJFR : public StackObj {
>>>>>>>>>
>>>>>>>>> Can I suggest JavaThreadInVM to make it clear this only affects 
>>>>>>>>> JavaThreads. And as it is local we don't need the "forJFR" part.
>>>>>>>>>
>>>>>>>>> Based on Markus's proposed change, and with a view to constrain 
>>>>>>>>> the scope even further can I suggest the following:
>>>>>>>>>
>>>>>>>>> if (!guard_reentrancy()) {
>>>>>>>>>     return;
>>>>>>>>> } else {
>>>>>>>>>     // Ensure a JavaThread is _thread_in_vm when we make this call
>>>>>>>>>     JavaThreadInVM jtivm(Thread::current_or_null_safe());
>>>>>>>>>     if (!prepare_for_emergency_dump()) {
>>>>>>>>>       return;
>>>>>>>>>     }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 4/11/2019 12:24 pm, David Holmes wrote:
>>>>>>>>>> Correction ...
>>>>>>>>>>
>>>>>>>>>> On 4/11/2019 12:11 pm, David Holmes wrote:
>>>>>>>>>>> On 4/11/2019 11:19 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>> On 2019/11/04 7:38, David Holmes wrote:
>>>>>>>>>>>>> On 4/11/2019 2:22 am, Markus Gronlund wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think you can simplify it to something like this:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev/
>>>>>>>>>>>>>
>>>>>>>>>>>>> That is more like I had envisaged for this. Reusing existing
>>>>>>>>>>>>> thread-state transition code is preferable to adding more 
>>>>>>>>>>>>> custom
>>>>>>>>>>>>> code that directly manipulates thread-state.
>>>>>>>>>>>>
>>>>>>>>>>>> I do not agree with this change.
>>>>>>>>>>>>
>>>>>>>>>>>> VMError::report_and_die() has "Thread* thread" in its 
>>>>>>>>>>>> arguments. So
>>>>>>>>>>>> Thread::current() might be different with it.
>>>>>>>>>>>
>>>>>>>>>>> Not sure what you mean. You only ever manipulate the thread 
>>>>>>>>>>> state of
>>>>>>>>>>> the current thread.
>>>>>>>>>>>
>>>>>>>>>>>> In addition, ThreadInVMfromUnknown uses 
>>>>>>>>>>>> transition_from_native() to
>>>>>>>>>>>> change the thread state.
>>>>>>>>>>>> It checks (and manipulates?) something which relates to 
>>>>>>>>>>>> safepoint.
>>>>>>>>>>>
>>>>>>>>>>> Yes it does - which would be a problem if a safepoint (or 
>>>>>>>>>>> handshake)
>>>>>>>>>>> were pending. But the path through before_exit already has 
>>>>>>>>>>> safepoint
>>>>>>>>>>> checks when you acquire the BeforeExit_lock.
>>>>>>>>>>
>>>>>>>>>> But that isn't relevant. The issue is we don't want a 
>>>>>>>>>> safepoint check
>>>>>>>>>> on the report_and_die() path. So a custom transition helper is 
>>>>>>>>>> needed
>>>>>>>>>> to avoid that.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> The main problem with the suggestion is it seems we may not be
>>>>>>>>>>> running in a JavaThread:
>>>>>>>>>>>
>>>>>>>>>>>    349   Thread* const thread = Thread::current();
>>>>>>>>>>>    350   if (thread->is_Watcher_thread()) {
>>>>>>>>>>>
>>>>>>>>>>> so we can't use the existing thread-state helpers, unless we 
>>>>>>>>>>> narrow
>>>>>>>>>>> the scope (as you do) to after the check for the WatcherThread.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>> Thus I added ThreadInVMForJFR to new my webrev.
>>>>>>>>>>>
>>>>>>>>>>> Your change still seems overly complicated.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>>>>>> Sent: den 2 november 2019 16:57
>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net;
>>>>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>>>>>>> Cc: David Holmes <david.holmes at oracle.com>; 
>>>>>>>>>>>>>> yasuenag at gmail.com;
>>>>>>>>>>>>>> Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not 
>>>>>>>>>>>>>> recover
>>>>>>>>>>>>>> thread state
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Markus commented in JBS this change should be kept local 
>>>>>>>>>>>>>> to JFR.
>>>>>>>>>>>>>> So I updated webrev. Could you review it?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This change passed all tests on submit repo
>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8233375-1-20191102-1354-6373703).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2019/11/01 18:41, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Forward to hotspot-runtime-dev.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As David commented in JBS, it may need to be fixed in JFR 
>>>>>>>>>>>>>>> code.
>>>>>>>>>>>>>>> But I'm not unclear why thread state is not recover.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'd like to hear about this from JFR folks.
>>>>>>>>>>>>>>> If it is just a bug in JFR, I will create a patch which 
>>>>>>>>>>>>>>> recover
>>>>>>>>>>>>>>> it in JFR code.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -------- Forwarded Message --------
>>>>>>>>>>>>>>> Subject: RFR: 8233375: JFR emergency dump do not recover 
>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>> state
>>>>>>>>>>>>>>> Date: Fri, 1 Nov 2019 17:08:42 +0900
>>>>>>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>>>> CC: yasuenag at gmail.com <yasuenag at gmail.com>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      JBS: https://bugs.openjdk.java.net/browse/JDK-8233375
>>>>>>>>>>>>>>>      webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If JFR is running when JVM crashes, JFR will dump data to
>>>>>>>>>>>>>>> hs_err_pid<PID>.jfr .
>>>>>>>>>>>>>>> It would perform in prepare_for_emergency_dump().
>>>>>>>>>>>>>>> However this function transits thread state to 
>>>>>>>>>>>>>>> "_thread_in_vm".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This change has been tested on submit repo as
>>>>>>>>>>>>>>> mach5-one-ysuenaga-JDK-8233375-20191101-0651-6334762.
>>>>>>>>>>>>>>> It failed at compiler/types/correctness/CorrectnessTest.java
>>>>>>>>>>>>>>> However this test is for JIT compiler, and related issue 
>>>>>>>>>>>>>>> has been
>>>>>>>>>>>>>>> reported as JDK-8225620.
>>>>>>>>>>>>>>> So I think this patch can go through.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa


More information about the hotspot-runtime-dev mailing list