Fwd: RFR: 8233375: JFR emergency dump do not recover thread state

David Holmes david.holmes at oracle.com
Tue Nov 5 00:17:38 UTC 2019


On 5/11/2019 9:56 am, Yasumasa Suenaga wrote:
> On 2019/11/04 22:43, Yasumasa Suenaga wrote:
>> Hi Markus,
>>
>> I thought similar change, and it is running on submit repo:
>>
>>    http://hg.openjdk.java.net/jdk/submit/rev/76703c4ec1ea
>>
>> If it passes all tests, I will send review request again.
> 
> This change passed all tests on submit repo 
> (mach5-one-ysuenaga-JDK-8233375-2-20191104-1317-6405064).
> Could you review again?
> 
>    http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.02/
> 
> In Markus's change, emergency dump will not perform when 
> Thread::current_or_null_safe() returns NULL.
> However it will occur when coredump signal (e.g. SIGSEGV) throws to PID 
> by `kill` command - main thread of the process will be already detached 
> (out of JVM).
> Also the crash might happen in native thread - created by pthread_create 
> (on Linux) from JNI code.
> 
> Thus we should continue to perform emergency dump even if 
> Thread::current_or_null_safe() returns NULL.

I didn't quite follow all that, but if there is no current thread then 
prepare_for_emergency_dump() is either going to assert here:

  348   Thread* const thread = Thread::current();

or crash here:

  349   if (thread->is_Watcher_thread()) {

David
-----

> 
> Thanks,
> 
> Yasumasa
> 
> 
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2019/11/04 22:23, Markus Gronlund wrote:
>>> Hi Yasumasa and David,
>>>
>>> Apologies about the suggestion to use ThreadInVMFromUnknown. I 
>>> realized later that, as you have pointed out, it would perform a real 
>>> thread transition. Sorry.
>>>
>>> Taking some input from ThreadInVMFromUnknown and the situations I 
>>> have seen at this location, I think the only case we need to be 
>>> concerned about here is when a JavaThread is _thread_in_native. 
>>> _thread_in_java transition to _thread_in_vm via stubs in 
>>> SharedRuntime (i believe) as part of coming out of the exception 
>>> handler(s). Unfortunately I cannot give a proper argument now to give 
>>> the premises where this invariant is enforced, so let's work with the 
>>> original thread state as you suggested Yasumasa.
>>>
>>> If we can avoid passing the thread all the way through, I think that 
>>> is preferable (this is not performance critical code). David also 
>>> alluded to the fact that you always manipulate the current thread 
>>> anyway. Although very unlikely, we could have run into an issue with 
>>> thread local storage, so it makes sense to test this up front. If we 
>>> cannot read the thread local, the operations we intend to perform 
>>> will fail, so we might just bail out already.
>>>
>>> I took the liberty to tighten up the transition class a little bit; 
>>> you only need to restore the thread state if there was an actual change.
>>>
>>> Perhaps we can do it like this?
>>>
>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev01/
>>>
>>> Thanks for your patience investigating this
>>>
>>> Markus
>>>
>>> -----Original Message-----
>>> From: David Holmes
>>> Sent: den 4 november 2019 05:24
>>> To: Yasumasa Suenaga <suenaga at oss.nttdata.com>; Markus Gronlund 
>>> <markus.gronlund at oracle.com>; hotspot-jfr-dev at openjdk.java.net; 
>>> hotspot-runtime-dev at openjdk.java.net
>>> Cc: yasuenag at gmail.com
>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not recover 
>>> thread state
>>>
>>> So looking at Yasumasa's proposed fix ...
>>>
>>> I don't think it is worth the disruption to pass the "thread" all the 
>>> way through these API's. It is simpler/cleaner to just call
>>> Thread::current_or_null_safe() when you need the current thread.
>>>
>>> 357   assert(thread->is_Java_thread() &&
>>> (((JavaThread*)thread)->thread_state() == _thread_in_vm), "invariant");
>>>
>>> This assertion is incorrect. As this can be called via
>>> VMError::report_or_die() there is AFAICS absolutely no guarantee that 
>>> we need be in a JavaThread at all.
>>>
>>>    428 class ThreadInVMForJFR : public StackObj {
>>>
>>> Can I suggest JavaThreadInVM to make it clear this only affects 
>>> JavaThreads. And as it is local we don't need the "forJFR" part.
>>>
>>> Based on Markus's proposed change, and with a view to constrain the 
>>> scope even further can I suggest the following:
>>>
>>> if (!guard_reentrancy()) {
>>>     return;
>>> } else {
>>>     // Ensure a JavaThread is _thread_in_vm when we make this call
>>>     JavaThreadInVM jtivm(Thread::current_or_null_safe());
>>>     if (!prepare_for_emergency_dump()) {
>>>       return;
>>>     }
>>> }
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>>
>>> On 4/11/2019 12:24 pm, David Holmes wrote:
>>>> Correction ...
>>>>
>>>> On 4/11/2019 12:11 pm, David Holmes wrote:
>>>>> On 4/11/2019 11:19 am, Yasumasa Suenaga wrote:
>>>>>> On 2019/11/04 7:38, David Holmes wrote:
>>>>>>> On 4/11/2019 2:22 am, Markus Gronlund wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> I think you can simplify it to something like this:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev/
>>>>>>>
>>>>>>> That is more like I had envisaged for this. Reusing existing
>>>>>>> thread-state transition code is preferable to adding more custom
>>>>>>> code that directly manipulates thread-state.
>>>>>>
>>>>>> I do not agree with this change.
>>>>>>
>>>>>> VMError::report_and_die() has "Thread* thread" in its arguments. So
>>>>>> Thread::current() might be different with it.
>>>>>
>>>>> Not sure what you mean. You only ever manipulate the thread state of
>>>>> the current thread.
>>>>>
>>>>>> In addition, ThreadInVMfromUnknown uses transition_from_native() to
>>>>>> change the thread state.
>>>>>> It checks (and manipulates?) something which relates to safepoint.
>>>>>
>>>>> Yes it does - which would be a problem if a safepoint (or handshake)
>>>>> were pending. But the path through before_exit already has safepoint
>>>>> checks when you acquire the BeforeExit_lock.
>>>>
>>>> But that isn't relevant. The issue is we don't want a safepoint check
>>>> on the report_and_die() path. So a custom transition helper is needed
>>>> to avoid that.
>>>>
>>>> David
>>>>
>>>>> The main problem with the suggestion is it seems we may not be
>>>>> running in a JavaThread:
>>>>>
>>>>>    349   Thread* const thread = Thread::current();
>>>>>    350   if (thread->is_Watcher_thread()) {
>>>>>
>>>>> so we can't use the existing thread-state helpers, unless we narrow
>>>>> the scope (as you do) to after the check for the WatcherThread.
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Thus I added ThreadInVMForJFR to new my webrev.
>>>>>
>>>>> Your change still seems overly complicated.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Markus
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>> Sent: den 2 november 2019 16:57
>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net;
>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>> Cc: David Holmes <david.holmes at oracle.com>; yasuenag at gmail.com;
>>>>>>>> Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not recover
>>>>>>>> thread state
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Markus commented in JBS this change should be kept local to JFR.
>>>>>>>> So I updated webrev. Could you review it?
>>>>>>>>
>>>>>>>>      http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.01/
>>>>>>>>
>>>>>>>> This change passed all tests on submit repo
>>>>>>>> (mach5-one-ysuenaga-JDK-8233375-1-20191102-1354-6373703).
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/11/01 18:41, Yasumasa Suenaga wrote:
>>>>>>>>> Forward to hotspot-runtime-dev.
>>>>>>>>>
>>>>>>>>> As David commented in JBS, it may need to be fixed in JFR code.
>>>>>>>>> But I'm not unclear why thread state is not recover.
>>>>>>>>>
>>>>>>>>> I'd like to hear about this from JFR folks.
>>>>>>>>> If it is just a bug in JFR, I will create a patch which recover
>>>>>>>>> it in JFR code.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -------- Forwarded Message --------
>>>>>>>>> Subject: RFR: 8233375: JFR emergency dump do not recover thread
>>>>>>>>> state
>>>>>>>>> Date: Fri, 1 Nov 2019 17:08:42 +0900
>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>> CC: yasuenag at gmail.com <yasuenag at gmail.com>
>>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Please review this change:
>>>>>>>>>
>>>>>>>>>      JBS: https://bugs.openjdk.java.net/browse/JDK-8233375
>>>>>>>>>      webrev:
>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.00/
>>>>>>>>>
>>>>>>>>> If JFR is running when JVM crashes, JFR will dump data to
>>>>>>>>> hs_err_pid<PID>.jfr .
>>>>>>>>> It would perform in prepare_for_emergency_dump().
>>>>>>>>> However this function transits thread state to "_thread_in_vm".
>>>>>>>>>
>>>>>>>>> This change has been tested on submit repo as
>>>>>>>>> mach5-one-ysuenaga-JDK-8233375-20191101-0651-6334762.
>>>>>>>>> It failed at compiler/types/correctness/CorrectnessTest.java
>>>>>>>>> However this test is for JIT compiler, and related issue has been
>>>>>>>>> reported as JDK-8225620.
>>>>>>>>> So I think this patch can go through.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa


More information about the hotspot-jfr-dev mailing list