Fwd: RFR: 8233375: JFR emergency dump do not recover thread state
David Holmes
david.holmes at oracle.com
Tue Nov 5 04:56:42 UTC 2019
On 5/11/2019 1:56 pm, Yasumasa Suenaga wrote:
> On 2019/11/05 9:17, David Holmes wrote:
>> On 5/11/2019 9:56 am, Yasumasa Suenaga wrote:
>>> On 2019/11/04 22:43, Yasumasa Suenaga wrote:
>>>> Hi Markus,
>>>>
>>>> I thought similar change, and it is running on submit repo:
>>>>
>>>> http://hg.openjdk.java.net/jdk/submit/rev/76703c4ec1ea
>>>>
>>>> If it passes all tests, I will send review request again.
>>>
>>> This change passed all tests on submit repo
>>> (mach5-one-ysuenaga-JDK-8233375-2-20191104-1317-6405064).
>>> Could you review again?
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.02/
>>>
>>> In Markus's change, emergency dump will not perform when
>>> Thread::current_or_null_safe() returns NULL.
>>> However it will occur when coredump signal (e.g. SIGSEGV) throws to
>>> PID by `kill` command - main thread of the process will be already
>>> detached (out of JVM).
>>> Also the crash might happen in native thread - created by
>>> pthread_create (on Linux) from JNI code.
>>>
>>> Thus we should continue to perform emergency dump even if
>>> Thread::current_or_null_safe() returns NULL.
>>
>> I didn't quite follow all that, but if there is no current thread then
>> prepare_for_emergency_dump() is either going to assert here:
>>
>> 348 Thread* const thread = Thread::current();
>>
>> or crash here:
>>
>> 349 if (thread->is_Watcher_thread()) {
>
> Thanks David!
> I fixed it in new webrev:
>
> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.03/
It would be cleaner/simpler if prepare_for_emergency_dump takes the
thread argument. As it is just a static function that doesn't impact
anything else.
Thanks,
David
> It works fine on submit repo
> (mach5-one-ysuenaga-JDK-8233375-2-20191105-0117-6422777).
>
>
> Yasumasa
>
>
>> David
>> -----
>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2019/11/04 22:23, Markus Gronlund wrote:
>>>>> Hi Yasumasa and David,
>>>>>
>>>>> Apologies about the suggestion to use ThreadInVMFromUnknown. I
>>>>> realized later that, as you have pointed out, it would perform a
>>>>> real thread transition. Sorry.
>>>>>
>>>>> Taking some input from ThreadInVMFromUnknown and the situations I
>>>>> have seen at this location, I think the only case we need to be
>>>>> concerned about here is when a JavaThread is _thread_in_native.
>>>>> _thread_in_java transition to _thread_in_vm via stubs in
>>>>> SharedRuntime (i believe) as part of coming out of the exception
>>>>> handler(s). Unfortunately I cannot give a proper argument now to
>>>>> give the premises where this invariant is enforced, so let's work
>>>>> with the original thread state as you suggested Yasumasa.
>>>>>
>>>>> If we can avoid passing the thread all the way through, I think
>>>>> that is preferable (this is not performance critical code). David
>>>>> also alluded to the fact that you always manipulate the current
>>>>> thread anyway. Although very unlikely, we could have run into an
>>>>> issue with thread local storage, so it makes sense to test this up
>>>>> front. If we cannot read the thread local, the operations we intend
>>>>> to perform will fail, so we might just bail out already.
>>>>>
>>>>> I took the liberty to tighten up the transition class a little bit;
>>>>> you only need to restore the thread state if there was an actual
>>>>> change.
>>>>>
>>>>> Perhaps we can do it like this?
>>>>>
>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev01/
>>>>>
>>>>> Thanks for your patience investigating this
>>>>>
>>>>> Markus
>>>>>
>>>>> -----Original Message-----
>>>>> From: David Holmes
>>>>> Sent: den 4 november 2019 05:24
>>>>> To: Yasumasa Suenaga <suenaga at oss.nttdata.com>; Markus Gronlund
>>>>> <markus.gronlund at oracle.com>; hotspot-jfr-dev at openjdk.java.net;
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> Cc: yasuenag at gmail.com
>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not recover
>>>>> thread state
>>>>>
>>>>> So looking at Yasumasa's proposed fix ...
>>>>>
>>>>> I don't think it is worth the disruption to pass the "thread" all
>>>>> the way through these API's. It is simpler/cleaner to just call
>>>>> Thread::current_or_null_safe() when you need the current thread.
>>>>>
>>>>> 357 assert(thread->is_Java_thread() &&
>>>>> (((JavaThread*)thread)->thread_state() == _thread_in_vm),
>>>>> "invariant");
>>>>>
>>>>> This assertion is incorrect. As this can be called via
>>>>> VMError::report_or_die() there is AFAICS absolutely no guarantee
>>>>> that we need be in a JavaThread at all.
>>>>>
>>>>> 428 class ThreadInVMForJFR : public StackObj {
>>>>>
>>>>> Can I suggest JavaThreadInVM to make it clear this only affects
>>>>> JavaThreads. And as it is local we don't need the "forJFR" part.
>>>>>
>>>>> Based on Markus's proposed change, and with a view to constrain the
>>>>> scope even further can I suggest the following:
>>>>>
>>>>> if (!guard_reentrancy()) {
>>>>> return;
>>>>> } else {
>>>>> // Ensure a JavaThread is _thread_in_vm when we make this call
>>>>> JavaThreadInVM jtivm(Thread::current_or_null_safe());
>>>>> if (!prepare_for_emergency_dump()) {
>>>>> return;
>>>>> }
>>>>> }
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>>
>>>>> On 4/11/2019 12:24 pm, David Holmes wrote:
>>>>>> Correction ...
>>>>>>
>>>>>> On 4/11/2019 12:11 pm, David Holmes wrote:
>>>>>>> On 4/11/2019 11:19 am, Yasumasa Suenaga wrote:
>>>>>>>> On 2019/11/04 7:38, David Holmes wrote:
>>>>>>>>> On 4/11/2019 2:22 am, Markus Gronlund wrote:
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> I think you can simplify it to something like this:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev/
>>>>>>>>>
>>>>>>>>> That is more like I had envisaged for this. Reusing existing
>>>>>>>>> thread-state transition code is preferable to adding more custom
>>>>>>>>> code that directly manipulates thread-state.
>>>>>>>>
>>>>>>>> I do not agree with this change.
>>>>>>>>
>>>>>>>> VMError::report_and_die() has "Thread* thread" in its arguments. So
>>>>>>>> Thread::current() might be different with it.
>>>>>>>
>>>>>>> Not sure what you mean. You only ever manipulate the thread state of
>>>>>>> the current thread.
>>>>>>>
>>>>>>>> In addition, ThreadInVMfromUnknown uses transition_from_native() to
>>>>>>>> change the thread state.
>>>>>>>> It checks (and manipulates?) something which relates to safepoint.
>>>>>>>
>>>>>>> Yes it does - which would be a problem if a safepoint (or handshake)
>>>>>>> were pending. But the path through before_exit already has safepoint
>>>>>>> checks when you acquire the BeforeExit_lock.
>>>>>>
>>>>>> But that isn't relevant. The issue is we don't want a safepoint check
>>>>>> on the report_and_die() path. So a custom transition helper is needed
>>>>>> to avoid that.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> The main problem with the suggestion is it seems we may not be
>>>>>>> running in a JavaThread:
>>>>>>>
>>>>>>> 349 Thread* const thread = Thread::current();
>>>>>>> 350 if (thread->is_Watcher_thread()) {
>>>>>>>
>>>>>>> so we can't use the existing thread-state helpers, unless we narrow
>>>>>>> the scope (as you do) to after the check for the WatcherThread.
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Thus I added ThreadInVMForJFR to new my webrev.
>>>>>>>
>>>>>>> Your change still seems overly complicated.
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Markus
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>> Sent: den 2 november 2019 16:57
>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net;
>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>>> Cc: David Holmes <david.holmes at oracle.com>; yasuenag at gmail.com;
>>>>>>>>>> Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not recover
>>>>>>>>>> thread state
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Markus commented in JBS this change should be kept local to JFR.
>>>>>>>>>> So I updated webrev. Could you review it?
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.01/
>>>>>>>>>>
>>>>>>>>>> This change passed all tests on submit repo
>>>>>>>>>> (mach5-one-ysuenaga-JDK-8233375-1-20191102-1354-6373703).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/11/01 18:41, Yasumasa Suenaga wrote:
>>>>>>>>>>> Forward to hotspot-runtime-dev.
>>>>>>>>>>>
>>>>>>>>>>> As David commented in JBS, it may need to be fixed in JFR code.
>>>>>>>>>>> But I'm not unclear why thread state is not recover.
>>>>>>>>>>>
>>>>>>>>>>> I'd like to hear about this from JFR folks.
>>>>>>>>>>> If it is just a bug in JFR, I will create a patch which recover
>>>>>>>>>>> it in JFR code.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -------- Forwarded Message --------
>>>>>>>>>>> Subject: RFR: 8233375: JFR emergency dump do not recover thread
>>>>>>>>>>> state
>>>>>>>>>>> Date: Fri, 1 Nov 2019 17:08:42 +0900
>>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>> CC: yasuenag at gmail.com <yasuenag at gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Please review this change:
>>>>>>>>>>>
>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8233375
>>>>>>>>>>> webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.00/
>>>>>>>>>>>
>>>>>>>>>>> If JFR is running when JVM crashes, JFR will dump data to
>>>>>>>>>>> hs_err_pid<PID>.jfr .
>>>>>>>>>>> It would perform in prepare_for_emergency_dump().
>>>>>>>>>>> However this function transits thread state to "_thread_in_vm".
>>>>>>>>>>>
>>>>>>>>>>> This change has been tested on submit repo as
>>>>>>>>>>> mach5-one-ysuenaga-JDK-8233375-20191101-0651-6334762.
>>>>>>>>>>> It failed at compiler/types/correctness/CorrectnessTest.java
>>>>>>>>>>> However this test is for JIT compiler, and related issue has
>>>>>>>>>>> been
>>>>>>>>>>> reported as JDK-8225620.
>>>>>>>>>>> So I think this patch can go through.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
More information about the hotspot-runtime-dev
mailing list