Fwd: RFR: 8233375: JFR emergency dump do not recover thread state
David Holmes
david.holmes at oracle.com
Tue Nov 5 05:56:14 UTC 2019
On 5/11/2019 3:48 pm, Yasumasa Suenaga wrote:
> On 2019/11/05 14:34, David Holmes wrote:
>> On 5/11/2019 3:13 pm, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>>> It would be cleaner/simpler if prepare_for_emergency_dump takes the
>>>> thread argument. As it is just a static function that doesn't impact
>>>> anything else.
>>>
>>> prepare_for_emergency_dump() returns false if some critical locks
>>> could not unlock.
>>> So what should we return if NULL is passed as the argument? true?
>>
>> But you're not calling prepare_for_emergency_dump when the thread is
>> NULL:
>>
>> 454 Thread* thread = Thread::current_or_null_safe();
>> 455
>> 456 // Ensure a JavaThread is _thread_in_vm when we make this call
>> 457 JavaThreadInVM jtivm(thread);
>> 458 if ((thread != NULL) && !prepare_for_emergency_dump()) {
>> 459 return;
>> 460 }
>
> Oh, sorry, I have a mistake!
> I want to change as below:
>
> ```
> + Thread* thread = Thread::current_or_null_safe();
> +
> + // Ensure a JavaThread is _thread_in_vm when we make this call
> + JavaThreadInVM jtivm(thread);
> + if (thread != NULL) {
> + if (!prepare_for_emergency_dump()) {
> + return;
> + }
> + }
> ```
but that is the same logic ??
>> All I'm saying is that you pass "thread" as a parameter so you can
>> then delete the existing call to Thread::current() that is inside
>> prepare_for_emergency_dump.
>
> Is it prefer pass "thread" to prepare_for_emergency_dump() instead of
> above?
??? The two are not related. If you've already obtained the current
thread you can pass it to prepare_for_emergency_dump and avoid the need
to call Thread:current() (in whatever form) again. How you handle a NULL
current thread is independent of that.
> If so, I will push new changeset to submit repo, and will send new
> review request.
I'd send the review request first and get agreement before wasting time
on the submit repo.
Thanks,
David
-----
>
> Yasumasa
>
>
>> David
>> -----
>>
>>> It might break semantics of this function, so I did not add argument
>>> to prepare_for_emergency_dump() in this webrev.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2019/11/05 13:56, David Holmes wrote:
>>>> On 5/11/2019 1:56 pm, Yasumasa Suenaga wrote:
>>>>> On 2019/11/05 9:17, David Holmes wrote:
>>>>>> On 5/11/2019 9:56 am, Yasumasa Suenaga wrote:
>>>>>>> On 2019/11/04 22:43, Yasumasa Suenaga wrote:
>>>>>>>> Hi Markus,
>>>>>>>>
>>>>>>>> I thought similar change, and it is running on submit repo:
>>>>>>>>
>>>>>>>> http://hg.openjdk.java.net/jdk/submit/rev/76703c4ec1ea
>>>>>>>>
>>>>>>>> If it passes all tests, I will send review request again.
>>>>>>>
>>>>>>> This change passed all tests on submit repo
>>>>>>> (mach5-one-ysuenaga-JDK-8233375-2-20191104-1317-6405064).
>>>>>>> Could you review again?
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.02/
>>>>>>>
>>>>>>> In Markus's change, emergency dump will not perform when
>>>>>>> Thread::current_or_null_safe() returns NULL.
>>>>>>> However it will occur when coredump signal (e.g. SIGSEGV) throws
>>>>>>> to PID by `kill` command - main thread of the process will be
>>>>>>> already detached (out of JVM).
>>>>>>> Also the crash might happen in native thread - created by
>>>>>>> pthread_create (on Linux) from JNI code.
>>>>>>>
>>>>>>> Thus we should continue to perform emergency dump even if
>>>>>>> Thread::current_or_null_safe() returns NULL.
>>>>>>
>>>>>> I didn't quite follow all that, but if there is no current thread
>>>>>> then prepare_for_emergency_dump() is either going to assert here:
>>>>>>
>>>>>> 348 Thread* const thread = Thread::current();
>>>>>>
>>>>>> or crash here:
>>>>>>
>>>>>> 349 if (thread->is_Watcher_thread()) {
>>>>>
>>>>> Thanks David!
>>>>> I fixed it in new webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.03/
>>>>
>>>> It would be cleaner/simpler if prepare_for_emergency_dump takes the
>>>> thread argument. As it is just a static function that doesn't impact
>>>> anything else.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> It works fine on submit repo
>>>>> (mach5-one-ysuenaga-JDK-8233375-2-20191105-0117-6422777).
>>>>>
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/11/04 22:23, Markus Gronlund wrote:
>>>>>>>>> Hi Yasumasa and David,
>>>>>>>>>
>>>>>>>>> Apologies about the suggestion to use ThreadInVMFromUnknown. I
>>>>>>>>> realized later that, as you have pointed out, it would perform
>>>>>>>>> a real thread transition. Sorry.
>>>>>>>>>
>>>>>>>>> Taking some input from ThreadInVMFromUnknown and the situations
>>>>>>>>> I have seen at this location, I think the only case we need to
>>>>>>>>> be concerned about here is when a JavaThread is
>>>>>>>>> _thread_in_native. _thread_in_java transition to _thread_in_vm
>>>>>>>>> via stubs in SharedRuntime (i believe) as part of coming out of
>>>>>>>>> the exception handler(s). Unfortunately I cannot give a proper
>>>>>>>>> argument now to give the premises where this invariant is
>>>>>>>>> enforced, so let's work with the original thread state as you
>>>>>>>>> suggested Yasumasa.
>>>>>>>>>
>>>>>>>>> If we can avoid passing the thread all the way through, I think
>>>>>>>>> that is preferable (this is not performance critical code).
>>>>>>>>> David also alluded to the fact that you always manipulate the
>>>>>>>>> current thread anyway. Although very unlikely, we could have
>>>>>>>>> run into an issue with thread local storage, so it makes sense
>>>>>>>>> to test this up front. If we cannot read the thread local, the
>>>>>>>>> operations we intend to perform will fail, so we might just
>>>>>>>>> bail out already.
>>>>>>>>>
>>>>>>>>> I took the liberty to tighten up the transition class a little
>>>>>>>>> bit; you only need to restore the thread state if there was an
>>>>>>>>> actual change.
>>>>>>>>>
>>>>>>>>> Perhaps we can do it like this?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev01/
>>>>>>>>>
>>>>>>>>> Thanks for your patience investigating this
>>>>>>>>>
>>>>>>>>> Markus
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes
>>>>>>>>> Sent: den 4 november 2019 05:24
>>>>>>>>> To: Yasumasa Suenaga <suenaga at oss.nttdata.com>; Markus Gronlund
>>>>>>>>> <markus.gronlund at oracle.com>; hotspot-jfr-dev at openjdk.java.net;
>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>> Cc: yasuenag at gmail.com
>>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not
>>>>>>>>> recover thread state
>>>>>>>>>
>>>>>>>>> So looking at Yasumasa's proposed fix ...
>>>>>>>>>
>>>>>>>>> I don't think it is worth the disruption to pass the "thread"
>>>>>>>>> all the way through these API's. It is simpler/cleaner to just
>>>>>>>>> call
>>>>>>>>> Thread::current_or_null_safe() when you need the current thread.
>>>>>>>>>
>>>>>>>>> 357 assert(thread->is_Java_thread() &&
>>>>>>>>> (((JavaThread*)thread)->thread_state() == _thread_in_vm),
>>>>>>>>> "invariant");
>>>>>>>>>
>>>>>>>>> This assertion is incorrect. As this can be called via
>>>>>>>>> VMError::report_or_die() there is AFAICS absolutely no
>>>>>>>>> guarantee that we need be in a JavaThread at all.
>>>>>>>>>
>>>>>>>>> 428 class ThreadInVMForJFR : public StackObj {
>>>>>>>>>
>>>>>>>>> Can I suggest JavaThreadInVM to make it clear this only affects
>>>>>>>>> JavaThreads. And as it is local we don't need the "forJFR" part.
>>>>>>>>>
>>>>>>>>> Based on Markus's proposed change, and with a view to constrain
>>>>>>>>> the scope even further can I suggest the following:
>>>>>>>>>
>>>>>>>>> if (!guard_reentrancy()) {
>>>>>>>>> return;
>>>>>>>>> } else {
>>>>>>>>> // Ensure a JavaThread is _thread_in_vm when we make this call
>>>>>>>>> JavaThreadInVM jtivm(Thread::current_or_null_safe());
>>>>>>>>> if (!prepare_for_emergency_dump()) {
>>>>>>>>> return;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 4/11/2019 12:24 pm, David Holmes wrote:
>>>>>>>>>> Correction ...
>>>>>>>>>>
>>>>>>>>>> On 4/11/2019 12:11 pm, David Holmes wrote:
>>>>>>>>>>> On 4/11/2019 11:19 am, Yasumasa Suenaga wrote:
>>>>>>>>>>>> On 2019/11/04 7:38, David Holmes wrote:
>>>>>>>>>>>>> On 4/11/2019 2:22 am, Markus Gronlund wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think you can simplify it to something like this:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mgronlun/8233375/webrev/
>>>>>>>>>>>>>
>>>>>>>>>>>>> That is more like I had envisaged for this. Reusing existing
>>>>>>>>>>>>> thread-state transition code is preferable to adding more
>>>>>>>>>>>>> custom
>>>>>>>>>>>>> code that directly manipulates thread-state.
>>>>>>>>>>>>
>>>>>>>>>>>> I do not agree with this change.
>>>>>>>>>>>>
>>>>>>>>>>>> VMError::report_and_die() has "Thread* thread" in its
>>>>>>>>>>>> arguments. So
>>>>>>>>>>>> Thread::current() might be different with it.
>>>>>>>>>>>
>>>>>>>>>>> Not sure what you mean. You only ever manipulate the thread
>>>>>>>>>>> state of
>>>>>>>>>>> the current thread.
>>>>>>>>>>>
>>>>>>>>>>>> In addition, ThreadInVMfromUnknown uses
>>>>>>>>>>>> transition_from_native() to
>>>>>>>>>>>> change the thread state.
>>>>>>>>>>>> It checks (and manipulates?) something which relates to
>>>>>>>>>>>> safepoint.
>>>>>>>>>>>
>>>>>>>>>>> Yes it does - which would be a problem if a safepoint (or
>>>>>>>>>>> handshake)
>>>>>>>>>>> were pending. But the path through before_exit already has
>>>>>>>>>>> safepoint
>>>>>>>>>>> checks when you acquire the BeforeExit_lock.
>>>>>>>>>>
>>>>>>>>>> But that isn't relevant. The issue is we don't want a
>>>>>>>>>> safepoint check
>>>>>>>>>> on the report_and_die() path. So a custom transition helper is
>>>>>>>>>> needed
>>>>>>>>>> to avoid that.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> The main problem with the suggestion is it seems we may not be
>>>>>>>>>>> running in a JavaThread:
>>>>>>>>>>>
>>>>>>>>>>> 349 Thread* const thread = Thread::current();
>>>>>>>>>>> 350 if (thread->is_Watcher_thread()) {
>>>>>>>>>>>
>>>>>>>>>>> so we can't use the existing thread-state helpers, unless we
>>>>>>>>>>> narrow
>>>>>>>>>>> the scope (as you do) to after the check for the WatcherThread.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>> Thus I added ThreadInVMForJFR to new my webrev.
>>>>>>>>>>>
>>>>>>>>>>> Your change still seems overly complicated.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>>>>>> Sent: den 2 november 2019 16:57
>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net;
>>>>>>>>>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>>>>>>>>>> Cc: David Holmes <david.holmes at oracle.com>;
>>>>>>>>>>>>>> yasuenag at gmail.com;
>>>>>>>>>>>>>> Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>>>> Subject: Re: Fwd: RFR: 8233375: JFR emergency dump do not
>>>>>>>>>>>>>> recover
>>>>>>>>>>>>>> thread state
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Markus commented in JBS this change should be kept local
>>>>>>>>>>>>>> to JFR.
>>>>>>>>>>>>>> So I updated webrev. Could you review it?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.01/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This change passed all tests on submit repo
>>>>>>>>>>>>>> (mach5-one-ysuenaga-JDK-8233375-1-20191102-1354-6373703).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2019/11/01 18:41, Yasumasa Suenaga wrote:
>>>>>>>>>>>>>>> Forward to hotspot-runtime-dev.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As David commented in JBS, it may need to be fixed in JFR
>>>>>>>>>>>>>>> code.
>>>>>>>>>>>>>>> But I'm not unclear why thread state is not recover.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'd like to hear about this from JFR folks.
>>>>>>>>>>>>>>> If it is just a bug in JFR, I will create a patch which
>>>>>>>>>>>>>>> recover
>>>>>>>>>>>>>>> it in JFR code.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -------- Forwarded Message --------
>>>>>>>>>>>>>>> Subject: RFR: 8233375: JFR emergency dump do not recover
>>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>>> state
>>>>>>>>>>>>>>> Date: Fri, 1 Nov 2019 17:08:42 +0900
>>>>>>>>>>>>>>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>>>> CC: yasuenag at gmail.com <yasuenag at gmail.com>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review this change:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8233375
>>>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8233375/webrev.00/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If JFR is running when JVM crashes, JFR will dump data to
>>>>>>>>>>>>>>> hs_err_pid<PID>.jfr .
>>>>>>>>>>>>>>> It would perform in prepare_for_emergency_dump().
>>>>>>>>>>>>>>> However this function transits thread state to
>>>>>>>>>>>>>>> "_thread_in_vm".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This change has been tested on submit repo as
>>>>>>>>>>>>>>> mach5-one-ysuenaga-JDK-8233375-20191101-0651-6334762.
>>>>>>>>>>>>>>> It failed at compiler/types/correctness/CorrectnessTest.java
>>>>>>>>>>>>>>> However this test is for JIT compiler, and related issue
>>>>>>>>>>>>>>> has been
>>>>>>>>>>>>>>> reported as JDK-8225620.
>>>>>>>>>>>>>>> So I think this patch can go through.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
More information about the hotspot-runtime-dev
mailing list