[jdk19] RFR: 8289091: move oop safety check from SharedRuntime::get_java_tid() to JavaThread::threadObj() [v2]
Daniel D. Daugherty
dcubed at openjdk.org
Wed Jun 29 18:43:38 UTC 2022
On Wed, 29 Jun 2022 02:40:41 GMT, David Holmes <dholmes at openjdk.org> wrote:
>> @dholmes-ora - Thanks for the re-review. I'm going to update the fix again and
>> switch back to using `Thread::current()` instead of `Thread::current_or_null()`.
>>
>> The reason that this particular handling of a secondary failure is important is
>> that a secondary failure in printing the name of the failing thread will prevent
>> entries for following threads from being printed in the thread list in the hs_err_pid
>> file.
>>
>> Here's an example to show placement:
>>
>> Java Threads: ( => current thread )
>> 0x00007fd64d808a00 JavaThread "Unknown thread" [_thread_blocked, id=7427, stack(0x0000700003c18000,0x0000700003d18000)]
>> 0x00007fd64e80d400 JavaThread "Unknown thread" [_thread_blocked, id=43011, stack(0x0000700004433000,0x0000700004533000)]
>> 0x00007fd64e80e200 JavaThread "Unknown thread" [_thread_blocked, id=42755, stack(0x0000700004536000,0x0000700004636000)]
>> 0x00007fd64e80e800 JavaThread "Unknown thread" [_thread_blocked, id=42243, stack(0x0000700004639000,0x0000700004739000)]
>> 0x00007fd64e80ee00 JavaThread "Unknown thread" [_thread_blocked, id=22787, stack(0x000070000473c000,0x000070000483c000)]
>> 0x00007fd64e80f400 JavaThread "Unknown thread" [_thread_blocked, id=41731, stack(0x000070000483f000,0x000070000493f000)]
>> 0x00007fd64e81e800 JavaThread "Unknown thread" [_thread_blocked, id=41219, stack(0x0000700004942000,0x0000700004a42000)]
>> 0x00007fd64e81ee00 JavaThread "Unknown thread" [_thread_blocked, id=23299, stack(0x0000700004a45000,0x0000700004b45000)]
>> 0x00007fd650819000 JavaThread "Unknown thread" [_thread_blocked, id=40451, stack(0x0000700004b48000,0x0000700004c48000)]
>> 0x00007fd64e851400 JavaThread "Unknown thread" [_thread_blocked, id=23811, stack(0x0000700004c4b000,0x0000700004d4b000)]
>> 0x00007fd64d84e200 JavaThread "Unknown thread" [_thread_blocked, id=24067, stack(0x0000700004e51000,0x0000700004f51000)]
>> 0x00007fd64e81c600 JavaThread "Unknown thread" [_thread_blocked, id=39171, stack(0x0000700004f54000,0x0000700005054000)] _threads_hazard_ptr=0x00007fd64e0041b0
>> 0x00007fd64d80d000 JavaThread "Unknown thread" [_thread_blocked, id=38915, stack(0x0000700005057000,0x0000700005157000)]
>> =>0x00007fd650812000 JavaThread "<no-name - current JavaThread has exited>" [_thread_in_vm, id=24835, stack(0x000070000515a000,0x000070000525a000)]
>>
>>
>> The entry prefixed with "=>" is the crashing thread that is past the GC barrier
>> detach point. In this example, it happens to be the last thread in the `Java Threads:
>> ( => current thread )` section of the hs_err_pid file. However, if the failing thread
>> happened to be earlier in the list and we didn't prevent the secondary error, then
>> we would be missing entries from the section.
>
>> The reason that this particular handling of a secondary failure is important is
> that a secondary failure in printing the name of the failing thread will prevent
> entries for following threads from being printed in the thread list in the hs_err_pid
> file.
>
> Yes I realise that a secondary crash loses some information, but my contention is that:
>
> a) the likelihood of crashing after detaching the GC barrier is very, very small; and
> b) in such a crash it is only the crashing thread that is really of interest, not the other threads in the system
>
> So to me trying to make secondary crash handling more robust in the current case is not worth the cost of the extra checks. If the checks were only in the crash reporting path then that would be okay, but not when they impact normal code execution. Sorry.
@dholmes-ora - I'm very glad that I resisted putting the check in `threadObj()`
when I made the changes for:
[JDK-8288139](https://bugs.openjdk.org/browse/JDK-8288139) JavaThread touches oop after GC barrier is detached
The review for that PR took too long and clearly this discussion would have made
it take longer. JDK-8288139 is fixed and Zhengyu can make progress on his issue.
I don't see a good way to make forward progress in this PR. It doesn't look like you
and I will reach a solution that's acceptable to both of us.
The only good news is that my many rounds of testing on this fix have convinced me
threadObj() is not being abused in the current code base except for the very narrow
case where we crash (for whatever reason) after the GC barrier is detached. Should
the GC team chose to put in an appropriate trap in their code that catches a thread
accessing an oop after the GC barrier is detached, then threadObj() will be automatically
sanity checked by that code.
Thanks for your time in doing several rounds of review.
-------------
PR: https://git.openjdk.org/jdk19/pull/69
More information about the hotspot-runtime-dev
mailing list