RFR: 8282475: SafeFetch should not rely on existence of Thread::current [v6]

David Holmes david.holmes at oracle.com
Thu Mar 17 23:29:15 UTC 2022


On 12/03/2022 2:37 am, Anton Kozlov wrote:
> On Thu, 10 Mar 2022 18:04:50 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
> 
>> blocking SIGSEGV and SIGBUS - or other synchronous error signals like SIGFPE - and then triggering said signal is UB. What happens is OS-dependent. I saw processes vanishing, or hang, or core. It makes sense, since what is the kernel supposed to do. It cannot deliver the signal, and deferring it would require returning to the faulting instruction, that would just re-fault.
>> For some more details see e.g. https://bugs.openjdk.java.net/browse/JDK-8252533
> 
> This UB looks reasonable. My point is that a native thread would run fine with SIGSEGV blocked. But then JVM decides it can do SafeFetch, and things gets nasty.
> 
>>> Is there a crash that is fixed by the change? I just spotted it is an enhancement, not a bug. Just trying to understand the problem.
>>
>> Yes, this issue is a breakout from https://bugs.openjdk.java.net/browse/JDK-8282306, where we'd like to use SafeFetch to make stack walking in AsyncGetCallTrace more robust. AGCT is called from the signal handler, and it may run in any number of situations (e.g. in foreign threads, or threads which are in the process of getting dismantled, etc).
> 
> I mean, some way to verify the issue is fixed, e.g. a test that does not fail anymore.
> 
> I see AsyncGetCallTrace to assume the JavaThread very soon, or do I look at the wrong place? https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/forte.cpp#L569

It is up to the agent setting things up for AGCT to only actually call 
it for JavaThreads.

David
-----

>> Another situation is error handling itself. When writing an hs-err file, we use SafeFetch to do carefully tiptoe around the possibly corrupt VM state. If the original crash happened in a foreign thread, we still want some of these reports to work (e.g. dumping register content or printing stacks). So SafeFetch should be as robust as possible.
> 
> OK, thanks. I think we also handle recursive segfaults recover after interpretation of the corrupted VM state. Otherwise, implementing the printing functions would be too tedious and hard with SafeFetch alone. But I see it's used in printing register content, at least.
> 
> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/7727


More information about the serviceability-dev mailing list