RFR: 8313796: AsyncGetCallTrace crash on unreadable interpreter method pointer [v4]

Tue Aug 8 13:47:42 UTC 2023

On Tue, 8 Aug 2023 12:44:35 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> @tstuefe @fisk I hadn't appreciated that the cause was probably concurrent method unloading, we don't have a core dump, just the backtrace from the crash and the disassembly from objdump, so all I knew was that the pointer was null but not why. This is not the sort of thing that reproduces readily. I don't have as much context about the adjacent JVM mechanisms as others in this thread and am just trying to fix a crash based on the evidence I have.
>> 
>> This pointer being null seems to be a symptom rather than a cause and it doesn't appear there's anything we can do about concurrent method unloading interfering with AsyncGetCallTrace, so I wonder how worthwhile attempting to fix this is. On the one hand it will crash another way sometimes, on the other hand the probability of this happening is significantly reduced to the subsequent usages of the pointer, whereas that window of time for unloading a method to cause a crash in AsyncGetCallTrace is currently the duration of the unwind preceding the current frame. Let me know what you think about proceeding and I'll submit a fix with the null check which would have been sufficient to avoid the observed segfault. Thanks.
>
> @richardstartin I definitely think a patch in the propsed form - first check for NULL, then check again with SafeFetch - makes a lot of sense. Not perfect, but it will reduce the chance of crashes happening. And it is very simple and backportable.

Hardening the API is always a good idea, especially if it doesn't have a performance impact. We generally don't know in which state ASGCT is called. I added comparable checks at many places before.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15178#discussion_r1287146823