RFR: 8244920: Access violation in frames::interpreter_frame_method

Markus Gronlund markus.gronlund at oracle.com
Mon Jun 8 19:56:59 UTC 2020


Hi again Erik

I'm sorry, but my suggestion to fix this locally in JFR was premature. I remember now why we don't check for "thread->in_deopt_handler()" for the synchronous case (no longer): 
the reason is the Deoptimization event that is located inside the depot handler. We are able to get a good stacktrace at this location, because the "UncommonTrapBlob" is parsable.

It is the stub invoking Deoptimization::unpack_frames() that, as you say, sets the uninitialized frame as LJF as a means of passing parameters (I guess this is how thread->in_deopt_handler() came to be in the first place).

I think we could find a way, although not part of this change, to keep the stack parsable up until the new frame is fully unpacked / complete. For example, we could  have the "UncommonTrapBlob" stay as LJF until something is ready to be published. But this is of course a larger effort.

Your suggestion looks good.

Thanks
Markus




-----Original Message-----
From: Markus Gronlund 
Sent: den 8 juni 2020 17:50
To: Erik Osterlund <erik.osterlund at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
Subject: RE: RFR: 8244920: Access violation in frames::interpreter_frame_method

Thanks for the explanation,

I did not see that the last java frame set when entering uncommon_trap(), and becoming parsable in uncommon_trap_inner(), is not the same last java frame set when entering unpack_frames(), having been reset in between.

I can also see that there is a missing check to early exit on "thread->in_deopt_handler()" as part of the JFR stack trace code (it is present for the asynchronous, but not for the synchronous case).

We could perform this change local to JFR by adding this check [1] or we can go with your suggestion. If we do the change in Deoptimization::unpack_frames(), then I think the comment should be made more general, since it affects asynchronous paths as well, i.e. all attempts to use the last java frame (having forget to use thread->in_deopt_handler()).

Thanks Erik for digging into this one.
Markus

[1] http://cr.openjdk.java.net/~mgronlun/8244920/webrev00/ 



-----Original Message-----
From: Erik Österlund
Sent: den 8 juni 2020 16:15
To: Markus Gronlund <markus.gronlund at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
Cc: Erik Gahlin <erik.gahlin at oracle.com>
Subject: Re: RFR: 8244920: Access violation in frames::interpreter_frame_method

Hi Markus,

Thanks for having a look at this.

On 2020-06-08 16:01, Markus Gronlund wrote:
> Hi Erik,
>
> Here is my understanding of this situation:
>
> The thread's last java frame (_anchor._last_Java_sp) is already set when the thread enters Deoptimization::uncommon_trap(), pointing to the "UncommonTrapBlob", but the stack is still parsable / walkable.
>
> In fact, a lot of the comments in the code assert the stack to be walkable during most of the unpacking process, for example in AbstractInterpreter::layout_activation():
> ...
>    "// It is also guaranteed to be walkable even though it is in a skeletal state"
>
> At least it asserts this before it starts to perform the modifications to the frame.

Right. The frames are "walkable", but not "parsable". The concrete problem is that the "skeletal state" it describes, does not contain a method pointer; that is to be subsequently unpacked to the frame. But the JFR code assumes that if the stack is walkable, then it is also parsable, which makes sense.

> Since the thread is in thread state "thread_in_Java", I would say this issue indicate a more general problem, affecting other profilers as well, for example users of AsynchGetCallTrace.
>
> Your suggestion, although it will work, could have an impact on profilers in that they will not get as many samples as they have got previously (because we've disabled the stack walking during unpacking).
>
> Perhaps it would be possible to solve this by doing something like:
>
> What if we hoist the read of the mirror to before the frame starts to mutate, for example in AbstractInterpreter_x86.cpp:
>
> #ifdef ASSERT
>    assert(caller->sp() == interpreter_frame->sender_sp(), "Frame not 
> properly walkable"); #endif
>
> 81 oop mirror = method->method_holder()->java_mirror()  <<--- TO HERE
>
> Instead of doing late at a time where the frame has already started to mutate ( currently the mirror is read at line 117). Could this maybe be a solution that keeps the stack walkable?
>
> Do we know the exact failure, i.e. why the stack is not parsable at line 117?

Yeah, the method of an interpreter frame is not set as the frames are not yet initialized to have method pointers, so when we ask it for its holder, we blow up. Therefore, moving reading that mirror up won't really help AFAICT.

This is really conceptually a leaf function; we are not capable of performing GC (expecting stack to be parsable).
It just happens to be that the frame is passed through last_Java_frame instead of as an argument to the function.
But I think it should be treated like a leaf function as much as possible, because that is what it really is. So like any other leaf function, it should not be parsed, because it was not designed for that to work.

Thanks,
/Erik

> Thanks
> Markus
>
>
> -----Original Message-----
> From: Erik Österlund
> Sent: den 8 juni 2020 09:15
> To: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
> Cc: Markus Gronlund <markus.gronlund at oracle.com>; Erik Gahlin 
> <erik.gahlin at oracle.com>
> Subject: RFR: 8244920: Access violation in 
> frames::interpreter_frame_method
>
> Hi,
>
> When we unpack interpreter frames due to deoptimization, we find ourselves in a situation where the stack space has been allocated, and the last_Java_frame has been set, but the contents of the frames has not yet been populated. Any JFR event firing during this time will think the stack can be parsed, as the last_Java_frame has been set, and then fail doing so, when running into uninitialized stack frames. After ZGC added an event that sometimes fires in load barriers, we suddenly found ourselves in this awkward spot.
>
> I propose to clear the last_Java_frame after the top frame has been acquired, and is no longer needed during the unpacking, so that such events will see that we are in fact in a leaf call and should not attempt to sample the stack.
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8244920
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8244920/webrev.00/
>
> Thanks,
> /Erik



More information about the hotspot-runtime-dev mailing list