RFR: 8244920: Access violation in frames::interpreter_frame_method
Erik Österlund
erik.osterlund at oracle.com
Tue Jun 9 14:51:28 UTC 2020
Hi Coleen,
Thanks for the review.
/Erik
On 2020-06-08 17:01, coleen.phillimore at oracle.com wrote:
>
>
> On 6/8/20 10:15 AM, Erik Österlund wrote:
>> Hi Markus,
>>
>> Thanks for having a look at this.
>>
>> On 2020-06-08 16:01, Markus Gronlund wrote:
>>> Hi Erik,
>>>
>>> Here is my understanding of this situation:
>>>
>>> The thread's last java frame (_anchor._last_Java_sp) is already set
>>> when the thread enters Deoptimization::uncommon_trap(), pointing to
>>> the "UncommonTrapBlob", but the stack is still parsable / walkable.
>>>
>>> In fact, a lot of the comments in the code assert the stack to be
>>> walkable during most of the unpacking process, for example in
>>> AbstractInterpreter::layout_activation():
>>> ...
>>> "// It is also guaranteed to be walkable even though it is in a
>>> skeletal state"
>>>
>>> At least it asserts this before it starts to perform the
>>> modifications to the frame.
>>
>> Right. The frames are "walkable", but not "parsable". The concrete
>> problem is that the "skeletal state" it describes, does not contain a
>> method pointer; that is to be subsequently unpacked to the frame. But
>> the JFR code assumes that if the stack is walkable, then it is also
>> parsable, which makes sense.
>>
>>> Since the thread is in thread state "thread_in_Java", I would say
>>> this issue indicate a more general problem, affecting other
>>> profilers as well, for example users of AsynchGetCallTrace.
>>>
>>> Your suggestion, although it will work, could have an impact on
>>> profilers in that they will not get as many samples as they have got
>>> previously (because we've disabled the stack walking during unpacking).
>
> This doesn't seem like a big window so wouldn't affect that many
> samples. Deoptimization should be pretty unusual and to sample while
> unpacking frames doesn't give you very good information anyway (except
> for it crashing). It's better to skip this than try to work around
> something that would need a lot of guard code to not crash.
>
> thanks,
> Coleen
>>>
>>> Perhaps it would be possible to solve this by doing something like:
>>>
>>> What if we hoist the read of the mirror to before the frame starts
>>> to mutate, for example in AbstractInterpreter_x86.cpp:
>>>
>>> #ifdef ASSERT
>>> assert(caller->sp() == interpreter_frame->sender_sp(), "Frame not
>>> properly walkable");
>>> #endif
>>>
>>> 81 oop mirror = method->method_holder()->java_mirror() <<--- TO HERE
>>>
>>> Instead of doing late at a time where the frame has already started
>>> to mutate ( currently the mirror is read at line 117). Could this
>>> maybe be a solution that keeps the stack walkable?
>>>
>>> Do we know the exact failure, i.e. why the stack is not parsable at
>>> line 117?
>>
>> Yeah, the method of an interpreter frame is not set as the frames are
>> not yet initialized to have method pointers,
>> so when we ask it for its holder, we blow up. Therefore, moving
>> reading that mirror up won't really help AFAICT.
>>
>> This is really conceptually a leaf function; we are not capable of
>> performing GC (expecting stack to be parsable).
>> It just happens to be that the frame is passed through
>> last_Java_frame instead of as an argument to the function.
>> But I think it should be treated like a leaf function as much as
>> possible, because that is what it really is. So like any
>> other leaf function, it should not be parsed, because it was not
>> designed for that to work.
>>
>> Thanks,
>> /Erik
>>
>>> Thanks
>>> Markus
>>>
>>>
>>> -----Original Message-----
>>> From: Erik Österlund
>>> Sent: den 8 juni 2020 09:15
>>> To: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
>>> Cc: Markus Gronlund <markus.gronlund at oracle.com>; Erik Gahlin
>>> <erik.gahlin at oracle.com>
>>> Subject: RFR: 8244920: Access violation in
>>> frames::interpreter_frame_method
>>>
>>> Hi,
>>>
>>> When we unpack interpreter frames due to deoptimization, we find
>>> ourselves in a situation where the stack space has been allocated,
>>> and the last_Java_frame has been set, but the contents of the frames
>>> has not yet been populated. Any JFR event firing during this time
>>> will think the stack can be parsed, as the last_Java_frame has been
>>> set, and then fail doing so, when running into uninitialized stack
>>> frames. After ZGC added an event that sometimes fires in load
>>> barriers, we suddenly found ourselves in this awkward spot.
>>>
>>> I propose to clear the last_Java_frame after the top frame has been
>>> acquired, and is no longer needed during the unpacking, so that such
>>> events will see that we are in fact in a leaf call and should not
>>> attempt to sample the stack.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8244920
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8244920/webrev.00/
>>>
>>> Thanks,
>>> /Erik
>>
>
More information about the hotspot-runtime-dev
mailing list