RFR: 8244920: Access violation in frames::interpreter_frame_method

Erik Österlund erik.osterlund at oracle.com
Tue Jun 9 14:51:28 UTC 2020


Hi Coleen,

Thanks for the review.

/Erik

On 2020-06-08 17:01, coleen.phillimore at oracle.com wrote:
>
>
> On 6/8/20 10:15 AM, Erik Österlund wrote:
>> Hi Markus,
>>
>> Thanks for having a look at this.
>>
>> On 2020-06-08 16:01, Markus Gronlund wrote:
>>> Hi Erik,
>>>
>>> Here is my understanding of this situation:
>>>
>>> The thread's last java frame (_anchor._last_Java_sp) is already set 
>>> when the thread enters Deoptimization::uncommon_trap(), pointing to 
>>> the "UncommonTrapBlob", but the stack is still parsable / walkable.
>>>
>>> In fact, a lot of the comments in the code assert the stack to be 
>>> walkable during most of the unpacking process, for example in 
>>> AbstractInterpreter::layout_activation():
>>> ...
>>>    "// It is also guaranteed to be walkable even though it is in a 
>>> skeletal state"
>>>
>>> At least it asserts this before it starts to perform the 
>>> modifications to the frame.
>>
>> Right. The frames are "walkable", but not "parsable". The concrete 
>> problem is that the "skeletal state" it describes, does not contain a 
>> method pointer; that is to be subsequently unpacked to the frame. But 
>> the JFR code assumes that if the stack is walkable, then it is also 
>> parsable, which makes sense.
>>
>>> Since the thread is in thread state "thread_in_Java", I would say 
>>> this issue indicate a more general problem, affecting other 
>>> profilers as well, for example users of AsynchGetCallTrace.
>>>
>>> Your suggestion, although it will work, could have an impact on 
>>> profilers in that they will not get as many samples as they have got 
>>> previously (because we've disabled the stack walking during unpacking).
>
> This doesn't seem like a big window so wouldn't affect that many 
> samples.  Deoptimization should be pretty unusual and to sample while 
> unpacking frames doesn't give you very good information anyway (except 
> for it crashing).  It's better to skip this than try to work around 
> something that would need a lot of guard code to not crash.
>
> thanks,
> Coleen
>>>
>>> Perhaps it would be possible to solve this by doing something like:
>>>
>>> What if we hoist the read of the mirror to before the frame starts 
>>> to mutate, for example in AbstractInterpreter_x86.cpp:
>>>
>>> #ifdef ASSERT
>>>    assert(caller->sp() == interpreter_frame->sender_sp(), "Frame not 
>>> properly walkable");
>>> #endif
>>>
>>> 81 oop mirror = method->method_holder()->java_mirror() <<--- TO HERE
>>>
>>> Instead of doing late at a time where the frame has already started 
>>> to mutate ( currently the mirror is read at line 117). Could this 
>>> maybe be a solution that keeps the stack walkable?
>>>
>>> Do we know the exact failure, i.e. why the stack is not parsable at 
>>> line 117?
>>
>> Yeah, the method of an interpreter frame is not set as the frames are 
>> not yet initialized to have method pointers,
>> so when we ask it for its holder, we blow up. Therefore, moving 
>> reading  that mirror up won't really help AFAICT.
>>
>> This is really conceptually a leaf function; we are not capable of 
>> performing GC (expecting stack to be parsable).
>> It just happens to be that the frame is passed through 
>> last_Java_frame instead of as an argument to the function.
>> But I think it should be treated like a leaf function as much as 
>> possible, because that is what it really is. So like any
>> other leaf function, it should not be parsed, because it was not 
>> designed for that to work.
>>
>> Thanks,
>> /Erik
>>
>>> Thanks
>>> Markus
>>>
>>>
>>> -----Original Message-----
>>> From: Erik Österlund
>>> Sent: den 8 juni 2020 09:15
>>> To: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
>>> Cc: Markus Gronlund <markus.gronlund at oracle.com>; Erik Gahlin 
>>> <erik.gahlin at oracle.com>
>>> Subject: RFR: 8244920: Access violation in 
>>> frames::interpreter_frame_method
>>>
>>> Hi,
>>>
>>> When we unpack interpreter frames due to deoptimization, we find 
>>> ourselves in a situation where the stack space has been allocated, 
>>> and the last_Java_frame has been set, but the contents of the frames 
>>> has not yet been populated. Any JFR event firing during this time 
>>> will think the stack can be parsed, as the last_Java_frame has been 
>>> set, and then fail doing so, when running into uninitialized stack 
>>> frames. After ZGC added an event that sometimes fires in load 
>>> barriers, we suddenly found ourselves in this awkward spot.
>>>
>>> I propose to clear the last_Java_frame after the top frame has been 
>>> acquired, and is no longer needed during the unpacking, so that such 
>>> events will see that we are in fact in a leaf call and should not 
>>> attempt to sample the stack.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8244920
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8244920/webrev.00/
>>>
>>> Thanks,
>>> /Erik
>>
>



More information about the hotspot-runtime-dev mailing list