Fwd: RFR(S) 8231634: SA stack walking fails with "illegal bci"
Alex Menkov
alexey.menkov at oracle.com
Tue Apr 28 01:23:16 UTC 2020
Hi Chris,
great analysis.
LGTM++
--alex
On 04/27/2020 14:52, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> The fix looks reasonable.
>
> Thanks,
> Serguei
>
>
> On 4/27/20 12:16, Chris Plummer wrote:
>> Ping! Please help review if you can. There's also an RFR out for
>> 8243500, which is related.
>>
>> thanks,
>>
>> Chris
>>
>> -------- Forwarded Message --------
>> Subject: RFR(S) 8231634: SA stack walking fails with "illegal bci"
>> Date: Thu, 23 Apr 2020 14:35:49 -0700
>> From: Chris Plummer <chris.plummer at oracle.com>
>> To: serviceability-dev <serviceability-dev at openjdk.java.net>
>>
>>
>>
>> Hello,
>>
>> Please help review the following:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8231634
>> http://cr.openjdk.java.net/~cjplummer/8231634/webrev.00/index.html
>>
>> The fix is fairly simple: Don't use R13 as the BCP if it does not
>> point into the methods actual bytecodes. Instead rely on the BCP value
>> flushed to the frame. Details below are mostly there to help you
>> understand the big picture of what is going on.
>>
>> First a bit a background on what the live_bcp field is in the
>> X86Frame. BCP is "bytecode pointer". It points to the current bytecode
>> being executed. I'm mostly just talking about interpreter frames here,
>> but I think for compiled frames it points to the current native
>> instruction being executed. For all but the current frame, frame->bcp
>> reliably contains the BCP. For the current frame you normally need to
>> look in a register for an up to date BCP. On x86_64 that's R13. When
>> trying to find and create the current frame for a thread, eventually
>> you end up in LinuxAMD64JavaThreadPDAccess.getCurrentFrameGuess(),
>> which does:
>>
>> // pass the value of R13 which contains the bcp for the top
>> level frame
>> Address bcp = context.getRegisterAsAddress(AMD64ThreadContext.R13);
>> return new X86Frame(guesser.getSP(), guesser.getFP(),
>> guesser.getPC(), null, bcp <-- live_bcp field);
>>
>> So in this case the X86Frame is constructed with R13 stored in the
>> live_bcp field. For any frame other than the current frame, the
>> live_bcp field will be NULL.
>>
>> And one other bit of clarification before moving on to the bug. You'll
>> see BCX in code. For example, INTERPRETER_FRAME_BCX_OFFSET. BCX ==
>> BCP. I'm not sure why BCX was ever chosen. I was tempted to rename it
>> to BCP, but it's been cloned to all the other SA CPU ports, so I think
>> it's best to just leave it.
>>
>> Now on to the bug. Sometimes the interpreter uses R13 as a scratch
>> register, so it does not always contain the BCP. This is why sometimes
>> tests will see an "illegal bci" assert from SA. It's because r13 is
>> currently not pointing to the BCP of the current frame, and thus can't
>> be converted to a BCI.
>>
>> What this fix does take advantage of the fact that when r13 is used as
>> a scratch register, it is first flushed to frame->bcp. So with this
>> fix R13 is only used as the BCP if it points within the Method's
>> bytecodes. If it does not, then SA will use the BCP flushed to
>> frame->bcp. The reason we don't always used frame->bcp is because it
>> is not kept up to date as the interpreter moves between bytecodes, but
>> we know it is up to date if R13 is currently being used as a scratch
>> register. So in other words, an invalid R13 implies that frame->bcp is
>> up to date.
>>
>> And if you find yourself thinking "X86Frame.java supports is for both
>> 32-bit and 64-bit x86, but R13 does not exists on 32-bit.", your
>> right. A couple of years ago the concept of live_bcp was introduced to
>> fix JDK-8214226 [1]. It only added support for fixing 64-bit and
>> ignored 32-bit. So you'll never see live_bcp getting set for 32-bit,
>> and thus JDK-8214226 is still broken on 32-bit, but it also means
>> 32-bit is not impacted by the bug I'm fixing here. So the fix for
>> JDK-8214226 is why you see R13 references in the X86Frame.java, but
>> the code will still work for any platform that sets up live_bcd
>> properly, even if it's not actually R13.
>>
>> ...and one more thing. 8214226 [1] actually introduced this "illegal
>> bci" bug. However 8214226 [1] was only ever applied to LinuxAMD64.
>> Never for BSD or Windows. This means two things: (1) the "illegal bci"
>> bug never turned up on these platforms, and (2) 8214226 is still
>> broken on these platform, meaning the bci returned by
>> getInterpreterFrameBCI() is likely often stale, meaning stack traces
>> will not show the proper line number for the current frame. It seems
>> we don't have a test to catch this. I've filed 8243500 [2] to cover
>> fixing 8214226 on BSD and Windows.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8214226
>> [2] https://bugs.openjdk.java.net/browse/JDK-8243500
>>
>> thanks,
>>
>> Chris
>>
>
More information about the serviceability-dev
mailing list