RIP values like 0xffffffff94bf7f80 due to patched NMethod

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Nov 14 03:09:38 UTC 2018


Hi Alexander

First, I would suggest to file bug report with hs_err files attached:

https://bugreport.java.com/bugreport/

The code in question is code which patch nmethod's entry point (link to jdk12 code) [1][2] as you pointed.

There was a bug fixed in jdk 10 where new generated nmethod could be deoptimized before it is even published [3]. Note, 
them race should be already fixed in JDK 8 by [4].

But you see the issue even with JDK11 where these bugs are fixed.

And I don't see where 0x90 'nop' instruction can come unless some flags in next code are used [5] and fat_nop() is 
called which produce 5 nop instructions.

But it means the code is still generated when it is deoptimized. Which is odd. An other possibility is this space is 
incorrectly considered free and is used to generate new code when old code which is there is not deoptimized yet.

Regards,
Vladimir

[1] http://hg.openjdk.java.net/jdk/jdk/file/bbbcd90f0adb/src/hotspot/share/code/nmethod.cpp#l1193
[2] http://hg.openjdk.java.net/jdk/jdk/file/bbbcd90f0adb/src/hotspot/cpu/x86/nativeInst_x86.cpp#l541

[3] https://bugs.openjdk.java.net/browse/JDK-8043070
[4] https://bugs.openjdk.java.net/browse/JDK-8023037

[5] http://hg.openjdk.java.net/jdk/jdk/file/d4f3e37d1fda/src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp#l347


On 11/13/18 3:24 PM, Alexander Miloslavskiy wrote:
> 
> Hi,
> 
> I'm currently investigating a strange JVM problem and decided to ask here in hopes someone already saw such problem, or 
> is able to provide clues.
> 
> One of our customers report that our java application crashes often for him.
> I did quite a bit of debugging and got some facts:
> 1) Customer used Windows JRE x64 versions 8.0_144-b01, 10.0.1+10, 11.0.1+13-LTS. All of them crash with the same symptoms.
> 2) It crashes because java's NMethod executes a wild jmp, resulting in crazy RIP values such as 0xffffffff94bf7f80.
> 3) Example of corrupted jmp:
>     00000000`042b2ac0 e9bb549490      jmp     ffffffff`94bf7f80
> 4) I managed to understand that correct jmp should be:
>     00000000`042b2ac0 e9bb5494ff      jmp     00000000`03bf7f80
> 5) Correct jmp address points to 'RuntimeStub: wrong_method_stub'
> 6) Just one byte of jmp instruction is corrupted with 0x90. It's always 0x90, and always the same byte is corrupted.
> 7) The crash occurs soon after jvm compiles a new NMethod for the same Method.
> 8) The new NMethod is compiled with new optimization settings, usually (but not always) it's 'CompLevel_full_profile' 
> --> 'CompLevel_full_optimization'.
> 9) The crash always occurs in the old NMethod.
> 10) The reason why 'jmp' is there is because old NMethod was transitioned into 'non_entrant' state by 
> 'nmethod::make_not_entrant()'
> 11) Customer says he doesn't have any java-specific tools installed such as profilers, etc.
> 12) Customer provided around 20 crash logs and around 10 crash core dumps. This is just a portion of his crashes. All of 
> them exhibit the same problem.
> 13) Customer used Windows RAM test and it shown no errors. On the other hand, error is too specific to be a hardware 
> problem, I think: the last byte of jmp gets corrupted with 0x90 when a new NMethod is compiled...
> 14) I have verified customer's jvm.dll and it's not corrupted.
> 15) I have verified (using core dump) the value of 'SharedRuntime::get_handle_wrong_method_stub()' and it's not corrupted.
> 16) In every core dump, only a single NMethod is corrupted.


More information about the hotspot-compiler-dev mailing list