8u152 sigsegv in CounterDecay::do_method during VMThead safepoint processing
Vitaly Davidovich
vitalyd at gmail.com
Fri Feb 2 02:25:58 UTC 2018
On Thu, Feb 1, 2018 at 7:56 PM Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:
> I filed https://bugs.openjdk.java.net/browse/JDK-8196624
Thanks!
> <https://bugs.openjdk.java.net/browse/JDK-8196624>
>
> > Do you think that 9 fix could be backported to 8 in case it’s the same
> > issue?
>
> It would greatly help if you could build jdk8u with 8042727 fix and verify
> it. It would be good
> justification for backporting. Otherwise someone in sustaining group have
> to investigate this problem.
I’ll see what I can do. We don’t have any experience building hotspot from
source but I’ll see how much enthusiasm I can garner :).
>
>
> Could you build fastdebug version and run with it?
> I assume you don't have a test which we can use to verify. Right? It would
> help if you do have it.
Yeah, no repro unfortunately. In fact, it’s not easily reproducible in the
real system either. That makes sense since this seems like a race or
timing bug where quite a few things have to come together to trigger it
(presumably). Gotta love these types of bugs ...
>
>
> Thanks,
> Vladimir
>
> On 2/1/18 4:00 PM, Vitaly Davidovich wrote:
> > On Thu, Feb 1, 2018 at 5:59 PM Vladimir Kozlov <
> vladimir.kozlov at oracle.com>
> > wrote:
> >
> >> Hi Vitaly,
> >>
> >> I would suggest to file a bug. I looked through our bugs DB and did not
> >> find anything similar except
> >> 8156721 which you pointed.
> >
> > Hi Vladimir,
> >
> > Thanks for following up on this. Would you like me to file a bug or did
> > you mean someone on the hotspot team?
> >
> >>
> >>
> >> Based on disassembler the problem happened on first instruction:
> >>
> >> static void do_method(Method* m) {
> >> MethodCounters* mcs = m->method_counters();
> >>
> >> 0x82000000: mov 0x18(%rdi),%rcx
> >> 0x82000004: push %rbp
> >> 0x82000005: mov %rsp,%rbp
> >> 0x82000008: test %rcx,%rcx
> >> 0x8200000b: je 0x82000035
> >>
> >> RDI=0x0000001a00190005
> >>
> >> Which means Method* m pointer is corrupted/incorrect (but not 0).
> >
> > Indeed. Doesn’t even look like a pointer at all with that 5 in there.
> >
> >>
> >>
> >> CounterDecay::do_method() is called from InstanceKlass::methods_do()
> which
> >> has a fix in JDK 9 to
> >> process only loaded classes:
> >>
> >> https://bugs.openjdk.java.net/browse/JDK-8042727
> >>
> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2c2aa6de8f60#l4.6
> >>
> >> That is the only related changes I found. May be it is a different
> >> problem. CCing to runtime group.
> >
> > Thanks for the JDK9 pointer. We’re slowly trying to get to 9 but it’s
> > taking a long time and a lot of effort for a variety of reasons.
> >
> > Do you think that 9 fix could be backported to 8 in case it’s the same
> > issue?
> >
> >>
> >>
> >> Regards,
> >> Vladimir
> >>
> >> On 1/22/18 7:36 AM, Vitaly Davidovich wrote:
> >>> Hi all,
> >>>
> >>> Are there any known issues with this method crashing the JVM? Here's a
> >> (slightly redacted) snippet
> >>> from the hs_err log:
> >>>
> >>> #____
> >>>
> >>> # A fatal error has been detected by the Java Runtime Environment:____
> >>>
> >>> #____
> >>>
> >>> # SIGSEGV (0xb) at pc=0x00002b14765b7210, pid=140880,
> >> tid=0x00002b149a643700____
> >>>
> >>> #____
> >>>
> >>> # JRE version: Java(TM) SE Runtime Environment (8.0_152-b16) (build
> >> 1.8.0_152-b16)____
> >>>
> >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.152-b16 mixed mode
> >> linux-amd64 compressed oops)____
> >>>
> >>> # Problematic frame:____
> >>>
> >>> # V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____
> >>>
> >>> #____
> >>>
> >>> # Core dump written. Default location: <path> or core.140880____
> >>>
> >>> #____
> >>>
> >>> # If you would like to submit a bug report, please visit:____
> >>>
> >>> # http://bugreport.java.com/bugreport/crash.jsp <
> >> http://bugreport.java.com/bugreport/crash.jsp>____
> >>>
> >>> #____
> >>>
> >>> __ __
> >>>
> >>> --------------- T H R E A D ---------------____
> >>>
> >>> __ __
> >>>
> >>> Current thread (0x00002b147cb12800): VMThread [stack:
> >> 0x00002b149a543000,0x00002b149a644000]
> >>> [id=140909]____
> >>>
> >>> __ __
> >>>
> >>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr:
> >> 0x0000001a0019001d____
> >>>
> >>> __ __
> >>>
> >>> Registers:____
> >>>
> >>> RAX=0x0000000000000000, RBX=0x0000000000000001, RCX=0x00002b156839ca18,
> >> RDX=0x00002b156799fc68____
> >>>
> >>> RSP=0x00002b149a6429b8, RBP=0x00002b149a6429e0, RSI=0x00002b14765b7210,
> >> RDI=0x0000001a00190005____
> >>>
> >>> R8 =0x0000000000000010, R9 =0x0000000000000001, R10=0x0000000000000000,
> >> R11=0x0000000000000001____
> >>>
> >>> R12=0x0000000000000007, R13=0x00000007c03c8428, R14=0x00002b14765b7210,
> >> R15=0x0000000000000000____
> >>>
> >>> RIP=0x00002b14765b7210, EFLAGS=0x0000000000010202,
> >> CSGSFS=0x0000000000000033, ERR=0x0000000000000004____
> >>>
> >>> TRAPNO=0x000000000000000e____
> >>>
> >>> __ __
> >>>
> >>> Top of Stack: (sp=0x00002b149a6429b8)____
> >>>
> >>> 0x00002b149a6429b8: 00002b147675e83d 0000000000000060____
> >>>
> >>> 0x00002b149a6429c8: 00002b149a642a10 0000000000000000____
> >>>
> >>> 0x00002b149a6429d8: 0000000000000000 00002b149a642a00____
> >>>
> >>> 0x00002b149a6429e8: 00002b14765b3cd1 40590bbbbbbbbbbc____
> >>>
> >>> 0x00002b149a6429f8: 00002b14770db352 00002b149a642a50____
> >>>
> >>> 0x00002b149a642a08: 00002b1476adfbd6 00000000018f0100____
> >>>
> >>> 0x00002b149a642a18: 0000000000000000 00002b149a642a40____
> >>>
> >>> 0x00002b149a642a28: 00002b1476ba2100 431bde82d7b634db____
> >>>
> >>> 0x00002b149a642a38: 00002b14f9ce9800 431bde82d7b634db____
> >>>
> >>> 0x00002b149a642a48: 00002b14f9ce9800 00002b149a642b00____
> >>>
> >>> 0x00002b149a642a58: 00002b1476ae08a6 00002b14770e6140____
> >>>
> >>> 0x00002b149a642a68: 00002b149a642aa0 00002b147709ef83____
> >>>
> >>> 0x00002b149a642a78: 0000000000ae12f0 000000307cb12800____
> >>>
> >>> 0x00002b149a642a88: 0000000000000000 00000040000003e8____
> >>>
> >>> 0x00002b149a642a98: 0000001a0000001a 00002b14ce44b580____
> >>>
> >>> 0x00002b149a642aa8: 00002b1478ccda09 00002b1478cbb5d0____
> >>>
> >>> 0x00002b149a642ab8: 00002b1400000000 00002b14ce44b5d0____
> >>>
> >>> 0x00002b149a642ac8: 00002b14ce44b580 00002b14770db3d8____
> >>>
> >>> 0x00002b149a642ad8: 0000000000000000 0000000000000000____
> >>>
> >>> 0x00002b149a642ae8: 00002b14770db3d8 00002b147cb12800____
> >>>
> >>> 0x00002b149a642af8: 00002b14770e5950 00002b149a642ca0____
> >>>
> >>> 0x00002b149a642b08: 00002b1476bf22ef 00002b149a642b20____
> >>>
> >>> 0x00002b149a642b18: 00002b149a642c30 00002b149a642b28____
> >>>
> >>> 0x00002b149a642b28: 6e69747563657845 65706f204d562067____
> >>>
> >>> 0x00002b149a642b38: 203a6e6f69746172 6c6f43636e493147____
> >>>
> >>> 0x00002b149a642b48: 506e6f697463656c 6e6f640065737561____
> >>>
> >>> 0x00002b149a642b58: 6e6f64206e6f0065 0000000000000065____
> >>>
> >>> 0x00002b149a642b68: 0000001577100ce0 0000000000000000____
> >>>
> >>> 0x00002b149a642b78: 00002b14770ae164 00002b1476116e40____
> >>>
> >>> 0x00002b149a642b88: 0000000000000148 00002b147cb12800____
> >>>
> >>> 0x00002b149a642b98: 0000000000000002 00002b149a642c40____
> >>>
> >>> 0x00002b149a642ba8: 00002b1475e08a40 00002b149a543000____
> >>>
> >>> __ __
> >>>
> >>> Instructions: (pc=0x00002b14765b7210)____
> >>>
> >>> 0x00002b14765b71f0: 55 31 c0 48 89 e5 c9 c3 90 90 90 90 90 90 90
> 90____
> >>>
> >>> 0x00002b14765b7200: 55 b8 04 00 00 00 48 89 e5 c9 c3 90 90 90 90
> 90____
> >>>
> >>> 0x00002b14765b7210: 48 8b 4f 18 55 48 89 e5 48 85 c9 74 28 8b 51
> 08____
> >>>
> >>> 0x00002b14765b7220: 89 d0 c1 e8 03 89 c6 d1 fe 85 c0 7e 09 85 f6
> b8____
> >>>
> >>> __ __
> >>>
> >>> Register to memory mapping:____
> >>>
> >>> __ __
> >>>
> >>> RAX=0x0000000000000000 is an unknown value____
> >>>
> >>> RBX=0x0000000000000001 is an unknown value____
> >>>
> >>> RCX=0x00002b156839ca18 is an unknown value____
> >>>
> >>> RDX=0x00002b156799fc68 is pointing into metadata____
> >>>
> >>> RSP=0x00002b149a6429b8 is an unknown value____
> >>>
> >>> RBP=0x00002b149a6429e0 is an unknown value____
> >>>
> >>> RSI=0x00002b14765b7210: <offset 0x49c210> in
> >> <path>/jre/lib/amd64/server/libjvm.so at
> >>> 0x00002b147611b000____
> >>>
> >>> RDI=0x0000001a00190005 is an unknown value____
> >>>
> >>> R8 =0x0000000000000010 is an unknown value____
> >>>
> >>> R9 =0x0000000000000001 is an unknown value____
> >>>
> >>> R10=0x0000000000000000 is an unknown value____
> >>>
> >>> R11=0x0000000000000001 is an unknown value____
> >>>
> >>> R12=0x0000000000000007 is an unknown value____
> >>>
> >>> R13=0x00000007c03c8428 is pointing into metadata____
> >>>
> >>> R14=0x00002b14765b7210: <offset 0x49c210> in
> >> <path>/jre/lib/amd64/server/libjvm.so at
> >>> 0x00002b147611b000____
> >>>
> >>> R15=0x0000000000000000 is an unknown value____
> >>>
> >>> __ __
> >>>
> >>> __ __
> >>>
> >>> Stack: [0x00002b149a543000,0x00002b149a644000], sp=0x00002b149a6429b8,
> >> free space=1022k____
> >>>
> >>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
> >> C=native code)____
> >>>
> >>> V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____
> >>>
> >>> V [libjvm.so+0x498cd1]
> >> NonTieredCompPolicy::do_safepoint_work()+0x91____
> >>>
> >>> V [libjvm.so+0x9c4bd6]
> >> SafepointSynchronize::do_cleanup_tasks()+0x76____
> >>>
> >>> V [libjvm.so+0x9c58a6] SafepointSynchronize::begin()+0x406____
> >>>
> >>> V [libjvm.so+0xad72ef] VMThread::loop()+0x1bf____
> >>>
> >>> V [libjvm.so+0xad7770] VMThread::run()+0x70____
> >>>
> >>> V [libjvm.so+0x92d8d8] java_start(Thread*)+0x108____
> >>>
> >>> __ __
> >>>
> >>> VM_Operation (0x00002b15140011a0): G1IncCollectionPause, mode:
> >> safepoint, requested by thread
> >>> 0x00002b14f9bb8000
> >>>
> >>>
> >>> This is on a Debian Wheezy linux machine running Xeon Broadwell cores.
> >> The reason I mention this
> >>> part is a quick google did show
> >> https://bugs.openjdk.java.net/browse/JDK-8156721 but that JBS is for
> >>> a different platform (with an overclocked CPU, apparently) and it's
> >> marked Incomplete.
> >>>
> >>> This crash was observed on about 17 separate JVMs (different hosts) at
> >> about the same time, all
> >>> running the same application code after about 3 weeks of uptime.
> >>>
> >>> I can provide more details if you'd like but wanted to see if this is a
> >> known (but rarely witnessed)
> >>> bug.
> >>>
> >>> Thanks
> >>>
> >>
>
--
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20180202/67e58c95/attachment-0001.html>
More information about the hotspot-compiler-dev
mailing list