Segmentation Fault occurs when ClassLoader and Metaspace is released in JDK 8

Yasumasa Suenaga suenaga at oss.nttdata.com
Mon Oct 21 13:29:22 UTC 2019


Hi Osamu,

What JVM options did you pass?

I guess you used CMS because this problem seems to occur on CMS only [1] [2].
So it might be work around not to use CMS.

I'm not sure root cause of this issue, but it seems to break ClassLoaderDataGraph::_unloading.
(like double free (delete) of CLD)


Thanks,

Yasumasa


[1] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/classfile/classLoaderData.hpp#l100
[2] http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/file/eed8e846c982/src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp#l6384


On 2019/10/21 17:50, Osamu Sakamoto wrote:
> Hi all,
> 
> I have a problem about Segmentation Fault(SEGV) in GC and I can't make the cause clear.
> Could you help me solve the problem?
> 
> Our System uses OpenJDK 1.8.0.181, and crashed by SEGV when purging ClassLoader at safepoint.
> This problem can't be reproduced, but this has happened 4 times in a few months.
> 
> The following is the summary of my investigation.
> 
> =============================================================================
> 
> First I checked hs_err, and that shows that the SEGV occurred.
> VM_Operation is GenCollectForAllocation at safepoint.
> 
> -----------------------------------------------------------------------------
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f6080c97f88, pid=23931, tid=0x00007f607c3ed700
> #
> # JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
> # Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
> # Problematic frame:
> # V  [libjvm.so+0x84bf88]
> #
> # Core dump written. Default location: /opt/tomcate0/core or core.23931
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> 
> ---------------  T H R E A D  ---------------
> 
> Current thread (0x00007f6078c00000):  VMThread [stack: 0x00007f607c2ed000,0x00007f607c3ee000] [id=23939]
> 
> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000018
> 
> Registers:
> RAX=0x0000000000000010, RBX=0x00007f5ff800ad30, RCX=0x0000000000000010, RDX=0x0000000000000000
> RSP=0x00007f607c3ecb50, RBP=0x00007f607c3ecb80, RSI=0x0000000000000002, RDI=0x0000000001cfe570
> R8 =0x00007f5ff80ae320, R9 =0x00007f5ff8052480, R10=0x0000000000000000, R11=0x0000000000000400
> R12=0x0000000001cfe570, R13=0x00007f6081419470, R14=0x0000000000000002, R15=0x00007f6081418640
> RIP=0x00007f6080c97f88, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000004
>    TRAPNO=0x000000000000000e
> 
> Top of Stack: (sp=0x00007f607c3ecb50)
> 0x00007f607c3ecb50:   00007f607c3ecba0 00007f5ff800ad30
> 0x00007f607c3ecb60:   00007f5ff800ad00 0000000000000000
> 0x00007f607c3ecb70:   0000000000000000 0000000000000001
> 0x00007f607c3ecb80:   00007f607c3ecba0 00007f6080c995fa
> 0x00007f607c3ecb90:   00007f5ff800ad00 00007f5ff800ac20
> 0x00007f607c3ecba0:   00007f607c3ecbc0 00007f60808bff5e
> 0x00007f607c3ecbb0:   00007f5ff800ac20 00007f5ff8052870
> 0x00007f607c3ecbc0:   00007f607c3ecbe0 00007f60808c0f0f
> 0x00007f607c3ecbd0:   00007f607c3ecbf0 00007f608140f308
> 0x00007f607c3ecbe0:   00007f607c3ecc30 00007f6080daa0b7
> 0x00007f607c3ecbf0:   00007f6069000100 0000000000000000
> 0x00007f607c3ecc00:   00007f607c3ecc20 00007f6080ed0800
> 0x00007f607c3ecc10:   00000000000000f9 88e95c3ba257ab00
> 0x00007f607c3ecc20:   431bde82d7b634db 00007f607800aa00
> 0x00007f607c3ecc30:   00007f607c3eccc0 00007f6080daa9d5
> 0x00007f607c3ecc40:   0000000000000000 00007f607803bf20
> 0x00007f607c3ecc50:   00007f607803be20 00000000000003e8
> 0x00007f607c3ecc60:   0000000000000001 00007f6078c00000
> 0x00007f607c3ecc70:   00007f607c3eccc0 0000000000000000
> 0x00007f607c3ecc80:   00000004000000f9 00007f60813e2b99
> 0x00007f607c3ecc90:   00007f607803bfa0 00007f6078c00000
> 0x00007f607c3ecca0:   0000000000000000 0000000000000000
> 0x00007f607c3eccb0:   00007f6081418bd0 00007f607803bf20
> 0x00007f607c3eccc0:   00007f607c3ece60 00007f6080f2048a
> 0x00007f607c3eccd0:   00007f607c3ecd20 00007f607c3ecce0
> 0x00007f607c3ecce0:   00007f6078c00000 00007f6078c00980
> 0x00007f607c3eccf0:   00007f6078c009c0 00007f6078c009d0
> 0x00007f607c3ecd00:   00007f6078c00aa8 00000000000000d8
> 0x00007f607c3ecd10:   00007f6078c00be0 0000000000000000
> 0x00007f607c3ecd20:   00007f607c3ecd28 6e69747563657845
> 0x00007f607c3ecd30:   65706f204d562067 203a6e6f69746172
> 0x00007f607c3ecd40:   656c6c6f436e6547 6c6c41726f467463
> 
> Instructions: (pc=0x00007f6080c97f88)
> 0x00007f6080c97f68:   b6 12 80 fa 00 74 01 f0 48 0f c1 01 31 c9 31 f6
> 0x00007f6080c97f78:   48 8b 44 0b 10 31 d2 48 85 c0 74 11 0f 1f 40 00
> 0x00007f6080c97f88:   48 8b 40 08 48 83 c2 01 48 85 c0 75 f3 48 83 c1
> 0x00007f6080c97f98:   08 48 01 d6 48 83 f9 20 75 d6 8b 7b 08 48 8b 05
> 
> Register to memory mapping:
> 
> RAX=0x0000000000000010 is an unknown value
> RBX=0x00007f5ff800ad30 is an unknown value
> RCX=0x0000000000000010 is an unknown value
> RDX=0x0000000000000000 is an unknown value
> RSP=0x00007f607c3ecb50 is an unknown value
> RBP=0x00007f607c3ecb80 is an unknown value
> RSI=0x0000000000000002 is an unknown value
> RDI=0x0000000001cfe570 is an unknown value
> R8 =0x00007f5ff80ae320 is an unknown value
> R9 =0x00007f5ff8052480 is an unknown value
> R10=0x0000000000000000 is an unknown value
> R11=0x0000000000000400 is an unknown value
> R12=0x0000000001cfe570 is an unknown value
> R13=0x00007f6081419470: <offset 0xfcd470> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
> R14=0x0000000000000002 is an unknown value
> R15=0x00007f6081418640: <offset 0xfcc640> in /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/jre/lib/amd64/server/libjvm.so at 0x00007f608044c000
> 
> 
> Stack: [0x00007f607c2ed000,0x00007f607c3ee000], sp=0x00007f607c3ecb50, free space=1022k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x84bf88]
> V  [libjvm.so+0x84d5fa]
> V  [libjvm.so+0x473f5e]
> V  [libjvm.so+0x474f0f]
> V  [libjvm.so+0x95e0b7]
> V  [libjvm.so+0x95e9d5]
> V  [libjvm.so+0xad448a]
> V  [libjvm.so+0xad48f1]
> V  [libjvm.so+0x8beb82]
> 
> VM_Operation (0x00007f5fd69e6120): GenCollectForAllocation, mode: safepoint, requested by thread 0x00007f6079013800
> 
> ...
> -----------------------------------------------------------------------------
> 
> 
> 
> Next, I used GDB to check the backtrace of the SEGV thread from the coredump.
> The following is the backtrace.
> The SEGV occurred when ClassLoader is purged and Metaspace is destructed.
> And frame #7 shows that a signal(SEGV) handler is called after SpaceManager::~SpaceManager() is executed.
> 
> -----------------------------------------------------------------------------
> (gdb) bt
> #0  0x00007f608146f1f7 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x00007f60814708e8 in __GI_abort () at abort.c:90
> #2  0x00007f6080d0bc39 in os::abort (dump_core=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1519
> #3  0x00007f6080f1b816 in VMError::report_and_die (this=this at entry=0x7f607c3ebd10) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:1060
> #4  0x00007f6080d15927 in JVM_handle_linux_signal (sig=11, info=0x7f607c3ebfb0, ucVoid=0x7f607c3ebe80, abort_if_unrecognized=<optimized out>)
>      at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
> #5  0x00007f6080d09038 in signalHandler (sig=11, info=0x7f607c3ebfb0, uc=0x7f607c3ebe80) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:4446
> #6  <signal handler called>
> #7  SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
> #8  0x00007f6080c995fa in Metaspace::~Metaspace (this=0x7f5ff800ad00, __in_chrg=<optimized out>)
>      at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2971
> #9  0x00007f60808bff5e in ClassLoaderData::~ClassLoaderData (this=0x7f5ff800ac20, __in_chrg=<optimized out>)
>      at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:383
> #10 0x00007f60808c0f0f in ClassLoaderDataGraph::purge () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.cpp:818
> #11 0x00007f6080daa0b7 in ClassLoaderDataGraph::purge_if_needed () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/classfile/classLoaderData.hpp:104
> #12 SafepointSynchronize::do_cleanup_tasks () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:551
> #13 0x00007f6080daa9d5 in SafepointSynchronize::begin () at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/safepoint.cpp:402
> #14 0x00007f6080f2048a in VMThread::loop (this=this at entry=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:501
> #15 0x00007f6080f208f1 in VMThread::run (this=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/runtime/vmThread.cpp:276
> #16 0x00007f6080d0ab82 in java_start (thread=0x7f6078c00000) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:796
> #17 0x00007f6081e2de25 in start_thread (arg=0x7f607c3ed700) at pthread_create.c:308
> #18 0x00007f608153234d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> -----------------------------------------------------------------------------
> 
> 
> In Frame #7, Line 2028 (chunk = chunk->next()) is the crash point.
> The variable "chunk" is defined at Line 2025 (Metachunk* chunk = chunks_in_use(i);).
> "chunks_in_use(i)" is defined at Line 648 (Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }).
> So I checked values of "_chunks_in_use", and understood that "_chunks_in_use[2]" has Illegal Address "0x10".
> Therefore, I think that the SEGV occurred because of referencing Illegal Address "0x10" at "chunk = chunk->next()".
> 
> -----------------------------------------------------------------------------
> (gdb) f 7
> #7  SpaceManager::~SpaceManager (this=0x7f5ff800ad30, __in_chrg=<optimized out>) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.181-3.b13.el7_5.x86_64/openjdk/hotspot/src/share/vm/memory/metaspace.cpp:2028
> 2028        chunk = chunk->next();
> (gdb) list
> 2023    size_t SpaceManager::sum_count_in_chunks_in_use(ChunkIndex i) {
> 2024      size_t count = 0;
> 2025      Metachunk* chunk = chunks_in_use(i);
> 2026      while (chunk != NULL) {
> 2027        count++;
> 2028        chunk = chunk->next();
> 2029      }
> 2030      return count;
> 2031    }
> 2032
> (gdb) list SpaceManager::chunks_in_use
> 647      // Accessors
> 648      Metachunk* chunks_in_use(ChunkIndex index) const { return _chunks_in_use[index]; }
> ...
> (gdb) p _chunks_in_use
> $11 = {0x7f5fcd41c400, 0x7f5fcd41a000, 0x10, 0x0}
> -----------------------------------------------------------------------------
> 
> 
> 
> The following is disassemble code of "SpaceManager::~SpaceManager()".
> %rax has 0x10 at "0x00007f6080c97f88 <+200>", but I don't understand why this "0x10" is inserted to %rax.
> 
> -----------------------------------------------------------------------------
> (gdb) disas
> Dump of assembler code for function SpaceManager::~SpaceManager():
>     0x00007f6080c97ec0 <+0>:    push   %rbp
>     0x00007f6080c97ec1 <+1>:    mov    %rsp,%rbp
>     0x00007f6080c97ec4 <+4>:    push   %r15
>     0x00007f6080c97ec6 <+6>:    push   %r14
>     0x00007f6080c97ec8 <+8>:    push   %r13
>     0x00007f6080c97eca <+10>:    push   %r12
>     0x00007f6080c97ecc <+12>:    push   %rbx
>     0x00007f6080c97ecd <+13>:    mov    %rdi,%rbx
>     0x00007f6080c97ed0 <+16>:    sub    $0x8,%rsp
>     0x00007f6080c97ed4 <+20>:    mov 0x780785(%rip),%r12        # 0x7f6081418660 <_ZN12SpaceManager12_expand_lockE>
>     0x00007f6080c97edb <+27>:    test   %r12,%r12
>     0x00007f6080c97ede <+30>:    je     0x7f6080c97ee8 <SpaceManager::~SpaceManager()+40>
>     0x00007f6080c97ee0 <+32>:    mov    %r12,%rdi
>     0x00007f6080c97ee3 <+35>:    callq  0x7f6080cce2f0 <Monitor::lock_without_safepoint_check()>
>     0x00007f6080c97ee8 <+40>:    movslq 0x8(%rbx),%rcx
>     0x00007f6080c97eec <+44>:    lea 0x78075d(%rip),%rdx        # 0x7f6081418650 <_ZN12MetaspaceAux15_capacity_wordsE>
>     0x00007f6080c97ef3 <+51>:    lea 0x781576(%rip),%r13        # 0x7f6081419470 <_ZN2os16_processor_countE>
>     0x00007f6080c97efa <+58>:    lea 0x78073f(%rip),%r15        # 0x7f6081418640 <_ZN12MetaspaceAux11_used_wordsE>
>     0x00007f6080c97f01 <+65>:    mov    (%rdx,%rcx,8),%rax
>     0x00007f6080c97f05 <+69>:    sub    0x40(%rbx),%rax
>     0x00007f6080c97f09 <+73>:    mov    %rax,(%rdx,%rcx,8)
>     0x00007f6080c97f0d <+77>:    mov    0x38(%rbx),%rax
>     0x00007f6080c97f11 <+81>:    movslq 0x8(%rbx),%rdx
>     0x00007f6080c97f15 <+85>:    neg    %rax
>     0x00007f6080c97f18 <+88>:    cmpl   $0x1,0x0(%r13)
>     0x00007f6080c97f1d <+93>:    lea    (%r15,%rdx,8),%rcx
>     0x00007f6080c97f21 <+97>:    mov    $0x1,%edx
>     0x00007f6080c97f26 <+102>:    jne    0x7f6080c97f32 <SpaceManager::~SpaceManager()+114>
>     0x00007f6080c97f28 <+104>:    lea 0x74acb4(%rip),%rdx        # 0x7f60813e2be3 <AssumeMP>
>     0x00007f6080c97f2f <+111>:    movzbl (%rdx),%edx
>     0x00007f6080c97f32 <+114>:    cmp    $0x0,%dl
>     0x00007f6080c97f35 <+117>:    je     0x7f6080c97f38 <SpaceManager::~SpaceManager()+120>
>     0x00007f6080c97f37 <+119>:    lock xadd %rax,(%rcx)
>     0x00007f6080c97f3c <+124>:    mov    0x48(%rbx),%r14
>     0x00007f6080c97f40 <+128>:    callq  0x7f6080c951a0 <Metachunk::overhead()>
>     0x00007f6080c97f45 <+133>:    movslq 0x8(%rbx),%rdx
>     0x00007f6080c97f49 <+137>:    imul   %r14,%rax
>     0x00007f6080c97f4d <+141>:    lea    (%r15,%rdx,8),%rcx
>     0x00007f6080c97f51 <+145>:    mov    $0x1,%edx
>     0x00007f6080c97f56 <+150>:    neg    %rax
>     0x00007f6080c97f59 <+153>:    cmpl   $0x1,0x0(%r13)
>     0x00007f6080c97f5e <+158>:    jne    0x7f6080c97f6a <SpaceManager::~SpaceManager()+170>
>     0x00007f6080c97f60 <+160>:    lea 0x74ac7c(%rip),%rdx        # 0x7f60813e2be3 <AssumeMP>
>     0x00007f6080c97f67 <+167>:    movzbl (%rdx),%edx
>     0x00007f6080c97f6a <+170>:    cmp    $0x0,%dl
>     0x00007f6080c97f6d <+173>:    je     0x7f6080c97f70 <SpaceManager::~SpaceManager()+176>
>     0x00007f6080c97f6f <+175>:    lock xadd %rax,(%rcx)
>     0x00007f6080c97f74 <+180>:    xor    %ecx,%ecx
>     0x00007f6080c97f76 <+182>:    xor    %esi,%esi
>     0x00007f6080c97f78 <+184>:    mov 0x10(%rbx,%rcx,1),%rax
>     0x00007f6080c97f7d <+189>:    xor    %edx,%edx
>     0x00007f6080c97f7f <+191>:    test   %rax,%rax
>     0x00007f6080c97f82 <+194>:    je     0x7f6080c97f95 <SpaceManager::~SpaceManager()+213>
>     0x00007f6080c97f84 <+196>:    nopl   0x0(%rax)
> => 0x00007f6080c97f88 <+200>:    mov    0x8(%rax),%rax
>     0x00007f6080c97f8c <+204>:    add    $0x1,%rdx
>     0x00007f6080c97f90 <+208>:    test   %rax,%rax
> ...
> (gdb) info registers
> rax            0x10    16
> rbx            0x7f5ff800ad30    140050159414576
> rcx            0x10    16
> rdx            0x0    0
> rsi            0x2    2
> rdi            0x1cfe570    30401904
> rbp            0x7f607c3ecb80    0x7f607c3ecb80
> rsp            0x7f607c3ecb50    0x7f607c3ecb50
> r8             0x7f5ff80ae320    140050160083744
> r9             0x7f5ff8052480    140050159707264
> r10            0x0    0
> r11            0x400    1024
> r12            0x1cfe570    30401904
> r13            0x7f6081419470    140052462146672
> r14            0x2    2
> r15            0x7f6081418640    140052462143040
> rip            0x7f6080c97f88    0x7f6080c97f88 <SpaceManager::~SpaceManager()+200>
> eflags         0x206    [ PF IF ]
> cs             0x33    51
> ss             0x2b    43
> ds             0x0    0
> es             0x0    0
> fs             0x0    0
> gs             0x0    0
> k0             <unavailable>
> k1             <unavailable>
> k2             <unavailable>
> k3             <unavailable>
> k4             <unavailable>
> k5             <unavailable>
> k6             <unavailable>
> k7             <unavailable>
> -----------------------------------------------------------------------------
> 
> =============================================================================
> 
> 
> 
> Does anyone know about this case?
> 
> Thanks, Osamu
> 
> 



More information about the hotspot-gc-dev mailing list