回复: JVM crashes constantly when High GC happens

Fri Apr 5 20:33:24 UTC 2019

Thanks Kim! Yes, I agree with you it should be caused by bad objects because of every crash always reported the same invalid address 0x000000008.  This issue only happens when running a tool called smartctl. It is hard to believe a process can corrupt other's process's heap. Only observation is when smartctl is running, a high IO usage is seen. The Linux IO may use a lot of memory for caching which puts the JVM process memory pressure. But still, I don't understand how it can cause other process's heap corruption.

I am very interested in finding patterns and instrumenting the JVM code. Can you explain a little or point me some wiki about "special patterns like 0xdeadbeef "?

Also, can you point me where is the code path I can instrument for the path for scanning and pushing object into queue.

Xinli

发送自 Outlook<http://aka.ms/weboutlook>

________________________________
发件人: Kim Barrett <kim.barrett at oracle.com>
发送时间: 2019年4月5日 12:15
收件人: shang xinli
抄送: hotspot-gc-dev at openjdk.java.net
主题: Re: JVM crashes constantly when High GC happens

> On Apr 4, 2019, at 10:16 PM, shang xinli <shangxinli at hotmail.com> wrote:
>
> Hi all,
>
> We hit crashes pretty constantly when the GC is high when using CMS GC. We switched to G1GC but it still crashes at the same places. It also crashes with the newest version of JDK. Anybody has a clue how to investigate why?

What's happening here is a segfault in oopDesc::size(), apparently
obtaining the klass() of the object. This suggests a corrupted object
or a NULL that was expected to be a real object. Since you report it
happens with different collectors (both CMS and G1), that suggests it
may not be a GC bug, but rather something else is corrupting the heap.

Try to examine the data involved with a debugger (either live or from
a core dump); there might be clues there, especially if there are any
special patterns like 0xdeadbeef and the like involved.

It looks like the bad object was obtained from the mark queue, which
suggests it's either in a GC root or some other object refers to it.
If there's any pattern to the bad value, and you can't find where it's
coming from any other way, you might try instrumenting the path that
scans objects and pushes what they reference onto the mark queue to
look for bad values.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20190405/1d2f5733/attachment.htm>