My bad. I hit "reply" instead of "reply all" on that older thread so my follow-ups didn't show up in the list. I'm including the original mail below. Anyway, it wasn't fixed here, but we don't see a reproduce any more (on both 6u23 and 6u25, 64-bit Server VM), so we're just letting it slip through. One possibility is that we're switching more and more to CMS, and the problem occurred in ParallelScavange.<div>
<br></div><div>The original mail:</div><div><br></div><div><meta http-equiv="content-type" content="text/html; charset=utf-8"><br><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Y. Srinivas Ramakrishna</b> <span dir="ltr"><<a href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a>></span><br>
Date: Mon, Apr 18, 2011 at 11:31 PM<br>Subject: Re: Crash log when do GC...<br>To: Krystal Mok <<a href="mailto:rednaxelafx@gmail.com">rednaxelafx@gmail.com</a>><br><br><br>i wonder if it's an issue with array copy stubs which leave random<br>
junk in some locations of the array, or if there's a race that causes<br>some locations to transiently have bad data. Seems unlikely, but the<br>involvement of object arrays raises some suspicions. I'll see if any<br>
array copying bugs have surfaced or been fixed recently although none<br>comes readily to mind...<br><br>PS: if it's production runs, you won't be able to use heap verification,<br>but if you have a test load that reproduces the problem, may be<br>
heap verification might give us some clues (although given the nature of<br>the problem, I am not hopeful). If you have a support contract,<br>I'd suggest filing an official ticket and sending in a couple of core<br>files, if you have any sitting around. That may be the only way to<br>
make progress on this kind of issue.<br><br>-- ramki<div><div></div><div class="h5"><br><br>On 4/18/2011 8:16 AM, Krystal Mok wrote:<br><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
Hi,<br><br>I wasn't able to make a minimal repro to this problem, because it seem to<br>happen pretty randomly, running fine for 9 to 15 hours before suddenly<br>crashing with a segfault.<br>It's already running JDK6u23, and there doesn't seem to be a lot of changes<br>
to HotSpot that got into JDK6u24, so I doubt if there would be any progress<br>upgrading to this version. Might try JDK6u25b03 and see if there's any luck.<br><br>Attached with this email is another crash log on the same issue. The program<br>
had a lot of threads, and crashes with this stack trace:<br><br>Stack: [0x0000000000000000,<u></u>0x0000000000000000],  sp=0x0000000041f8a810,<br> free space=1080874k<br>Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native<br>
code)<br>V  [libjvm.so+0x3e62c3]<void ParScanClosure::do_oop_work<<u></u>unsigned<br>int>(unsigned int*, bool, bool)+0x63><br>V  [libjvm.so+0x60bc83]<<u></u>objArrayKlass::oop_oop_<u></u>iterate_nv(oopDesc*,<br>
ParScanWithoutBarrierClosure*)<u></u>+0xf3><br>V  [libjvm.so+0x6318d4]<<u></u>ParScanThreadState::trim_<u></u>queues(int)+0x124><br>V  [libjvm.so+0x3e61c5]<void<br>ParScanClosure::do_oop_work<<u></u>oopDesc*>(oopDesc**, bool, bool)+0x105><br>
V  [libjvm.so+0x632260]<br><<u></u>ParRootScanWithoutBarrierClosu<u></u>re::do_oop(oopDesc**)+0x10><br>V  [libjvm.so+0x3702b1]<<u></u>InterpreterFrameClosure::<u></u>offset_do(int)+0x31><br>V  [libjvm.so+0x619776]<br>
<InterpreterOopMap::iterate_<u></u>oop(OffsetClosure*)+0x86><br>V  [libjvm.so+0x36efd8]<frame::<u></u>oops_interpreted_do(<u></u>OopClosure*, RegisterMap<br>const*, bool)+0x188><br>V  [libjvm.so+0x36fd71]<frame::<u></u>oops_do_internal(OopClosure*,<br>
CodeBlobClosure*, RegisterMap*, bool)+0xb1><br>V  [libjvm.so+0x728fb3]<<u></u>JavaThread::oops_do(<u></u>OopClosure*,<br>CodeBlobClosure*)+0x1d3><br>V  [libjvm.so+0x72bc9e]<Threads::<u></u>possibly_parallel_oops_do(<u></u>OopClosure*,<br>
CodeBlobClosure*)+0xbe><br>V  [libjvm.so+0x69572e]<<u></u>SharedHeap::process_strong_<u></u>roots(bool, bool,<br>SharedHeap::ScanningOption, OopClosure*, CodeBlobClosure*,<br>OopsInGenClosure*)+0x8e><br>V  [libjvm.so+0x39d75d]<<u></u>GenCollectedHeap::gen_process_<u></u>strong_roots(int,<br>
bool, bool, bool, SharedHeap::ScanningOption, OopsInGenClosure*, bool,<br>OopsInGenClosure*)+0x7d><br>V  [libjvm.so+0x6325f6]<<u></u>ParNewGenTask::work(int)+0xd6><br>V  [libjvm.so+0x78018d]<<u></u>GangWorker::loop()+0xaa><br>
V  [libjvm.so+0x7800a4]<<u></u>GangWorker::run()+0x24><br>V  [libjvm.so+0x623e1f]<java_<u></u>start(Thread*)+0x13f><br><br>JavaThread 0x00002aaab7692800 (nid = 8559) was being processed<br>Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)<br>
j  java.lang.reflect.Array.set(<u></u>Ljava/lang/Object;ILjava/lang/<u></u>Object;)V+0<br>J<br> com.taobao.top.core.<u></u>DefaultBlackBoxEngine.callHsf(<u></u>Ljava/lang/String;Ljava/lang/<u></u>String;Ljava/lang/Long;Lcom/<u></u>taobao/hsf/app/spring/util/<u></u>SuperHSFSpringConsumerBeanTop;<u></u>[Ljava/lang/String;[Ljava/<u></u>lang/Object;Lcom/taobao/top/<u></u>core/framework/TopPipeResult;)<u></u>Ljava/lang/Object;<br>
J<br> com.taobao.top.core.<u></u>DefaultApiExecutor.execute(<u></u>Lcom/taobao/top/core/<u></u>framework/TopPipeInput;Lcom/<u></u>taobao/top/core/framework/<u></u>TopPipeResult;)V<br>J  com.taobao.top.core.framework.<u></u>TopPipeTask.run()V<br>
J  java.util.concurrent.<u></u>Executors$RunnableAdapter.<u></u>call()Ljava/lang/Object;<br>J  java.util.concurrent.<u></u>FutureTask.run()V<br>J<br> java.util.concurrent.<u></u>ThreadPoolExecutor$Worker.<u></u>runTask(Ljava/lang/Runnable;)V<br>
J  java.util.concurrent.<u></u>ThreadPoolExecutor$Worker.run(<u></u>)V<br>j  java.lang.Thread.run()V+11<br>v  ~StubRoutines::call_stub<br><br>What's weird about it is that this program would repeatedly crash in the<br>
same function in ParNew GC, and that the JavaThread it's working on was in<br>an invocation to java.lang.reflect.Array.set(). In this case it's trying to<br>dereference off a bad pointer decompressed from a narrowOop, but it's hard<br>
to trace just where things went wrong at the beginning.<br><br>We'll see if it's affordable to turn on heap verification to trace it down.<br><br>Sincerely,<br>Kris Mok<br><br>On Mon, Apr 18, 2011 at 10:58 PM, Y. Srinivas Ramakrishna<<br>
<a href="mailto:y.s.ramakrishna@oracle.com" target="_blank">y.s.ramakrishna@oracle.com</a>>  wrote:<br><br><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
Hi, i have heard a couple of other reports of this sort recently.<br>But i don't think we have found or fixed any issue recently that<br>might address this. You might want to try a more recent<br>JVM/JDK to confirm if the crash still occurs (which i think<br>
it probably will, going by other such reports). Do you have<br>a test case? If so, please file a bug through support or send<br>us your test case off-line. You can also enable heap verification<br>at some considerable GC performance cost and see if that gets us<br>
closer to the root cause. (From looking at the stack retrace it appears<br>as though GC finds a bad reference from an object array while copying<br>live objects from the young generation during a scavenge.)<br><br>-- ramki<br>
<br><br><br>On 4/18/2011 6:48 AM, BlueDavy Lin wrote:<br><br><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
hi!<br><br>      Rencently our two app often crash when do gc,the crash log<br>attached,can someone give me some advice? thks.<br><br>      ps: I tried to set -XX:-UseCompressedOops,but still crash,and<br>log is the same.<br>
<br><br></blockquote><br></blockquote><br></blockquote><br></div></div></div><br><div class="gmail_quote">On Thu, Sep 8, 2011 at 12:06 AM, Ramki Ramakrishna <span dir="ltr"><<a href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><u></u>

  
    
  
  <div bgcolor="#ffffff" text="#000000">
    I didn't see any follow-up on the issue reported at:-<div class="im"><br>
    <br>
     <a href="http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html" target="_blank">http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html</a><br>
    <br></div>
    so I do not know if that issue ever got satisfactorily resolved. I
    don't think<br>
    there are any open bugs in our database for that issue. If there's a
    test-case we<br>
    can take a look.<br>
    <br>
    thanks.<br><font color="#888888">
    -- ramki</font><div><div></div><div class="h5"><br>
    <br>
    On 9/7/2011 4:36 AM, Krystal Mok wrote:
    <blockquote type="cite">CC'ing hotspot-gc-dev for the first stack trace<br>
      <br>
      <div class="gmail_quote">---------- Forwarded message ----------<br>
        From: <b class="gmail_sendername">Krystal Mok</b> <span dir="ltr"><<a href="mailto:rednaxelafx@gmail.com" target="_blank">rednaxelafx@gmail.com</a>></span><br>
        Date: Wed, Sep 7, 2011 at 7:35 PM<br>
        Subject: Re: JVM crash HS machine<br>
        To: yogesh <<a href="mailto:ydhaked@amdocs.com" target="_blank">ydhaked@amdocs.com</a>><br>
        <br>
        <br>
        Hi,
        <div><br>
        </div>
        <div>I don't think the two stack traces shown here are of the
          same issue. The first one (the one in quotes) seem to be the
          same as one mentioned before: <a href="http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html" target="_blank">http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html</a> ,
          but no solutions yet (to my knowledge).</div>
        <div><br>
        </div>
        <div>The second stack trace is missing some very important
          stuff. It's important to know the caller of the operator new,
          which means a deeper stack trace log would help; without that
          it's quite hard to infer any context out of the stack trace.
          It'd also be helpful to know what signal it was.</div>
        <div><br>
        </div>
        <div>Regards,</div>
        <div>Kris Mok
          <div>
            <div><br>
              <br>
              <div class="gmail_quote">On Wed, Sep 7, 2011 at 7:06 PM,
                yogesh <span dir="ltr"><<a href="mailto:ydhaked@amdocs.com" target="_blank">ydhaked@amdocs.com</a>></span>
                wrote:<br>
                <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
                  Igor Shprukh <a href="mailto:igor.shprukh@..." target="_blank"><igor.shprukh@...></a> writes:<br>
                  <br>
                  ><br>
                  > I have attached the hs log file.<br>
                  > The JVM continuously crashes every two hours.<br>
                  > Thank You!<br>
                  > -----Original Message-----<br>
                  > From: Dmitry Samersoff [mailto:<a href="mailto:Dmitry.Samersoff" target="_blank">Dmitry.Samersoff</a>
                  <at> <a href="http://oracle.com" target="_blank">oracle.com</a>]<br>
                  > Sent: Sunday, April 17, 2011 4:53 PM<br>
                  > To: Igor Shprukh<br>
                  > Cc: hotspot-runtime-dev <at> <a href="http://openjdk.java.net" target="_blank">openjdk.java.net</a><br>
                  > Subject: Re: JVM crash HS machine<br>
                  ><br>
                  > Igor,<br>
                  ><br>
                  > Please, send across full hs_err_*.log<br>
                  ><br>
                  > -Dmitry<br>
                  ><br>
                  > On 2011-04-17 17:23, Igor Shprukh wrote:<br>
                  > > *Hi all, I have the following error after
                  the running the JVM for about<br>
                  > > 5 hrs.*<br>
                  > ><br>
                  > > *This is linux – amd 64bit machine with 16
                  proccesors.*<br>
                  > ><br>
                  > > *The crash is at the GC, do you have any
                  ideas on the cause ?*<br>
                  > ><br>
                  > > **<br>
                  > ><br>
                  > > *Thank You !*<br>
                  > ><br>
                  > > Program terminated with signal 6, Aborted.<br>
                  > ><br>
                  > > #0 0x00000035b2430265 in raise () from
                  /lib64/libc.so.6<br>
                  > ><br>
                  > > (gdb) bt<br>
                  > ><br>
                  > > #0 0x00000035b2430265 in raise () from
                  /lib64/libc.so.6<br>
                  > ><br>
                  > > #1 0x00000035b2431d10 in abort () from
                  /lib64/libc.so.6<br>
                  > ><br>
                  > > #2 0x00002aed9f0a8fd7 in os::abort(bool) ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #3 0x00002aed9f1fc05d in
                  VMError::report_and_die() ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #4 0x00002aed9f0af655 in
                  JVM_handle_linux_signal ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #5 0x00002aed9f0abbae in signalHandler(int,
                  siginfo*, void*) ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #6 <signal handler called><br>
                  > ><br>
                  > > #7 0x00002aed9ee64703 in void
                  ParScanClosure::do_oop_work<unsigned<br>
                  > > int>(unsigned int*, bool, bool) () from<br>
                  > >
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #8 0x00002aed9f095d43 in
                  objArrayKlass::oop_oop_iterate_nv(oopDesc*,<br>
                  > > ParScanWithoutBarrierClosure*) () from<br>
                  > >
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #9 0x00002aed9f0bc0e4 in
                  ParScanThreadState::trim_queues(int) ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #10 0x00002aed9f0bcbde in
                  ParEvacuateFollowersClosure::do_void() ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #11 0x00002aed9f0bce36 in
                  ParNewGenTask::work(int) ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #12 0x00002aed9f21245d in GangWorker::loop()
                  ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #13 0x00002aed9f212374 in GangWorker::run()
                  ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #14 0x00002aed9f0ae14f in
                  java_start(Thread*) ()<br>
                  > ><br>
                  > > from
                  /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                  > ><br>
                  > > #15 0x00000035b2c0673d in start_thread ()
                  from /lib64/libpthread.so.0<br>
                  > ><br>
                  > > #16 0x00000035b24d3d1d in clone () from
                  /lib64/libc.so.6<br>
                  > ><br>
                  > > (gdb)<br>
                  > ><br>
                  ><br>
                  ><br>
                  <br>
                  <br>
                  <br>
                  <br>
                  I have same problem with Linux and jdk1.6.0_24.<br>
                  <br>
                  If any body have any solution please let me know.<br>
                  Below is the part of gdb stack trace-<br>
                  <br>
                  Thread 1 (Thread 1996):<br>
                  #0  0xffffe410 in __kernel_vsyscall ()<br>
                  No symbol table info available.<br>
                  #1  0x00b0ddf0 in raise () from /lib/libc.so.6<br>
                  No symbol table info available.<br>
                  #2  0x00b0f701 in abort () from /lib/libc.so.6<br>
                  No symbol table info available.<br>
                  #3  0xf78d823f in os::abort(bool) ()<br>
                  from
                  /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                  No symbol table info available.<br>
                  #4  0xf7a1f431 in VMError::report_and_die() ()<br>
                  from
                  /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                  No symbol table info available.<br>
                  #5  0xf78df1dc in JVM_handle_linux_signal ()<br>
                  from
                  /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                  No symbol table info available.<br>
                  #6  0xf78db124 in signalHandler(int, siginfo*, void*)
                  ()<br>
                  from
                  /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                  No symbol table info available.<br>
                  #7  <signal handler called><br>
                  No symbol table info available.<br>
                  #8  0x00b4ef5f in _int_malloc () from /lib/libc.so.6<br>
                  No symbol table info available.<br>
                  #9  0x00b50fb7 in malloc () from /lib/libc.so.6<br>
                  No symbol table info available.<br>
                  #10 0x4c242af7 in operator new(unsigned int) () from
                  /usr/lib/libstdc++.so.6<br>
                  <br>
                  Thanks<br>
                  <font color="#888888">/Y<br>
                    <br>
                    <br>
                    <br>
                  </font></blockquote>
              </div>
              <br>
            </div>
          </div>
        </div>
      </div>
      <br>
    </blockquote>
  </div></div></div>

</blockquote></div><br></div>