Oops, my memory fell short there, too...<div>I thought I recall it was PS without actually reading the original mail. Sorry, it's ParNew+CMS.<div><br></div><div>-- Kris<br><br><div class="gmail_quote">On Thu, Sep 8, 2011 at 1:00 AM, Ramki Ramakrishna <span dir="ltr"><<a href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><u></u>

  
    
  
  <div bgcolor="#ffffff" text="#000000">
    Kris, Thanks for the reminder. As you can tell my memory is short
    (and fading :-)<br>
    <br>
    Anyway, the crash below (and in your emails and in Yogesh's) all
    seem to be with<br>
    ParNew (which is the young gen scavenger that typically goes with
    CMS when you<br>
    run on an MP platform), not with ParallelScavenge.<br>
    <br>
    In case it resurfaces with more recent JVMs, we should follow up to
    see if something can be done ...<br>
    If it's with older JVM's, please follow up with the appropriate
    support org.<br>
    <br>
    thanks!<br><font color="#888888">
    -- ramki</font><div><div></div><div class="h5"><br>
    <br>
    On 9/7/2011 9:15 AM, Krystal Mok wrote:
    <blockquote type="cite">My bad. I hit "reply" instead of "reply all" on that
      older thread so my follow-ups didn't show up in the list. I'm
      including the original mail below. Anyway, it wasn't fixed here,
      but we don't see a reproduce any more (on both 6u23 and 6u25,
      64-bit Server VM), so we're just letting it slip through. One
      possibility is that we're switching more and more to CMS, and the
      problem occurred in ParallelScavange.
      <div>
        <br>
      </div>
      <div>The original mail:</div>
      <div><br>
      </div>
      <div>
        
        <br>
        <br>
        <div class="gmail_quote">---------- Forwarded message ----------<br>
          From: <b class="gmail_sendername">Y. Srinivas Ramakrishna</b> <span dir="ltr"><<a href="mailto:y.s.ramakrishna@oracle.com" target="_blank">y.s.ramakrishna@oracle.com</a>></span><br>
          Date: Mon, Apr 18, 2011 at 11:31 PM<br>
          Subject: Re: Crash log when do GC...<br>
          To: Krystal Mok <<a href="mailto:rednaxelafx@gmail.com" target="_blank">rednaxelafx@gmail.com</a>><br>
          <br>
          <br>
          i wonder if it's an issue with array copy stubs which leave
          random<br>
          junk in some locations of the array, or if there's a race that
          causes<br>
          some locations to transiently have bad data. Seems unlikely,
          but the<br>
          involvement of object arrays raises some suspicions. I'll see
          if any<br>
          array copying bugs have surfaced or been fixed recently
          although none<br>
          comes readily to mind...<br>
          <br>
          PS: if it's production runs, you won't be able to use heap
          verification,<br>
          but if you have a test load that reproduces the problem, may
          be<br>
          heap verification might give us some clues (although given the
          nature of<br>
          the problem, I am not hopeful). If you have a support
          contract,<br>
          I'd suggest filing an official ticket and sending in a couple
          of core<br>
          files, if you have any sitting around. That may be the only
          way to<br>
          make progress on this kind of issue.<br>
          <br>
          -- ramki
          <div>
            <div><br>
              <br>
              On 4/18/2011 8:16 AM, Krystal Mok wrote:<br>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
                Hi,<br>
                <br>
                I wasn't able to make a minimal repro to this problem,
                because it seem to<br>
                happen pretty randomly, running fine for 9 to 15 hours
                before suddenly<br>
                crashing with a segfault.<br>
                It's already running JDK6u23, and there doesn't seem to
                be a lot of changes<br>
                to HotSpot that got into JDK6u24, so I doubt if there
                would be any progress<br>
                upgrading to this version. Might try JDK6u25b03 and see
                if there's any luck.<br>
                <br>
                Attached with this email is another crash log on the
                same issue. The program<br>
                had a lot of threads, and crashes with this stack trace:<br>
                <br>
                Stack: [0x0000000000000000,0x0000000000000000],
                 sp=0x0000000041f8a810,<br>
                 free space=1080874k<br>
                Native frames: (J=compiled Java code, j=interpreted,
                Vv=VM code, C=native<br>
                code)<br>
                V  [libjvm.so+0x3e62c3]<void
                ParScanClosure::do_oop_work<unsigned<br>
                int>(unsigned int*, bool, bool)+0x63><br>
                V  [libjvm.so+0x60bc83]<objArrayKlass::oop_oop_iterate_nv(oopDesc*,<br>
                ParScanWithoutBarrierClosure*)+0xf3><br>
                V  [libjvm.so+0x6318d4]<ParScanThreadState::trim_queues(int)+0x124><br>
                V  [libjvm.so+0x3e61c5]<void<br>
                ParScanClosure::do_oop_work<oopDesc*>(oopDesc**,
                bool, bool)+0x105><br>
                V  [libjvm.so+0x632260]<br>
                <ParRootScanWithoutBarrierClosure::do_oop(oopDesc**)+0x10><br>
                V  [libjvm.so+0x3702b1]<InterpreterFrameClosure::offset_do(int)+0x31><br>
                V  [libjvm.so+0x619776]<br>
                <InterpreterOopMap::iterate_oop(OffsetClosure*)+0x86><br>
                V  [libjvm.so+0x36efd8]<frame::oops_interpreted_do(OopClosure*,
                RegisterMap<br>
                const*, bool)+0x188><br>
                V  [libjvm.so+0x36fd71]<frame::oops_do_internal(OopClosure*,<br>
                CodeBlobClosure*, RegisterMap*, bool)+0xb1><br>
                V  [libjvm.so+0x728fb3]<JavaThread::oops_do(OopClosure*,<br>
                CodeBlobClosure*)+0x1d3><br>
                V  [libjvm.so+0x72bc9e]<Threads::possibly_parallel_oops_do(OopClosure*,<br>
                CodeBlobClosure*)+0xbe><br>
                V  [libjvm.so+0x69572e]<SharedHeap::process_strong_roots(bool,
                bool,<br>
                SharedHeap::ScanningOption, OopClosure*,
                CodeBlobClosure*,<br>
                OopsInGenClosure*)+0x8e><br>
                V  [libjvm.so+0x39d75d]<GenCollectedHeap::gen_process_strong_roots(int,<br>
                bool, bool, bool, SharedHeap::ScanningOption,
                OopsInGenClosure*, bool,<br>
                OopsInGenClosure*)+0x7d><br>
                V  [libjvm.so+0x6325f6]<ParNewGenTask::work(int)+0xd6><br>
                V  [libjvm.so+0x78018d]<GangWorker::loop()+0xaa><br>
                V  [libjvm.so+0x7800a4]<GangWorker::run()+0x24><br>
                V  [libjvm.so+0x623e1f]<java_start(Thread*)+0x13f><br>
                <br>
                JavaThread 0x00002aaab7692800 (nid = 8559) was being
                processed<br>
                Java frames: (J=compiled Java code, j=interpreted, Vv=VM
                code)<br>
                j  java.lang.reflect.Array.set(Ljava/lang/Object;ILjava/lang/Object;)V+0<br>
                J<br>
                 com.taobao.top.core.DefaultBlackBoxEngine.callHsf(Ljava/lang/String;Ljava/lang/String;Ljava/lang/Long;Lcom/taobao/hsf/app/spring/util/SuperHSFSpringConsumerBeanTop;[Ljava/lang/String;[Ljava/lang/Object;Lcom/taobao/top/core/framework/TopPipeResult;)Ljava/lang/Object;<br>

                J<br>
                 com.taobao.top.core.DefaultApiExecutor.execute(Lcom/taobao/top/core/framework/TopPipeInput;Lcom/taobao/top/core/framework/TopPipeResult;)V<br>
                J  com.taobao.top.core.framework.TopPipeTask.run()V<br>
                J  java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;<br>
                J  java.util.concurrent.FutureTask.run()V<br>
                J<br>
                 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Ljava/lang/Runnable;)V<br>
                J  java.util.concurrent.ThreadPoolExecutor$Worker.run()V<br>
                j  java.lang.Thread.run()V+11<br>
                v  ~StubRoutines::call_stub<br>
                <br>
                What's weird about it is that this program would
                repeatedly crash in the<br>
                same function in ParNew GC, and that the JavaThread it's
                working on was in<br>
                an invocation to java.lang.reflect.Array.set(). In this
                case it's trying to<br>
                dereference off a bad pointer decompressed from a
                narrowOop, but it's hard<br>
                to trace just where things went wrong at the beginning.<br>
                <br>
                We'll see if it's affordable to turn on heap
                verification to trace it down.<br>
                <br>
                Sincerely,<br>
                Kris Mok<br>
                <br>
                On Mon, Apr 18, 2011 at 10:58 PM, Y. Srinivas
                Ramakrishna<<br>
                <a href="mailto:y.s.ramakrishna@oracle.com" target="_blank">y.s.ramakrishna@oracle.com</a>>
                 wrote:<br>
                <br>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
                  Hi, i have heard a couple of other reports of this
                  sort recently.<br>
                  But i don't think we have found or fixed any issue
                  recently that<br>
                  might address this. You might want to try a more
                  recent<br>
                  JVM/JDK to confirm if the crash still occurs (which i
                  think<br>
                  it probably will, going by other such reports). Do you
                  have<br>
                  a test case? If so, please file a bug through support
                  or send<br>
                  us your test case off-line. You can also enable heap
                  verification<br>
                  at some considerable GC performance cost and see if
                  that gets us<br>
                  closer to the root cause. (From looking at the stack
                  retrace it appears<br>
                  as though GC finds a bad reference from an object
                  array while copying<br>
                  live objects from the young generation during a
                  scavenge.)<br>
                  <br>
                  -- ramki<br>
                  <br>
                  <br>
                  <br>
                  On 4/18/2011 6:48 AM, BlueDavy Lin wrote:<br>
                  <br>
                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
                    hi!<br>
                    <br>
                          Rencently our two app often crash when do
                    gc,the crash log<br>
                    attached,can someone give me some advice? thks.<br>
                    <br>
                          ps: I tried to set -XX:-UseCompressedOops,but
                    still crash,and<br>
                    log is the same.<br>
                    <br>
                    <br>
                  </blockquote>
                  <br>
                </blockquote>
                <br>
              </blockquote>
              <br>
            </div>
          </div>
        </div>
        <br>
        <div class="gmail_quote">On Thu, Sep 8, 2011 at 12:06 AM, Ramki
          Ramakrishna <span dir="ltr"><<a href="mailto:y.s.ramakrishna@oracle.com" target="_blank">y.s.ramakrishna@oracle.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
            <div bgcolor="#ffffff" text="#000000"> I didn't see any
              follow-up on the issue reported at:-
              <div><br>
                <br>
                 <a href="http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html" target="_blank">http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html</a><br>
                <br>
              </div>
              so I do not know if that issue ever got satisfactorily
              resolved. I don't think<br>
              there are any open bugs in our database for that issue. If
              there's a test-case we<br>
              can take a look.<br>
              <br>
              thanks.<br>
              <font color="#888888"> -- ramki</font>
              <div>
                <div><br>
                  <br>
                  On 9/7/2011 4:36 AM, Krystal Mok wrote:
                  <blockquote type="cite">CC'ing hotspot-gc-dev for the
                    first stack trace<br>
                    <br>
                    <div class="gmail_quote">---------- Forwarded
                      message ----------<br>
                      From: <b class="gmail_sendername">Krystal Mok</b>
                      <span dir="ltr"><<a href="mailto:rednaxelafx@gmail.com" target="_blank">rednaxelafx@gmail.com</a>></span><br>
                      Date: Wed, Sep 7, 2011 at 7:35 PM<br>
                      Subject: Re: JVM crash HS machine<br>
                      To: yogesh <<a href="mailto:ydhaked@amdocs.com" target="_blank">ydhaked@amdocs.com</a>><br>
                      <br>
                      <br>
                      Hi,
                      <div><br>
                      </div>
                      <div>I don't think the two stack traces shown here
                        are of the same issue. The first one (the one in
                        quotes) seem to be the same as one mentioned
                        before: <a href="http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html" target="_blank">http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-April/002537.html</a> ,

                        but no solutions yet (to my knowledge).</div>
                      <div><br>
                      </div>
                      <div>The second stack trace is missing some very
                        important stuff. It's important to know the
                        caller of the operator new, which means a deeper
                        stack trace log would help; without that it's
                        quite hard to infer any context out of the stack
                        trace. It'd also be helpful to know what signal
                        it was.</div>
                      <div><br>
                      </div>
                      <div>Regards,</div>
                      <div>Kris Mok
                        <div>
                          <div><br>
                            <br>
                            <div class="gmail_quote">On Wed, Sep 7, 2011
                              at 7:06 PM, yogesh <span dir="ltr"><<a href="mailto:ydhaked@amdocs.com" target="_blank">ydhaked@amdocs.com</a>></span>
                              wrote:<br>
                              <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex"> Igor Shprukh
                                <a href="mailto:igor.shprukh@..." target="_blank"><igor.shprukh@...></a>
                                writes:<br>
                                <br>
                                ><br>
                                > I have attached the hs log file.<br>
                                > The JVM continuously crashes every
                                two hours.<br>
                                > Thank You!<br>
                                > -----Original Message-----<br>
                                > From: Dmitry Samersoff [mailto:<a href="mailto:Dmitry.Samersoff" target="_blank">Dmitry.Samersoff</a>
                                <at> <a href="http://oracle.com" target="_blank">oracle.com</a>]<br>
                                > Sent: Sunday, April 17, 2011 4:53
                                PM<br>
                                > To: Igor Shprukh<br>
                                > Cc: hotspot-runtime-dev <at>
                                <a href="http://openjdk.java.net" target="_blank">openjdk.java.net</a><br>
                                > Subject: Re: JVM crash HS machine<br>
                                ><br>
                                > Igor,<br>
                                ><br>
                                > Please, send across full
                                hs_err_*.log<br>
                                ><br>
                                > -Dmitry<br>
                                ><br>
                                > On 2011-04-17 17:23, Igor Shprukh
                                wrote:<br>
                                > > *Hi all, I have the following
                                error after the running the JVM for
                                about<br>
                                > > 5 hrs.*<br>
                                > ><br>
                                > > *This is linux – amd 64bit
                                machine with 16 proccesors.*<br>
                                > ><br>
                                > > *The crash is at the GC, do
                                you have any ideas on the cause ?*<br>
                                > ><br>
                                > > **<br>
                                > ><br>
                                > > *Thank You !*<br>
                                > ><br>
                                > > Program terminated with signal
                                6, Aborted.<br>
                                > ><br>
                                > > #0 0x00000035b2430265 in raise
                                () from /lib64/libc.so.6<br>
                                > ><br>
                                > > (gdb) bt<br>
                                > ><br>
                                > > #0 0x00000035b2430265 in raise
                                () from /lib64/libc.so.6<br>
                                > ><br>
                                > > #1 0x00000035b2431d10 in abort
                                () from /lib64/libc.so.6<br>
                                > ><br>
                                > > #2 0x00002aed9f0a8fd7 in
                                os::abort(bool) ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #3 0x00002aed9f1fc05d in
                                VMError::report_and_die() ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #4 0x00002aed9f0af655 in
                                JVM_handle_linux_signal ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #5 0x00002aed9f0abbae in
                                signalHandler(int, siginfo*, void*) ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #6 <signal handler
                                called><br>
                                > ><br>
                                > > #7 0x00002aed9ee64703 in void
                                ParScanClosure::do_oop_work<unsigned<br>
                                > > int>(unsigned int*, bool,
                                bool) () from<br>
                                > >
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #8 0x00002aed9f095d43 in
                                objArrayKlass::oop_oop_iterate_nv(oopDesc*,<br>
                                > > ParScanWithoutBarrierClosure*)
                                () from<br>
                                > >
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #9 0x00002aed9f0bc0e4 in
                                ParScanThreadState::trim_queues(int) ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #10 0x00002aed9f0bcbde in
                                ParEvacuateFollowersClosure::do_void()
                                ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #11 0x00002aed9f0bce36 in
                                ParNewGenTask::work(int) ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #12 0x00002aed9f21245d in
                                GangWorker::loop() ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #13 0x00002aed9f212374 in
                                GangWorker::run() ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #14 0x00002aed9f0ae14f in
                                java_start(Thread*) ()<br>
                                > ><br>
                                > > from
                                /usr/java/jdk1.6.0_24/jre/lib/amd64/server/libjvm.so<br>
                                > ><br>
                                > > #15 0x00000035b2c0673d in
                                start_thread () from
                                /lib64/libpthread.so.0<br>
                                > ><br>
                                > > #16 0x00000035b24d3d1d in
                                clone () from /lib64/libc.so.6<br>
                                > ><br>
                                > > (gdb)<br>
                                > ><br>
                                ><br>
                                ><br>
                                <br>
                                <br>
                                <br>
                                <br>
                                I have same problem with Linux and
                                jdk1.6.0_24.<br>
                                <br>
                                If any body have any solution please let
                                me know.<br>
                                Below is the part of gdb stack trace-<br>
                                <br>
                                Thread 1 (Thread 1996):<br>
                                #0  0xffffe410 in __kernel_vsyscall ()<br>
                                No symbol table info available.<br>
                                #1  0x00b0ddf0 in raise () from
                                /lib/libc.so.6<br>
                                No symbol table info available.<br>
                                #2  0x00b0f701 in abort () from
                                /lib/libc.so.6<br>
                                No symbol table info available.<br>
                                #3  0xf78d823f in os::abort(bool) ()<br>
                                from
                                /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                                No symbol table info available.<br>
                                #4  0xf7a1f431 in
                                VMError::report_and_die() ()<br>
                                from
                                /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                                No symbol table info available.<br>
                                #5  0xf78df1dc in
                                JVM_handle_linux_signal ()<br>
                                from
                                /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                                No symbol table info available.<br>
                                #6  0xf78db124 in signalHandler(int,
                                siginfo*, void*) ()<br>
                                from
                                /usr/java/jdk1.6.0_24/jre/lib/i386/server/libjvm.so<br>
                                No symbol table info available.<br>
                                #7  <signal handler called><br>
                                No symbol table info available.<br>
                                #8  0x00b4ef5f in _int_malloc () from
                                /lib/libc.so.6<br>
                                No symbol table info available.<br>
                                #9  0x00b50fb7 in malloc () from
                                /lib/libc.so.6<br>
                                No symbol table info available.<br>
                                #10 0x4c242af7 in operator new(unsigned
                                int) () from /usr/lib/libstdc++.so.6<br>
                                <br>
                                Thanks<br>
                                <font color="#888888">/Y<br>
                                  <br>
                                  <br>
                                  <br>
                                </font></blockquote>
                            </div>
                            <br>
                          </div>
                        </div>
                      </div>
                    </div>
                    <br>
                  </blockquote>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
  </div></div></div>

</blockquote></div><br></div></div>