Disastrous bug when running jinfo and jmap
tobe
tobeg3oogle at gmail.com
Tue Sep 2 13:49:16 UTC 2014
And I see this
http://ebergen.net/wordpress/2008/06/25/ptrace-on-threads-and-linux-signal-handling-issues/
.
On Tue, Sep 2, 2014 at 9:37 PM, tobe <tobeg3oogle at gmail.com> wrote:
> Now I'm considering something about ptrace. Our kernel version is
> 2.6.32-279. Maybe it doesn't resume the threads correctly. Is it related to
> http://kernel.opensuse.org/cgit/kernel/commit/?h=openSUSE-13.1&id=d1f26676dad578a65c94782f0c2bd00b7aa68f1b
> ?
>
>
> On Tue, Sep 2, 2014 at 8:03 PM, tobe <tobeg3oogle at gmail.com> wrote:
>
>> Just like what @mikael said, running jstack -F has the same behaviour
>> while jstack doesn't. But our processes have been suspended for several
>> days and it's quite abnormal. I think there's something preventing the
>> processes from recovering. Is it related to our running environment or
>> jdk1.6?
>>
>>
>> On Tue, Sep 2, 2014 at 6:05 PM, tobe <tobeg3oogle at gmail.com> wrote:
>>
>>> Hi @martijn. Do you mean you can run jmap and jinfo on the Java process
>>> which has ran over 25 days? Have you checked the status of that process?
>>> Our 1.6 jvms were suspended but not exited.
>>>
>>> If it's the issue on 1.6, can anyone help to find out that issue and
>>> patch?
>>>
>>>
>>> On Tue, Sep 2, 2014 at 5:38 PM, tobe <tobeg3oogle at gmail.com> wrote:
>>>
>>>> Thank @mikael for replying. But I can see the complete message "Server
>>>> compiler detected" and expect the JVM to continue. It's wired that this
>>>> doesn't happen when jinfo the new processes.
>>>>
>>>>
>>>>
>>>> On Tue, Sep 2, 2014 at 5:28 PM, Staffan Larsen <
>>>> staffan.larsen at oracle.com> wrote:
>>>>
>>>>>
>>>>> On 2 sep 2014, at 11:15, Mikael Gerdin <mikael.gerdin at oracle.com>
>>>>> wrote:
>>>>>
>>>>> > Hi,
>>>>> >
>>>>> > This is the expected behavior for jmap and jinfo. If you call jstack
>>>>> with the "-F" flag you will see the same behavior.
>>>>> >
>>>>> > The reason for this is that jmap, jinfo and jstack -F all attach to
>>>>> your target JVM as a debugger and read the memory from the process. That
>>>>> needs to be done when the target process is in a frozen state.
>>>>>
>>>>> But when jinfo/jmap/jstack is done with the process it should continue
>>>>> execution.
>>>>>
>>>>> Is this reproducible with JDK 8?
>>>>>
>>>>> /Staffan
>>>>>
>>>>>
>>>>> >
>>>>> > /Mikael
>>>>> >
>>>>> > On 2014-09-02 11:08, tobe wrote:
>>>>> >> When I run jinfo or jmap to any Java process, it will "suspend" the
>>>>> Java
>>>>> >> process. It's 100% reproduced for the long running processes.
>>>>> >>
>>>>> >> Here're the detailed steps:
>>>>> >>
>>>>> >> 1. Pick a Java process which is running over 25 days(It's wired
>>>>> because
>>>>> >> this doesn't work for new processes).
>>>>> >> 2. Run ps to check the state of the process, should be "Sl" which is
>>>>> >> expected.
>>>>> >> 3. Run jinfo or jmap to this process(BTY, jstack doesn't have this
>>>>> issue).
>>>>> >> 4. Run ps to check the state of the process. This time it changes
>>>>> to "Tl"
>>>>> >> which means STOPPED and the process doesn't response any requests.
>>>>> >>
>>>>> >> Here's the output of our process:
>>>>> >>
>>>>> >> [work at hadoop ~]$ ps aux |grep "qktst" |grep "RegionServer"
>>>>> >> work 36663 0.1 1.7 24157828 1150820 ? Sl Aug06 72:54
>>>>> >> /opt/soft/jdk/bin/java -cp
>>>>> >>
>>>>> /home/work/app/hbase/qktst-qk/regionserver/:/home/work/app/hbase/qktst-qk/regionserver/package//:/home/work/app/hbase/qktst-qk/regionserver/package//lib/*:/home/work/app/hbase/qktst-qk/regionserver/package//*
>>>>> >>
>>>>> -Djava.library.path=:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/Linux-amd64-64
>>>>> >>
>>>>> -Xbootclasspath/p:/home/work/app/hbase/qktst-qk/regionserver/package/lib/hadoop-security-2.0.0-mdh1.1.0.jar
>>>>> >> -Xmx10240m -Xms10240m -Xmn1024m -XX:MaxDirectMemorySize=1024m
>>>>> >> -XX:MaxPermSize=512m
>>>>> >>
>>>>> -Xloggc:/home/work/app/hbase/qktst-qk/regionserver/stdout/regionserver_gc_20140806-211157.log
>>>>> >> -Xss256k -XX:PermSize=64m -XX:+HeapDumpOnOutOfMemoryError
>>>>> >> -XX:HeapDumpPath=/home/work/app/hbase/qktst-qk/regionserver/log
>>>>> >> -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC
>>>>> -verbose:gc
>>>>> >> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6
>>>>> >> -XX:+UseCMSCompactAtFullCollection
>>>>> -XX:CMSInitiatingOccupancyFraction=75
>>>>> >> -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled
>>>>> >> -XX:+UseNUMA -XX:+CMSClassUnloadingEnabled
>>>>> >> -XX:CMSMaxAbortablePrecleanTime=10000 -XX:TargetSurvivorRatio=80
>>>>> >> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100
>>>>> -XX:GCLogFileSize=128m
>>>>> >> -XX:CMSWaitDuration=2000 -XX:+CMSScavengeBeforeRemark
>>>>> >> -XX:+PrintPromotionFailure -XX:ConcGCThreads=16
>>>>> -XX:ParallelGCThreads=16
>>>>> >> -XX:PretenureSizeThreshold=2097088 -XX:+CMSConcurrentMTEnabled
>>>>> >> -XX:+ExplicitGCInvokesConcurrent -XX:+SafepointTimeout
>>>>> >> -XX:MonitorBound=16384 -XX:-UseBiasedLocking
>>>>> -XX:MaxTenuringThreshold=3
>>>>> >> -Dproc_regionserver
>>>>> >>
>>>>> -Djava.security.auth.login.config=/home/work/app/hbase/qktst-qk/regionserver/jaas.conf
>>>>> >> -Djava.net.preferIPv4Stack=true
>>>>> >> -Dhbase.log.dir=/home/work/app/hbase/qktst-qk/regionserver/log
>>>>> >> -Dhbase.pid=36663 -Dhbase.cluster=qktst-qk -Dhbase.log.level=debug
>>>>> >> -Dhbase.policy.file=hbase-policy.xml
>>>>> >> -Dhbase.home.dir=/home/work/app/hbase/qktst-qk/regionserver/package
>>>>> >>
>>>>> -Djava.security.krb5.conf=/home/work/app/hbase/qktst-qk/regionserver/krb5.conf
>>>>> >> -Dhbase.id.str=work
>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer start
>>>>> >> [work at hadoop ~]$ jinfo 36663 > tobe.jinfo
>>>>> >> Attaching to process ID 36663, please wait...
>>>>> >> Debugger attached successfully.
>>>>> >> Server compiler detected.
>>>>> >> JVM version is 20.12-b01
>>>>> >> [work at hadoop ~]$ ps aux |grep "qktst" |grep "RegionServer"
>>>>> >> work 36663 0.1 1.7 24157828 1151008 ? Tl Aug06 72:54
>>>>> >> /opt/soft/jdk/bin/java -cp
>>>>> >>
>>>>> /home/work/app/hbase/qktst-qk/regionserver/:/home/work/app/hbase/qktst-qk/regionserver/package//:/home/work/app/hbase/qktst-qk/regionserver/package//lib/*:/home/work/app/hbase/qktst-qk/regionserver/package//*
>>>>> >>
>>>>> -Djava.library.path=:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/:/home/work/app/hbase/qktst-qk/regionserver/package/lib/native/Linux-amd64-64
>>>>> >>
>>>>> -Xbootclasspath/p:/home/work/app/hbase/qktst-qk/regionserver/package/lib/hadoop-security-2.0.0-mdh1.1.0.jar
>>>>> >> -Xmx10240m -Xms10240m -Xmn1024m -XX:MaxDirectMemorySize=1024m
>>>>> >> -XX:MaxPermSize=512m
>>>>> >>
>>>>> -Xloggc:/home/work/app/hbase/qktst-qk/regionserver/stdout/regionserver_gc_20140806-211157.log
>>>>> >> -Xss256k -XX:PermSize=64m -XX:+HeapDumpOnOutOfMemoryError
>>>>> >> -XX:HeapDumpPath=/home/work/app/hbase/qktst-qk/regionserver/log
>>>>> >> -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC
>>>>> -verbose:gc
>>>>> >> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6
>>>>> >> -XX:+UseCMSCompactAtFullCollection
>>>>> -XX:CMSInitiatingOccupancyFraction=75
>>>>> >> -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled
>>>>> >> -XX:+UseNUMA -XX:+CMSClassUnloadingEnabled
>>>>> >> -XX:CMSMaxAbortablePrecleanTime=10000 -XX:TargetSurvivorRatio=80
>>>>> >> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100
>>>>> -XX:GCLogFileSize=128m
>>>>> >> -XX:CMSWaitDuration=2000 -XX:+CMSScavengeBeforeRemark
>>>>> >> -XX:+PrintPromotionFailure -XX:ConcGCThreads=16
>>>>> -XX:ParallelGCThreads=16
>>>>> >> -XX:PretenureSizeThreshold=2097088 -XX:+CMSConcurrentMTEnabled
>>>>> >> -XX:+ExplicitGCInvokesConcurrent -XX:+SafepointTimeout
>>>>> >> -XX:MonitorBound=16384 -XX:-UseBiasedLocking
>>>>> -XX:MaxTenuringThreshold=3
>>>>> >> -Dproc_regionserver
>>>>> >>
>>>>> -Djava.security.auth.login.config=/home/work/app/hbase/qktst-qk/regionserver/jaas.conf
>>>>> >> -Djava.net.preferIPv4Stack=true
>>>>> >> -Dhbase.log.dir=/home/work/app/hbase/qktst-qk/regionserver/log
>>>>> >> -Dhbase.pid=36663 -Dhbase.cluster=qktst-qk -Dhbase.log.level=debug
>>>>> >> -Dhbase.policy.file=hbase-policy.xml
>>>>> >> -Dhbase.home.dir=/home/work/app/hbase/qktst-qk/regionserver/package
>>>>> >>
>>>>> -Djava.security.krb5.conf=/home/work/app/hbase/qktst-qk/regionserver/krb5.conf
>>>>> >> -Dhbase.id.str=work
>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer start
>>>>> >>
>>>>> >>
>>>>> >> I hope some JVM experts here could help.
>>>>> >>
>>>>> >> $ java -version
>>>>> >> java version "1.6.0_37"
>>>>> >> Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
>>>>> >> Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)
>>>>> >>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140902/59fb78ab/attachment-0001.html>
More information about the serviceability-dev
mailing list