Java application getting paused when pfiles or pstack command run for it

Amit Mishra amit.mishra at redknee.com
Thu Sep 8 11:43:13 UTC 2016


Thanks Poonam, do you have exact set of commands to be executed on impacted application code, I was thinking of core dump but it will cause complete application breakdown while in current scenario this pause is happening only for few seconds regularly after every hour on specific day of week in our case it is Tuesday and not daily.

Also application become responsive on its own after few seconds so I am suspecting it could be due to those proc or similar commands for application PID.

Thanks Charlie for your response, first thing we checked here was GC logs which is all clean , RMI DGC frequency is 24 hour and even that is disabled as we have ExplicitGCInvokeCMS flag in place.


Moreover these application issues is not happening daily but once in a week on particular day during few hours in morning and evening.


argv[32]: -XX:+ExplicitGCInvokesConcurrent
argv[38]: -Dsun.rmi.dgc.server.gcInterval=86400000
argv[39]: -Dsun.rmi.dgc.client.gcInterval=86400000


I will collect top,netstat -an,ps -eaf next Tuesday and in case if there will be any new findings then I will share with you guys.

Thank you very much once again for your kind support.

Regards,
Amit Mishra


From: Poonam Bajaj Parhar [mailto:poonam.bajaj at oracle.com]
Sent: Wednesday, September 7, 2016 22:28
To: Amit Mishra <amit.mishra at redknee.com>
Cc: hotspot-gc-use at openjdk.java.net
Subject: Re: Java application getting paused when pfiles or pstack command run for it

Hello Amit,

If you suspect that the JVM is hung, then you could try jstack -F to collect the stack traces. And if that too does not work then try collecting core file, and use jstack and/or native debuggers to collect the stack traces including native frames to see what is going on.

Thanks,
Poonam

On 9/7/2016 5:59 AM, Charlie Hunt wrote:
Your description sounds like it may be distributed GC running at its default, every 60 minutes.  A look at GC logs would help confirm that's the case.

GC logs are usually the best place to start looking when looking for sources of application pauses.

If it turns out to be distributed GC, you can change the frequency at which it runs by adding the following system properties:
-Dsun.rmi.dgc.client.gcInterval=<milliseconds>
-Dsun.rmi.dgc.server.gcInterval=<milliseconds>

As an extreme example, you could effectively tell distributed GC to not run by setting:
-Dsun.rmi.dgc.client.gcInterval=Long.MAX_VALUE
-Dsun.rmi.dgc.server.gcInterval=Long.MAX_VALUE

Or, you could add the following command line option:
-XX:+DisableExplicitGC

hths,

charlie

----- Original Message -----
From: amit.mishra at redknee.com<mailto:amit.mishra at redknee.com>
To: charlie.hunt at oracle.com<mailto:charlie.hunt at oracle.com>
Cc: hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net>
Sent: Wednesday, September 7, 2016 7:08:23 AM GMT -06:00 US/Canada Central
Subject: RE: Java application getting paused when pfiles or pstack command run for it


Thank you very much Charles, but initially Customer reported a Production issue where java Application process hangs periodically for a few seconds at fixed point of time of every hour.(say 17:41,18:41,19:41).

We tried to investigate it by taking TD's to see where application threads are blocked but TD didn't came up due to which we conclude that it is JVM freeze.
Further to analyze that we were supposed to analyze process threads using pstack but as pstack itself is causing application pause so in your opinion what are the best commands/tools to analyze JVM freeze when Thread dumps stopped coming.

Regards,
Amit

From: charlie hunt [mailto:charlie.hunt at oracle.com]
Sent: Wednesday, September 7, 2016 17:29
To: Amit Mishra <amit.mishra at redknee.com><mailto:amit.mishra at redknee.com>
Cc: hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net>
Subject: Re: Java application getting paused when pfiles or pstack command run for it

Hi Amit,

pfiles, pstack and pldd stops the process while they do their work.

The following is directly from the p-tools (pfiles,etc.) man page, and from Oracle docs on p-tools:
These proc tools (p-tools) stop their target processes while inspecting them and reporting the results: files, pled, mmap, and stack. A process can do nothing while it is stopped.

As for why it may take up to 15 seconds?  It may be that there are a large number file descriptors in use, etc. or a (large) portion of that time is spent getting all threads in the JVM and app to come to stopped state, or a combination of both.

hths,

charlie


On Sep 7, 2016, at 5:33 AM, Amit Mishra <amit.mishra at redknee.com<mailto:amit.mishra at redknee.com>> wrote:

Hello Charlie/team,

I need your expert help on one of Production issue whereas when pfiles <JavaappPID> command whenever runs then it cause java application freeze until command get completed.(it is causing long pauses of up-to 15 seconds)

During this whole time situations appears as JVM freeze as TD's for Application PID stopped generating, GC logs are getting paused and application stopped catering traffic.

What could be the cause of it, Application Java version is 1.6u45.(Application PID is 18387)

3 samples from Top command when pfiles ,pldd and pstack command cause Application freeze(as TD's was also not coming we treated this as JVM freeze).


========================================================
1st Sample:
load averages:  1.11,  1.00,  0.98;                    up 68+20:03:01  21:41:59
94 processes: 91 sleeping, 1 stopped, 2 on cpu
CPU states: 85.6% idle,  0.4% user, 14.1% kernel,  0.0% iowait,  0.0% swap
Memory: 64G phys mem, 30G free mem, 16G total swap, 16G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
  5251 root       1   0   17 2872K 1396K cpu/2    0:00  6.82% pfiles 18387
18387 rkadm    999   0    0   29G   23G stop   878.5H  5.73% /usr/Java_1.6_45/bin/amd64/java -Djava.util.logging.manager=com.redknee.framewo

======================================================================
2nd Sample:
PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
10145 root       1   0   17 2620K 1676K cpu/2    0:00  3.90% pldd 18387
  5352 root       4   6   17   31M   27M sleep    0:01  3.33% pkgserv -N pkgchk
18387 rkadm    999  59    0   29G   23G stop   878.5H  0.82% /usr/Java_1.6_45/bin/amd64/java -Djava.util.logging.manager=com.redknee.framewo

=================================================================
3rd Sample
load averages:  1.21,  1.16,  1.15;                    up 68+18:02:37  19:41:35
88 processes: 85 sleeping, 1 stopped, 2 on cpu
CPU states: 85.9% idle,  0.2% user, 13.8% kernel,  0.0% iowait,  0.0% swap
Memory: 64G phys mem, 30G free mem, 16G total swap, 16G free swap

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
  1250 rkadm      1  30    0   28M   11M cpu/1    0:09 12.49% pstack 18387
19859 rkadm    392  59    0 3322M 2342M sleep   69.6H  0.46% /usr/Java_1.6_45/bin/java -classpath /usr/Java_1.6_45/lib/tools.jar:/opt/redkne
  1257 rkadm      1  59    0 3128K 1756K cpu/3    0:00  0.05% top -c -d 1 -s 1 100
  1024 root      12   6   17  154M  109M sleep  118:09  0.01% /opt/IBM/SCM/client/../_jvm/bin/java -Xint -Xmx128m -Djlog.logCmdPort=1953 -Dsu
15036 root      32  59    0  138M   74M sleep   85:05  0.01% /usr/java/bin/java -Dviper.fifo.path=/var/run/smc898/boot.fifo -Xmx128m -Dsun.s
  1075 noaccess  19  59    0   97M   91M sleep   69:05  0.01% /usr/java/bin/java -server -Xmx128m -XX:+UseParallelGC -XX:ParallelGCThreads=4
 18285 rkadm     23  59    0  110M   53M sleep   42:39  0.01% orbd -ORBInitialPort 20000 -port 20100
18289 rkadm     21  59    0  105M   45M sleep   42:02  0.01% orbd -ORBInitialPort 21000 -port 21100 /opt/redknee/log/rkctl.log
   185 root       1  59    0 2472K 1108K sleep   46:38  0.01% /usr/lib/inet/in.mpathd -a
  1254 root      43  59    0   47M   22M sleep   12:32  0.00% /opt/IBM/ITM/sol606/ul/bin/kulagent
  1467 root      44  59    0   61M   32M sleep  157:12  0.00% /opt/IBM/ITM/sol606/ux/bin/kuxagent
   665 root       1  59    0 9084K 2596K sleep    1:41  0.00% /usr/lib/sendmail -bd -q15m
     7 root      14  59    0   12M   10M sleep    1:58  0.00% /lib/svc/bin/svc.startd
   502 root       1  59    0 1444K  748K sleep    0:18  0.00% /usr/lib/utmpd
   676 root       1 100  -20 2652K 1424K sleep    5:32  0.00% /usr/lib/inet/xntpd
   574 root       4  59    0 6412K 3068K sleep    4:19  0.00% /usr/lib/inet/inetd start
   643 root       1  59    0 2468K 1304K sleep    0:00  0.00% /usr/lib/snmp/snmpdx -y -c /etc/snmp/conf
  4933 root       2  59    0  590M  578M sleep  499:41  0.00% /usr/bin/dsmc schedule
  5231 root       7  59    0   44M   17M sleep    2:59  0.00% /opt/IBM/ITM/sol606/ux/bin/kcawd
18387 rkadm    999  59    0   29G   23G stop   877.2H  0.00% /usr/Java_1.6_45/bin/amd64/java -Djava.util.logging.manager=com.redknee.framewo


Thanks,
Amit Mishra





_______________________________________________

hotspot-gc-use mailing list

hotspot-gc-use at openjdk.java.net<mailto:hotspot-gc-use at openjdk.java.net>

http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20160908/f334a0d0/attachment.html>


More information about the hotspot-gc-use mailing list