Troubleshoot memory leak without taking heap dump of Production application
Poonam Bajaj Parhar
poonam.bajaj at oracle.com
Thu Nov 10 18:15:57 UTC 2016
Hello Amit,
Given the fact that the Full GCs are not able to reclaim space, this
indicates that there is some strong root that is holding on to the
growing objects in the Java Heap.
Issue time :Heap usage around 28G
num #instances #bytes class name
----------------------------------------------
1: 118037170 6600874168 [C
2: 103071116 5771982496 java.util.HashMap$Entry
3: 101560457 5687385592
com.redknee.product.s5600.ipc.xgen.AcctSessionInfo
4: 118042761 4721710440 java.lang.String
5: 9942863 3020272632 [Ljava.lang.Object;
6: 7537560 2737186632 [Ljava.util.HashMap$Entry;
7: 1453865 639700600
com.redknee.product.s5600.ipc.xgen.PdpContextID
8: 7537148 542674656 java.util.HashMap
I would focus my attention on
'com.redknee.product.s5600.ipc.xgen.AcctSessionInfo' instances and try
to determine what is holding them and preventing them from getting
collected by the Full GCs.
Heap dumps are the best way to figure that out if you could collect one
from your production system when the issue starts occurring. If that is
not possible, then would it be possible to run JVMTI agent to collect
the reference path information for these objects? Long time back, I had
written this JVMTI agent that given a class name can print the reference
path information for the instances of that class.
https://blogs.oracle.com/poonam/entry/jvmti_agent_to_print_reference
And if you have access to the code where instances of AcctSessionInfo
are being created, and stored in a HashMap, I would suggest taking a
look at the source code around that too and see if there is anything
obviously happening wrong with the storage of these instances.
Thanks,
Poonam
On 11/9/2016 11:38 PM, Amit Mishra wrote:
>
> Hello Charlie/Poonam/team,
>
> Need your help/suggestions on how to troubleshoot memory leak without
> taking any heap dump.
>
> We are facing random Promotion failure followed by Continuous
> concurrent mode failures/Full GC events that impacts our Standalone
> application for long time until restart.
>
> Application GC remain stable for more than a week with smooth saw
> tooth pattern and suddenly something happened within 1 hour or so that
> results in severe GC failure and ultimately application failure.
>
> We have verified traffic pattern/application logs and other dependent
> application logs but there is no indication on why suddenly at one
> point of time heap usage kept on increasing which results in CMS
> failures.(Traffic pattern is fairly stable and there are no scheduled
> or cron jobs during time of issue)
>
> We cannot take heap dump as this is standalone application having big
> heap size.(32G)
>
> We have collected histogram during issue time and of non- issue time
> and found that instances of 2-3 classes have been suddenly increased
> from 200-300 MB to 5G+ but not sure how we can dig into code to find
> out what cause those classes instances to surge.
>
> Please guide me how to troubleshoot this issue in terms of any light
> weight tool that would exactly pin point methods or calls that can
> lead to this memory leak as we can’t take heap dump which is very
> heavy impacting tool.
>
> One more question is why Full GC not able to clean generations even
> after multiple attempts and a continuous loop of GC failures being
> created which got resolved only after application restart, does it
> indicates that no new objects was creating & it was only GC algorithm
> which started failing and increased heap usage.
>
> Many thanks in advance for your kind support and guidance.
>
> This is GC graph and attached is GC file.
>
> cid:image002.jpg at 01D23948.747997C0
>
> Histogram snapshots:
>
> java.util.HashMap$Entry was only 400 MB before issue and then 5.5G
> during issue same thing true for AcctSessionInfo and java.lang.String
> class instances.
>
> Non issue time:
>
> num #instances #bytes class name
>
> ----------------------------------------------
>
> 1: 13613915 2219936904 [Ljava.lang.Object;
>
> 2: 10065566 1569906056 [Ljava.util.HashMap$Entry;
>
> 3: 2671564 1175488160
> com.redknee.product.s5600.ipc.xgen.PdpContextID
>
> 4: 17247420 903565648 [C
>
> 5: 10055084 723966048 java.util.HashMap
>
> 6: 17208464 688338560 java.lang.String
>
> 7: 7843562 439239472 java.util.HashMap$Entry
>
> 8: 10065566 402622640 java.util.HashMap$FrontCache
>
> Issue time :Heap usage around 28G
>
> num #instances #bytes class name
>
> ----------------------------------------------
>
> 1: 118037170 6600874168 [C
>
> 2: 103071116 5771982496 java.util.HashMap$Entry
>
> 3: 101560457 5687385592
> com.redknee.product.s5600.ipc.xgen.AcctSessionInfo
>
> 4: 118042761 4721710440 java.lang.String
>
> 5: 9942863 3020272632 [Ljava.lang.Object;
>
> 6: 7537560 2737186632 [Ljava.util.HashMap$Entry;
>
> 7: 1453865 639700600
> com.redknee.product.s5600.ipc.xgen.PdpContextID
>
> 8: 7537148 542674656 java.util.HashMap
>
> Thanks,
>
> Amit Mishra
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20161110/6b6879a4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 34533 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20161110/6b6879a4/attachment-0001.jpe>
More information about the hotspot-gc-use
mailing list