Troubleshoot memory leak without taking heap dump of Production application
Amit Mishra
amit.mishra at redknee.com
Thu Nov 10 07:38:55 UTC 2016
Hello Charlie/Poonam/team,
Need your help/suggestions on how to troubleshoot memory leak without taking any heap dump.
We are facing random Promotion failure followed by Continuous concurrent mode failures/Full GC events that impacts our Standalone application for long time until restart.
Application GC remain stable for more than a week with smooth saw tooth pattern and suddenly something happened within 1 hour or so that results in severe GC failure and ultimately application failure.
We have verified traffic pattern/application logs and other dependent application logs but there is no indication on why suddenly at one point of time heap usage kept on increasing which results in CMS failures.(Traffic pattern is fairly stable and there are no scheduled or cron jobs during time of issue)
We cannot take heap dump as this is standalone application having big heap size.(32G)
We have collected histogram during issue time and of non- issue time and found that instances of 2-3 classes have been suddenly increased from 200-300 MB to 5G+ but not sure how we can dig into code to find out what cause those classes instances to surge.
Please guide me how to troubleshoot this issue in terms of any light weight tool that would exactly pin point methods or calls that can lead to this memory leak as we can't take heap dump which is very heavy impacting tool.
One more question is why Full GC not able to clean generations even after multiple attempts and a continuous loop of GC failures being created which got resolved only after application restart, does it indicates that no new objects was creating & it was only GC algorithm which started failing and increased heap usage.
Many thanks in advance for your kind support and guidance.
This is GC graph and attached is GC file.
[cid:image002.jpg at 01D23948.747997C0]
Histogram snapshots:
java.util.HashMap$Entry was only 400 MB before issue and then 5.5G during issue same thing true for AcctSessionInfo and java.lang.String class instances.
Non issue time:
num #instances #bytes class name
----------------------------------------------
1: 13613915 2219936904 [Ljava.lang.Object;
2: 10065566 1569906056 [Ljava.util.HashMap$Entry;
3: 2671564 1175488160 com.redknee.product.s5600.ipc.xgen.PdpContextID
4: 17247420 903565648 [C
5: 10055084 723966048 java.util.HashMap
6: 17208464 688338560 java.lang.String
7: 7843562 439239472 java.util.HashMap$Entry
8: 10065566 402622640 java.util.HashMap$FrontCache
Issue time :Heap usage around 28G
num #instances #bytes class name
----------------------------------------------
1: 118037170 6600874168 [C
2: 103071116 5771982496 java.util.HashMap$Entry
3: 101560457 5687385592 com.redknee.product.s5600.ipc.xgen.AcctSessionInfo
4: 118042761 4721710440 java.lang.String
5: 9942863 3020272632 [Ljava.lang.Object;
6: 7537560 2737186632 [Ljava.util.HashMap$Entry;
7: 1453865 639700600 com.redknee.product.s5600.ipc.xgen.PdpContextID
8: 7537148 542674656 java.util.HashMap
Thanks,
Amit Mishra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20161110/ff77aa86/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 34533 bytes
Desc: image002.jpg
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20161110/ff77aa86/image002-0001.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcstats.log.rar
Type: application/octet-stream
Size: 559660 bytes
Desc: gcstats.log.rar
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20161110/ff77aa86/gcstats.log-0001.rar>
More information about the hotspot-gc-use
mailing list