Hi, When a workload produces a uniformly swiss-cheesy heap, i.e. where all parts of the heap have roughly the same amount of garbage, then the GC will face a situation where there are no free lunches and it will have to work hard (compact a lot) to reclaim memory. Therefore, the GC will tolerate a certain amount of fragmentation/waste, in the hope that more object will die soon, making compaction less expensive (at the expense of using more memory for a while). How many CPU cycles to spend on compaction vs. how much memory you can spare is of course a trade-off. You can use -XX:ZFragmentationLimit to control this. It currently defaults to 25% and your workload seems to stabilize at 21%. If you want more aggressive compaction/reclamation, then set the -XX:ZFragmentationLimit to something below 21. This may or may not be a good trade-off in your case. The alternative is to give the GC a larger heap to work with. cheers, Per On 11/4/19 7:56 PM, Sundara Mohan M wrote:
Hi, I ran into this issue where ZGC is unable to reclaim memory for few hours/days. It just keep printing "Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and Allocation Stall happening on that thread.
Here is the metrics which shows for some reason even though there is Garbage but it is unable to Reclaim
.... [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] GC(112126) Live: - 6366M (78%) 6366M (78%) 6366M (78%) - - *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] GC(112126) Garbage: - 1735M (21%) 1735M (21%) 1731M (21%)* - - [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] GC(112126) Reclaimed: - - 0M (0%) 4M (0%) ...
[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) Live: - 6367M (78%) 6367M (78%) 6367M (78%) - - *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) Garbage: - 1730M (21%) 1730M (21%) 1724M (21%)* - - [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) Reclaimed: - - 0M (0%) 6M (0%)
Here it was in this state for ~8hours and it is still happening. It says has a Garbage of 21G but it is not able to Reclaim it everytime it reclaims only 4-6M.
Any idea what might be the issue here.
TIA Sundar