ZGC Unable to reclaim memory for long time

Wed Nov 6 20:07:31 UTC 2019

HI Per,
   Thanks. Will try changing ZFragmentationLimit value to see if it works.

Regards
Sundar

On Wed, Nov 6, 2019 at 2:38 AM Per Liden <per.liden at oracle.com> wrote:

> Hi,
>
> On 11/5/19 2:27 AM, Sundara Mohan M wrote:
> > HI Per,
> > This explains why it didn't work to reclaim memory, also my heap memory
> > was 8G and 6G was strongly reachable (when i took heap dump). Agreed
> > increasing heap memory will help in this case.
> >
> > Still trying to understand better on ZGC,
> > 1. So shouldn't GC try to be more aggressive and try to put more effort
> > to reclaim without additional settings?
> > 2. Is there a reason why it shouldn't give more CPU to GC threads and
> > reclaim garbage (say after X run of GC it could not reclaim memory)? In
> > this case it would be good to reclaim existing garbage instead of doing
> > Allocation Stall and failing with heap out of memory.
>
> The tricky part is knowing/detecting when to be more aggressive, since
> it tends to become an exercise in trying to predict the future. Reacting
> when something bad happens (e.g. allocation stall) tends to be too late.
>
> However, before thinking too much about heuristics, we might just want
> to reconsider the ZFragmentationLimit default value, as it is perhaps a
> bit too generous today. Most apps I've looked at tend to stabilize
> somewhere between 2-10% fragmentation/waste (i.e. way below 25%), so
> lowering the default might not hurt most apps, but help some apps.
>
> cheers,
> Per
>
> >
> >
> > Thanks
> > Sundar
> >
> > On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com
> > <mailto:per.liden at oracle.com>> wrote:
> >
> >     Hi,
> >
> >     When a workload produces a uniformly swiss-cheesy heap, i.e. where
> all
> >     parts of the heap have roughly the same amount of garbage, then the
> GC
> >     will face a situation where there are no free lunches and it will
> have
> >     to work hard (compact a lot) to reclaim memory. Therefore, the GC
> will
> >     tolerate a certain amount of fragmentation/waste, in the hope that
> more
> >     object will die soon, making compaction less expensive (at the
> expense
> >     of using more memory for a while). How many CPU cycles to spend on
> >     compaction vs. how much memory you can spare is of course a
> trade-off.
> >
> >     You can use -XX:ZFragmentationLimit to control this. It currently
> >     defaults to 25% and your workload seems to stabilize at 21%. If you
> >     want
> >     more aggressive compaction/reclamation, then set the
> >     -XX:ZFragmentationLimit to something below 21. This may or may not
> be a
> >     good trade-off in your case. The alternative is to give the GC a
> larger
> >     heap to work with.
> >
> >     cheers,
> >     Per
> >
> >     On 11/4/19 7:56 PM, Sundara Mohan M wrote:
> >      > Hi,
> >      >     I ran into this issue where ZGC is unable to reclaim memory
> >     for few
> >      > hours/days. It just keep printing "Exception in thread "RMI TCP
> >      > Connection(idle)" java.lang.OutOfMemoryError: Java heap space"
> and
> >      > Allocation Stall happening on that thread.
> >      >
> >      >
> >      > Here is the metrics which shows for some reason even though there
> is
> >      > Garbage but it is unable to Reclaim
> >      >
> >      > ....
> >      > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
> >      > GC(112126)      Live:         -              6366M (78%)
> >     6366M (78%)
> >      >         6366M (78%)
> >      >      -                  -
> >      > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> >      > GC(112126)   Garbage:         -              1735M (21%)
> >     1735M (21%)
> >      >         1731M (21%)*
> >      >      -                  -
> >      > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> >     GC(112126)
> >      > Reclaimed:         -                  -                 0M (0%)
> >      >   4M (0%)
> >      > ...
> >      >
> >      > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >     GC(135520)
> >      >       Live:         -              6367M (78%)        6367M (78%)
> >      >   6367M (78%)
> >      >      -                  -
> >      > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >      > GC(135520)   Garbage:         -              1730M (21%)
> >     1730M (21%)
> >      >         1724M (21%)*
> >      >      -                  -
> >      > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >     GC(135520)
> >      > Reclaimed:         -                  -                 0M (0%)
> >      >   6M (0%)
> >      >
> >      > Here it was in this state for ~8hours and it is still happening.
> >     It says
> >      > has a Garbage of 21G but it is not able to Reclaim it everytime
> >     it reclaims
> >      > only 4-6M.
> >      >
> >      > Any idea what might be the issue here.
> >      >
> >      >
> >      > TIA
> >      > Sundar
> >      >
> >
>