ZGC Unable to reclaim memory for long time
Per Liden
per.liden at oracle.com
Wed Nov 6 10:44:38 UTC 2019
On 11/5/19 4:48 PM, Peter Booth wrote:
> Reading this and similar threads I am struck by the fact that ZGC users are experiencing things that users of Azul’s Zing JVM also go through. I remember the amazement at seeing a JVM run without substantive GC pauses and thinking that it was a free lunch. But the price was two parts - ensuring adequate heap, and rewiring brains that are accustomed to seeing cpu and memory as independent resources. The second turns out to be much harder.
>
> From experience, I think a lot of pain can be avoided by clearly communicating that an adequate heap is a prerequisite for a healthy JVM. Most java developers have absorbed the notion that large heaps are bad/risky and unlearning takes time.
The documentation on the ZGC wiki [1] tries to be clear about this, but
I'm sure it could be improved.
[1] https://wiki.openjdk.java.net/display/zgc/Main
cheers,
Per
>
> Sent from my iPhone
>
>> On Nov 4, 2019, at 8:28 PM, Sundara Mohan M <m.sundar85 at gmail.com> wrote:
>>
>> HI Per,
>> This explains why it didn't work to reclaim memory, also my heap memory was
>> 8G and 6G was strongly reachable (when i took heap dump). Agreed increasing
>> heap memory will help in this case.
>>
>> Still trying to understand better on ZGC,
>> 1. So shouldn't GC try to be more aggressive and try to put more effort to
>> reclaim without additional settings?
>> 2. Is there a reason why it shouldn't give more CPU to GC threads and
>> reclaim garbage (say after X run of GC it could not reclaim memory)? In
>> this case it would be good to reclaim existing garbage instead of doing
>> Allocation Stall and failing with heap out of memory.
>>
>>
>> Thanks
>> Sundar
>>
>>> On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> When a workload produces a uniformly swiss-cheesy heap, i.e. where all
>>> parts of the heap have roughly the same amount of garbage, then the GC
>>> will face a situation where there are no free lunches and it will have
>>> to work hard (compact a lot) to reclaim memory. Therefore, the GC will
>>> tolerate a certain amount of fragmentation/waste, in the hope that more
>>> object will die soon, making compaction less expensive (at the expense
>>> of using more memory for a while). How many CPU cycles to spend on
>>> compaction vs. how much memory you can spare is of course a trade-off.
>>>
>>> You can use -XX:ZFragmentationLimit to control this. It currently
>>> defaults to 25% and your workload seems to stabilize at 21%. If you want
>>> more aggressive compaction/reclamation, then set the
>>> -XX:ZFragmentationLimit to something below 21. This may or may not be a
>>> good trade-off in your case. The alternative is to give the GC a larger
>>> heap to work with.
>>>
>>> cheers,
>>> Per
>>>
>>>> On 11/4/19 7:56 PM, Sundara Mohan M wrote:
>>>> Hi,
>>>> I ran into this issue where ZGC is unable to reclaim memory for few
>>>> hours/days. It just keep printing "Exception in thread "RMI TCP
>>>> Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and
>>>> Allocation Stall happening on that thread.
>>>>
>>>>
>>>> Here is the metrics which shows for some reason even though there is
>>>> Garbage but it is unable to Reclaim
>>>>
>>>> ....
>>>> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ]
>>>> GC(112126) Live: - 6366M (78%) 6366M
>>> (78%)
>>>> 6366M (78%)
>>>> - -
>>>> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ]
>>>> GC(112126) Garbage: - 1735M (21%) 1735M
>>> (21%)
>>>> 1731M (21%)*
>>>> - -
>>>> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ]
>>> GC(112126)
>>>> Reclaimed: - - 0M (0%)
>>>> 4M (0%)
>>>> ...
>>>>
>>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ]
>>> GC(135520)
>>>> Live: - 6367M (78%) 6367M (78%)
>>>> 6367M (78%)
>>>> - -
>>>> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ]
>>>> GC(135520) Garbage: - 1730M (21%) 1730M
>>> (21%)
>>>> 1724M (21%)*
>>>> - -
>>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ]
>>> GC(135520)
>>>> Reclaimed: - - 0M (0%)
>>>> 6M (0%)
>>>>
>>>> Here it was in this state for ~8hours and it is still happening. It says
>>>> has a Garbage of 21G but it is not able to Reclaim it everytime it
>>> reclaims
>>>> only 4-6M.
>>>>
>>>> Any idea what might be the issue here.
>>>>
>>>>
>>>> TIA
>>>> Sundar
>>>>
>>>
>
More information about the zgc-dev
mailing list