From m.sundar85 at gmail.com  Mon Nov  4 18:56:05 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Mon, 4 Nov 2019 10:56:05 -0800
Subject: ZGC Unable to reclaim memory for long time
Message-ID: <CACGCMVrYmtj2vr4JnhbUpj4FGoTf-72EARsR6CoLZ6g7LHTKNw@mail.gmail.com>

Hi,
   I ran into this issue where ZGC is unable to reclaim memory for few
hours/days. It just keep printing "Exception in thread "RMI TCP
Connection(idle)" java.lang.OutOfMemoryError: Java heap space"  and
Allocation Stall happening on that thread.


Here is the metrics which shows for some reason even though there is
Garbage but it is unable to Reclaim

....
[2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
GC(112126)      Live:         -              6366M (78%)        6366M (78%)
       6366M (78%)
    -                  -
*[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
GC(112126)   Garbage:         -              1735M (21%)        1735M (21%)
       1731M (21%)*
    -                  -
[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ] GC(112126)
Reclaimed:         -                  -                 0M (0%)
 4M (0%)
...

[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ] GC(135520)
     Live:         -              6367M (78%)        6367M (78%)
 6367M (78%)
    -                  -
*[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
GC(135520)   Garbage:         -              1730M (21%)        1730M (21%)
       1724M (21%)*
    -                  -
[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ] GC(135520)
Reclaimed:         -                  -                 0M (0%)
 6M (0%)

Here it was in this state for ~8hours and it is still happening. It says
has a Garbage of 21G but it is not able to Reclaim it everytime it reclaims
only 4-6M.

Any idea what might be the issue here.


TIA
Sundar

From per.liden at oracle.com  Mon Nov  4 20:40:49 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 4 Nov 2019 21:40:49 +0100
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVrYmtj2vr4JnhbUpj4FGoTf-72EARsR6CoLZ6g7LHTKNw@mail.gmail.com>
References: <CACGCMVrYmtj2vr4JnhbUpj4FGoTf-72EARsR6CoLZ6g7LHTKNw@mail.gmail.com>
Message-ID: <fdf2bf07-a352-9cc0-ec91-021cfc20d45c@oracle.com>

Hi,

When a workload produces a uniformly swiss-cheesy heap, i.e. where all 
parts of the heap have roughly the same amount of garbage, then the GC 
will face a situation where there are no free lunches and it will have 
to work hard (compact a lot) to reclaim memory. Therefore, the GC will 
tolerate a certain amount of fragmentation/waste, in the hope that more 
object will die soon, making compaction less expensive (at the expense 
of using more memory for a while). How many CPU cycles to spend on 
compaction vs. how much memory you can spare is of course a trade-off.

You can use -XX:ZFragmentationLimit to control this. It currently 
defaults to 25% and your workload seems to stabilize at 21%. If you want 
more aggressive compaction/reclamation, then set the 
-XX:ZFragmentationLimit to something below 21. This may or may not be a 
good trade-off in your case. The alternative is to give the GC a larger 
heap to work with.

cheers,
Per

On 11/4/19 7:56 PM, Sundara Mohan M wrote:
> Hi,
>     I ran into this issue where ZGC is unable to reclaim memory for few
> hours/days. It just keep printing "Exception in thread "RMI TCP
> Connection(idle)" java.lang.OutOfMemoryError: Java heap space"  and
> Allocation Stall happening on that thread.
> 
> 
> Here is the metrics which shows for some reason even though there is
> Garbage but it is unable to Reclaim
> 
> ....
> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
> GC(112126)      Live:         -              6366M (78%)        6366M (78%)
>         6366M (78%)
>      -                  -
> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> GC(112126)   Garbage:         -              1735M (21%)        1735M (21%)
>         1731M (21%)*
>      -                  -
> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ] GC(112126)
> Reclaimed:         -                  -                 0M (0%)
>   4M (0%)
> ...
> 
> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ] GC(135520)
>       Live:         -              6367M (78%)        6367M (78%)
>   6367M (78%)
>      -                  -
> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> GC(135520)   Garbage:         -              1730M (21%)        1730M (21%)
>         1724M (21%)*
>      -                  -
> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ] GC(135520)
> Reclaimed:         -                  -                 0M (0%)
>   6M (0%)
> 
> Here it was in this state for ~8hours and it is still happening. It says
> has a Garbage of 21G but it is not able to Reclaim it everytime it reclaims
> only 4-6M.
> 
> Any idea what might be the issue here.
> 
> 
> TIA
> Sundar
> 

From m.sundar85 at gmail.com  Tue Nov  5 01:27:46 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Mon, 4 Nov 2019 17:27:46 -0800
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <fdf2bf07-a352-9cc0-ec91-021cfc20d45c@oracle.com>
References: <CACGCMVrYmtj2vr4JnhbUpj4FGoTf-72EARsR6CoLZ6g7LHTKNw@mail.gmail.com>
 <fdf2bf07-a352-9cc0-ec91-021cfc20d45c@oracle.com>
Message-ID: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>

HI Per,
This explains why it didn't work to reclaim memory, also my heap memory was
8G and 6G was strongly reachable (when i took heap dump). Agreed increasing
heap memory will help in this case.

Still trying to understand better on ZGC,
1. So shouldn't GC try to be more aggressive and try to put more effort to
reclaim without additional settings?
2. Is there a reason why it shouldn't give more CPU to GC threads and
reclaim garbage (say after X run of GC it could not reclaim memory)? In
this case it would be good to reclaim existing garbage instead of doing
Allocation Stall and failing with heap out of memory.


Thanks
Sundar

On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com> wrote:

> Hi,
>
> When a workload produces a uniformly swiss-cheesy heap, i.e. where all
> parts of the heap have roughly the same amount of garbage, then the GC
> will face a situation where there are no free lunches and it will have
> to work hard (compact a lot) to reclaim memory. Therefore, the GC will
> tolerate a certain amount of fragmentation/waste, in the hope that more
> object will die soon, making compaction less expensive (at the expense
> of using more memory for a while). How many CPU cycles to spend on
> compaction vs. how much memory you can spare is of course a trade-off.
>
> You can use -XX:ZFragmentationLimit to control this. It currently
> defaults to 25% and your workload seems to stabilize at 21%. If you want
> more aggressive compaction/reclamation, then set the
> -XX:ZFragmentationLimit to something below 21. This may or may not be a
> good trade-off in your case. The alternative is to give the GC a larger
> heap to work with.
>
> cheers,
> Per
>
> On 11/4/19 7:56 PM, Sundara Mohan M wrote:
> > Hi,
> >     I ran into this issue where ZGC is unable to reclaim memory for few
> > hours/days. It just keep printing "Exception in thread "RMI TCP
> > Connection(idle)" java.lang.OutOfMemoryError: Java heap space"  and
> > Allocation Stall happening on that thread.
> >
> >
> > Here is the metrics which shows for some reason even though there is
> > Garbage but it is unable to Reclaim
> >
> > ....
> > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
> > GC(112126)      Live:         -              6366M (78%)        6366M
> (78%)
> >         6366M (78%)
> >      -                  -
> > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> > GC(112126)   Garbage:         -              1735M (21%)        1735M
> (21%)
> >         1731M (21%)*
> >      -                  -
> > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> GC(112126)
> > Reclaimed:         -                  -                 0M (0%)
> >   4M (0%)
> > ...
> >
> > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> GC(135520)
> >       Live:         -              6367M (78%)        6367M (78%)
> >   6367M (78%)
> >      -                  -
> > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> > GC(135520)   Garbage:         -              1730M (21%)        1730M
> (21%)
> >         1724M (21%)*
> >      -                  -
> > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> GC(135520)
> > Reclaimed:         -                  -                 0M (0%)
> >   6M (0%)
> >
> > Here it was in this state for ~8hours and it is still happening. It says
> > has a Garbage of 21G but it is not able to Reclaim it everytime it
> reclaims
> > only 4-6M.
> >
> > Any idea what might be the issue here.
> >
> >
> > TIA
> > Sundar
> >
>

From peter_booth at me.com  Tue Nov  5 15:48:43 2019
From: peter_booth at me.com (Peter Booth)
Date: Tue, 5 Nov 2019 10:48:43 -0500
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
Message-ID: <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>

Reading this and similar threads I am struck by the fact that ZGC users are experiencing things that users of Azul?s Zing JVM also go through. I remember the amazement at seeing a JVM run without substantive GC pauses and thinking that it was a free lunch. But the price was two parts - ensuring adequate heap, and rewiring brains that are accustomed to seeing cpu and memory as independent resources. The second turns out to be much harder.

From experience, I think a lot of pain can be avoided by clearly communicating that an adequate heap is a prerequisite for a healthy JVM. Most java developers have absorbed the notion that large heaps are bad/risky and unlearning takes time.

Sent from my iPhone

> On Nov 4, 2019, at 8:28 PM, Sundara Mohan M <m.sundar85 at gmail.com> wrote:
> 
> ?HI Per,
> This explains why it didn't work to reclaim memory, also my heap memory was
> 8G and 6G was strongly reachable (when i took heap dump). Agreed increasing
> heap memory will help in this case.
> 
> Still trying to understand better on ZGC,
> 1. So shouldn't GC try to be more aggressive and try to put more effort to
> reclaim without additional settings?
> 2. Is there a reason why it shouldn't give more CPU to GC threads and
> reclaim garbage (say after X run of GC it could not reclaim memory)? In
> this case it would be good to reclaim existing garbage instead of doing
> Allocation Stall and failing with heap out of memory.
> 
> 
> Thanks
> Sundar
> 
>> On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com> wrote:
>> 
>> Hi,
>> 
>> When a workload produces a uniformly swiss-cheesy heap, i.e. where all
>> parts of the heap have roughly the same amount of garbage, then the GC
>> will face a situation where there are no free lunches and it will have
>> to work hard (compact a lot) to reclaim memory. Therefore, the GC will
>> tolerate a certain amount of fragmentation/waste, in the hope that more
>> object will die soon, making compaction less expensive (at the expense
>> of using more memory for a while). How many CPU cycles to spend on
>> compaction vs. how much memory you can spare is of course a trade-off.
>> 
>> You can use -XX:ZFragmentationLimit to control this. It currently
>> defaults to 25% and your workload seems to stabilize at 21%. If you want
>> more aggressive compaction/reclamation, then set the
>> -XX:ZFragmentationLimit to something below 21. This may or may not be a
>> good trade-off in your case. The alternative is to give the GC a larger
>> heap to work with.
>> 
>> cheers,
>> Per
>> 
>>> On 11/4/19 7:56 PM, Sundara Mohan M wrote:
>>> Hi,
>>>    I ran into this issue where ZGC is unable to reclaim memory for few
>>> hours/days. It just keep printing "Exception in thread "RMI TCP
>>> Connection(idle)" java.lang.OutOfMemoryError: Java heap space"  and
>>> Allocation Stall happening on that thread.
>>> 
>>> 
>>> Here is the metrics which shows for some reason even though there is
>>> Garbage but it is unable to Reclaim
>>> 
>>> ....
>>> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
>>> GC(112126)      Live:         -              6366M (78%)        6366M
>> (78%)
>>>        6366M (78%)
>>>     -                  -
>>> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
>>> GC(112126)   Garbage:         -              1735M (21%)        1735M
>> (21%)
>>>        1731M (21%)*
>>>     -                  -
>>> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
>> GC(112126)
>>> Reclaimed:         -                  -                 0M (0%)
>>>  4M (0%)
>>> ...
>>> 
>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
>> GC(135520)
>>>      Live:         -              6367M (78%)        6367M (78%)
>>>  6367M (78%)
>>>     -                  -
>>> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
>>> GC(135520)   Garbage:         -              1730M (21%)        1730M
>> (21%)
>>>        1724M (21%)*
>>>     -                  -
>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
>> GC(135520)
>>> Reclaimed:         -                  -                 0M (0%)
>>>  6M (0%)
>>> 
>>> Here it was in this state for ~8hours and it is still happening. It says
>>> has a Garbage of 21G but it is not able to Reclaim it everytime it
>> reclaims
>>> only 4-6M.
>>> 
>>> Any idea what might be the issue here.
>>> 
>>> 
>>> TIA
>>> Sundar
>>> 
>> 


From per.liden at oracle.com  Wed Nov  6 10:37:32 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 6 Nov 2019 11:37:32 +0100
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
References: <CACGCMVrYmtj2vr4JnhbUpj4FGoTf-72EARsR6CoLZ6g7LHTKNw@mail.gmail.com>
 <fdf2bf07-a352-9cc0-ec91-021cfc20d45c@oracle.com>
 <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
Message-ID: <c56c0d7c-bf48-3946-6b84-2457fa92efa2@oracle.com>

Hi,

On 11/5/19 2:27 AM, Sundara Mohan M wrote:
> HI Per,
> This explains why it didn't work to reclaim memory, also my heap memory 
> was 8G and 6G was strongly reachable (when i took heap dump). Agreed 
> increasing heap memory will help in this case.
> 
> Still trying to understand better on ZGC,
> 1. So shouldn't GC try to be more aggressive and try to put more effort 
> to reclaim without additional settings?
> 2. Is there a reason why it shouldn't give more CPU to GC threads and 
> reclaim?garbage (say after X run of GC it could not reclaim memory)? In 
> this case it would be good to reclaim existing garbage instead of doing 
> Allocation Stall and failing with heap out of memory.

The tricky part is knowing/detecting when to be more aggressive, since 
it tends to become an exercise in trying to predict the future. Reacting 
when something bad happens (e.g. allocation stall) tends to be too late.

However, before thinking too much about heuristics, we might just want 
to reconsider the ZFragmentationLimit default value, as it is perhaps a 
bit too generous today. Most apps I've looked at tend to stabilize 
somewhere between 2-10% fragmentation/waste (i.e. way below 25%), so 
lowering the default might not hurt most apps, but help some apps.

cheers,
Per

> 
> 
> Thanks
> Sundar
> 
> On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com 
> <mailto:per.liden at oracle.com>> wrote:
> 
>     Hi,
> 
>     When a workload produces a uniformly swiss-cheesy heap, i.e. where all
>     parts of the heap have roughly the same amount of garbage, then the GC
>     will face a situation where there are no free lunches and it will have
>     to work hard (compact a lot) to reclaim memory. Therefore, the GC will
>     tolerate a certain amount of fragmentation/waste, in the hope that more
>     object will die soon, making compaction less expensive (at the expense
>     of using more memory for a while). How many CPU cycles to spend on
>     compaction vs. how much memory you can spare is of course a trade-off.
> 
>     You can use -XX:ZFragmentationLimit to control this. It currently
>     defaults to 25% and your workload seems to stabilize at 21%. If you
>     want
>     more aggressive compaction/reclamation, then set the
>     -XX:ZFragmentationLimit to something below 21. This may or may not be a
>     good trade-off in your case. The alternative is to give the GC a larger
>     heap to work with.
> 
>     cheers,
>     Per
> 
>     On 11/4/19 7:56 PM, Sundara Mohan M wrote:
>      > Hi,
>      >? ? ?I ran into this issue where ZGC is unable to reclaim memory
>     for few
>      > hours/days. It just keep printing "Exception in thread "RMI TCP
>      > Connection(idle)" java.lang.OutOfMemoryError: Java heap space"? and
>      > Allocation Stall happening on that thread.
>      >
>      >
>      > Here is the metrics which shows for some reason even though there is
>      > Garbage but it is unable to Reclaim
>      >
>      > ....
>      > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap? ? ?]
>      > GC(112126)? ? ? Live:? ? ? ? ?-? ? ? ? ? ? ? 6366M (78%)       
>     6366M (78%)
>      >? ? ? ? ?6366M (78%)
>      >? ? ? -? ? ? ? ? ? ? ? ? -
>      > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap? ? ?]
>      > GC(112126)? ?Garbage:? ? ? ? ?-? ? ? ? ? ? ? 1735M (21%)       
>     1735M (21%)
>      >? ? ? ? ?1731M (21%)*
>      >? ? ? -? ? ? ? ? ? ? ? ? -
>      > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap? ? ?]
>     GC(112126)
>      > Reclaimed:? ? ? ? ?-? ? ? ? ? ? ? ? ? -? ? ? ? ? ? ? ? ?0M (0%)
>      >? ?4M (0%)
>      > ...
>      >
>      > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap? ? ?]
>     GC(135520)
>      >? ? ? ?Live:? ? ? ? ?-? ? ? ? ? ? ? 6367M (78%)? ? ? ? 6367M (78%)
>      >? ?6367M (78%)
>      >? ? ? -? ? ? ? ? ? ? ? ? -
>      > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap? ? ?]
>      > GC(135520)? ?Garbage:? ? ? ? ?-? ? ? ? ? ? ? 1730M (21%)       
>     1730M (21%)
>      >? ? ? ? ?1724M (21%)*
>      >? ? ? -? ? ? ? ? ? ? ? ? -
>      > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap? ? ?]
>     GC(135520)
>      > Reclaimed:? ? ? ? ?-? ? ? ? ? ? ? ? ? -? ? ? ? ? ? ? ? ?0M (0%)
>      >? ?6M (0%)
>      >
>      > Here it was in this state for ~8hours and it is still happening.
>     It says
>      > has a Garbage of 21G but it is not able to Reclaim it everytime
>     it reclaims
>      > only 4-6M.
>      >
>      > Any idea what might be the issue here.
>      >
>      >
>      > TIA
>      > Sundar
>      >
> 

From per.liden at oracle.com  Wed Nov  6 10:44:38 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 6 Nov 2019 11:44:38 +0100
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
Message-ID: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>

On 11/5/19 4:48 PM, Peter Booth wrote:
> Reading this and similar threads I am struck by the fact that ZGC users are experiencing things that users of Azul?s Zing JVM also go through. I remember the amazement at seeing a JVM run without substantive GC pauses and thinking that it was a free lunch. But the price was two parts - ensuring adequate heap, and rewiring brains that are accustomed to seeing cpu and memory as independent resources. The second turns out to be much harder.
> 
>  From experience, I think a lot of pain can be avoided by clearly communicating that an adequate heap is a prerequisite for a healthy JVM. Most java developers have absorbed the notion that large heaps are bad/risky and unlearning takes time.

The documentation on the ZGC wiki [1] tries to be clear about this, but 
I'm sure it could be improved.

[1] https://wiki.openjdk.java.net/display/zgc/Main

cheers,
Per

> 
> Sent from my iPhone
> 
>> On Nov 4, 2019, at 8:28 PM, Sundara Mohan M <m.sundar85 at gmail.com> wrote:
>>
>> ?HI Per,
>> This explains why it didn't work to reclaim memory, also my heap memory was
>> 8G and 6G was strongly reachable (when i took heap dump). Agreed increasing
>> heap memory will help in this case.
>>
>> Still trying to understand better on ZGC,
>> 1. So shouldn't GC try to be more aggressive and try to put more effort to
>> reclaim without additional settings?
>> 2. Is there a reason why it shouldn't give more CPU to GC threads and
>> reclaim garbage (say after X run of GC it could not reclaim memory)? In
>> this case it would be good to reclaim existing garbage instead of doing
>> Allocation Stall and failing with heap out of memory.
>>
>>
>> Thanks
>> Sundar
>>
>>> On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> When a workload produces a uniformly swiss-cheesy heap, i.e. where all
>>> parts of the heap have roughly the same amount of garbage, then the GC
>>> will face a situation where there are no free lunches and it will have
>>> to work hard (compact a lot) to reclaim memory. Therefore, the GC will
>>> tolerate a certain amount of fragmentation/waste, in the hope that more
>>> object will die soon, making compaction less expensive (at the expense
>>> of using more memory for a while). How many CPU cycles to spend on
>>> compaction vs. how much memory you can spare is of course a trade-off.
>>>
>>> You can use -XX:ZFragmentationLimit to control this. It currently
>>> defaults to 25% and your workload seems to stabilize at 21%. If you want
>>> more aggressive compaction/reclamation, then set the
>>> -XX:ZFragmentationLimit to something below 21. This may or may not be a
>>> good trade-off in your case. The alternative is to give the GC a larger
>>> heap to work with.
>>>
>>> cheers,
>>> Per
>>>
>>>> On 11/4/19 7:56 PM, Sundara Mohan M wrote:
>>>> Hi,
>>>>     I ran into this issue where ZGC is unable to reclaim memory for few
>>>> hours/days. It just keep printing "Exception in thread "RMI TCP
>>>> Connection(idle)" java.lang.OutOfMemoryError: Java heap space"  and
>>>> Allocation Stall happening on that thread.
>>>>
>>>>
>>>> Here is the metrics which shows for some reason even though there is
>>>> Garbage but it is unable to Reclaim
>>>>
>>>> ....
>>>> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
>>>> GC(112126)      Live:         -              6366M (78%)        6366M
>>> (78%)
>>>>         6366M (78%)
>>>>      -                  -
>>>> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
>>>> GC(112126)   Garbage:         -              1735M (21%)        1735M
>>> (21%)
>>>>         1731M (21%)*
>>>>      -                  -
>>>> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
>>> GC(112126)
>>>> Reclaimed:         -                  -                 0M (0%)
>>>>   4M (0%)
>>>> ...
>>>>
>>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
>>> GC(135520)
>>>>       Live:         -              6367M (78%)        6367M (78%)
>>>>   6367M (78%)
>>>>      -                  -
>>>> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
>>>> GC(135520)   Garbage:         -              1730M (21%)        1730M
>>> (21%)
>>>>         1724M (21%)*
>>>>      -                  -
>>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
>>> GC(135520)
>>>> Reclaimed:         -                  -                 0M (0%)
>>>>   6M (0%)
>>>>
>>>> Here it was in this state for ~8hours and it is still happening. It says
>>>> has a Garbage of 21G but it is not able to Reclaim it everytime it
>>> reclaims
>>>> only 4-6M.
>>>>
>>>> Any idea what might be the issue here.
>>>>
>>>>
>>>> TIA
>>>> Sundar
>>>>
>>>
> 

From m.sundar85 at gmail.com  Wed Nov  6 20:07:31 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Wed, 6 Nov 2019 12:07:31 -0800
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <c56c0d7c-bf48-3946-6b84-2457fa92efa2@oracle.com>
References: <CACGCMVrYmtj2vr4JnhbUpj4FGoTf-72EARsR6CoLZ6g7LHTKNw@mail.gmail.com>
 <fdf2bf07-a352-9cc0-ec91-021cfc20d45c@oracle.com>
 <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <c56c0d7c-bf48-3946-6b84-2457fa92efa2@oracle.com>
Message-ID: <CACGCMVqEKWbFQaZPNcdao8sbkfOfLd+VGxRTVOxuydLhMty-cg@mail.gmail.com>

HI Per,
   Thanks. Will try changing ZFragmentationLimit value to see if it works.


Regards
Sundar

On Wed, Nov 6, 2019 at 2:38 AM Per Liden <per.liden at oracle.com> wrote:

> Hi,
>
> On 11/5/19 2:27 AM, Sundara Mohan M wrote:
> > HI Per,
> > This explains why it didn't work to reclaim memory, also my heap memory
> > was 8G and 6G was strongly reachable (when i took heap dump). Agreed
> > increasing heap memory will help in this case.
> >
> > Still trying to understand better on ZGC,
> > 1. So shouldn't GC try to be more aggressive and try to put more effort
> > to reclaim without additional settings?
> > 2. Is there a reason why it shouldn't give more CPU to GC threads and
> > reclaim garbage (say after X run of GC it could not reclaim memory)? In
> > this case it would be good to reclaim existing garbage instead of doing
> > Allocation Stall and failing with heap out of memory.
>
> The tricky part is knowing/detecting when to be more aggressive, since
> it tends to become an exercise in trying to predict the future. Reacting
> when something bad happens (e.g. allocation stall) tends to be too late.
>
> However, before thinking too much about heuristics, we might just want
> to reconsider the ZFragmentationLimit default value, as it is perhaps a
> bit too generous today. Most apps I've looked at tend to stabilize
> somewhere between 2-10% fragmentation/waste (i.e. way below 25%), so
> lowering the default might not hurt most apps, but help some apps.
>
> cheers,
> Per
>
> >
> >
> > Thanks
> > Sundar
> >
> > On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com
> > <mailto:per.liden at oracle.com>> wrote:
> >
> >     Hi,
> >
> >     When a workload produces a uniformly swiss-cheesy heap, i.e. where
> all
> >     parts of the heap have roughly the same amount of garbage, then the
> GC
> >     will face a situation where there are no free lunches and it will
> have
> >     to work hard (compact a lot) to reclaim memory. Therefore, the GC
> will
> >     tolerate a certain amount of fragmentation/waste, in the hope that
> more
> >     object will die soon, making compaction less expensive (at the
> expense
> >     of using more memory for a while). How many CPU cycles to spend on
> >     compaction vs. how much memory you can spare is of course a
> trade-off.
> >
> >     You can use -XX:ZFragmentationLimit to control this. It currently
> >     defaults to 25% and your workload seems to stabilize at 21%. If you
> >     want
> >     more aggressive compaction/reclamation, then set the
> >     -XX:ZFragmentationLimit to something below 21. This may or may not
> be a
> >     good trade-off in your case. The alternative is to give the GC a
> larger
> >     heap to work with.
> >
> >     cheers,
> >     Per
> >
> >     On 11/4/19 7:56 PM, Sundara Mohan M wrote:
> >      > Hi,
> >      >     I ran into this issue where ZGC is unable to reclaim memory
> >     for few
> >      > hours/days. It just keep printing "Exception in thread "RMI TCP
> >      > Connection(idle)" java.lang.OutOfMemoryError: Java heap space"
> and
> >      > Allocation Stall happening on that thread.
> >      >
> >      >
> >      > Here is the metrics which shows for some reason even though there
> is
> >      > Garbage but it is unable to Reclaim
> >      >
> >      > ....
> >      > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
> >      > GC(112126)      Live:         -              6366M (78%)
> >     6366M (78%)
> >      >         6366M (78%)
> >      >      -                  -
> >      > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> >      > GC(112126)   Garbage:         -              1735M (21%)
> >     1735M (21%)
> >      >         1731M (21%)*
> >      >      -                  -
> >      > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> >     GC(112126)
> >      > Reclaimed:         -                  -                 0M (0%)
> >      >   4M (0%)
> >      > ...
> >      >
> >      > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >     GC(135520)
> >      >       Live:         -              6367M (78%)        6367M (78%)
> >      >   6367M (78%)
> >      >      -                  -
> >      > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >      > GC(135520)   Garbage:         -              1730M (21%)
> >     1730M (21%)
> >      >         1724M (21%)*
> >      >      -                  -
> >      > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >     GC(135520)
> >      > Reclaimed:         -                  -                 0M (0%)
> >      >   6M (0%)
> >      >
> >      > Here it was in this state for ~8hours and it is still happening.
> >     It says
> >      > has a Garbage of 21G but it is not able to Reclaim it everytime
> >     it reclaims
> >      > only 4-6M.
> >      >
> >      > Any idea what might be the issue here.
> >      >
> >      >
> >      > TIA
> >      > Sundar
> >      >
> >
>

From m.sundar85 at gmail.com  Wed Nov  6 20:17:54 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Wed, 6 Nov 2019 12:17:54 -0800
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
Message-ID: <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>

Hi Per
   As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it can
handle  *few hundred megabytes* to multi terabytes*.*

So my understanding was if my application is running with 8G before, with
ZGC and same heap also it should run without issues. So far that is not
the case i have to increase the heap size always to make sure it gets the
same latency/RPS.

For me this doesn't seem to be true always in my case(heap ranging from 8 -
48 G i need to change to higher value to make sure i am getting same RPS
and latency). Again this is my observation and might vary for different
workload.

Thanks
Sundar

On Wed, Nov 6, 2019 at 2:46 AM Per Liden <per.liden at oracle.com> wrote:

> On 11/5/19 4:48 PM, Peter Booth wrote:
> > Reading this and similar threads I am struck by the fact that ZGC users
> are experiencing things that users of Azul?s Zing JVM also go through. I
> remember the amazement at seeing a JVM run without substantive GC pauses
> and thinking that it was a free lunch. But the price was two parts -
> ensuring adequate heap, and rewiring brains that are accustomed to seeing
> cpu and memory as independent resources. The second turns out to be much
> harder.
> >
> >  From experience, I think a lot of pain can be avoided by clearly
> communicating that an adequate heap is a prerequisite for a healthy JVM.
> Most java developers have absorbed the notion that large heaps are
> bad/risky and unlearning takes time.
>
> The documentation on the ZGC wiki [1] tries to be clear about this, but
> I'm sure it could be improved.
>
> [1] https://wiki.openjdk.java.net/display/zgc/Main
>
> cheers,
> Per
>
> >
> > Sent from my iPhone
> >
> >> On Nov 4, 2019, at 8:28 PM, Sundara Mohan M <m.sundar85 at gmail.com>
> wrote:
> >>
> >> ?HI Per,
> >> This explains why it didn't work to reclaim memory, also my heap memory
> was
> >> 8G and 6G was strongly reachable (when i took heap dump). Agreed
> increasing
> >> heap memory will help in this case.
> >>
> >> Still trying to understand better on ZGC,
> >> 1. So shouldn't GC try to be more aggressive and try to put more effort
> to
> >> reclaim without additional settings?
> >> 2. Is there a reason why it shouldn't give more CPU to GC threads and
> >> reclaim garbage (say after X run of GC it could not reclaim memory)? In
> >> this case it would be good to reclaim existing garbage instead of doing
> >> Allocation Stall and failing with heap out of memory.
> >>
> >>
> >> Thanks
> >> Sundar
> >>
> >>> On Mon, Nov 4, 2019 at 12:40 PM Per Liden <per.liden at oracle.com>
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> When a workload produces a uniformly swiss-cheesy heap, i.e. where all
> >>> parts of the heap have roughly the same amount of garbage, then the GC
> >>> will face a situation where there are no free lunches and it will have
> >>> to work hard (compact a lot) to reclaim memory. Therefore, the GC will
> >>> tolerate a certain amount of fragmentation/waste, in the hope that more
> >>> object will die soon, making compaction less expensive (at the expense
> >>> of using more memory for a while). How many CPU cycles to spend on
> >>> compaction vs. how much memory you can spare is of course a trade-off.
> >>>
> >>> You can use -XX:ZFragmentationLimit to control this. It currently
> >>> defaults to 25% and your workload seems to stabilize at 21%. If you
> want
> >>> more aggressive compaction/reclamation, then set the
> >>> -XX:ZFragmentationLimit to something below 21. This may or may not be a
> >>> good trade-off in your case. The alternative is to give the GC a larger
> >>> heap to work with.
> >>>
> >>> cheers,
> >>> Per
> >>>
> >>>> On 11/4/19 7:56 PM, Sundara Mohan M wrote:
> >>>> Hi,
> >>>>     I ran into this issue where ZGC is unable to reclaim memory for
> few
> >>>> hours/days. It just keep printing "Exception in thread "RMI TCP
> >>>> Connection(idle)" java.lang.OutOfMemoryError: Java heap space"  and
> >>>> Allocation Stall happening on that thread.
> >>>>
> >>>>
> >>>> Here is the metrics which shows for some reason even though there is
> >>>> Garbage but it is unable to Reclaim
> >>>>
> >>>> ....
> >>>> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap     ]
> >>>> GC(112126)      Live:         -              6366M (78%)        6366M
> >>> (78%)
> >>>>         6366M (78%)
> >>>>      -                  -
> >>>> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> >>>> GC(112126)   Garbage:         -              1735M (21%)        1735M
> >>> (21%)
> >>>>         1731M (21%)*
> >>>>      -                  -
> >>>> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap     ]
> >>> GC(112126)
> >>>> Reclaimed:         -                  -                 0M (0%)
> >>>>   4M (0%)
> >>>> ...
> >>>>
> >>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >>> GC(135520)
> >>>>       Live:         -              6367M (78%)        6367M (78%)
> >>>>   6367M (78%)
> >>>>      -                  -
> >>>> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >>>> GC(135520)   Garbage:         -              1730M (21%)        1730M
> >>> (21%)
> >>>>         1724M (21%)*
> >>>>      -                  -
> >>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap     ]
> >>> GC(135520)
> >>>> Reclaimed:         -                  -                 0M (0%)
> >>>>   6M (0%)
> >>>>
> >>>> Here it was in this state for ~8hours and it is still happening. It
> says
> >>>> has a Garbage of 21G but it is not able to Reclaim it everytime it
> >>> reclaims
> >>>> only 4-6M.
> >>>>
> >>>> Any idea what might be the issue here.
> >>>>
> >>>>
> >>>> TIA
> >>>> Sundar
> >>>>
> >>>
> >
>

From fw at deneb.enyo.de  Wed Nov  6 20:47:26 2019
From: fw at deneb.enyo.de (Florian Weimer)
Date: Wed, 06 Nov 2019 21:47:26 +0100
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
 (Sundara Mohan M.'s message of "Wed, 6 Nov 2019 12:17:54 -0800")
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
 <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
Message-ID: <8736f03f6p.fsf@mid.deneb.enyo.de>

* Sundara Mohan M.:

> Hi Per
>    As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it can
> handle  *few hundred megabytes* to multi terabytes*.*
>
> So my understanding was if my application is running with 8G before, with
> ZGC and same heap also it should run without issues. So far that is not
> the case i have to increase the heap size always to make sure it gets the
> same latency/RPS.

What do you mean with ?before??  Thanks.

From m.sundar85 at gmail.com  Wed Nov  6 21:04:16 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Wed, 6 Nov 2019 13:04:16 -0800
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <8736f03f6p.fsf@mid.deneb.enyo.de>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
 <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
 <8736f03f6p.fsf@mid.deneb.enyo.de>
Message-ID: <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>

To be clear "before" i was referring to my previous GC, Parallel/G1/CMS.

Here is what i was seeing
Instance1 - 8G heap,   ParallelGC, 100RPS,  200ms Latency
Instance2 - 8G heap,   ZGC,           100RPS,  600ms Latency
Instance3 - 32G heap, ZGC,           100RPS,  200ms Latency

My expectation was Instance2 should give me same result as Instance1 but
that is not the case. Instead i had to move to Instance3 setting to get
what i want.
This is just my observation and it might not be same for all workloads.

But on the other hand with ZGC my throughput increased from
*(ParallelGC)97%* to* (ZGC)99.7% *and my STW pauses have never crossed 20ms.


Thanks
Sundar


On Wed, Nov 6, 2019 at 12:47 PM Florian Weimer <fw at deneb.enyo.de> wrote:

> * Sundara Mohan M.:
>
> > Hi Per
> >    As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it
> can
> > handle  *few hundred megabytes* to multi terabytes*.*
> >
> > So my understanding was if my application is running with 8G before, with
> > ZGC and same heap also it should run without issues. So far that is not
> > the case i have to increase the heap size always to make sure it gets the
> > same latency/RPS.
>
> What do you mean with ?before??  Thanks.
>

From conniall at amazon.com  Wed Nov  6 21:15:38 2019
From: conniall at amazon.com (Connaughton, Niall)
Date: Wed, 6 Nov 2019 21:15:38 +0000
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
 <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
 <8736f03f6p.fsf@mid.deneb.enyo.de>
 <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
Message-ID: <5C0FB2FF-2768-458E-9C60-7ECFCB8B8DAD@amazon.com>

ZGC and other pauseless/low pause collectors are designed to allow your process to continue while the GC is running - hence why your pause times are low. The problem is that if your process is still running, it's still allocating. If you are filling up the heap faster than ZGC can keep up, you will have degraded performance. Assuming your allocations are mostly not long-lived, you give ZGC more time to collect by giving it a bigger heap, which takes longer for you to fill.

This is the mindset shift that Peter Booth was referring to earlier. A lot of engineers have developed/been trained into a mindset where smaller heaps are better, particularly by STW GCs like ParallelGC. That approach is not helpful with concurrent collectors, and it's something you can already see with G1GC. In most cases giving G1GC a larger heap will improve performance, but it does depend on your workload and allocation/lifetime pattern.

So the fact ZGC can handle heaps that are TBs in size doesn't mean that you can expect better performance by moving to ZGC and keeping the heap size the same, even if the heap is only a few GB.

Niall

?On 11/6/19, 13:05, "zgc-dev on behalf of Sundara Mohan M" <zgc-dev-bounces at openjdk.java.net on behalf of m.sundar85 at gmail.com> wrote:

    To be clear "before" i was referring to my previous GC, Parallel/G1/CMS.
    
    Here is what i was seeing
    Instance1 - 8G heap,   ParallelGC, 100RPS,  200ms Latency
    Instance2 - 8G heap,   ZGC,           100RPS,  600ms Latency
    Instance3 - 32G heap, ZGC,           100RPS,  200ms Latency
    
    My expectation was Instance2 should give me same result as Instance1 but
    that is not the case. Instead i had to move to Instance3 setting to get
    what i want.
    This is just my observation and it might not be same for all workloads.
    
    But on the other hand with ZGC my throughput increased from
    *(ParallelGC)97%* to* (ZGC)99.7% *and my STW pauses have never crossed 20ms.
    
    
    Thanks
    Sundar
    
    
    On Wed, Nov 6, 2019 at 12:47 PM Florian Weimer <fw at deneb.enyo.de> wrote:
    
    > * Sundara Mohan M.:
    >
    > > Hi Per
    > >    As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it
    > can
    > > handle  *few hundred megabytes* to multi terabytes*.*
    > >
    > > So my understanding was if my application is running with 8G before, with
    > > ZGC and same heap also it should run without issues. So far that is not
    > > the case i have to increase the heap size always to make sure it gets the
    > > same latency/RPS.
    >
    > What do you mean with ?before??  Thanks.
    >
    

From fw at deneb.enyo.de  Wed Nov  6 21:22:42 2019
From: fw at deneb.enyo.de (Florian Weimer)
Date: Wed, 06 Nov 2019 22:22:42 +0100
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
 (Sundara Mohan M.'s message of "Wed, 6 Nov 2019 13:04:16 -0800")
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
 <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
 <8736f03f6p.fsf@mid.deneb.enyo.de>
 <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
Message-ID: <87pni41yzh.fsf@mid.deneb.enyo.de>

* Sundara Mohan M.:

> To be clear "before" i was referring to my previous GC, Parallel/G1/CMS.
>
> Here is what i was seeing
> Instance1 - 8G heap,   ParallelGC, 100RPS,  200ms Latency
> Instance2 - 8G heap,   ZGC,           100RPS,  600ms Latency
> Instance3 - 32G heap, ZGC,           100RPS,  200ms Latency

Latency is measured end-to-end, including processing time and stalls
etc.?

Keep in mind that ZGC does not supported compressed pointers, so if
your workload is heavy on pointers, the VM has to do more work, and
the memory requirements are also higher.

> My expectation was Instance2 should give me same result as Instance1 but
> that is not the case. Instead i had to move to Instance3 setting to get
> what i want.

It's difficult to beat ParallelGC in terms of overall CPU efficiency.
If you do not have CPU cycles to spare and the GC does not have
sufficient extra heap to work with beyond the live object set (to some
extent, you can trade RAM vs CPU), application performance will
suffer.

You could also give Shenandoah a try. 8-)

From per.liden at oracle.com  Wed Nov  6 23:08:44 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 7 Nov 2019 00:08:44 +0100
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
 <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
 <8736f03f6p.fsf@mid.deneb.enyo.de>
 <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
Message-ID: <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com>

On 11/6/19 10:04 PM, Sundara Mohan M wrote:
[...]
> But on the other hand with ZGC my throughput increased from 
> *(ParallelGC)97%* to*(ZGC)99.7% *and my STW pauses have never crossed 20ms.

Could you please paste the last printout of the "Garbage Collection 
Statistics" table? If you have pauses close to 20ms, it would be 
interesting to see where that time is spent. I would assume it's 
accounted to "Subphase: Pause Roots Threads", but seeing the whole table 
would tell the me more.

thanks,
Per

From m.sundar85 at gmail.com  Thu Nov  7 01:28:25 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Wed, 6 Nov 2019 17:28:25 -0800
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
 <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
 <8736f03f6p.fsf@mid.deneb.enyo.de>
 <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
 <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com>
Message-ID: <CACGCMVqwLfPNuHUW6gJnENr-M1_X_KaHKZoztaKWkNDYqV_4Og@mail.gmail.com>

Unfortunately i don't have that stats. I was running application with this
option for gc
log -Xlog:gc,gc+init,gc+start,gc+phases,gc+heap,gc+cpu,gc+reloc,gc+ref,gc+marking,gc+metaspace=info.
Will see if i can get that issue reproducible and get that data for you.


Thanks
Sundar

On Wed, Nov 6, 2019 at 3:08 PM Per Liden <per.liden at oracle.com> wrote:

> On 11/6/19 10:04 PM, Sundara Mohan M wrote:
> [...]
> > But on the other hand with ZGC my throughput increased from
> > *(ParallelGC)97%* to*(ZGC)99.7% *and my STW pauses have never crossed
> 20ms.
>
> Could you please paste the last printout of the "Garbage Collection
> Statistics" table? If you have pauses close to 20ms, it would be
> interesting to see where that time is spent. I would assume it's
> accounted to "Subphase: Pause Roots Threads", but seeing the whole table
> would tell the me more.
>
> thanks,
> Per
>

From per.liden at oracle.com  Thu Nov  7 07:25:10 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 7 Nov 2019 08:25:10 +0100
Subject: ZGC Unable to reclaim memory for long time
In-Reply-To: <CACGCMVqwLfPNuHUW6gJnENr-M1_X_KaHKZoztaKWkNDYqV_4Og@mail.gmail.com>
References: <CACGCMVpCgqw_zEv_yFf5+d5Mx09s0omT05-wH=eS9NSURa1yGg@mail.gmail.com>
 <EE76503D-8FE3-444B-BA9C-B49DF60611BA@me.com>
 <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com>
 <CACGCMVoCb5_YvjqxbWE5ZUFFxqhLGkjnA7uox0Xmxn81tcdmQg@mail.gmail.com>
 <8736f03f6p.fsf@mid.deneb.enyo.de>
 <CACGCMVqtxiojuJzMU8Z064XezN69F0ExZGSPi4-z0CuuX=cfWw@mail.gmail.com>
 <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com>
 <CACGCMVqwLfPNuHUW6gJnENr-M1_X_KaHKZoztaKWkNDYqV_4Og@mail.gmail.com>
Message-ID: <e488fcef-5f2b-6ab3-2903-4abff1c3bd7b@oracle.com>

On 11/7/19 2:28 AM, Sundara Mohan M wrote:
> Unfortunately i don't have that stats. I was running application with 
> this option for gc 
> log?-Xlog:gc,gc+init,gc+start,gc+phases,gc+heap,gc+cpu,gc+reloc,gc+ref,gc+marking,gc+metaspace=info.
> Will see if i can get that issue reproducible and get that data for you.

Ok, thanks. I'd recommend using -Xlog:gc*, that way you catch pretty 
much all you need to tell what's going on, without being super verbose.

/Per

> 
> 
> Thanks
> Sundar
> 
> On Wed, Nov 6, 2019 at 3:08 PM Per Liden <per.liden at oracle.com 
> <mailto:per.liden at oracle.com>> wrote:
> 
>     On 11/6/19 10:04 PM, Sundara Mohan M wrote:
>     [...]
>      > But on the other hand with ZGC my throughput increased from
>      > *(ParallelGC)97%* to*(ZGC)99.7% *and my STW pauses have never
>     crossed 20ms.
> 
>     Could you please paste the last printout of the "Garbage Collection
>     Statistics" table? If you have pauses close to 20ms, it would be
>     interesting to see where that time is spent. I would assume it's
>     accounted to "Subphase: Pause Roots Threads", but seeing the whole
>     table
>     would tell the me more.
> 
>     thanks,
>     Per
> 

From m.sundar85 at gmail.com  Tue Nov 12 00:42:30 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Mon, 11 Nov 2019 16:42:30 -0800
Subject: Upgrade to JDK13 for ZGC?
Message-ID: <CACGCMVoF3OHNQxCuyNmkH8yiv9JuDbpjTxJX1hHFYqTs1Jdoxg@mail.gmail.com>

Hi,
    I am using ZGC and trying to see if any bug fixes gone in to JDK13
other than "Uncommit memory feature".

1. Was "memory uncommit" is the only feature gone in to JDK13?
2. Is there a way to find all the bug fixed in JDK13 and categorize it by
ZGC, so in future i can do it myself.


TIA,
Sundar

From thomas.schatzl at oracle.com  Tue Nov 12 10:26:50 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 12 Nov 2019 11:26:50 +0100
Subject: Upgrade to JDK13 for ZGC?
In-Reply-To: <CACGCMVoF3OHNQxCuyNmkH8yiv9JuDbpjTxJX1hHFYqTs1Jdoxg@mail.gmail.com>
References: <CACGCMVoF3OHNQxCuyNmkH8yiv9JuDbpjTxJX1hHFYqTs1Jdoxg@mail.gmail.com>
Message-ID: <c6113c61-aaaf-0273-cf0f-18ae3aeaa853@oracle.com>

Hi,

On 12.11.19 01:42, Sundara Mohan M wrote:
> Hi,
>      I am using ZGC and trying to see if any bug fixes gone in to JDK13
> other than "Uncommit memory feature".
> 
> 1. Was "memory uncommit" is the only feature gone in to JDK13?
> 2. Is there a way to find all the bug fixed in JDK13 and categorize it by
> ZGC, so in future i can do it myself.
> 
> 

Something like this JBS query:

https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC

should do the trick, i.e. showing everything with the "zgc" label and 
fix version of the various jdk 13 releases.

Thanks,
   Thomas

From m.sundar85 at gmail.com  Tue Nov 12 21:27:55 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Tue, 12 Nov 2019 13:27:55 -0800
Subject: Upgrade to JDK13 for ZGC?
In-Reply-To: <c6113c61-aaaf-0273-cf0f-18ae3aeaa853@oracle.com>
References: <CACGCMVoF3OHNQxCuyNmkH8yiv9JuDbpjTxJX1hHFYqTs1Jdoxg@mail.gmail.com>
 <c6113c61-aaaf-0273-cf0f-18ae3aeaa853@oracle.com>
Message-ID: <CACGCMVroMTZUcMaQvQs0FoTkTNMex=juXxVmFxT0vLZAgMbBRw@mail.gmail.com>

Thank you.

Regards
Sundar

On Tue, Nov 12, 2019 at 2:29 AM Thomas Schatzl <thomas.schatzl at oracle.com>
wrote:

> Hi,
>
> On 12.11.19 01:42, Sundara Mohan M wrote:
> > Hi,
> >      I am using ZGC and trying to see if any bug fixes gone in to JDK13
> > other than "Uncommit memory feature".
> >
> > 1. Was "memory uncommit" is the only feature gone in to JDK13?
> > 2. Is there a way to find all the bug fixed in JDK13 and categorize it by
> > ZGC, so in future i can do it myself.
> >
> >
>
> Something like this JBS query:
>
>
> https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC
>
> should do the trick, i.e. showing everything with the "zgc" label and
> fix version of the various jdk 13 releases.
>
> Thanks,
>    Thomas
>

From m.sundar85 at gmail.com  Thu Nov 14 18:58:30 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Thu, 14 Nov 2019 10:58:30 -0800
Subject: Why does load average on host increases as Allocation Stall happens?
Message-ID: <CACGCMVoorSu7Yi=Bi0QFRv0qnOziV_ogAqJZoo9P6UGqnkZ67g@mail.gmail.com>

Hi,
   I have notices Load average on the host increases 5 - 10 times when
Allocation Stall happens, trying to understand what causes load average to
increase when this happens.

Looking at the code
zPageAllocator.cpp
do {
// Start asynchronous GC
ZCollectedHeap::heap()->collect(GCCause::_z_allocation_stall);

// Wait for allocation to complete or fail
page = request.wait();
} while (page == gc_marker);

Seems request.wait() is internally doing a get call on ZFuture.

1. Will this use this thread to spin on CPU or it is async (mean this
thread will go to sleep and can be woken up when it is ready and other
process can occupy this CPU)?
2. Since load average increase matches exactly with allocation stall, is
there any other operation (like Flushing page)  can cause this behavior?

Since i haven't enabled "gc,stats" tag in my logging i missed some
information there. Will try to get that information when i can reproduce it.


TIA,
Sundar

From fw at deneb.enyo.de  Thu Nov 14 19:24:41 2019
From: fw at deneb.enyo.de (Florian Weimer)
Date: Thu, 14 Nov 2019 20:24:41 +0100
Subject: Why does load average on host increases as Allocation Stall
 happens?
In-Reply-To: <CACGCMVoorSu7Yi=Bi0QFRv0qnOziV_ogAqJZoo9P6UGqnkZ67g@mail.gmail.com>
 (Sundara Mohan M.'s message of "Thu, 14 Nov 2019 10:58:30 -0800")
References: <CACGCMVoorSu7Yi=Bi0QFRv0qnOziV_ogAqJZoo9P6UGqnkZ67g@mail.gmail.com>
Message-ID: <877e42w9ae.fsf@mid.deneb.enyo.de>

* Sundara Mohan M.:

>    I have notices Load average on the host increases 5 - 10 times
> when Allocation Stall happens, trying to understand what causes load
> average to increase when this happens.

> 2. Since load average increase matches exactly with allocation stall, is
> there any other operation (like Flushing page)  can cause this behavior?

I don't know ZGC internals, but I think it stalls the application when
the GC cannot keep up.  This is a last resort.  Before that happens,
more GC threads will try hard to reclaim memory.  That work increases
system load.

An alternative explanation could be that something else consumes CPU
resources, taking it away from the GC threads, so that they cannot
keep up, and ZGC has to introduce allocation stalls.

From m.sundar85 at gmail.com  Fri Nov 15 18:55:38 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Fri, 15 Nov 2019 10:55:38 -0800
Subject: MMU drops suddenly
Message-ID: <CACGCMVoUB8JU0ogVBimNKy-Y0wcTZRhzXxcu2o0-RzW3b02CUg@mail.gmail.com>

Hi,
    Have noticed following in gc log
[2019-11-13T19:24:13.095+0000][69629.984s][info][gc,mmu      ] GC(10952)
MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, 100ms/90.8%
[2019-11-13T20:12:55.339+0000][72552.228s][info][gc,mmu      ] GC(11441)
MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, *100ms/90.8%*
[2019-11-13T21:00:53.415+0000][75430.304s][info][gc,mmu      ] GC(11927)
MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, *100ms/70.7%*
[2019-11-13T21:52:46.244+0000][78543.133s][info][gc,mmu      ] GC(12450)
MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
[2019-11-13T22:40:35.887+0000][81412.776s][info][gc,mmu      ] GC(12946)
MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
[2019-11-13T23:27:23.807+0000][84220.696s][info][gc,mmu      ] GC(13410)
MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, *100ms/43.0%*

Was trying to understand what it means and here is my understanding, This
says how much minimum CPU available for mutator thread in last Xms
1. Is this correct?
2. Why is this suddenly dropping from (100ms 90% -> 40%) ? Also other time
unit it is 0% does that mean my application doesn't get a chance to run?
Also i see it never goes back to higher value.
3. Does this measure indicates something good or bad?
3. If this is bad what should i look further to get more insights?

Can someone help me to get better understanding on this.


TIA,
Sundar

From per.liden at oracle.com  Mon Nov 18 08:43:14 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 18 Nov 2019 09:43:14 +0100
Subject: MMU drops suddenly
In-Reply-To: <CACGCMVoUB8JU0ogVBimNKy-Y0wcTZRhzXxcu2o0-RzW3b02CUg@mail.gmail.com>
References: <CACGCMVoUB8JU0ogVBimNKy-Y0wcTZRhzXxcu2o0-RzW3b02CUg@mail.gmail.com>
Message-ID: <fa7eea2c-197d-3e9e-de95-5df1ae9e4331@oracle.com>

Hi,

On 11/15/19 7:55 PM, Sundara Mohan M wrote:
> Hi,
>      Have noticed following in gc log
> [2019-11-13T19:24:13.095+0000][69629.984s][info][gc,mmu      ] GC(10952)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, 100ms/90.8%
> [2019-11-13T20:12:55.339+0000][72552.228s][info][gc,mmu      ] GC(11441)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, *100ms/90.8%*
> [2019-11-13T21:00:53.415+0000][75430.304s][info][gc,mmu      ] GC(11927)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, *100ms/70.7%*
> [2019-11-13T21:52:46.244+0000][78543.133s][info][gc,mmu      ] GC(12450)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
> [2019-11-13T22:40:35.887+0000][81412.776s][info][gc,mmu      ] GC(12946)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
> [2019-11-13T23:27:23.807+0000][84220.696s][info][gc,mmu      ] GC(13410)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, *100ms/43.0%*
> 
> Was trying to understand what it means and here is my understanding, This
> says how much minimum CPU available for mutator thread in last Xms
> 1. Is this correct?

Not quite. The MMU printout tells you the minimum amount of time Java 
threads could execute in the different time windows. Note that it's the 
worst case since the VM started. For example, 10ms/23.7% means there has 
been at least one 10ms window, where Java threads could only execute for 
23.7% of that time (2.37ms).

> 2. Why is this suddenly dropping from (100ms 90% -> 40%) ? Also other time
> unit it is 0% does that mean my application doesn't get a chance to run?

Right, 2ms/0.0% means there was at least one 2ms windows, where the Java 
threads didn't get a chance to run at all.

> Also i see it never goes back to higher value.

Correct. Since it shows the worst case since the VM started it will 
never go back to a higher value.

> 3. Does this measure indicates something good or bad?

In general, 0% is bad, 100% is good. Exactly which time window you're 
interested in depends on what response time requirements you have. A 
simplified example to show the principle: Assume a request takes 5ms to 
process in your application, and you have a response time requirement of 
10ms, then 10ms/60% would be good, but 10ms/40% would not be good enough.

> 3. If this is bad what should i look further to get more insights?

Look at the GC pauses, how long they are and how far apart they are. The 
GC statistics printed by gc+stats shows you where you're spending time 
in pauses. If the GC pauses are long, then ZGC is likely starved on CPU. 
If the GC pauses are close to each other, then ZGC is likely doing 
back-to-back GCs and needs more heap to work with.

cheers,
Per

From per.liden at oracle.com  Mon Nov 18 08:59:16 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 18 Nov 2019 09:59:16 +0100
Subject: Why does load average on host increases as Allocation Stall
 happens?
In-Reply-To: <CACGCMVoorSu7Yi=Bi0QFRv0qnOziV_ogAqJZoo9P6UGqnkZ67g@mail.gmail.com>
References: <CACGCMVoorSu7Yi=Bi0QFRv0qnOziV_ogAqJZoo9P6UGqnkZ67g@mail.gmail.com>
Message-ID: <d686ea84-b9bf-eaf7-53d1-0edb97670d9b@oracle.com>

On 11/14/19 7:58 PM, Sundara Mohan M wrote:
> Hi,
>     I have notices Load average on the host increases 5 - 10 times when
> Allocation Stall happens, trying to understand what causes load average to
> increase when this happens.

It's impossible to say with certainty without inspecting what's actually 
going on in the system. Florian's explanations are good. It could just 
be that your application workload is peaking, which in turn causes the 
allocation stalls.

> 
> Looking at the code
> zPageAllocator.cpp
> do {
> // Start asynchronous GC
> ZCollectedHeap::heap()->collect(GCCause::_z_allocation_stall);
> 
> // Wait for allocation to complete or fail
> page = request.wait();
> } while (page == gc_marker);
> 
> Seems request.wait() is internally doing a get call on ZFuture.
> 
> 1. Will this use this thread to spin on CPU or it is async (mean this
> thread will go to sleep and can be woken up when it is ready and other
> process can occupy this CPU)?

It's async.

/Per

> 2. Since load average increase matches exactly with allocation stall, is
> there any other operation (like Flushing page)  can cause this behavior?
> 
> Since i haven't enabled "gc,stats" tag in my logging i missed some
> information there. Will try to get that information when i can reproduce it.
> 
> 
> TIA,
> Sundar
> 

From per.liden at oracle.com  Mon Nov 18 12:32:47 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 18 Nov 2019 13:32:47 +0100
Subject: Upgrade to JDK13 for ZGC?
In-Reply-To: <CACGCMVroMTZUcMaQvQs0FoTkTNMex=juXxVmFxT0vLZAgMbBRw@mail.gmail.com>
References: <CACGCMVoF3OHNQxCuyNmkH8yiv9JuDbpjTxJX1hHFYqTs1Jdoxg@mail.gmail.com>
 <c6113c61-aaaf-0273-cf0f-18ae3aeaa853@oracle.com>
 <CACGCMVroMTZUcMaQvQs0FoTkTNMex=juXxVmFxT0vLZAgMbBRw@mail.gmail.com>
Message-ID: <e9249de3-78e8-96ad-6c0e-f6e5642fc568@oracle.com>

The ZGC wiki also has a high-level change log, which highlights the most 
interesting user visible enhancements:

https://wiki.openjdk.java.net/display/zgc/Main#Main-ChangeLog

/Per

On 11/12/19 10:27 PM, Sundara Mohan M wrote:
> Thank you.
> 
> Regards
> Sundar
> 
> On Tue, Nov 12, 2019 at 2:29 AM Thomas Schatzl <thomas.schatzl at oracle.com>
> wrote:
> 
>> Hi,
>>
>> On 12.11.19 01:42, Sundara Mohan M wrote:
>>> Hi,
>>>       I am using ZGC and trying to see if any bug fixes gone in to JDK13
>>> other than "Uncommit memory feature".
>>>
>>> 1. Was "memory uncommit" is the only feature gone in to JDK13?
>>> 2. Is there a way to find all the bug fixed in JDK13 and categorize it by
>>> ZGC, so in future i can do it myself.
>>>
>>>
>>
>> Something like this JBS query:
>>
>>
>> https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC
>>
>> should do the trick, i.e. showing everything with the "zgc" label and
>> fix version of the various jdk 13 releases.
>>
>> Thanks,
>>     Thomas
>>

From m.sundar85 at gmail.com  Tue Nov 19 19:28:28 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Tue, 19 Nov 2019 11:28:28 -0800
Subject: Upgrade to JDK13 for ZGC?
In-Reply-To: <e9249de3-78e8-96ad-6c0e-f6e5642fc568@oracle.com>
References: <CACGCMVoF3OHNQxCuyNmkH8yiv9JuDbpjTxJX1hHFYqTs1Jdoxg@mail.gmail.com>
 <c6113c61-aaaf-0273-cf0f-18ae3aeaa853@oracle.com>
 <CACGCMVroMTZUcMaQvQs0FoTkTNMex=juXxVmFxT0vLZAgMbBRw@mail.gmail.com>
 <e9249de3-78e8-96ad-6c0e-f6e5642fc568@oracle.com>
Message-ID: <CACGCMVrCM6vDztrNGR1=O2q87KFR8pxpxDWJyqhmPC1naEBUYQ@mail.gmail.com>

Cool, thanks!

On Mon, Nov 18, 2019 at 4:32 AM Per Liden <per.liden at oracle.com> wrote:

> The ZGC wiki also has a high-level change log, which highlights the most
> interesting user visible enhancements:
>
> https://wiki.openjdk.java.net/display/zgc/Main#Main-ChangeLog
>
> /Per
>
> On 11/12/19 10:27 PM, Sundara Mohan M wrote:
> > Thank you.
> >
> > Regards
> > Sundar
> >
> > On Tue, Nov 12, 2019 at 2:29 AM Thomas Schatzl <
> thomas.schatzl at oracle.com>
> > wrote:
> >
> >> Hi,
> >>
> >> On 12.11.19 01:42, Sundara Mohan M wrote:
> >>> Hi,
> >>>       I am using ZGC and trying to see if any bug fixes gone in to
> JDK13
> >>> other than "Uncommit memory feature".
> >>>
> >>> 1. Was "memory uncommit" is the only feature gone in to JDK13?
> >>> 2. Is there a way to find all the bug fixed in JDK13 and categorize it
> by
> >>> ZGC, so in future i can do it myself.
> >>>
> >>>
> >>
> >> Something like this JBS query:
> >>
> >>
> >>
> https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC
> >>
> >> should do the trick, i.e. showing everything with the "zgc" label and
> >> fix version of the various jdk 13 releases.
> >>
> >> Thanks,
> >>     Thomas
> >>
>

From m.sundar85 at gmail.com  Tue Nov 19 19:29:37 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Tue, 19 Nov 2019 11:29:37 -0800
Subject: Why does load average on host increases as Allocation Stall
 happens?
In-Reply-To: <d686ea84-b9bf-eaf7-53d1-0edb97670d9b@oracle.com>
References: <CACGCMVoorSu7Yi=Bi0QFRv0qnOziV_ogAqJZoo9P6UGqnkZ67g@mail.gmail.com>
 <d686ea84-b9bf-eaf7-53d1-0edb97670d9b@oracle.com>
Message-ID: <CACGCMVqrPt=Agd3C3NySEzfEYJgW9W4c2v4=SnWuO4HHmwYYug@mail.gmail.com>

Thank you for the clarification.

I will try to get more gc log and system information during that time to
get more detail.

Thanks
Sundar

On Mon, Nov 18, 2019 at 12:59 AM Per Liden <per.liden at oracle.com> wrote:

> On 11/14/19 7:58 PM, Sundara Mohan M wrote:
> > Hi,
> >     I have notices Load average on the host increases 5 - 10 times when
> > Allocation Stall happens, trying to understand what causes load average
> to
> > increase when this happens.
>
> It's impossible to say with certainty without inspecting what's actually
> going on in the system. Florian's explanations are good. It could just
> be that your application workload is peaking, which in turn causes the
> allocation stalls.
>
> >
> > Looking at the code
> > zPageAllocator.cpp
> > do {
> > // Start asynchronous GC
> > ZCollectedHeap::heap()->collect(GCCause::_z_allocation_stall);
> >
> > // Wait for allocation to complete or fail
> > page = request.wait();
> > } while (page == gc_marker);
> >
> > Seems request.wait() is internally doing a get call on ZFuture.
> >
> > 1. Will this use this thread to spin on CPU or it is async (mean this
> > thread will go to sleep and can be woken up when it is ready and other
> > process can occupy this CPU)?
>
> It's async.
>
> /Per
>
> > 2. Since load average increase matches exactly with allocation stall, is
> > there any other operation (like Flushing page)  can cause this behavior?
> >
> > Since i haven't enabled "gc,stats" tag in my logging i missed some
> > information there. Will try to get that information when i can reproduce
> it.
> >
> >
> > TIA,
> > Sundar
> >
>

From m.sundar85 at gmail.com  Thu Nov 21 23:33:53 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Thu, 21 Nov 2019 15:33:53 -0800
Subject: MMU drops suddenly
In-Reply-To: <fa7eea2c-197d-3e9e-de95-5df1ae9e4331@oracle.com>
References: <CACGCMVoUB8JU0ogVBimNKy-Y0wcTZRhzXxcu2o0-RzW3b02CUg@mail.gmail.com>
 <fa7eea2c-197d-3e9e-de95-5df1ae9e4331@oracle.com>
Message-ID: <CACGCMVrHcpAmA2V_dCNn-oWKzzDBCLLdndGybPOaQvUQKipDqw@mail.gmail.com>

Got it.
Thanks for the explanation.

Regards
Sundar

On Mon, Nov 18, 2019 at 12:43 AM Per Liden <per.liden at oracle.com> wrote:

> Hi,
>
> On 11/15/19 7:55 PM, Sundara Mohan M wrote:
> > Hi,
> >      Have noticed following in gc log
> > [2019-11-13T19:24:13.095+0000][69629.984s][info][gc,mmu      ] GC(10952)
> > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, 100ms/90.8%
> > [2019-11-13T20:12:55.339+0000][72552.228s][info][gc,mmu      ] GC(11441)
> > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%,
> *100ms/90.8%*
> > [2019-11-13T21:00:53.415+0000][75430.304s][info][gc,mmu      ] GC(11927)
> > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, *100ms/70.7%*
> > [2019-11-13T21:52:46.244+0000][78543.133s][info][gc,mmu      ] GC(12450)
> > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
> > [2019-11-13T22:40:35.887+0000][81412.776s][info][gc,mmu      ] GC(12946)
> > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
> > [2019-11-13T23:27:23.807+0000][84220.696s][info][gc,mmu      ] GC(13410)
> > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, *100ms/43.0%*
> >
> > Was trying to understand what it means and here is my understanding, This
> > says how much minimum CPU available for mutator thread in last Xms
> > 1. Is this correct?
>
> Not quite. The MMU printout tells you the minimum amount of time Java
> threads could execute in the different time windows. Note that it's the
> worst case since the VM started. For example, 10ms/23.7% means there has
> been at least one 10ms window, where Java threads could only execute for
> 23.7% of that time (2.37ms).
>
> > 2. Why is this suddenly dropping from (100ms 90% -> 40%) ? Also other
> time
> > unit it is 0% does that mean my application doesn't get a chance to run?
>
> Right, 2ms/0.0% means there was at least one 2ms windows, where the Java
> threads didn't get a chance to run at all.
>
> > Also i see it never goes back to higher value.
>
> Correct. Since it shows the worst case since the VM started it will
> never go back to a higher value.
>
> > 3. Does this measure indicates something good or bad?
>
> In general, 0% is bad, 100% is good. Exactly which time window you're
> interested in depends on what response time requirements you have. A
> simplified example to show the principle: Assume a request takes 5ms to
> process in your application, and you have a response time requirement of
> 10ms, then 10ms/60% would be good, but 10ms/40% would not be good enough.
>
> > 3. If this is bad what should i look further to get more insights?
>
> Look at the GC pauses, how long they are and how far apart they are. The
> GC statistics printed by gc+stats shows you where you're spending time
> in pauses. If the GC pauses are long, then ZGC is likely starved on CPU.
> If the GC pauses are close to each other, then ZGC is likely doing
> back-to-back GCs and needs more heap to work with.
>
> cheers,
> Per
>

From m.sundar85 at gmail.com  Thu Nov 21 23:37:54 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Thu, 21 Nov 2019 15:37:54 -0800
Subject: Heap dump is always around 8G on process with 80G heap
Message-ID: <CACGCMVoFgvkEqd=P2fZNbk-L72LZu+TVmxnx8TVq6EobpdGuqg@mail.gmail.com>

Hi,
   I am trying to take a heap dump of java process with 80G heap with ZGC,
this is always giving me around 8G dump file.
Same application with ParallelGC running with 48G heap i am getting around
30G dump file.
I am using following command on both process and verified both process has
same no of request processed and Used memory from gc log is similar
   jcmd <pid> GC.heap_dump <filename>

1. Why is ZGC heap dump always less compared to process running with
ParallelGC?
2. Is there something i am missing?


Thanks
Sundar

From per.liden at oracle.com  Fri Nov 22 08:31:03 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 22 Nov 2019 09:31:03 +0100
Subject: Heap dump is always around 8G on process with 80G heap
In-Reply-To: <CACGCMVoFgvkEqd=P2fZNbk-L72LZu+TVmxnx8TVq6EobpdGuqg@mail.gmail.com>
References: <CACGCMVoFgvkEqd=P2fZNbk-L72LZu+TVmxnx8TVq6EobpdGuqg@mail.gmail.com>
Message-ID: <b2dd1b66-6ac3-743c-2c0e-31d95d37987f@oracle.com>

On 11/22/19 12:37 AM, Sundara Mohan M wrote:
> Hi,
>     I am trying to take a heap dump of java process with 80G heap with ZGC,
> this is always giving me around 8G dump file.
> Same application with ParallelGC running with 48G heap i am getting around
> 30G dump file.
> I am using following command on both process and verified both process has
> same no of request processed and Used memory from gc log is similar
>     jcmd <pid> GC.heap_dump <filename>
> 
> 1. Why is ZGC heap dump always less compared to process running with
> ParallelGC?
> 2. Is there something i am missing?
> 

There are various reasons why a heap dump from one GC is larger or 
smaller compared to another GC. For example, ZGC only ever dumps 
reachable objects, while PrallelGC can also dump unreachable objects 
under some conditions (even though you didn't ask for them).

It's hard to tell where the difference comes from in your case, without 
further inspection/debugging.

/Per

From m.sundar85 at gmail.com  Fri Nov 22 19:39:08 2019
From: m.sundar85 at gmail.com (Sundara Mohan M)
Date: Fri, 22 Nov 2019 11:39:08 -0800
Subject: Heap dump is always around 8G on process with 80G heap
In-Reply-To: <b2dd1b66-6ac3-743c-2c0e-31d95d37987f@oracle.com>
References: <CACGCMVoFgvkEqd=P2fZNbk-L72LZu+TVmxnx8TVq6EobpdGuqg@mail.gmail.com>
 <b2dd1b66-6ac3-743c-2c0e-31d95d37987f@oracle.com>
Message-ID: <CACGCMVqQUTeozKThWjX09D-geHj_Yf0Gqd66tpLO8iDCuJKkqw@mail.gmail.com>

Hi Per,
 "ZGC only ever dumps reachable objects"
Does that mean we can never dump unreachable objects in ZGC or there is
some options that can be passed to get it?

I will try to see if the dump from other GC has unreachable object which is
showing as large file.

Thanks
Sundar

On Fri, Nov 22, 2019 at 12:31 AM Per Liden <per.liden at oracle.com> wrote:

> On 11/22/19 12:37 AM, Sundara Mohan M wrote:
> > Hi,
> >     I am trying to take a heap dump of java process with 80G heap with
> ZGC,
> > this is always giving me around 8G dump file.
> > Same application with ParallelGC running with 48G heap i am getting
> around
> > 30G dump file.
> > I am using following command on both process and verified both process
> has
> > same no of request processed and Used memory from gc log is similar
> >     jcmd <pid> GC.heap_dump <filename>
> >
> > 1. Why is ZGC heap dump always less compared to process running with
> > ParallelGC?
> > 2. Is there something i am missing?
> >
>
> There are various reasons why a heap dump from one GC is larger or
> smaller compared to another GC. For example, ZGC only ever dumps
> reachable objects, while PrallelGC can also dump unreachable objects
> under some conditions (even though you didn't ask for them).
>
> It's hard to tell where the difference comes from in your case, without
> further inspection/debugging.
>
> /Per
>

From conniall at amazon.com  Tue Nov 26 22:45:34 2019
From: conniall at amazon.com (Connaughton, Niall)
Date: Tue, 26 Nov 2019 22:45:34 +0000
Subject: Is ZGC still in experimental?
In-Reply-To: <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com>
References: <CACGCMVp_Bq37F07GrVhUc6yD7GKvkYKj5ftefWCvYWmrfQH4Jg@mail.gmail.com>
 <cf476b09-ae30-5d6f-1191-e9b61b303f8a@oracle.com>
 <CACGCMVoX2U7tUduB984i9g=byVCL9vJ1t3HW1Cs596y3ybOUJA@mail.gmail.com>
 <CACGCMVq-HMhLRrRN4J15dMTmv4VsXKYtfPEbigat-mpKUaskHA@mail.gmail.com>
 <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com>
Message-ID: <C6F42124-7578-426D-A99F-3F21811D31CC@amazon.com>

I wanted to double check on this. Is the intention that ZGC in JDK 11 is not intended for production use and will never have backports to bring it up to a production ready level? Put another way - if we want to evaluate using ZGC for a production service, is it an effective pre-requisite to move to a later JDK than 11?

Obviously we can test with the JDK11 version, and if we don't happen to see any problems then can make an "at-your-own-risk" call on that. But the longer term plans for addressing issues will affect the appetite for whether to even start out on that version.

Thanks,
Niall

?On 10/31/19, 02:00, "zgc-dev on behalf of Per Liden" <zgc-dev-bounces at openjdk.java.net on behalf of per.liden at oracle.com> wrote:

    I would way that's unlikely at this time, given ZGC's experimental 
    status in 11.
    
    /Per
    
    On 10/30/19 8:28 PM, Sundara Mohan M wrote:
    > Hi Per
    >    Will these changes be merged back to JDK11 at any point?
    > For ex. uncommit memory feature or C2 related changes will be merged 
    > back to 11?
    > 
    > Thanks
    > Sundar
    > 
    > 
    > On Tue, Oct 22, 2019 at 11:04 AM Sundara Mohan M <m.sundar85 at gmail.com 
    > <mailto:m.sundar85 at gmail.com>> wrote:
    > 
    >     Ok, thanks for the update.
    > 
    >     On Tue, Oct 22, 2019 at 1:12 AM Per Liden <per.liden at oracle.com
    >     <mailto:per.liden at oracle.com>> wrote:
    > 
    >         Hi,
    > 
    >         No decision has been made, but we're continuously evaluating
    >         where we
    >         stand. The new C2 load barriers (JDK-8230565) was a major milestone
    >         towards making ZGC rock solid. We can hopefully make it
    >         non-experimental
    >         sooner rather than later.
    > 
    >         /Per
    > 
    >         On 10/22/19 12:14 AM, Sundara Mohan M wrote:
    >          > Hi,
    >          >     Any idea when ZGC will be moved out of experimental flags?
    >          > Understand it is too early to move it out of experimental but
    >         do we have
    >          > any plan to run it without +UnlockExperimentalVMOptions?
    >          >
    >          > Thanks
    >          > Sundar
    >          >
    > 
    

From per.liden at oracle.com  Wed Nov 27 10:05:40 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 27 Nov 2019 11:05:40 +0100
Subject: Is ZGC still in experimental?
In-Reply-To: <C6F42124-7578-426D-A99F-3F21811D31CC@amazon.com>
References: <CACGCMVp_Bq37F07GrVhUc6yD7GKvkYKj5ftefWCvYWmrfQH4Jg@mail.gmail.com>
 <cf476b09-ae30-5d6f-1191-e9b61b303f8a@oracle.com>
 <CACGCMVoX2U7tUduB984i9g=byVCL9vJ1t3HW1Cs596y3ybOUJA@mail.gmail.com>
 <CACGCMVq-HMhLRrRN4J15dMTmv4VsXKYtfPEbigat-mpKUaskHA@mail.gmail.com>
 <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com>
 <C6F42124-7578-426D-A99F-3F21811D31CC@amazon.com>
Message-ID: <a0a5e995-2244-29d5-b171-bc4ac92fdac1@oracle.com>

Hi,

On 11/26/19 11:45 PM, Connaughton, Niall wrote:
> I wanted to double check on this. Is the intention that ZGC in JDK 11 is not intended for production use and will never have backports to bring it up to a production ready level? Put another way - if we want to evaluate using ZGC for a production service, is it an effective pre-requisite to move to a later JDK than 11?
> 
> Obviously we can test with the JDK11 version, and if we don't happen to see any problems then can make an "at-your-own-risk" call on that. But the longer term plans for addressing issues will affect the appetite for whether to even start out on that version.

Experimental status basically means it's a technical preview giving 
people a chance to test it and provide feedback, without having to roll 
their own JDK. We have backported bug fixes on a few occasions, but 
that's typically only done if we think it's critical for one reason or 
another, and the bar is high. We don't intend to make ZGC in 11 
non-experimental. In the new JDK release model, LTS releases typically 
don't get "new features" after GA.

ZGC and the supporting infrastructure in Hotspot is moving along at a 
fairly rapid pace, and quite a lot of good stuff has gone into each 
release since 11. Using the latest JDK is always recommended if you're 
using ZGC. ZGC in JDK 11 vs. 13 can for some workloads be a quite 
noticeable leap, in terms of performance, latency, etc.

Hope that helps.

cheers,
/Per

> 
> Thanks,
> Niall
> 
> ?On 10/31/19, 02:00, "zgc-dev on behalf of Per Liden" <zgc-dev-bounces at openjdk.java.net on behalf of per.liden at oracle.com> wrote:
> 
>      I would way that's unlikely at this time, given ZGC's experimental
>      status in 11.
>      
>      /Per
>      
>      On 10/30/19 8:28 PM, Sundara Mohan M wrote:
>      > Hi Per
>      >    Will these changes be merged back to JDK11 at any point?
>      > For ex. uncommit memory feature or C2 related changes will be merged
>      > back to 11?
>      >
>      > Thanks
>      > Sundar
>      >
>      >
>      > On Tue, Oct 22, 2019 at 11:04 AM Sundara Mohan M <m.sundar85 at gmail.com
>      > <mailto:m.sundar85 at gmail.com>> wrote:
>      >
>      >     Ok, thanks for the update.
>      >
>      >     On Tue, Oct 22, 2019 at 1:12 AM Per Liden <per.liden at oracle.com
>      >     <mailto:per.liden at oracle.com>> wrote:
>      >
>      >         Hi,
>      >
>      >         No decision has been made, but we're continuously evaluating
>      >         where we
>      >         stand. The new C2 load barriers (JDK-8230565) was a major milestone
>      >         towards making ZGC rock solid. We can hopefully make it
>      >         non-experimental
>      >         sooner rather than later.
>      >
>      >         /Per
>      >
>      >         On 10/22/19 12:14 AM, Sundara Mohan M wrote:
>      >          > Hi,
>      >          >     Any idea when ZGC will be moved out of experimental flags?
>      >          > Understand it is too early to move it out of experimental but
>      >         do we have
>      >          > any plan to run it without +UnlockExperimentalVMOptions?
>      >          >
>      >          > Thanks
>      >          > Sundar
>      >          >
>      >
>      
> 

From conniall at amazon.com  Wed Nov 27 22:18:22 2019
From: conniall at amazon.com (Connaughton, Niall)
Date: Wed, 27 Nov 2019 22:18:22 +0000
Subject: Is ZGC still in experimental?
In-Reply-To: <a0a5e995-2244-29d5-b171-bc4ac92fdac1@oracle.com>
References: <CACGCMVp_Bq37F07GrVhUc6yD7GKvkYKj5ftefWCvYWmrfQH4Jg@mail.gmail.com>
 <cf476b09-ae30-5d6f-1191-e9b61b303f8a@oracle.com>
 <CACGCMVoX2U7tUduB984i9g=byVCL9vJ1t3HW1Cs596y3ybOUJA@mail.gmail.com>
 <CACGCMVq-HMhLRrRN4J15dMTmv4VsXKYtfPEbigat-mpKUaskHA@mail.gmail.com>
 <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com>
 <C6F42124-7578-426D-A99F-3F21811D31CC@amazon.com>
 <a0a5e995-2244-29d5-b171-bc4ac92fdac1@oracle.com>
Message-ID: <F0CA2825-1DFD-4354-A659-4271FDF42E4B@amazon.com>

Thanks, that helps clarify. For us running on an LTS release is preferred as we don't necessarily want to be in a position of needing to move to a newer JDK at high frequency. We're aware of ZGC being experimental and know we're somewhat wading into unknown waters, so it's a balance we have to think about between the potential benefits of a fundamentally different GC vs potential pitfalls or additional effort in keeping up to date. The Shenandoah team seem to be putting a lot of effort into backporting to earlier JDKs - I guess this is down to a choice they've specifically made.

?On 11/27/19, 02:06, "Per Liden" <per.liden at oracle.com> wrote:

    Hi,
    
    On 11/26/19 11:45 PM, Connaughton, Niall wrote:
    > I wanted to double check on this. Is the intention that ZGC in JDK 11 is not intended for production use and will never have backports to bring it up to a production ready level? Put another way - if we want to evaluate using ZGC for a production service, is it an effective pre-requisite to move to a later JDK than 11?
    > 
    > Obviously we can test with the JDK11 version, and if we don't happen to see any problems then can make an "at-your-own-risk" call on that. But the longer term plans for addressing issues will affect the appetite for whether to even start out on that version.
    
    Experimental status basically means it's a technical preview giving 
    people a chance to test it and provide feedback, without having to roll 
    their own JDK. We have backported bug fixes on a few occasions, but 
    that's typically only done if we think it's critical for one reason or 
    another, and the bar is high. We don't intend to make ZGC in 11 
    non-experimental. In the new JDK release model, LTS releases typically 
    don't get "new features" after GA.
    
    ZGC and the supporting infrastructure in Hotspot is moving along at a 
    fairly rapid pace, and quite a lot of good stuff has gone into each 
    release since 11. Using the latest JDK is always recommended if you're 
    using ZGC. ZGC in JDK 11 vs. 13 can for some workloads be a quite 
    noticeable leap, in terms of performance, latency, etc.
    
    Hope that helps.
    
    cheers,
    /Per
    
    > 
    > Thanks,
    > Niall
    > 
    > On 10/31/19, 02:00, "zgc-dev on behalf of Per Liden" <zgc-dev-bounces at openjdk.java.net on behalf of per.liden at oracle.com> wrote:
    > 
    >      I would way that's unlikely at this time, given ZGC's experimental
    >      status in 11.
    >      
    >      /Per
    >      
    >      On 10/30/19 8:28 PM, Sundara Mohan M wrote:
    >      > Hi Per
    >      >    Will these changes be merged back to JDK11 at any point?
    >      > For ex. uncommit memory feature or C2 related changes will be merged
    >      > back to 11?
    >      >
    >      > Thanks
    >      > Sundar
    >      >
    >      >
    >      > On Tue, Oct 22, 2019 at 11:04 AM Sundara Mohan M <m.sundar85 at gmail.com
    >      > <mailto:m.sundar85 at gmail.com>> wrote:
    >      >
    >      >     Ok, thanks for the update.
    >      >
    >      >     On Tue, Oct 22, 2019 at 1:12 AM Per Liden <per.liden at oracle.com
    >      >     <mailto:per.liden at oracle.com>> wrote:
    >      >
    >      >         Hi,
    >      >
    >      >         No decision has been made, but we're continuously evaluating
    >      >         where we
    >      >         stand. The new C2 load barriers (JDK-8230565) was a major milestone
    >      >         towards making ZGC rock solid. We can hopefully make it
    >      >         non-experimental
    >      >         sooner rather than later.
    >      >
    >      >         /Per
    >      >
    >      >         On 10/22/19 12:14 AM, Sundara Mohan M wrote:
    >      >          > Hi,
    >      >          >     Any idea when ZGC will be moved out of experimental flags?
    >      >          > Understand it is too early to move it out of experimental but
    >      >         do we have
    >      >          > any plan to run it without +UnlockExperimentalVMOptions?
    >      >          >
    >      >          > Thanks
    >      >          > Sundar
    >      >          >
    >      >
    >      
    >