Strange G1 behavior

Fri Oct 20 13:59:28 UTC 2017

> On Oct 20, 2017, at 3:22 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
>>> 
>> 
>> In my experience promotion rates are exacerbated by an overly small
>> young gen (which translates into an overly small to-space). In these
>> cases I believe it only adds to the overall pressure on tenured and
>> it part of the reason why the full recovers as much as it does.   Not
>> promoting has the benefit of not requiring a mixed collection to
>> clean things up. Thus larger survivors still can play a positive role
>> as they do in generational collectors. MMV will vary with each
>> application.
> 
> Yes. However as mentioned, in this case the death rate is already quite
> low (1:4), so decreasing the young gen by >1/4th would even be a win if
> everything needed to be promoted (I suggested to decrease the young gen
> during mixed gc to 1/5th ;)).
> 
> In practice you probably won't get a 1:1 rate even with very small
> young gen.

Ok, it looks like we don’t agree on the cost model involved here and thats ok. Thinking about it made me realize that I jumped to a configuration that worked for the other clients I’ve helped who’ve encountered this problem. The problem with this is; it’s a recommendation made without any rigor or process. The one thing that we have agreed upon is that this application becomes starved for memory so it makes sense to give it more memory. In this case tenured encroaches to more than 15G before the Full is triggered. So it seems that a heap setting 32G seems like a reasonable start. I would then get rid of any other heap configurations hoping that we can fall back to the default position of having only a max heap size set. We can then use the GC logs to see if there is any further configuration required.

>> 
>> This is great information. Unfortunately there isn’t any data to help
>> anyone understand what a reason able setting should be. Would it also
> 
> For JDK8 the worst case is basically the number of references in your
> largest j.l.O. array times max number of j.l.O. arrays iirc.
> 
> In JDK9, 1024 * max number of j.l.O. arrays; also in JDK9 work
> distribution between mark threads is much better, i.e. the mark stack
> will be processed much faster.
> 
> (I did not think hard about both JDK's worst cases, so I may be
> completely off).
> 
> As for when to increase mark stack size, you do get a message when/if
> the mark stack overflows... also, G1 could simply increase the mark
> stack size in that case and continue without restarting the marking if
> increasing space succeeded. There's a CR for that somewhere,
> contributions welcome as usual (G1 already does that iirc when there is
> a mark stack overflow during GC pause).

Ok, as an aside, I use some of the data in the GC log to estimate the level of parallelism in the collector. It’s a rough guess but it’s sometimes useful in helping to identify cases such as this.
> 
>> be reasonable to double the mark stack size when you these failure.
>> Also, is the max size of the stack bigger if you configure a larger
>> heap?
> 
> Use JDK9 - there except in rather unlikely situations you should need
> to increase it.
> 
> Also in JDK9, the transition from marking to mixed gcs is faster,
> automatically decreasing the pressure.

I’m looking forward to being able to use 9.

Kind regards,
Kirk