Re: G1 patch of elastic Java heap

Sat Oct 12 11:51:26 UTC 2019

Hi Thomas,

The manual generation limit can be put aside currently since we know it might not be so general for
 a GC. We can focus on how to change heap size and return memory in runtime first. 

GCTimeRatio is a good metric to measure the health of a Java application and I have considered
 to use that. But finally I chose a simple way just like the periodic old GC. Guarantee a long 
enough young GC interval is an alternative way to make sure the GCTimeRatio at a heathy state. 
I'm absolutely ok to use GCTimeRatio instead of the fixed young GC interval. This part is same
 to ZGC or Shenandoah for how to balance the desired memory size and GC frequency. I'm open to 
any good solution and we are already in the same page for this issue I think:)

A big difference of our implementation is evaluating heap resizing in any young GC instead of a 
concurrent gc cycle which I think is swifter and more immmediate. The concurrent map/unmap 
mechanism gets rid of the additional pause time. My thought is the heap shrink/expand can be
 all determined in young GC pause and performed in concurrent thread which could exclude the 
considerable time cost by OS interface. Most of our Java users are intolerant to those pause
 pikes caused by page fault which can be up to seconds. And we also found the issue of time 
cost by map/unmap in ZGC.

A direct advantage of the young GC resizing and concurrent memory free machanism is for implementing
SoftMaxHeapSize. The heap size can be changed after last mixed GC. The young GC won't have longer
 pause and the memory can be freed concurrently without side effect.

Thanks,
Liang

------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2019 Oct. 11 (Fri.) 19:02
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: G1 patch of elastic Java heap

Hi,

On 10.10.19 15:48, Liang Mao wrote:
> Hi Thomas,
> 
> Thank you for the feedback.
> You are right about some points that the present code seems to separate 
> the heap into young and old gen pools. In OpenJDK8, there's no adaptive-ihop so fixed ihop 
> and MaxNewSize can clearly separate young gen and old gen. I'm also thinking about how to design it better 
> in upstream of OpenJDK G1.
> 
> There is a tradeoff between memory and GC frequency. More frequent GC 
> uses less memory. We found our online service applications keep large young generation for 
> potential query traffic but most of time the young GC frequency is quite low. Memory can be easily saved 
> by using smaller young gen
> In Shenandoah or ZGC, there is only 1 generation and it's 
> straightforward to determine if memory is wasted and can be returned. G1 has 2 generations, in remark phase 
> MinHeapFreeRatio/MaxHeapFreeRatio cannot tell the young generation is rather wasted for running 2 minutes 
> without a young GC and we can return a lot of memory. Each generation's GC interval or time ratio 
> spent on mutator/gc you mentioned seems more intuitive.
> 
> The explicit limitation of generation may not be a good design from G1 
> GC's perspective. From the operation's point of view, it is easy for manipulating JVM. There is a 
> simple relationship: larger network traffic -> higher memory allocation rate -> larger young 
> generation. So cluster operation can easily set the young generation as 10% of max young gen 
> size to every Java instance if the network traffic is guanranteed to be below 10% for a period of time.
> 
> I'm not sticking to the current implementation to create clear boundary 
> between young and old gen, especially for newer OpenJDK versions and I've been thinking of unifying 
> the 2 generations' resizing within the single memory pool of heap along with Xms. The periodic 
> uncommit mode does not strickly separate the young/old gen. Current implementation calculates the 
> average GC interval and keep it in a certain range between a low bound and high bound and will immediately 
> trigger an expansion if a single GC interval smaller than a threshould. We can use a similar 
> policy to estimate a target young generation capacity and adjust the capacity of old generation after a 
> concurrent cycle. The 2 parts together can be the target heap capacity. The capacity can vary between 
> Xms and Xmx. The difference with current G1 is it can be resized in a young GC not only remark.

Thank you for presenting your problem (and not insisting on a particular 
solution upfront).

Summary of this long text:

In case of "low" activity the user wants to limit the heap resulting in 
giving back memory. Currently, all the user can do is specifying the 
maximum amount of work the gc is allowed to use (GCTimeRatio). At least 
G1, as soon as the time spent in gc compared to mutator time is lower 
than GCTimeRatio (typically achieved by expanding the heap), it "never" 
shrinks the heap back (at least not based on that ratio). Which wastes 
lots of space, which is the problem.

We all agree that this is a problem :) I believe we only differ on what 
knobs the user should have available to achieve this.

Here are my current suggestions:

One option that I suggested earlier, is that instead of setting 
generation sizes (or heap sizes) manually (which could be fine in some 
cases for other reasons) could be thinking a bit differently about 
GCTimeRatio than now: currently it is the maximum amount of GC activity 
the user can bear, so we should make the GC to use less.
The slight tweak here could be that we assume that any GC activity below 
that is fine :)

Ie. if current GC activity is very low compared to mutator activity (far 
below what GCTimeRatio allows), and expected additional GC activity 
caused by this forced GC cycle would not exceed that GCTimeRatio, why 
not do the GC?

Think of a "minimum" GCTimeRatio; in some way this is very much like 
minimum and maximum GC intervals only with much more flexibility for the 
GC to meet (also this metric is independent of the environment, e.g. 
hardware, while setting actual values of sizes needs tuning).

I agree that there is then not an immediately obvious relation between 
external input (the traffic in your example) to what you should set that 
"minimum" GCTimeRatio to. However since there is a relation to young gen 
size and GCTimeRatio I think this can be figured out.

This is what ZGC does and I think would be worth trying out before 
thinking about adding a G1 specific way of achieving this or a similar 
effect.

The other option which is more direct would be implementing and changing 
target heap size during runtime: it would also automatically shrink the 
heap. I believe that if you were able to modify the current adaptive 
IHOP's "target" heap size from outside, G1 would already automatically 
give back memory; in conjunction with the "Promptly Return ...", it 
would also make sure that in very low mutator activity cases the GC 
cycle would continue.

As for whether this feature would be accepted for inclusion into G1: 
there is already a SoftMaxHeapSize switch in the JDK, so I guess this is 
a non-issue.

Note that you can *already*, if you know that from a particular time on 
there will be little activity, modify the "Promptly Return..." settings 
so that it will immediately start cleaning up and compacting the heap; 
you can even force maximum compaction at that time by issuing a full gc 
if service interruption is not an issue.

> 
> In order to do swift heap resizing we have to conquer the over head of 
> memory request/release from OS. The memory unmap and map(including the page fault) cost significant 
> time. So we use an intuitive way to have a concurrent thread to do the map/unmap/pretouch. The free 
> regions will be synchronized in GC pause. In our applications, a typical G1 remark cost ~100ms of pause. I 
> haven't tested latest G1 but based on our experimental data, the pause can be easily doubled if done 
> considerable map/unmaps.
> 

That's a related but distinct problem and a solution that seems at least 
worth trying :)

> 
> All of above are our thoughts and the present implementation is kind of 
> reference. Please let me know if
> I answered all your questions. Hope we can come to an agreement in some 
> points and conceive a good design
> in latest G1 GC :)
> 

Thanks,
   Thomas