Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Mon Feb 10 11:47:06 UTC 2020

Hi Thomas,

In my testing, I didn't change the value of Min/MaxHeapFreeRatio.

The heap had already shrinked to 5GB but in remark it expand to 6644M.
The fault value of MinHeapFreeRatio is 40, so the minimal commit size
after remark is the heap size * 1.67 (3979M * 1.67 = 6644M).
1.67 = 100/(100 - 40)

[1031.322s][info][gc ] GC(741) Pause Young (Concurrent Start) (G1 Evacuation Pause) 4724M->4506M(5120M) 10.607ms
[1031.322s][info][gc,cpu         ] GC(741) User=0.42s Sys=0.00s Real=0.01s
[1031.322s][info][gc             ] GC(742) Concurrent Cycle
[1031.322s][info][gc,marking     ] GC(742) Concurrent Clear Claimed Marks
[1031.322s][info][gc,marking     ] GC(742) Concurrent Clear Claimed Marks 0.066ms
[1031.322s][info][gc,marking     ] GC(742) Concurrent Scan Root Regions
[1031.322s][info][gc,stringdedup ] Concurrent String Deduplication (1031.322s)
[1031.323s][info][gc,stringdedup ] Concurrent String Deduplication 14224.0B->0.0B(14224.0B) avg 51.1% (1031.322s, 1031.323s) 0.514ms
[1031.326s][info][gc,marking     ] GC(742) Concurrent Scan Root Regions 3.939ms
[1031.326s][info][gc,marking     ] GC(742) Concurrent Mark (1031.326s)
[1031.326s][info][gc,marking     ] GC(742) Concurrent Mark From Roots
[1031.326s][info][gc,task        ] GC(742) Using 16 workers of 16 for marking
[1031.483s][info][gc,marking     ] GC(742) Concurrent Mark From Roots 157.144ms
[1031.483s][info][gc,marking     ] GC(742) Concurrent Preclean
[1031.484s][info][gc,marking     ] GC(742) Concurrent Preclean 0.404ms
[1031.484s][info][gc,marking     ] GC(742) Concurrent Mark (1031.326s, 1031.484s) 157.587ms
[1031.485s][info][gc,start       ] GC(742) Pause Remark
[1031.496s][info][gc             ] GC(742) Pause Remark 4625M->3979M(6644M) 10.953ms
[1031.496s][info][gc,cpu         ] GC(742) User=0.22s Sys=0.04s Real=0.01s

In our production environment, we never use JEP 346 mainly because of JDK version.
So I cannot tell how if it would work. I agree the "idle" issue is not our main focus for now.

Using SoftMaxHeapSize to guide adaptive IHOP to make desicion of concurrent
 mark GC cycle can work well with JEP 346 and the resize logic in remark.
I don't stick to shrink the heap in every GC.

The capacity in resize_heap_if_necessary will be
Max2(min_desire_capacity_by_MinHeapFreeRatio,  Min2(soft_max_capacity(), max_desire_capacity_by_MaxHeapFreeRatio))

But both 2 approaches have the problem that default MinHeapFreeRatio is too large
in remark comparing to full gc.  As resize_heap_if_necessary
 will keep a minimal heap size as 1.667X of used heap size. After remark,
the used size could be large that not only include those old regions with garbages but
also the used young regions. 

#############################
void G1CollectedHeap::resize_heap_if_necessary() {
...
const size_t capacity_after_gc = capacity();
const size_t used_after_gc = capacity_after_gc - unused_committed_regions_in_bytes();
#############################

The used_after_gc is reasonable for full gc but it can contains young regions in remark.
Do you think it should be changed like this?
#############################
const size_t used_after_gc = capacity_after_gc - unused_committed_regions_in_bytes() - young_regions_count() * HeapRegion::GrainWords;
// young_regions_count is 0 after full GC
#############################

Besides this, as you suggested, a lower MinHeapFreeRatio would be good. 
But arbitrarily setting a fixed number seems is not a good way that the small
 number may not meet pause time goal in later young GC. I tried to use
 following number in resize_heap_if_necessary:

##############################
void G1CollectedHeap::resize_heap_if_necessary() {
...
// We can now safely turn them into size_t's.
  size_t minimum_desired_capacity = (size_t) minimum_desired_capacity_d;
  size_t maximum_desired_capacity = (size_t) maximum_desired_capacity_d;

if (!collector_state()->in_full_gc()) {
    minimum_desired_capacity = MIN2(minimum_desired_capacity, policy()->minimum_desired_bytes(used_after_gc));
  }

....size_t G1Policy::minimum_desired_bytes(size_t used_bytes) const {
  return _ihop_control->unrestrained_young_size() != 0 ?
           _ihop_control->unrestrained_young_size() :
           _young_list_max_length * HeapRegion::GrainBytes
         + _reserve_regions * HeapRegion::GrainBytes + used_bytes;
}
#############################

I made the minimum_desired_capacity small enough based on adaptive IHOP's
_last_unrestrained_young_size. Even without SoftMaxHeapSize, the test can
keep the memory under 3GB. It's a rough example and I didn't predict the promotion
bytes of next young gc yet. Do you think
 a proper value of minimum_desired_capacity in remark resize
+ 
G1AdaptiveIHOPControl::actual_target_threshold according to soft_max_capacity
 is enough?

Thanks,
Liang

------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Feb. 7 (Fri.) 19:09
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi,

On 06.02.20 13:27, Liang Mao wrote:
> Hi Thomas,
> 
> Thanks for the testing and evaluating!
> 
> I tried your test with specjbb2015 and had some little different
> result maybe because of machine capability. The config I used is as below:
> -Xmx8g -Xms2g -Xlog:gc* -XX:GCTimeRatio=4
> -XX:+UseStringDeduplication
> -Dspecjbb.comm.connect.type=HTTP_Jetty
> -Dspecjbb.controller.type=PRESET
> -Dspecjbb.controller.preset.ir=5000
> -Dspecjbb.controller.preset.duration=10800000
> 
> The heap was around 6GB after running for a while (300s). And
> I was able to use SoftMaxHeapSize to let it shrink to 5GB. It
> should be like your scenario to shrink the heap to 3GB
> 
> The behavior is as I expected. But I thought you might expect
> more aggressive result. In my mind, for a constant load,
> the jvm might not need to shrink the heap that JVM supposes to expand
> the heap to the right capacity. 

Did you change Min/MaxHeapFreeRatio for your test? It does not look like 
that, as I get roughly the same results if I don't. Given that we agree 
that it is wrong to use Min/MaxHeapFreeRatio during Remark, the 
observation is interesting, but does not seem to help here except 
reinforcing that Min/MaxHeapFreeRatio are not a good thing to use.

Also, I doubt that G1's current heap size selection is optimal. Some 
reasons off my head:

- Min/MaxHeapFreeRatio has been chosen to avoid uncommit/commit 
ping-pong and frequent (un-)commits (i.e. performance), not heap 
compactness.

- adaptive IHOP (or at least the knowledge about expected amount of 
memory used during gc operation) has not been available, hence the very 
conservative values.

- the values have been chosen long before the uncommit at remark [2] has 
been implemented. As author of that change I can authoratively say that 
fixing the policy had been out of scope for that change ;) however it 
had been needed for JEP 346 Promptly Uncommit unused memory [1] to do 
*something* without disrupting existing behavior too much to avoid 
lengthy re-evaluation of sizing policies.

The logic went something like: what concurrent mark does roughly equals 
full gc, so do the same sizing as during full gc. End.

- there is (rough) consensus that Min/MaxHeapFreeRatio is/has been a bad 
idea, starting from the naming. ZGC and Shenandoah do not use it afaict.

- optimal heap size depends on application phase (e.g. 
startup/operation/idle). Min/MaxHeapFreeRatio default values basically 
prevent shrinking in many cases. Sometimes they even expand the heap 
[3]. Given the high default value of MinHeapFreeRatio, G1 will most 
likely end up using too much memory.

I.e. we apply MinHeapFreeRatio at Remark, which means that the heap size 
will be kept at heap size at Remark + 40%. Given that Remark is where 
heap usage almost peaked anyway, you get a really large commit size. 
Unnecessarily large because (beginning with modestly large heaps in few 
GBs) the actual peak memory usage *at optimal operation* is what 
adaptive IHOP determined. This is typically a lot less than 40% of 
existing usage at Remark. So G1 keeps a lot of memory around for no 
reason. This can be particularly significant in large heaps (say, double 
digit GB) where those 40% can be a lot in absolute terms while G1 only 
ever uses single digit additional GB during the cycle.

In my tests, e.g. the suggested 10% seem sufficient for that particular 
case.

We also agree that uncommit at end of mixed gc is probably better, but 
again, how much do you uncommit? To keep as much as you expect to not 
use would be a good start, maybe a bit more. Not less, because then you 
are going to do an unnecessary commit during that cycle for sure. 
Currently the best idea about what we are going to need in the next time 
is given by the IHOP goal value imho.

So overall, please do not read too much into existing heap sizing policy :)

> The soft limit I imagine is
> to bring the heap size down after a load pike. In Alibaba's
> workload, the heap shrink is controlled by cluster's unified
> control center which has the predicition data and the soft limit
> works more like a *hard* limit in our 8u implementation. >
> So I think it is acceptable that heap size failed shrinked
> to 2GB in your test case. You can see that
> G1HeapSizingPolicy::can_shrink_heap_size_to is a bit conservative
> and we may be able to make it more aggressive.
> 
> 
> For almost idle application which doesn't have a GC for a
> rather long time, the shrink cannot happen. In our previous 8u
> patch, we have a timer to trigger GC and the softmx is changed by
> a jcmd which will also trigger a GC(there was no SoftMaxHeapSize option
> in 8u yet). Shall we introduce a timer GC as well?
>

Please give the functionality JEP 346 added a try if you haven't. It 
should achieve what you suggest except that Min/MaxHeapFreeRatio may 
prevent G1 to achive the compact heap you expect (again).

Min/MaxHeapFreeRatio were changed to be manageable exactly for this 
reason, i.e. if you are idle, and your control center knows that the 
machine is going to be idle, instead of adjusting (in this case) 
SoftMaxHeapSize it may as well set Min/MaxHeapFreeRatio to low values 
and JEP 346 would do the rest. Before JEP 346 you needed to send a 
manual system.gc in addition.

So a simpler solution than the one suggested by you would be to just 
drop usage of Min/MaxHeapFreeRatio and/or incorporate SoftMaxHeapSize in 
the uncommit at remark in your case and let JEP 346 functionality its job.

If JEP 346 does not work for your use case, we are eager to hear back 
from you about your experience. We do know that it may be a little bit 
too much focused on what "idle" is, but that can be tweaked.

The reason I am suggesting to try JEP 346 is that from my understanding 
the suggested implementation seems to cover only exactly the same case 
as JEP 346, but only with side effects e.g.

- causing commit/uncommit ping-pong if the application is slightly 
active at worst, and no effect at best. While concurrent uncommit tries 
to mitigate this (and it is still very interesting to do), doing less 
commit/uncommit in the first place seems better.

- not covering e.g. the case where an existing Remark finishes after the 
last GC that decreased the heap to SoftMaxHeapSize even in the idle case 
(could be fixed as you mentioned above with a timer, but JEP 346 covers 
this already)

- only limited to reducing heap to SoftMaxHeapSize (why? Fixed as you 
said you were thinking about a more aggressive policy)

In a SoftMaxHeapSize solution in the JVM that I envision, the change 
should cover a wide(r) range of usage scenarios. We need to look a bit 
further than this single use case (which afaict G1 should already handle).

In the case you need a real hard limit I recommend looking at 
implementing that. There has been a proposal to do so some time ago, but 
is inactive at this time [0].

> 
> Honestly, I don't think Min/MaxHeapFreeRatio is a good way to detemine
> the heap expand/shrink in G1 and in our 8u practical experience we never
> have full GC so Min/MaxHeapFreeRatio is useless. Here when I reproduce
> your test, the only exception is the heap will expand to 6GB after
> shrinking to SoftMaxHeapSize=5g is because in remark we will resize the 
> heap.
> BTW, I don't think remark is a good point to resize heap since in remark
> phaseregions full of garbage havn't been reclaimed yet. IMHO we even don't
> need to resize in remark but just resize after mixed GC according to 
> GCTimeRatio.
> 
> Your change to make SoftMaxHeapSize sensible in adaptive IHOP controlling
> seems a similar approach as ZGC. ZGC is a single generation GC whose 
> scenario
> is much simpler. Maybe we don't need SoftMaxHeapSize to guide GC decision
> in G1. Since we already have policy to determine the shrink of the heap
> by SoftMaxHeapSize, I'm not sure if we need to make adaptive IHOP according
> to SoftMaxHeapSize... We may encounter the situation that we cannot 
> shrink the
> heap size to SoftMaxHeapSize but concurrent mark become frequent after 
> affecting
> the IHOP policy.

ZGC will be generational at some point. This has been on its roadmap 
since the beginning. Also, there is not much difference as you can see 
from the patch. The difference is currently 1 LOC to set young gen sizes 
in addition to the heap goal.

I also thought about the last point, i.e. when the user sets 
SoftMaxHeapSize too low, then you get continuous marking cycles. My 
answer to the user would be that, well, feel free to shoot yourselves 
into the foot, but compared to an OOME with a hard limit, this behavior 
seems much better (but there are certainly situations where a hard limit 
is better for someone so both seem useful).
Ultimately the only thing I can say that there is no free lunch in the 
throughput/latency/memory triangle, but there may be situations where 
memory is more important than performance too (widening the appeal of 
SoftMaxHeapSize).

In the test I gave, the 2g goal is maybe too low for this case, but the 
3g (instead of 3.8g) looks really attractive (and G1 seems to find an 
"optimal" size of 2.2-2.8g at that point; I think I found the reason for 
the spikes above 3g and looking into testing a fix).

The implementation suggested by me does not affect the idle case at all; 
JEP 346 functionality will clean up and compact the heap nicely (you 
would still need to fix the shrinking amount in the sizing policy, but 
we already agreed on that it is not good, and that doing the evaluation 
at remark isn't the best idea either - but both are separate issues).

> 
>> In the log I have, the problem seems to be that we are re-setting the 
>> softmaxheapsize within the space reclamation phase (i.e. mixed gc) and 
>> G1 sizing policies got confused, i.e. it partially keeps on using the 2g 
>> goal for young gen sizing until the *2 problem expands it. That's a bug 
>> and needs to be fixed.
> 
> I don't think it's a problem that after mixed GC 
> resize_heap_after_young_collection
> will evaluate if the heap can be shrinked to the new value of 
> SoftMaxHeapSize.

Resizing (to SoftMaxHeapSize) after every gc will shrink and expand all 
the time unnecessarily. I.e. you expand one GC, the next gc it may 
happen that G1 can shrink to SoftMaxHeapSize again (e.g. because eager 
reclaim freed a lot), next gc G1 commits again because of failed pause 
time goal (or just commit during humongous allocation which can be 
immediately reversed because of eager reclaim).

Even with concurrent uncommit, such behavior seems a waste of time. Imho 
with concurrent (un-)commit unnecessary resizing should be avoided if 
possible.

One option is to base that decision on the value that adaptive IHOP 
gives you. It seems a very good start but there may be better 
approaches. Fixed percentages like Min/MaxFreeRatio are too simple as it 
seems :)

Thanks,
   Thomas

[0] https://bugs.openjdk.java.net/browse/JDK-8204088
[1] https://bugs.openjdk.java.net/browse/JDK-8204089
[2] https://bugs.openjdk.java.net/browse/JDK-6490394
[3]
https://bugs.openjdk.java.net/browse/JDK-6490394?focusedCommentId=14283475&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14283475
(only just noticed)