[G1GC] Evacuation failures with bursts of humongous object allocations

Mon Nov 9 11:03:37 UTC 2020

Hi Charlie,

   [resending, trying to fix some nomenclature to avoid too many 
"survivor"s, disambiguating better between objects that survive an 
evacation, survival rates and survivor regions and space, and adding 
some comments about what age means in G1's survival rate tracking. Just 
ask if there are questions, I guess :)]

On 05.11.20 23:49, Charlie Gracie wrote:
> Hi,
> 
> We have been investigating an issue with G1GC and bursts of short lived
> humongous object allocations. Normally during the application, the humongous
> object allocation rate is about 1 humongous region between each GC. Occasionally,
> the humongous allocation rate climbs to 600 or more regions between 2 GC cycles
> and consumes 100% of the free regions. The subsequent GC has no free regions for
> to-space and not even a single object can be evacuated. Since to-space is exhausted
> immediately, the GC is extremely long due to dealing with evacuation failures. The
> workload is running on JDK 11 but we have been able to reproduce it on JDK 16 builds.
> About 1/40 GCs are impacted by these bursts of humongous allocations.
> 
> [3] is an example of a GC running on JDK 11 when the burst of humongous
> allocations happens. [4] is an example of the rest of the GCs.
> 
> It seems like -XX:G1ReservePercent is the recommended way to tune for humongous
> object allocations. Is this correct? 

Yes.

> We could tune around this behaviour by increasing
> the G1ReserverPercent and heap size but since this happens rarely the JVM will be over
> provisioned most of the time. This is an ok work-around but I am hoping we can make
> G1GC more resilient to bursts of humongous object allocations.

You probably are missing "... that are short-living" in your 
description. Otherwise the suggested workaround does not... work.

> 
> What we are experiencing seems related to JDK-8248783 [1] and I have been
> prototyping changes that may resolve one of their issues as well. My approach is to
> force a GC during the slow allocation path if the number of free regions is about to
> drop below a reasonable threshold to complete the next GC cycle. The check is inserted
> into the slow path for regular objects

Why regular objects too? Maybe to completely obsolete G1ReservePercent 
for this purpose?

> and humongous objects. In my current prototype [2]
> the G1 slow allocation path will only allow a free region to be consumed:
> 
> if (((ERC / SR) + ((SRC * TSR) / 100)) <= (FRC  - CR))

This looks like: "do a gc if the amount of free space after evacuating 
currently allocated regions is larger than the allocation".

A few notes/problems on that suggested formula (which seems to err on 
the conservative side):

- for the first term, eden regions, g1 already provides more accurate(?) 
prediction of survived bytes after evac using survival rate predictors 
(see G1SurvRateGroup), one use is in 
G1Policy::predict_eden_copy_time_ms(), another in 
G1Policy::predict_bytes_to_copy.

Note that this value is not adjusted by any kind of allocation 
fragmentation. Eg. due to use use PLABs and regions G1 might use more 
than that.

I think for eden regions that prediction fairly okay and better than 
some random value like SR.

- There is currently no good prediction of survival rate for survivor 
regions; the use of TSR for survivor space seems kind of random.

TSR is the ratio of objects that survive that the collector intends to 
keep in survivor space. The idea is to limit the amount of objects kept 
there to limit copy costs in further evacuations. It has not much to do 
with actual survival rates. Objects live and die regardless of that 
value - maybe not taking space in survivor regions but in old regions.

If you ever looked at the gc+heap=debug output containing the expected 
number of regions in survivor space, from my experience this value is 
typically way too high. That's probably why it works well with your 
application.

Note that G1 does track survival rate for survivor regions too, but it's 
not good in my experience - survival rate tracking assumes that objects 
within a region are of approximately the same age (the survival rate 
prediction mechanism assigns the same "age" to objects allocated in the 
same region. This is not the object age in the object headers generally 
used!), which the objects in survivor regions tend to simply not be. The 
objects in there are typically jumbled together from many different ages 
which completely violates that assumption.

(Assuming that age is a good indicator for death rate).

There were some attempts in the past by me to improve that but they were 
not completed, mostly related due to testing time as more tracking tends 
to add more code in the inner copy-loop (which typically requires lots 
of scrutiny).

- potential surviving objects from old region evacuations are missing 
completely in the formula. I presume in your case these were not 
interesting because (likely) this application mostly does short living 
humongous allocations and otherwise keeps a stable old gen?

> 
> ERC - eden region count
> SR - SurvivorRatio
> SRC - survivor region count
> TSR - TargetSurvivorRatio
> FRC - free region count
> CR - number of free regions required for allocation
> 
> Using this algorithm significantly improves G1GCs handling of bursts ofhumongous
> object allocations. I have not measured any degradations to "normal" workloads we
> run but that may not be representative set. In theory, this should onlyimpact workloads
> that consume more humongous regions than G1ReservePercent between GC cycles.

So basically the intent is to replace G1ReservePercent for this purpose, 
making it automatic, which is not a bad idea at all.

One problem I can see in this situation is that what if that GC does not 
free humongous objects memory? Is the resulting behavior better than 
before, or in which situation it is(n't)?

And, is there anything that can be done to speed up evacuation failure? 
:) Answering my rhetorical question: very likely, see the issues with 
evacuation failure collected using the gc-g1-pinned-regions labels 
lately [1], in particular JDK-8254739 [2].

So it would be interesting to see the time distribution for evacuation 
failure (gc+phases=trace) and occupancy distribution of these failures.

> 
> I am curious about what other people think of the behaviour we are seeing and the
> solution I am experimenting with. Any feedback would be greatly appreciated.

Hth a bit,
   Thomas

> 
> Thanks,
> Charlie
> 
> [1] - https://bugs.openjdk.java.net/browse/JDK-8248783
> [2] - https://github.com/charliegracie/jdk/tree/humongous_regions
> 
> [3] - Example of a bad GC during the burst humongous object allocations
> GC(468) Pause Young (Prepare Mixed) (G1 Humongous Allocation)
> GC(468) GC(468) Age table with threshold 15 (max threshold 15)
> GC(468) To-space exhausted
> GC(468)   Pre Evacuate Collection Set: 0.2ms
> GC(468)     Prepare TLABs: 0.2ms
> GC(468)     Choose Collection Set: 0.0ms
> GC(468)     Humongous Register: 0.2ms
> GC(468)   Evacuate Collection Set: 30.1ms
> GC(468)   Post Evacuate Collection Set: 253.3ms
> GC(468)     Evacuation Failure: 249.1ms
> GC(468) Eden regions: 404->0(64)
> GC(468) Survivor regions: 8->0(69)
> GC(468) Old regions: 182->594
> GC(468) Humongous regions: 686->2
> GC(468) Pause Young (Prepare Mixed) (G1 Humongous Allocation) 10225M->4755M(10240M) 285.057ms
> 
> [4] Regular GC from the same log for comparison.
> GC(465) Pause Young (Normal) (G1 Evacuation Pause)
> GC(465) Age table with threshold 15 (max threshold 15)
> GC(465) - age   1:   21586848 bytes,   21586848 total
> GC(465) - age   2:    7962712 bytes,   29549560 total
> GC(465) - age   3:    1033216 bytes,   30582776 total
> GC(465) - age   4:    4710920 bytes,   35293696 total
> GC(465) - age   5:     716064 bytes,   36009760 total
> GC(465) - age   6:    2387064 bytes,   38396824 total
> GC(465) - age   7:    2331208 bytes,   40728032 total
> GC(465) - age   8:     321680 bytes,   41049712 total
> GC(465) - age   9:    4974056 bytes,   46023768 total
> GC(465) - age  10:     106488 bytes,   46130256 total
> GC(465)   Pre Evacuate Collection Set: 0.0ms
> GC(465)   Evacuate Collection Set: 16.0ms
> GC(465)   Post Evacuate Collection Set: 1.2ms
> GC(465)   Other: 1.3ms
> GC(465) Eden regions: 494->0(537)
> GC(465) Survivor regions: 5->7(63)
> GC(465) Old regions: 182->182
> GC(465) Humongous regions: 1->1
> GC(465) Pause Young (Normal) (G1 Evacuation Pause) 5454M->1512M(10240M)18.704ms

> 

[1] 
https://bugs.openjdk.java.net/issues/?jql=labels%20%3D%20gc-g1-pinned-regions
[2] https://bugs.openjdk.java.net/browse/JDK-8254739