RFC: TLAB allocation and garbage-first policy

Wed Sep 20 12:16:08 UTC 2017

On 09/20/2017 02:09 PM, Christine Flood wrote:
> Changing our heuristics to compacting mostly empty regions to solve a one-of sparse region due to a
> race condition doesn't make sense to me.  The common case would be compacting a bunch of live data
> from a bunch of usable regions for no reason.  

Problem is, this case is not really one-off (i.e. the race is rather frequent and its effects linger
indefinitely), and those regions are not really usable (because regions get retired when allocations
fail in them). The whole thing leads to fragmentation that kills the concurrent GC. Quote:

"You can clearly see the anomaly in Visualizer after 10+ minutes of LRUFragger run with 50 GB LDS on
100 GB heap (...and it drives into Full GC shortly afterwards, because free set got depleted due to
fragmentation!):
  http://cr.openjdk.java.net/~shade/shenandoah/wip-tlab-race/baseline-1.png"

I think our job here is to recuperate from the loss in a pragmatic way, and collecting based on live
data seems to be pragmatic. It is also in line with what G1 is doing. It also benefits humongous
allocs because it defrags the heap more aggressively.

But I would be equally happy to see another way out of this. What would you suggest?

-Aleksey