Releasing unused memory

Thu Oct 18 09:56:10 UTC 2018

On 10/18/2018 10:30 AM, Aleksey Shipilev wrote:
> On 10/18/2018 09:59 AM, Per Liden wrote:
>> For example, when this is enabled, we probably also want this to affect ZGC's heap grow
>> heuristics so that we not only return memory we don't need but also so that we're more
>> conservative about growing the heap in the first place. Also, the current patch returns memory
>> very aggressively. To reduce the impact on performance, we might want to only release memory that
>> have gone unused for some amount of time.
> I assume "being more conservative about growing the heap in first place" basically means triggering
> the GC more often, and not actually stalling allocations when the current ephemeral limit is reached
> (that would be a questionable choice, IMO)?

With "being more conservative about growing the heap in first place" I 
mean the case where we were given say 100G heap, but to maintain some 
reasonable GC frequency we only seem to need, say, 10G. In other words, 
we were given a much larger heap headroom than what we seem to need at 
the moment. In that case, we might want to exploit that, but not letting 
the heap grow to 100G. Of course, the hard part to answer here is 
"what's a reasonable GC frequency?". And, of course, if it turns out 
that the workload all of a sudden needs all of that that headroom we 
were give, than it's of course free to use it.

So, this essentially just means that the heuristics, for calculating 
when to do a GC, would use a "target max capacity" instead of the real 
"max capacity" (Xmx) as input.

> 
> Sharing some adoption experience below:
> 
> Shenandoah used to have the separate heuristics code that decided when to grow the heap vs
> triggering the GC, but we eventually considered that is one feedback loop too many, which runs in
> peculiar corner cases on real workloads. So, as soon as we got timed uncommit integrated, we have
> started to piggyback on that for "heap sizing": initialize all the heap (metadata) at once, but
> commit only the regions we currently need, uncommit unused regions after some time.
> 
> This ties up heap sizing to actual use automatically, does not run into feedback potholes, and
> exposes sensible tuning options (i.e. timed uncommit delay that users can relate to their
> deployments). On the down side, this might make heap grow up to -Xmx until driver code reacts with
> GC request.
> 
> We thought that would be a concern, but it seems adopters come with combination of these three
> desires: a) fine with whatever active footprint, as long as throughput holds; b) fine with whatever
> throughput hits, as long as footprint is minimal; c) care very much about idle footprint. It seems a
> waste of time trying to come up with a heuristics that tries to have medium throughput with medium
> footprint. Instead, it seems more prudent to handle idle footprint issues first, and then provide
> canned profiles for either throughput- or footprint-extremes, with some tunable options that can
> balance these, if users are happy to tune up for their specific workloads.

I agree, those three use cases are the most relevant ones. The 
"conservative heap growth" discussion above would apply to "b". "a" is 
what we do today, and "c" is essentially ZGC's proactive GCs (except 
that it doesn't actually uncommit yet).

cheers,
Per