RFR (M): 8212657: Implementation of JDK-8204089 Timely Reduce Unused Committed Memory

Thu Nov 15 13:32:52 UTC 2018

Hi,

On Wed, 2018-11-14 at 19:19 -0800, Man Cao wrote:
> Google has two similar features to reduce idling process's memory
> footprint. Both features were initially implemented in CMS and have
> been reimplemented for G1.
> The features mainly target the Xms=Xmx case when a program is idling,
> as most of our desktop and production jobs set Xms=Xmx.
> 
> The first feature that is responsible for uncommitting memory is
> similarly to Shenandoah's approach.
> It calls madvise(MADV_DONTNEED) on a subset of free heap pages that
> are committed by the collector.

By default, -Xms == -Xmx should just mean that, MADV_DONTNEED may
introduce unnecessary latency that would certainly be unexpected.

For the same reason I closed JDK-8196820 as WNF recently.

It could certainly be useful to use madvise to "uncommit" memory within
the usual heap sizing (with -Xms != -Xmx, see JDK-8210709). If the heap
sizing heuristic is not appropriate enough, it is probably better to
fix that than to introduce some workaround over that (or at least try).

> For G1, we added a new type to HeapRegionType::Tag for uncommitted
> region. The new type is basically a minor type of Free region.
> The feature uses a heuristic to uncommit some regions
> in HeapRegionManager::_free_list after an old-gen or full collection.

These are just my initial thoughts just after reading this: it seems
unnecessary to have an additional tag for that as the GC algorithm
itself does not seem to need to know about different kind of free
regions (and the application neither).

I.e. unless there is some compelling reason I would hide this "madvised
region" information in the HeapRegionManager class, which already knows
about different kinds of free regions (e.g. uncommitted and no
HeapRegion information available, uncommitted but HeapRegion
information available, and committed and HeapRegion information
available).

_If_ keeping that information somewhere is actually needed (maybe for
allocation from the free list purposes?). E.g. madvise(MADV_DONTNEED)
memory seem to behave the same as regular free memory from a GC point
of view. If accessed by the mutator even without interaction with some
layer below, it will automatically be available for use. The details
are handled by the OS (i.e. whether it does nothing or it is
"committed").

Maybe my model of the behavior of madvise(DONTNEED/*) is too simple.

> The second feature controls when to trigger additional old-gen
> collections in order to trigger the first feature.
> We use a notion called "mutator utilization", which is the percentage
> of time mutator used since last old-gen collection.
> For G1, it is:
> mutator_utilization = 1 - (last_concurrent_mark_duration +
> total_mixed_gc_duration_since_last_concurrent_mark +
> total_full_gc_duration_since_last_concurrent_mark) /
> time_since_start_of_last_concurrent_mark
> If mutator utilization exceeds a certain threshold (e.g., >98%), then
> a concurrent cycle is initiated.
> If mutator utilization is too low (e.g., <40%), it can be used to
> prevent concurrent collection from happening, reducing GC thrashing.

That is a very non-standard way of defining mutator utilization, but
some of the terms are not clearly defined :)
>From what I understand, the formula in the end just reduces to periodic
old gen collections regardless of other activity (e.g. it does not take
minor gc into account apparently).

(Also, just pointing out that G1 already tracks things like recent
mutator utilization e.g. for heap sizing.)

We considered mutator utilization as a driver for the periodic gcs (ZGC
does it that way), however it relies on some assumptions on how much
impact on throughput and particular on (max) latency the caused GCs
have. Compared to G1, ZGC does not have a problem on max latencies, so
we wanted to move the latency spikes (i.e. the initial mark) into
"idle" phases (i.e. only taking mutator utilization, particularly
according to the extent I understand your formula, into account may
cause these GCs in inconvenient situations).

One part of the thinking about that is that if the application is
active, it is typically running into a concurrent cycle sooner or later
anyway.

Also, just some kind of timeout as implemented now seemed to be easiest
to understand and configure while catching similar use cases.

Not saying it is best, but it is apparently a good start for a
discussion. :)

> Comparing to setting Xms!=Xmx, the main benefits of using these two
> features are:
> - it avoids the cost of shrinking/expanding the heap, while saving
> memory when process idles. The cost of re-committing pages previously
> marked as MADV_DONTNEED is lower than triggering more collections to
> resize the heap;
> - it saves users' configuration effort for figure out different Xms
> and Xmx values.

To me it seems that the actual problem for the second part is that the
default values for -Xms are not good enough. It may be more useful to
improve them instead of opening the "-Xms==-Xmx does not mean what it
intuitively means" can of worms :)

> These two features do not reduce memory as much as setting a small
> Xms, because they do not eagerly compact the heap.
> Basically, they are better suited for workload that prioritizes
> throughput and performance over memory usage, but still would like
> memory savings while idling.

See above.

> It would be nice if we could merge these ideas/features into this
> JEP. Or maybe they can be in another JEP that targets the Xms=Xmx
> case?

The madvise() change a different issue altogether, and does not need a
JEP.
There will be a lot of resistence to make -Xmx==-Xms behave as you
suggest (in the default case...), and it seems that the problem in your
case is improper heuristics for -Xms in some (many?) cases which seems
to be acknowledged above.

I am still not sure what the problem is with -Xms != -Xmx, or what
-Xmx==-Xms with following uncommit solves. It is hard to believe for me
that setting -Xms to -Xmx is easiest for an end user - I would consider
not setting -Xms easiest...

Maybe doing so improves startup time where often it is advantageous to
have a large eden to get the long-lived working set into old gen
quickly? Maybe some "startup boost" for heap sizing/some options would
help here much better?

> The idea of mutator utilization might be interesting by itself.
> Wessam (CCed) worked on this feature.
> It is orthogonal to G1UseAdaptiveIHOP to control when to start a
> concurrent cycle. We also found it is useful to reduce GC cost in
> production workload by setting a higher minimum bound to prevent
> concurrent cycles.

I did not get that paragraph, you need to explain this in more detail
:)

> 
> (Sorry that I should have brought up these two features in the
> earlier discussion threads for the JEP)

Don't worry.

Thanks,
  Thomas