Discussion: improve humongous objects handling for G1
Thomas Schatzl
thomas.schatzl at oracle.com
Fri Jan 17 10:00:23 UTC 2020
Hi,
On 17.01.20 01:53, Man Cao wrote:
> Hi all,
>
> While migrating our workload from CMS to G1, we found many production
> applications suffer from humongous allocations.
> The default threshold for humongous objects is often too small for our
> applications with heap sizes between 2GB-15GB.
> Humongous allocations caused noticeable increase in the frequency of
> concurrent old-gen collections, mixed collections and CPU usage.
> We could advise applications to increase G1HeapRegionSize. But some
> applications still suffer with G1HeapRegionSize=32M.
> We could also advise applications to refactor code to break down large
> objects. But it is a high cost effort that may not always be feasible.
>
> We'd like to work with the OpenJDK community together to improve G1's
> handling of humongous objects.
> Thomas Schatzl mentioned to me a few efforts/ideas on this front in an
> offline chat:
> a. Allocation into tail regions of humongous object: JDK-8172713,
> JDK-8031381
> b. Commit additional virtual address space for humongous objects.
> c. Improve the region selection heuristics (e.g., first-fit, best-fit) for
> humongous objects.
>
> I didn't find open CRs for b. and c. Could someone give pointers?
> Are there any other ideas/prototypes on this front?
TLDR: we in the Oracle gc team have quite a few ideas that can decrease
the issue significantly. We are happy to help with implementation of any
of these.
We would appreciate a sample application.
Long version:
The problems with humongous object allocation in G1:
- internal fragmentation: the tail end of a humongous object is wasted
space.
- external fragmentation: sometimes you can't find enough contiguous
space for a humongous object.
There are quite a few CRs related to this problem in the bug tracker; I
just now connected them together using a "g1-humongous" label [0].
Here's a rundown of our ideas, categorized a little (note that these CRs
predate significant changes due to how G1 works now, so the ideas may
need to be adapted to the current situation):
- try to get rid of humongous asap, i.e. improve eager reclaim support
by allowing eager reclaim with reference arrays (JDK-8048180) or
non-objArrays (JDK-8073288).
I remember the main problem with that were stale remembered set entries
after removal (and SATB marking, but you could just not do eager reclaim
during marking).
In the applications we had at hand at that time, reference arrays tended
to be not eager reclaimable most of the time, and humongous regular
objects were rare.
So the benefit to look into this might be small.
- allow allocation into the tail end of humongous objects (JDK-8172713);
there has once been an internal prototype for that, but it has been
abandoned because of implementation issues (it was a hack that has not
been completed to a stable state, mainly because humongous object
management had been full of odd quirks wrt to region management. This
has been fixed since. Also the example application benefitted more from
eager reclaim).
While the argument from Aleksey about nepotism in the other thread is
valid (as far as I understand it), it depends on the implementation. The
area at the tail end could be considered as a separate evacuation
source, i.e. evacuated independently of the humongous object (and that
would actually improve the code to clean out HeapRegion ;)).
(This needs more care with single-region humongous objects but does not
seem completely problematic; single-region humongous objects may
nowadays not be a big issue to just move during GC).
- external fragmentation can be approached in many ways:
- or just ignored by letting G1 reserve a multiple of MaxHeapSize
while only ever committing MaxHeapSize (JDK-8229373). The main drawback
here is that it impacts the range of heaps where compressed oops can be
used, and 32 bit (particularly Windows) VMs (if you still care, but the
feature could be disabled as well).
Compressed oops typically improve throughput significantly. Of course,
as long as the total size of the reservation is below the threshold, it
does not really matter.
Fwiw, when using the HeterogeneousHeapRegionManager, this is already
attempted (for other reasons).
- improve the region allocator to decrease the problem (JDK-8229373).
The way G1 currently allocates regions is a first-fit approach which
interferes a bit with destination region selection for old and survivor
regions, likely creating more fragmentation than necessary. (Basically:
it does not care at all, so go figure ;) ).
Also during mixed gc one could explicitly prefer regions to evacuate
that break long runs of free regions, weighing those regions higher
(evacuating earlier). This needs to be done in conjunction with the
remembered set selection at end of marking, before creating them.
Long time ago, on a different regional collector, I started looking into
this.
- actively defragment the heap during GC. This may either be full gc
(JDK-8191565) like shenandoah does, or any young gc assuming that G1
first kept remembered sets for potential candidates (JDK-8038487).
- never create humongous objects
- potentially implement one of the various ideas in the literature to
break down large objects into smaller ones, J9's arraylets being one of
them.
There are other solutions like completely separate allocation of
humongous objects like ZGC does, but that typically has the same problem
as reserving more space (i.e. compressed oops range, but ZGC does not
care at this time).
I think it would help potential contributors if there were some
application available where the impact of changes could be shown on in
some way. In the past, whenever there had been someone with that
problem, these persons were happy to just increase heap region size -
which is great for them, but does not fix the problem :)
We would in any case help anyone taking a stab of one of these ideas (or
others).
Thanks,
Thomas
[0]
https://bugs.openjdk.java.net/browse/JDK-8237466?jql=labels%20%3D%20g1-humongous
More information about the hotspot-gc-dev
mailing list