Discussion: improve humongous objects handling for G1

Wed Jan 22 05:05:34 UTC 2020

Hi all,

Thanks for the great discussion from Thomas and Liang!

Regarding to GC logs, histogram of humongous allocations, and a more
concrete example, I guess we are in the same boat here. We only advised
users to increase G1HeapRegionSize, which would work around many cases of
the problem. We have not yet closely studied patterns of the problematic
humongous allocations. I will do such a study and follow up with some
statistics and GC logs when I get my hands on them.

>> maybe the threshold for the amount of
>> remembered set entries to keep these humongous objects as eligible for
>> eager reclaim is too low, and increasing that one would just make it
work.
> JDK-8237500
Thanks for this. I will definitely try tuning this if the humongous objects
are non-objArrays.

> You could double-map like
> https://blog.openj9.org/2019/05/01/double-map-arraylets/ does for
> native access.
> There is some remark in some tech paper about arraylets
> (
https://www.ibm.com/developerworks/websphere/techjournal/1108_sciampacone/1108_sciampacone.html
> that indicates that the  balanced collector seems to not move the
> arrayoids too.
Thanks for digging into the details of arraylets. I didn't do much research
on it.

> Btw the same text also indicates that copying seems like a non-starter
> anyway, as, quoting from the text "One use case, SPECjbb2015 benchmark
> is not being able to finish RT curve...".
> Not sure what prevents arraylets in particular from being O(1); a
> particular access is slower though due to the additional indirection
> with the spine.
> ...
> Which means that there is significant optimization work needed to make
> array access "as fast" as before in jitted code
These two issues:
(1) copying for JNI Critical
(2) slowing down typical jitted code for array accesses
do sound like performance deal-breakers, particularly if they are only
required for G1+arraylets but not other collectors. There are some use
cases of JNI Critical on arrays that are solely for performance reasons,
and we'd rather not slow them down.

> It could help with all problems but cases where you allocate a very
> large of humongous objects and you can't keep the humognous object
> tails filled. This option still keeps the invariant that humongous
> objects need to be allocated at a region boundary.
>
> Most of the other ideas you propose below also (seem to) retain this
> property.
Agreed. It seems that JDK-8172713 would help most ideas anyway.

> Maybe it is sufficient as "most" applications only use single or low
> double-digit GB heaps at the moment where the entire reservation still
> fits into the 32gb barrier.
I also had the same thought. Most of our important workloads have heap
sizes less than 20GB.
If the "reserve multiple MaxHeapSize" approach could work with compressed
oops for <16GB heap, then it is quite acceptable.
That said, now I do agree that I should first study the patterns of
humongous allocations and look into improvement on eager reclamation.

For the approach from Liang/Alibaba, I'm optimistic that it could solve
many problems migrating from ParNew+CMS to G1. Because it handles humongous
allocations in a similar way as ParNew+CMS does, plus G1 has the advantage
of not copying humongous objects, the pause duration/frequency would
probably not degrade compared to ParNew/CMS.
I also agree with Thomas that it may increase pause duration compared to
current G1 due to extra scanning, and allocation spikes might affect
other aspects of G1. I noticed in the description for JDK-8027959: "a)
logically keep LOBs in young gen, doing in-place aging", which sounds like
the GC team have explored this approach for eager reclamation before? It
might be the best of both worlds if we could make eager reclamation of
humongous objArrays work without putting them in young gen, and further
improve eager reclamation in general.

-Man