Discussion: improve humongous objects handling for G1
Thomas Schatzl
thomas.schatzl at oracle.com
Mon Jan 20 11:11:18 UTC 2020
Hi Liang,
On 19.01.20 08:08, Liang Mao wrote:
> Hi Guys,
>
> We Alibaba have experienced the same problem as Man introduced.
> Some applications got frequent concurrent mark cycles and high
> cpu usage and even some to-space exhausted failures because of
> large amount of humongous object allocation even with
> G1HeapRegionSize=32m. But those applications worked fine
> with ParNew/CMS. We are working on some enhancements for better
Can you provide logs? (with gc+heap=debug,gc+humongous=debug)
> reclamation of humongous objects. Our first intention is to reduce
> the frequent concurrent cycles and possible to-space exhausted so
> the heap utility or arraylets are not taken into consideration yet.
>
> Our solution is more like a ParNew/CMS flow and will treat a
> humongous object as young or old.
> 1. Humongous object allocation in mutator will be considered into
> eden size and won't directly trigger concurrent mark cycle. That
> will avoid the possible to-space exhausted while concurrent mark
> is working and humongous allocations are "eating" the free regions.
(I am trying to imagine situations here where this would be a problem
since I do not have a log)
That helps if G1 is already trying to do a marking cycle if the space is
tight and already eating into the reserve that has explicitly been set
aside for this case (G1ReservePercent - did you try increasing that for
a workaround?). It does make young collections much more frequent than
necessary otherwise.
Particularly if these humongous regions are eager-reclaimable. In these
cases the humongous allocations would be "free", while with that policy
they would cause a young gc.
The other issue, if these humongous allocations cause too many
concurrent cycles could be managed by looking into canceling the
concurrent marking if that concurrent start gc freed lots and lots of
humongous objects, e.g. getting way below the mark threshold again.
I did not think this through though, of course at some point you do need
to start the concurrent mark.
Some (or most) of that heap pressure might have been caused by the
internal fragmentation, so allowing allocation into the tail ends would
very likely decrease that pressure too.
This would likely be the first thing I would be looking into if the logs
indicate that.
> 2. Enhance the reclamation of short-live humongous object by
> covering object array that current eager reclaim only supports
> primitive type for now. This part looks same to JDK-8048180 and
> JDK-8073288 Thomas mentioned. The evacuation flow will iterate
> the humongous object array as a regular object if the humongous
> object is "young" which can be distinguished by the "age" field
> in markoop. >
> The patch is being tested. We will share it once it proves to
> work fine with our applications. I don't know if any similar
> approach has been already tried and any advices?
The problem with treating humongous reference arrays as young is that
this heuristic significantly increases the garbage collection time if
that object survives the collection.
I.e. the collector needs to iterate over all young objects, and while
you do save the time to copy the object by in-place aging, scanning the
references tends to take more time than copying.
In that "different regional collector" I referenced in the other email
exactly this had been implemented with the above issues. That collector
also had configurable regions down to 64k (well, basically even less,
but anything below that was just for experimentation, and 64k had been
very debatable too), so the humongous object problem had been a lot
larger. It might not be the case with G1's "giant" humongous objects.
Treating them as old like they are now within G1 allows you to be a lot
more selective about what you take in for garbage collection. Now the
policy isn't particularly smart (just take humongous objects of a
particular type with less than a low, fixed threshold of remembered set
entries), but that could be improved.
I.e. G1 has a measure of how long scanning a remembered set entry
approximately takes, so that could be made dependent on available time.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list