G1GC Full GCs
Peter Schuller
peter.schuller at infidyne.com
Mon Jul 12 09:02:34 PDT 2010
> Am I missing some tuning that should be done for G1GC for applications like
> this? Is 20ms out of 80ms too aggressive a target for the garbage rates
> we're generating?
I have never run HBase, but in an LRU stress test (I posted about it a
few months ago) I specifically observed remembered set scanning costs
go way up. In addition I was seeing fallbacks to full GC:s recently in
a slightly different test that I also posed about to -use, and that
turned out to be a result of the estimated rset scanning costs being
so high that regions were never selected for eviction even though they
had very little live data. I would be very interested to hear if
you're having the same problem. My last post on the topic is here:
http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2010-June/000652.html
Including the link to the (throw-away) patch that should tell you
whether this is what's happening:
http://distfiles.scode.org/mlref/g1/g1_region_live_stats_hack.patch
Out of personal curiosity I'd be very interested to hear whether this
is what's happening to you (in a real reasonable use-case rather than
a synthetic benchmark).
My sense (and hotspot/g1 developers please smack me around if I am
misrepresenting anything here) is that the effect I saw (with rset
scanning costs) could cause perpetual memory grow (until fallback to
full GC) in two ways:
(1) The estimated (and possibly real) cost of rset scanning for a
single region could be so high that it is never possible to select it
for eviction given the asked for pause time goals. Hence, such a
region effectively "leaks" until full GC.
(2) The estimated (and possibly real) cost of rset scanning for
regions may be so high that there are, in practice, always other
regions selected for high pay-off/cost ratios, such that they end up
never being collected even if theoretically a single region could be
evicted within the pause time goal.
These are effectively the same thing, with (1) being an extreme case of (2).
In both cases, the effect should be mitigated (and have been in the
case where I did my testing), but as far as I can tell not generally
"fixed", by increasing the pause time goals.
It is unclear to me how this is intended to be handled. The original
g1 paper mentions an rset scanning thread that I may suspect would be
intended to help do rset scanning in the background such that regions
like these could be evicted more cheaply during the STW eviction
pause; but I didn't find such a thread anywhere in the source code -
but I may very well just be missing it.
--
/ Peter Schuller
More information about the hotspot-gc-use
mailing list