g1 not doing partial aggressively enough -> fallback to full gc

Wed Jun 23 22:11:25 UTC 2010

For the record (= mailing list archives), using a variant of [1]
(strictly throw-away, but just FYI) I finally realized that the
problem is/was that the predicted remember set scanning cost is
extremely high for the regions that are not being selected for partial
gc. Individual regions, even those that are almost empty (< 10%
liveness) either never reach the top of region candidates, or else are
predicted to be so expensive that even a single region can blow away
the pause time goal. For example:

predict_region_elapsed_time_ms: 26.896303ms total, 26.172088ms rs scan
(36045 cnum), 0.056240 copy time (34720 bytes), 0.667975 other time
predict_region_elapsed_time_ms: 59.700384ms total, 58.974873ms rs scan
(81222 cnum), 0.057536 copy time (35520 bytes), 0.667975 other time
predict_region_elapsed_time_ms: 79.066806ms total, 78.331835ms rs scan
(107881 cnum), 0.066996 copy time (41360 bytes), 0.667975 other time
predict_region_elapsed_time_ms: 76.376391ms total, 75.619144ms rs scan
(104145 cnum), 0.089272 copy time (55112 bytes), 0.667975 other time

What I don't understand now is why there is an accumulation of cards
to be scanned in the remembered set that is so large. Reading the g1
paper I get the impression that mutator threads are supposed to do
rset scanning when the global queue becomes full (and that the size is
not huge), and normally for non-hot cards that a concurrent rs
scanning thread will do the scanning work. (Code-wise I have not yet
figured out whether there even is a dedicated remember set scanning
thread anymore though.)

In this case I have accumulations of > 100 000 cards to be scanned.
That's pretty significant. Presumably (though I have not looked at
this in detail yet) these remembered sets remain large (for whatever
reason) or else the regions would be collected eventually within some
reasonable time. Is there a dedicated thread which is just not
catching up with mutators (in which case one might want it to
prioritize low-liveness regions), or is it the case that there is only
mutator + gc rs scanning in the current g1? If the latter, if the
mutator's don't trigger rs scanning for these regions, and it is never
picked for collection for efficiency reasons, the regions might, it
seems to me, essentially be un-collectable forever.

[1] http://distfiles.scode.org/mlref/g1/g1_region_live_stats_hack.patch

-- 
/ Peter Schuller
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use