RFR: 8140326: G1: Consider putting regions where evacuation failed into next collection set [v4]
Thomas Schatzl
tschatzl at openjdk.org
Tue Jun 13 08:55:50 UTC 2023
On Fri, 9 Jun 2023 11:43:35 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:
> I don't get why tams and CM-related data-structure need to be cleared here; shouldn't they be handled inside G1ClearBitmapClosure already?
First, evacuation failure already updated region statistics during concurrent start (as well as g1 updated TAMS during concurrent start).
It is not that evacuation failure and marking are linked, it is that bitmaps and TAMS and statistics are linked. Previous code also assumed that all old regions were marked through, so (as optimizations):
1) Bitmap: g1 keeps the marks on the bitmap of evac-fail regions
2) TAMS: could be kept to top() for all evac-failed regions during concurrent start
3) statistics: evacuation failure did already update them; clearing them in some cases simply been avoided by conditionally updating statistics (maybe remove that `if` now in evacuation failure handling now, what do you think?).
Generally, if you update one, you need to keep the other in sync for marking to work. However none of the marking data cleanup has been necessary before because of that assumption that all old regions are to be marked through.
`G1ClearBitmapClosure` doesn't interact with marking proper (as in: marking live objects on the bitmap), the regions that are cleared in that phase are only those that may not have their bitmap cleared yet due to the "clear bitmaps concurrently" optimization. If that were not the case, g1 would not need that phase.
At that time TAMS and statistics have already been reset by marking, so it does not need to (and both are marking data structures anyway).
That is, the reason to have `G1ClearBitmapClosure` is different compared to `ClearRetainedRegionBitmaps`.
Now that not all old regions may be marked through at concurrent start any more, g1 needs to undo the eagerly updated statistics and TAMS to not confuse other code (mainly marking, but there are asserts wrt to TAMS all over the place).
I do not think there is a way to avoid evacuation failure to interact with concurrent marking (during the concurrent start pause at least) if we want to clear out pinned regions efficiently and quickly.
Maybe making this explicit with something like (didn't test):
bool drop_mark_data = g1h->collector_state()->in_concurrent_start_gc() &&
g1h->policy()->retain_evac_failed_region(r);
if (drop_mark_data) {
// clear statistics and tams
} else {
// tams and statistics must be "set"
}
bool drop_bitmap = !g1h->collector_state()->in_concurrent_start_gc() ||
g1h->policy()->retain_evac_failed_region(r);
if (drop_bitmap) {
// clear bitmap
} else {
// bitmap must contain marks
}
would help?
I thought this would be too complicated for not much gain (resetting TAMS and statistics is little work compared to bitmaps), so I did not. The change I'm going to post is trying to improve the comment.
> (On a higher reasoning, evac-fail processing shouldn't interact with CM; I am assuming the bitmap is not related to CM.)
Evacuation failure already did and still does by a) using mark bitmaps and keeping the marks during concurrent start and b) needing to update marking statistics/tams if necessary (because it keeps marks on the bitmap). This particular change is only required extension to support the new functionality.
I assume we will keep the optimization to keep marks on the bitmap created by evacuation failure (we did in the past, the reason did not change), so evacuation failure be required to keep messing with CM.
I renamed `ClearRetainedRegionBitmaps` to `ClearRetainedRegionData` to make it more clear that retained regions need more processing than just bitmaps now.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14220#discussion_r1227773655
More information about the hotspot-gc-dev
mailing list