RFR: 8256265: G1: Improve parallelism in regions that failed evacuation [v5]

Wed Sep 14 10:26:41 UTC 2022

On Wed, 14 Sep 2022 09:31:15 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:

> > another that TAMS is intrinsically linked to the bitmap (and marking) so it should be managed (like TARS) by G1ConcurrentMark
> 
> I don't think tams and the bitmap are _intrinsically_ linked; they are two independent entities. Only in the marking context, are they linked. tams is intrinsically linked to marking but not the bitmap.

I see the use of the (marking) bitmap during evacuation failure as an optimization of memory usage. Evacuation failure and marking are completely different concerns and in an ideal world should not share data structures.

> 
> > That G1 messes with the bitmap for evacuation failure handling is from an abstraction POV not good, but overall acceptable (and adds these warts).
> 
> I don't view it as a leaky abstraction, since the bitmap is not really tied to marking in my mind. Recording evac-fail objs in the bitmap is completely isolated from the use of the bitmap during concurrent-marking.

We can probably only use the marking bitmap for evac-fail objs because we made the (prior, well reasoned) design decision to completely separate mixed collection and marking. Marking in the mixed phase could be implemented (and actually iirc it has been possible and working in JDK7/8. At least I remember some bug which fixed some accidental enabling of that for a time...). Then this reuse might have been a problem.

In the currently only case where concurrent marking overlaps the mixed phase (Concurrent Clear for Next Mark) it just happens to be okay to clear the bitmap twice, once at gc start and then in the concurrent phase.

The marking bitmap contents (per region) are also only valid between bottom and tams of a region; that the bitmap always covers the whole region is an implementation detail (and something I actually would like to experiment with to get rid of at some point - why keep around so much memory if it isn't used? There are often regions that do not contain live objects at all, why have backing storage committed bitmaps for them in advance? Or humongous objects, is it efficient to commit the whole area they span for a single mark at the header?). There are also some marking algorithm improvements (with desirable properties) that add a little bit more per-region helper data structures that might need to be reset at that point.

Potentially having to enumerate all of these for clearing at all of the places where we now use `clear_bitmap_for_region` seems cumbersome and error-prone. Clearing additional data might be some extra work, but from whatever I've seen so far it isn't relevant.

This is why in my view the (per-region) bitmap and tams are both by definition marking related data structures, and should be treated as intrinsically linked. Imho evac-failure simply barges in and reuses them out of convenience (in some way) ;-)

> 
> > This omits that call in the second invocation of G1CollectedHeap::clear_bitmap_for_region() because TAMS must be bottom() at that point.
> 
> That's one of the reasons why I suggested moving tams assignment out.

I found that not only doing the absolutely necessary thing very often makes the code clearer, allowing to raise the abstraction level for the code, i.e. "clear all marking data for region" vs. "clear bitmap for region". In this case the overhead is an assignment to a single variable in addition to clearing kBs of bitmap data...

-------------

PR: https://git.openjdk.org/jdk/pull/9980