RFR: 8256265 G1: Improve parallelism in regions that failed evacuation [v2]

Stefan Johansson sjohanss at openjdk.java.net
Mon Dec 13 14:31:10 UTC 2021


On Mon, 13 Dec 2021 02:59:00 GMT, Hamlin Li <mli at openjdk.org> wrote:

> > ...
> > I directly mark them in the bitmap (that is already in place). One drawback with this approach is that we need all failing regions to have clean bitmaps ready for use. I solve this by clearing them up front. We can probably make this more efficient, but in the PoC I just clear all old regions added to the collection set and all regions compacted in the Full GC. To see the additional cost of the CSet clearing I added a new phase showing this and it takes significant time in some cases, but still it seems to be a lot faster than the current approach. Your logging additions really showed that the sorting part of the current approach is quite slow when many objects are failing the evacuation (and this will be the normal case with region pinning).
> 
> Thanks a lot Stefen, I like the idea very much! :) I'll see what can be done to further optimize this approach.
> 
One thing that we need to look more at is the clearing of the bitmap for old regions. This can probably be moved to a better location and be done in parallel to minimize the impact.

> > I also think this approach will be easier to parallelize, we can just split each region into multiple chunks that are taken care of individually (we just need to make sure the zapping is handled correctly).
> 
> I think we can handle this zapping correctly, let me do some investigation and test.
> 
Yeah, this is just a matter of coming up with a nice clean approach to handle it.

> > What do you think about this?
> 
> Of course, I agree!
> 
> > There are more things on the agenda that will affect this, me and Thomas have ideas on how to remove the need for two marking bitmaps. Currently we keep the prev bitmap to be able to parse regions with unloaded classes, but there are other ways to avoid this without keeping the second bitmap. We also need the bitmap for verification, but this can also be managed in other ways.
> > If we do these things, the approach in my PoC would also need to be revised a bit.
> 
> Thanks for sharing, it will be helpful if there's more details in this direction. :)
> 

If we manage to remove one of the bitmaps there will no longer be a "prev" and a "next" bitmap, but just "the marking" bitmap. The bitmap is populated during the marking and during this point no old regions will be collected so there will be no need to clear parts of this bitmap. Once we are ready to do mixed collections the marking information in the bitmap is no longer needed and the bitmap will be cleared (this will be done concurrently, as it is today for the "next" bitmap). So what I mean by revised is that we need to adapt the solution to this new state of things. 

> > So the question is how we should proceed forward. I'm a bit hesitant to this change because of its complexity and knowing that there probably is a simpler way to improve parallelism if we take another initial approach.
> > Please let me know what you think.
> 
> I like the idea, and fully support the new direction, hope to get more details if it's convenient for now. :)
> 

Sounds great. The question is how we should move forward to make this most efficient. I could propose my PoC as a PR and then take it from there, allowing you to later work on making it parallelize better. The in the future attack the other optimizations.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6627



More information about the hotspot-gc-dev mailing list