RFR: 8357976: GenShen crash in swap_card_tables: Should be clean [v3]
Y. Srinivas Ramakrishna
ysr at openjdk.org
Tue Jun 17 00:15:27 UTC 2025
On Thu, 12 Jun 2025 22:10:12 GMT, William Kemper <wkemper at openjdk.org> wrote:
>> Degenerated cycles must also clean the `read` card table before swapping it with the `write` card table.
>
> William Kemper has updated the pull request incrementally with one additional commit since the last revision:
>
> Revert "Consolidate read table cleaning"
>
> This reverts commit 9552021cbeec699e7b04db81fa68819c23b7dab7.
Reviewed and approved; see previous comment.
🚢
@earthling-amzn :
> Genshen does not clean the read table at the end of a concurrent cycle (we could do this, but we'd still need to do the same in a degenerated cycle).
Yes, and I am suggesting we do that to clean up and consolidate the cleaning along multiple paths into the same method that is invoked at an identical point in the Collector's state machine, irrespective of whether we arrived at that state via a concurrent collection or a degeneration.
> The complexity comes from swapping the card tables in a concurrent cycle. We must not let a subsequent degenerated cycle swap them a second time. The degenerated cycle can only clean the read table if there was no active concurrent cycle (lest it throw away critical remembered set data).
The swapping must happen at the start of the marking cycle. In similar fashion, the state of the Collector's state machine should indicate if the swap has already been done in this cycle or not. The specifics of wheher the swap was done by a degenerate or concurrent collection should then not matter?
> However, the degenerated cycle must also clear the mark bitmaps if the concurrent cycle degenerated during the root scan (the degenerated cycle will rescan the roots). Of course, we cannot clear the mark bitmaps if the concurrent cycle degenerated during the mark phase. So, for the degenerated cycle, clearing the mark bitmaps and cleaning the read table need to happen in two different cases and it doesn't work if they're combined into the same method (and I don't think adding a method parameter to sometimes clear the mark bitmap and sometimes the read table will make the code any easier to understand).
That's OK, they can be in two different methods, with the clearing of the marking bitmap done in one case and not in the other based on when the degeeration occurred. I think may be we don't have sufficiently refined Collector state machine at the moment which may be part of the problem here.
That having been said, I can understand that:
> The refactoring you suggest is not so simple, alas.
In particular, if our eventual future plan is to do away with degeneration as we know it today, then the refactoring to achieve the clean up I suggest may be moot.
With that in mind, I am going to approve the changes in their current form as they look right if somewhat difficult to maintain and prove the correctness of. (The more refined state machine approach would likely make it easier, but will be moot once degeneration is eliminated in the future.)
-------------
Marked as reviewed by ysr (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/25735#pullrequestreview-2933812072
PR Comment: https://git.openjdk.org/jdk/pull/25735#issuecomment-2978526794
More information about the shenandoah-dev
mailing list