RFR: 8338737: Shenandoah: Reset marking bitmaps after the cycle

William Kemper wkemper at openjdk.org
Thu Oct 17 21:29:00 UTC 2024


On Thu, 17 Oct 2024 08:29:43 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:

> All the shenandoah passed, still waiting for our test farm to verify performance.
> 
> Reset marking bitmaps after collection cycle, for GenShen only do this for young generation, also choose not do this for Degen and full GC since both are running at safepoint, we should leave safepoint as ASAP.
> 
> I have run same workload for 30s with shenandoah in generational mode and classic mode, average average time of concurrent reset dropped significantly(283us to 109us for GenShen, 276us to 167us for classic shenandoah. 
> 
> Classic shenandoah baseline:
> 
> [30.973s][info][gc,stats    ] Concurrent Reset               =    0.148 s (a =      276 us) (n =   536) (lvls, us =      141,      205,      240,      291,      694)
> [30.973s][info][gc,stats    ] Pause Init Mark (G)            =    0.136 s (a =      253 us) (n =   536) (lvls, us =      119,      242,      254,      264,      926)
> [30.973s][info][gc,stats    ] Pause Init Mark (N)            =    0.047 s (a =       87 us) (n =   536) (lvls, us =       62,       84,       88,       91,      110)
> ...
> 
> 
> Classic shenandoah reset bitmaps after cycle:
> 
> [30.967s][info][gc,stats    ] Concurrent Reset               =    0.109 s (a =      167 us) (n =   652) (lvls, us =       48,      113,      139,      160,      888)
> [30.967s][info][gc,stats    ] Concurrent Reset After Collect =    0.085 s (a =      132 us) (n =   648) (lvls, us =      107,      121,      125,      131,      532)
> [30.967s][info][gc,stats    ] Pause Init Mark (G)            =    0.189 s (a =      289 us) (n =   652) (lvls, us =      117,      260,      277,      291,     2297)
> [30.967s][info][gc,stats    ] Pause Init Mark (N)            =    0.058 s (a =       89 us) (n =   652) (lvls, us =       62,       85,       89,       94,      143)
> ...
> 
> 
> GenShen baseline
> 
> [31.008s][info][gc,stats    ] Concurrent Reset               =    0.107 s (a =      283 us) (n =   379) (lvls, us =      143,      225,      283,      330,      753)
> [31.008s][info][gc,stats    ] Pause Init Mark (G)            =    0.098 s (a =      259 us) (n =   379) (lvls, us =      104,      227,      262,      305,     1033)
> [31.008s][info][gc,stats    ] Pause Init Mark (N)            =    0.034 s (a =       90 us) (n =   379) (lvls, us =       67,       81,       89,       99,      130)
> ...
> 
> 
> GenShen reset bitmaps after cycle
> 
> 
> [30.977s][info][gc,stats    ] Concurrent Reset               =    0.050 s (a =      109 us) (n =   462) (lvls, us =       54,       77,      100,      125,      496)
> [30.977s][...

I have some concerns about the correctness here. Consider this scenario:

1. At the end of a young _or mixed_ collection, `op_reset_after_collect` will reset young regions and unset the `needs reset` flag for the young generation.
2. An old region (A) from the mixed collection could be repurposed as a young region (it's bitmap will not be reset when it is recycled).
3. The next young/mixed collection will not reset the bitmap for region A because the `needs reset` flag is clear.
4. The collector will see stale mark bits for region A and will probably crash.

src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 1206:

> 1204:     if (!_do_old_gc_bootstrap) {
> 1205:       // Only reset for young generation, bitmap for old generation must be retained,
> 1206:       // except there is collection(global/old/degen/full) trigged to collet regions in old gen.

s/collet/collect

src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 235:

> 233:   if (need_bitmap_reset()) {
> 234:     if (heap->mode()->is_generational() && is_global() && !heap->young_generation()->need_bitmap_reset()) {
> 235:       //Only need to reset bitmap for old generation.

I'd like to see an assert here that all the bitmaps for all the young regions are already clear.

-------------

Changes requested by wkemper (Committer).

PR Review: https://git.openjdk.org/shenandoah/pull/516#pullrequestreview-2376307456
PR Review Comment: https://git.openjdk.org/shenandoah/pull/516#discussion_r1805440742
PR Review Comment: https://git.openjdk.org/shenandoah/pull/516#discussion_r1805439068


More information about the shenandoah-dev mailing list