RFR: 8343468: GenShen: Enable relocation of remembered set card tables

Kelvin Nilsen kdnilsen at openjdk.org
Fri Jan 17 21:25:37 UTC 2025


On Fri, 17 Jan 2025 05:18:39 GMT, Cesar Soares Lucas <cslucas at openjdk.org> wrote:

> In the current Generational Shenandoah implementation, the pointers to the read and write card tables are established at JVM launch time and fixed during the whole of the application execution. Because they are considered constants, they are embedded as such in JIT-compiled code. 
> 
> The cleaning of dirty cards in the read card table is performed during the `init-mark` pause, and our experiments show that it represents a sizable portion of that phase's duration. This pull request makes the addresses of the read and write card tables dynamic, with the end goal of reducing the duration of the `init-mark` pause by moving the cleaning of the dirty cards in the read card table to the `reset` concurrent phase.
> 
> The idea is quite simple. Instead of using distinct read and write card tables for the entire duration of the JVM execution, we alternate which card table serves as the read/write table during each GC cycle. In the `reset` phase we concurrently clean the cards in the the current _read_ table so that when the cycle reaches the next `init-mark` phase we have a version of the card table totally clear. In the next `init-mark` pause we swap the pointers to the base of the read and write tables. When the `init-mark` finishes the mutator threads will operate on the table just cleaned in the `reset` phase; the GC will operate on the table that just turned the new _read_ table.
> 
> Most of the changes in the patch account for the fact that the write card table is no longer at a fixed address.
> 
> The primary benefit of this change is that it eliminates the need to copy and zero the remembered set during the init-mark Safepoint. A secondary benefit is that it allows us to replace the init-mark Safepoint with an `init-mark` handshake—something we plan to work on after this PR is merged.
> 
> Our internal performance testing showed a significant reduction in the duration of `init-mark` pauses and no statistically significant regression due to the dynamic loading of the card table address in JIT-compiled code.
> 
> Functional testing was performed on Linux, macOS, Windows running on x64, AArch64, and their respective 32-bit versions. I’d appreciate it if someone with access to RISC-V (@luhenry  ?) and PowerPC (@TheRealMDoerr ?)  platforms could review and test the changes for those platforms, as I have limited access to running tests on them.

Thanks for figuring out how to make all of this work.  It represents a very significant improvement to the GenShen implementation.

src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.cpp line 120:

> 118:   if (ShenandoahCardBarrier) {
> 119:     // Every thread always have a pointer to the _current_ _write_ version of the card table.
> 120:     // The JIT'ed code will use this address (+card entry offset) to marke card's as dirty.

typo here: "to mark cards as dirty"

src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 228:

> 226: // This avoids the need to synchronize reads of the table by the GC workers doing
> 227: // remset scanning, on the one hand, with the dirtying of the table by mutators
> 228: // and by the GC workers doing remset scans, on the other.

Remove "and by the the GC workers doing remset scans" from this comment, so it reads:
"This avoids the need to synchronize reads of the table by the GC workers doing
remset scanning, on the one hand, with the dirtying of the table by mutators on the other.

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 121:

> 119:   CardValue* bp = &(read_table)[0];
> 120:   CardValue* end_bp = &(read_table)[_card_table->last_valid_index()];
> 121: 

Looks like there may be an off-by-one error here, or a bad name for "last_valid_index()".  The following loop should continue while bp <= end_bp.

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 632:

> 630:   CardValue* end_bp = &(new_ptr)[_card_table->last_valid_index()];
> 631: 
> 632:   while (start_bp < end_bp) {

Same issue here as mentioned above.  So our validation test doesn't catch our failure to overwrite last index value with clean value.

-------------

PR Review: https://git.openjdk.org/jdk/pull/23170#pullrequestreview-2559986861
PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1920769089
PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1920775685
PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1920778272
PR Review Comment: https://git.openjdk.org/jdk/pull/23170#discussion_r1920779535


More information about the shenandoah-dev mailing list