RFR: 8358735: GenShen: block_start() may be incorrect after class unloading [v12]

Y. Srinivas Ramakrishna ysr at openjdk.org
Wed Nov 12 01:25:26 UTC 2025


On Mon, 10 Nov 2025 23:45:39 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

>> When scanning a range of dirty cards within the GenShen remembered set, we need to find the object that spans the beginning of the left-most dirty card.  The existing code is not reliable following class unloading.
>> 
>> The new code uses the marking context when it is available to determine the location of live objects that reside below TAMS within each region.  Above TAMS, all objects are presumed live and parsable.
>
> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add two comments

I am still going through this, but it seems as if there's a bunch of potential clean-ups to do here. I realize this has already been integrated; I can may be create a separate task to perhaps clean up some of the questions raised in this.

Since there are quite a few comments now, I am going to flush these for now as a record of some of my thoughts and create a separate task in which I can see if these concerns are real and if the code can be somewhat simplified in a few places.

Nothing specific to do here at this time in response to these stream-of-consciousness comments.

src/hotspot/share/gc/shenandoah/shenandoahMarkBitMap.cpp line 70:

> 68:   assert(limit >= r->bottom(), "limit must be more than bottom");
> 69:   assert(addr <= tams, "addr must be less than TAMS");
> 70: #endif

Wouldn't it make more sense for these checks to move to the caller? It appears to me to be a leakage of abstraction to test these conditions here. We should be able to return the address for the marked bit found without interpreting the semantics of the bits themselves?

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 251:

> 249: #endif
> 250: 
> 251:   HeapWord* right = MIN2(region->top(), end_range_of_interest);

This is a safe thing to do, but doesn't the caller already establish the invariant that
`region->top() >= end_range_of_interest` ? Can we just assert that instead of doing the clip/clamp? (And rename the formal parameter name from `end_range_of_interest` to `right`?)

If so, we might also want to change the name of the formal parameter from `card_index` to `left`, and change it to be a (card-aligned) _heap_ _address_ for symmetry in the API.

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 252:

> 250: 
> 251:   HeapWord* right = MIN2(region->top(), end_range_of_interest);
> 252:   HeapWord* end_of_search_next = MIN2(right, tams);

Does the caller ensure that `tams` is always valid (e.g. when `ctx == nullptr`)?

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 261:

> 259:       last_relevant_card_index--;
> 260:     }
> 261:   }

I am not sure this is necessary. I'd just adjust the caller so this is ensured, avoiding this computation here. I think the caller has the last dirty card address and can just use that? I realize there's a bit of an issue with the address for `tams` not necessarily being card-aligned, but I think we should be able to deal with that in the caller as well once we remember that all of this always happens within a single region. (We can add such an assertion so that future adjustments do not render this assumption invalid if the code is changed/adjusted later.)

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 296:

> 294: //     for scanning the portion of the obj-array overlapping the dirty cards in
> 295: //     its cluster.
> 296: //  3. Non-array objects are precisely dirtied by the interpreter and the compilers

This should say "imprecisely" at line 296, I think?

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 410:

> 408:   ShenandoahCardCluster(ShenandoahDirectCardMarkRememberedSet* rs) {
> 409:     _rs = rs;
> 410:     _end_of_heap = ShenandoahHeap::heap()->end();

It is interesting why we should need end of heap but not start of heap.

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.hpp line 657:

> 655: 
> 656: 
> 657:   // Given a card_index, return the starting address of the first live object in the heap

The interface/API comment should describe a Dijkstra-like pre- and post-condition. i.e. if these conditions are satisfied, we'll give you this result.

A description of what the method does (i.e. how it implements the functionality) belongs in the method implementation.

Here, the two are conflated making the interface description unnecessarily long and convoluted. Sometimes this might indicate that the interface isn't as frugal as it should be.

I might state this more succinctly as follows:


// Given:
// `card_index`: a valid index of a card in the old generation
// `ctx` : a valid marking context for the old generation
// `tams` : a valid top-at-mark-start address for the old generation
//                region in which the card_index is located
// `end_range_of_interest` : an address in that region beyond which we need
//                not locate an object
//
// Returns:
// the address of the object, if any, at the least address that overlaps with
// the address range between the start of card_index and end_range_of_interest,
// or nullptr if no such object exists in the given range.


Once you look at the spec in this manner, you realize that the first and last arguments go together and define a suitable address range, and the second and third arguments go together and provide a context. This allows you to divide the assertion checking and call interface most optimally between caller and callee.

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.inline.hpp line 168:

> 166:         // The range of addresses to be scanned is empty
> 167:         continue;
> 168:       }

When would this happen? We start off with dirty_l to the left of dirty_r, and with dirty_r having started at a card that would correspond to end_addr. I am not convinced this check is needed. I'd rather assert here that: ``` assert(left <= end_addr, "left should remain left of end_addr established above"); ```

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.inline.hpp line 178:

> 176:       // if its head card is dirty. If not, (i.e. its head card is clean)
> 177:       // we'll call it each time we process a new dirty range on the object.
> 178:       // This is always the case for large object arrays, which are typically more

Instead of `This is always the case ...` may be we can say `The latter is aways the case ...`?

(Mea culpa for the old comment.)

src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.inline.hpp line 181:

> 179:       // common.
> 180:       assert(ctx != nullptr || heap->old_generation()->is_parsable(), "Error");
> 181:       HeapWord* p = _scc->first_object_start(dirty_l, ctx, tams, right);

TODO: Wondering if we need to pass both tams and right, or just the max of the two. Will look at `first_object_start()`.

-------------

PR Review: https://git.openjdk.org/jdk/pull/27353#pullrequestreview-3445628617
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2516048266
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2512594318
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2516272583
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2512602336
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2512514800
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2512532407
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2516276311
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2512428956
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2512489229
PR Review Comment: https://git.openjdk.org/jdk/pull/27353#discussion_r2512496763


More information about the hotspot-gc-dev mailing list