RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration [v2]

Patricio Chilano Mateo pchilanomate at openjdk.org
Fri Apr 12 15:49:44 UTC 2024


On Fri, 5 Apr 2024 09:35:22 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

>> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp.
>> 
>> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit.
>> 
>> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize).
>> 
>> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation
>
> Erik Österlund has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Nits

Thanks, looks good to me. Only a few comments.

src/hotspot/share/oops/stackChunkOop.inline.hpp line 135:

> 133: 
> 134: inline bool stackChunkOopDesc::is_empty() const {
> 135:   assert(sp() <= stack_size(), "");

Maybe keep assert(sp() <= bottom(), "")?

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 567:

> 565:   // Consider leaving the chunk's argsize set when emptying it and removing the following branch,
> 566:   // although that would require changing stackChunkOopDesc::is_empty
> 567:   if (!chunk->is_empty()) {

Seems you have implemented the suggestion in the comment so we can remove this branch and unconditionally decrement total_size_needed.

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 631:

> 629:     chunk->set_max_thawing_size(cont_size());
> 630:     chunk->set_bottom(chunk_start_sp - _cont.argsize());
> 631:     chunk->set_sp(chunk->bottom());

Do we need to set sp? We didn't do it before.

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 662:

> 660:   // They'll then be stored twice: in the chunk and in the parent chunk's top frame
> 661:   const int chunk_start_sp = cont_size() + frame::metadata_words;
> 662:   assert(chunk_start_sp == chunk->stack_size(), "");

Isn't this assert still valid?

-------------

PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-1997313321
PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562699073
PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562687298
PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562647851
PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1562654716


More information about the hotspot-dev mailing list