RFR: 8329088: Stack chunk thawing races with concurrent GC stack iteration
Stefan Karlsson
stefank at openjdk.org
Fri Apr 5 06:24:00 UTC 2024
On Fri, 5 Apr 2024 05:54:11 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:
> When we thaw the last frame from a stack chunk, we non-atomically set the stack pointer (sp), and set its argsize to 0. Unfortunately, GC threads may iterate over the frames of the stack chunk concurrently. When initializing their stack frame iterator, they read the sp and argsize racingly. Since there is no synchronization between the threads, we may observe inconsistent pairs of sp and argsize, for example the updated sp with a stale argsize, or the updated argsize with a stale sp.
>
> At the core of the problem, the stack chunks define sp and argsize. The argsize is used to calculate where the bottom of the stack chunk is, which is required to determine if it is empty or not. This patch proposes to switch things around and store the bottom directly in the chunk, instead of argsize. Instead, argsize is calculated from the bottom. By changing the relationship of which property is stored and which property is calculated, we can simplify this code quite a bit.
>
> In the new model, is_empty() is true iff sp and bottom are exactly the same. Bottom is only set during freezing, never during thawing. The bottom is initialized whenever the bottom frame is frozen, and left untouched during thawing. Unlike thawing, the freeze operation does not race with the GC by design. Hence we have moved one of the racy mutations to the operation that doesn't race with the GC. The GC is now only exposed to changing sp(). It doesn't matter if it observes the old or new sp(), now that we have removed the only source if inconsistency describing said frame (racing argsize).
>
> Testing: tier1-5, manual testing of test/jdk/jdk/internal/vm/Continuation
Looks good. There's a few nits that could be worth considering.
src/hotspot/share/oops/stackChunkOop.inline.hpp line 115:
> 113: if (is_empty()) {
> 114: return 0;
> 115: } else {
Should this be removed?
src/hotspot/share/runtime/continuationFreezeThaw.cpp line 652:
> 650: const int chunk_start_sp = cont_size() + frame::metadata_words;
> 651:
> 652: chunk->set_max_thawing_size(cont_size());
Should this move be reverted?
src/hotspot/share/runtime/continuationJavaClasses.hpp line 110:
> 108: static inline void set_size(HeapWord* chunk, int value);
> 109:
> 110: static inline void set_bottom(HeapWord* chunk, int value);
Shouldn't this be moved down to the other set_bottom? The two set_sp functions are held together.
-------------
Marked as reviewed by stefank (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/18643#pullrequestreview-1982024387
PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1552982540
PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1552983809
PR Review Comment: https://git.openjdk.org/jdk/pull/18643#discussion_r1552985724
More information about the hotspot-dev
mailing list