RFR: 8261448: Preserve GC stack watermark across safepoints in StackWalk

Roman Kennke rkennke at openjdk.java.net
Wed Feb 10 12:40:38 UTC 2021


On Wed, 10 Feb 2021 10:07:20 GMT, Roman Kennke <rkennke at openjdk.org> wrote:

> I am observing the following assert:
> 
> # Internal Error (/home/rkennke/src/openjdk/loom/src/hotspot/share/runtime/stackWatermark.cpp:178), pid=54418, tid=54534
> # assert(is_frame_safe(f)) failed: Frame must be safe
> 
> (see issue for full hs_err)
> 
> In StackWalk::fetchNextBatch() we prepare the entire stack to be processed by calling StackWatermarkSet::finish_processing(jt, NULL, StackWatermarkKind::gc), but then subsequently, during frames scan, perform allocations to fill in the frame information (fill_in_frames => LiveFrameStream::fill_frame => fill_live_stackframe) at where we could safepoint for GC, which could reset the stack watermark.
> 
> This is only relevant for GCs that use the StackWatermark, e.g. ZGC and Shenandoah at the moment.
> 
> Solution is to preserve the stack-watermark across safepoints in StackWalk::fetchNextBatch(). StackWalk::fetchFirstBatch() doesn't look to be affected by this: it is not using the stack-watermark.
> 
> Testing:
>  - [x] StackWalk tests with Shenandoah/aggressive
>  - [x] StackWalk tests with ZGC/aggressive
>  - [ ] tier1 (+Shenandoah/ZGC)
>  - [ ] tier2 (+Shenandoah/ZGC)

I'm converting back to draft. The Loom tests (test/jdk/java/lang/Continuation/*) are still failing and it looks like fetchFirstBatch() does indeed require treatment, and it's complicated because fetchFirstBatch() may end up calling fetchNextBatch() and the KeepStackGCProcessedMark is not reentrant.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2500



More information about the hotspot-gc-dev mailing list