Conflicting use of StackWatermark in StackWalker vs GC?
Stefan Karlsson
stefan.karlsson at oracle.com
Tue Feb 9 14:45:52 UTC 2021
It's interesting that fetchNextBatch process the entire stack in
preparation for filling in the information about the frames:
// If we have to get back here for even more frames, then 1) the
user did not supply
// an accurate hint suggesting the depth of the stack walk, and 2)
we are not just
// peeking at a few frames. Take the cost of flushing out any
pending deferred GC
// processing of the stack.
StackWatermarkSet::finish_processing(jt, NULL /* context */,
StackWatermarkKind::gc);
but further down in fill_in_frames => LiveFrameStream::fill_frame =>
fill_live_stackframe, we perform object allocation, which could
safepoint for a GC that would reset the watermark. After leaving that
safepoint we will have processed the top-most frames, but we won't have
processed down the the current frame the StackWalker is looking at. This
is my guess of what's happening, but I haven't been able to reproduce
the problem, so it's a bit hard to verify that this is what's happening.
StefanK
On 2021-02-09 15:08, Roman Kennke wrote:
> I am getting the same failure with ZGC:
>
> CONF=linux-x86_64-server-fastdebug make run-test
> TEST=java/lang/StackWalker
> TEST_VM_OPTS="-XX:+UnlockExperimentalVMOptions -XX:+UseZGC
> -XX:ZCollectionInterval=0.01"
>
>
>> Hello all,
>>
>> When running StackWalker tests with 'aggressive' Shenandoah mode
>> (i.e. run GCs all the time, even if there is no work), then I observe
>> crashes like this:
>>
>> # Internal Error
>> (/home/rkennke/src/openjdk/jdk/src/hotspot/share/runtime/stackWatermark.cpp:178),
>> pid=549168, tid=549230
>> # assert(is_frame_safe(f)) failed: Frame must be safe
>>
>> Full hs_err:
>> http://cr.openjdk.java.net/~rkennke/hs_err_pid549168.log
>>
>> I strongly suspect that this is happening because of StackWalker's
>> use of StackWatermark which conflicts with the GC's own use of
>> StackWalker. IOW, it asserts that the frame has been processed, but
>> the GC is still on it.
>>
>> Are we missing some coordination between StackWalker and the GC here?
>>
>> It can be reproduced using:
>> CONF=linux-x86_64-server-fastdebug make run-test
>> TEST=java/lang/StackWalker
>> TEST_VM_OPTS="-XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC
>> -XX:ShenandoahGCHeuristics=aggressive"
>>
>> Thanks,
>> Roman
>
More information about the hotspot-gc-dev
mailing list