RFR(XS): VM times out in VM_HandshakeAllThreads::doit() with RunThese30M
David Holmes
david.holmes at oracle.com
Wed Aug 15 10:40:58 UTC 2018
Hi Robbin,
On 15/08/2018 7:13 PM, Robbin Ehn wrote:
> Hi all, please review.
>
> The issue is that external suspend don't set the thread state to
> blocked. The safepoint code handles this internally but, as handshakes
> do, using e.g. safepoint_safe() this was not handled. External
> suspending a thread causes the handshake not to come through until you
> resume that thread (bad for ZGC).
>
> The proper fix (at least IMHO) to this is to implement suspend (and the
> other uses of suspend_flag) with handshakes instead, but that will
> require asynch handshakes and would require all platforms to implement
> handshakes (or they would suffer a safepoint). So if you read this have
> a platform which you do not have implemented handshakes for, please
> start looking into it.
>
> Below patch adds the unhandle case for handshakes.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8207334
>
> Passed tier 1-5 and I could no longer get prolonged handshakes due to
> suspend.
>
> Thanks, Robbin
>
> diff -r f3cf91d5373f src/hotspot/share/runtime/handshake.cpp
> --- a/src/hotspot/share/runtime/handshake.cpp Mon Aug 13 14:04:43
> 2018 +0200
> +++ b/src/hotspot/share/runtime/handshake.cpp Wed Aug 15 11:00:22
> 2018 +0200
> @@ -340,3 +340,8 @@
> bool HandshakeState::vmthread_can_process_handshake(JavaThread* target) {
> - return SafepointSynchronize::safepoint_safe(target,
> target->thread_state());
> + // We must block the S/R protocol to resume a thread while the vm thread
> + // is executing the operation, by holding the Threads_lock.
> + // This is problematic, since we now can't remove the Threads_lock in
> handshakes.
> + assert(Threads_lock->owned_by_self(), "Not holding Threads_lock.");
> + return SafepointSynchronize::safepoint_safe(target,
> target->thread_state()) ||
> + target->is_ext_suspended();
> }
The context in which this is executed is not at all clear so assessing
the suitability of the fix is also not clear. IIRC a thread can only be
resumed at a safepoint so if the above code executes in the VMThread
then it can't be racing with a resume and the Threads_lock holding seems
unnecessary wrt. this action. ??
Thanks,
David
More information about the hotspot-runtime-dev
mailing list