RFR: 8266557: assert(SafepointMechanism::local_poll_armed(_handshakee)) failed: Must be [v3]

Robbin Ehn rehn at openjdk.java.net
Wed Jun 2 09:55:48 UTC 2021


On Tue, 1 Jun 2021 20:37:32 GMT, Daniel D. Daugherty <dcubed at openjdk.org> wrote:

>> src/hotspot/share/runtime/handshake.cpp line 330:
>> 
>>> 328:   // closure are visible to the VMThread/Handshaker after it reads
>>> 329:   // that the operation has completed.
>>> 330:   Atomic::dec(&_pending_threads);
>> 
>> Why do we need a full fence instead of a release as before?
>
> And does the comment on L326 now need to be updated?

We have three options here:
1: "If the handshake operation is not completed it must be on at least one queue."
2: "If the handshake operation is completed it is not on any queue."
3: "It may be on queue while completed and may not be queue when not completed, it depends."

With the polling mechanism saying you must stop while the queue is not empty, the operation is removed after it is executed, option 1 seemed preferable to me.
Specially if we want a simpler easier model to avoid any further such bugs as this.

This means removal from queue happens after decrement of _pending_threads.
remove_op() -> pop() gives no ordering guarantee, those comes from the mutex.

The technical note on the current implementation of the queue is:
Only first can be read without lock and the implementation of first manipulation is done with a CAS.
So a side-effect of current implementation is that the only thing that can be seen will be ordered.

Since we running in slowpath there is no need to reduce ordering to a minimum and we can thus change the queue implementation since it have no guarantees to do any ordering without worrying about someone (ab)using side-effects.

Do you agree?

-------------

PR: https://git.openjdk.java.net/jdk/pull/3973


More information about the hotspot-runtime-dev mailing list