RFR: 8266557: assert(SafepointMechanism::local_poll_armed(_handshakee)) failed: Must be [v3]

Robbin Ehn rehn at openjdk.java.net
Thu Jun 3 07:49:38 UTC 2021


On Wed, 2 Jun 2021 18:10:51 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:

>> We have three options here:
>> 1: "If the handshake operation is not completed it must be on at least one queue."
>> 2: "If the handshake operation is completed it is not on any queue."
>> 3: "It may be on queue while completed and may not be queue when not completed, it depends."
>> 
>> With the polling mechanism saying you must stop while the queue is not empty, the operation is removed after it is executed, option 1 seemed preferable to me.
>> Specially if we want a simpler easier model to avoid any further such bugs as this.
>> 
>> This means removal from queue happens after decrement of _pending_threads.
>> remove_op() -> pop() gives no ordering guarantee, those comes from the mutex.
>> 
>> The technical note on the current implementation of the queue is:
>> Only first can be read without lock and the implementation of first manipulation is done with a CAS.
>> So a side-effect of current implementation is that the only thing that can be seen will be ordered.
>> 
>> Since we running in slowpath there is no need to reduce ordering to a minimum and we can thus change the queue implementation since it have no guarantees to do any ordering without worrying about someone (ab)using side-effects.
>> 
>> Do you agree?
>
> Right, but I don't see why the change of just using the queue to check for pending handshakes matters here. This release was to ensure that if the handshakee executed the closure, the memory operations executed in the handshake would be visible to the vmthread/handshaker after seeing the operation was completed (matched with the acquired after is_completed()). What could be the issue if this is not a full fence?

"Change of just using the queue to check for pending handshakes": also means we must remove the handshake from the queue after it have been executed.
I consider the decrement as the marking of "it have been executed".

Therefore anything after the decrement, should stay after and not float above the dec, to follow the protocol.
If disregard ordering side-effects when removing the last in queue with CAS, if the removal happens before the decrement means if the handshakee ask "should process" it could say no before we decrement.

This release to the removal of the operation is match with the acquire on is_empty().
Normally the removal would had used the release from the mutex lock, but here we hold the lock over several operations.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3973


More information about the hotspot-runtime-dev mailing list