RFR: 8267079: Support async handshakes that can be executed by a remote thread

Thu May 20 06:49:50 UTC 2021

On 20/05/2021 3:44 pm, Man Cao wrote:
> On Thu, 20 May 2021 02:50:07 GMT, David Holmes <david.holmes at oracle.com> wrote:
> 
>> I thought we had to preserve the order of handshake operations? (In the
>> same way the safepoint operations were previously well ordered.) If that
>> is not the case ... there might be some subtle interactions there that
>> might lead to very hard to diagnose bugs.
> 
> I'm not sure about this. Currently without this change, if there are both async self-executed ops and synchronous non-self executable ops on the queue, it would not preserve the execution order.
> I think non-self executable ops and self-executed ops shouldn't depend on each other. It would be a misuse of handshakes if that happens.

Hmmm. I had a mental model where order of ops was preserved. Introducing 
non-determinism in the order in which ops are executed seems potentially 
fragile - and a new mode of operation compared to safepoint VM 
operations. That said, taking suspension as an an example, if the target 
thread is off in native when suspended, and so could not process the 
async handshake op for suspension yet, then we would still want a 
synchronous handshake op to dump its stack to work.

But I'm not at all convinced that there may not be any ordering 
dependencies ever. Maybe it is a misuse, but how hard will be it be to 
spot this misuse, or debug it ? (rhetorical question)

>> As I don't know anything about the epoch sync protocol I don't really
>> understand the requirements here. If you are prepared to have some
>> threads defer execution of the async handshake indefinitely (because
>> they aren't blocked) then why do you need to ensure you update the
>> counter for other threads, rather than have them do it themselves when
>> they are able to execute the async handshake op?
> 
> The epoch sync protocol only needs the target thread to execute a memory fence, or become blocked. The purpose is to flush out all potential stores to Java heap, or establish a release-acquire edge from the target thread to requesting thread. It is essentially an asymmetric Dekker synchronization (see [this article](https://blogs.oracle.com/dave/qpi-quiescence)). The role of handshake is like a "membarrier" Linux syscall on the target thread. This is why the actual handshake op is a no-op, and the arm-the-poll-only approach as Robbin suggested is superior.
> 
> For this question, it is because:
> - We want to minimize the number of deferred ops, because deferring means more work during the later GC pause. In fact, with a timeout of 2-3 milliseconds, it is extremely rare to have deferred ops already.

If the deferred op is a no-op then I'm not sure how this creates work 
for a later GC pause. Assume all the Java threads are executing in 
native and stay there for a long time - why should that impact the GC's 
work?

> - A blocked thread may not come back to in_Java for a long time, so it will not execute any self-executed async op. If we don't handle them, most ops will become deferred if there's a blocked thread. In a realistic large server, it is common to have hundreds or thousands of Java threads, and a large portion of them are blocked on Object.wait() and rarely become running.

So taking an extreme example, if a thread is blocked for a few minutes 
(or equivalently, but less likely, is in native) then you are concerned 
that many of these epoch-sync async ops will accumulate, and that could 
cause memory pressure and slowdown the thread's return to Java. I can 
see that is a concern. But the first thought I had in relation to this 
problem was that perhaps we need to introduce the notion of coalescable 
operations: if all epoch-sync operations are equivalent then you only 
need at most one to get enqueued. Of course then we have to scan the 
queue for an existing occurrence. But that seems more general a solution 
to unbounded deferred operations than introducing a way to "skip" 
blocked threads.

I think it is important to flesh out the requirements here to ensure 
we're making strategic design decisions about the overall architecture 
of the handshake mechanism, rather than just trying to tweak the 
mechanism to support a specific use case. So sorry for the delay this 
adds, but I think the discussions are important.

Thanks,
David

> -------------
> 
> PR: https://git.openjdk.java.net/jdk/pull/4005
>