RFR: 8264200: java/nio/channels/DatagramChannel/SRTest.java fails intermittently [v2]

Tue Apr 6 17:50:28 UTC 2021

On Tue, 6 Apr 2021 17:20:56 GMT, Conor Cleary <ccleary at openjdk.org> wrote:

>> I note in the senders (ClassicWriter and NioWriter) there are two send invocation while in the receivers there is only one receive invocation. That raises a question as to why there are two send invocations, and which of these a receiver processes - no way of telling if the first has been lost?
>> 
>> now a bit of conjecture:
>> In theory based on the structure of the test the receiver or reader thread could finish before the writer have executed the second send. So if the receiver has finished, executed close and released their port, making it available for re-allocation in another concurrently executing test, then the second send could be a stray send to another now unrelated UDP end point.
>
>> I note in the senders (ClassicWriter and NioWriter) there are two send invocation while in the receivers there is only one receive invocation. That raises a question as to why there are two send invocations, and which of these a receiver processes - no way of telling if the first has been lost?
>> 
>> now a bit of conjecture:
>> In theory based on the structure of the test the receiver or reader thread could finish before the writer have executed the second send. So if the receiver has finished, executed close and released their port, making it available for re-allocation in another concurrently executing test, then the second send could be a stray send to another now unrelated UDP end point.
> 
> @msheppar it seems to me that the duplicate send is a feature from the old test which may no longer be needed. In the old test, the `invoke()` method contained the following code:
> 
> static void invoke(Sprintable reader, Sprintable writer) throws Exception {
>         Thread readerThread = new Thread(reader);
>         readerThread.start();
>         Thread.sleep(50);
> 
>         Thread writerThread = new Thread(writer);
>         writerThread.start();
>         ...
> This `thread.sleep(50)` I'm guessing is to ensure the readerThread has fully started and is waiting to receive before starting the writer thread. Following on from this, both writer classes (ClassicWriter & NioWriter) contain something like...
>        dc.send(bb, isa);
>        Thread.sleep(50);
>        dc.send(bb, isa);
>  
> I would subsequently guess that this serves to be extra sure that a Datagram reaches the reader. Assuming the first packet doesnt make it (probably unlikely), waiting for 50 milliseconds ensures that the reader is at the very least waiting to receive.

Yes, the original test had this conundrum.
But for the refactored test there still exists the possibility that the second send could be received by another test process, because the test's receiver has completed and released its socket resources before the second send has been invoked. In your refactored  test that possibility is diminished as you start the writer thread prior to the reader thread. While in the original test it was reader thread first and writer next. But as with all multithreaded scenarios there is a strong element of non determinism so the possibility still remains.

As such there is no synchronicity between the sender and the receiver in the test, other than the receiver may block indefinitely if a datagram is not received, which is now diminished by using the loopback as the receiver's bind address and as the destination address. But the rationale for invoking two sends and one receive is obscure and still remains a potential, if somewhat rare, problem.

So I'd proffer some symmetry between the sender and receiver with one send and one receive, or that the receiver should remain extant until the sender has terminated, as such it would wait on "signal" from the sender that it has finished.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3354