adding rsockets support into JDK

Chris Hegarty chris.hegarty at oracle.com
Tue Dec 11 19:30:45 UTC 2018


Lucy,

Sure, the small test scenarios can be modified to make them "work". The
bigger question is how the proposed JDK-RDMA implementation code can be
modified to provide the semantics of the `SocketChannel` API.

Issue #1: rread returns EAGAIN. One possible solution could be that the
blocking code path in RdmaSocketChannelImpl could fallback back into a
blocking rpoll POLLOUT if the IOUtil.read method invocation returns
IOStatus.UNAVAILABLE. I think this should work, and not have too much of
a negative impact since the fallback will only occur infrequently.
  Q: can rwrite return EAGAIN? I have not checked yet.

Issue #2: This issue is likely to be encountered mainly during testing,
since a non-blocking connect followed by an accept, on the same thread,
is not all the common in non-test code. That said, the semantics of the 
SocketChannel API would lead one to expect it to work. ( I get that 
rsocket is not asynchronous, but the semantics of non-blocking channels
implies some asynchronousity ).  I wonder if the JDK-RDMA implementation
should have a dedicated thread that "pulls" on unfinished non-blocking
connects that are not subsequently registered with a Selector? Maybe
accepts too? I'm not sure yet.

-Chris.

> On 11 Dec 2018, at 00:36, Lu, Yingqi <yingqi.lu at intel.com> wrote:
> 
> Hi Alan/Chris,
> 
> I was able to confirm that connecting on non-blocking socket causes issues. It happens when connect/accept occurs in the same thread or different threads in the same process. 
> 
> Then, I did a small tweak in Chris's sample application by spawning a thread doing rpoll on the connection_fd. Now, the connect/accept works in both of the cases above. Please let me know if this is a valid workaround for the issue. 
> 
> Performance wise, this workaround should not impact send/receive at all. It might only add a small overhead to the connection setup phase only with non-blocking RDMA socket. 
> 
> The modified app code is available at
> 
> For connect/accept occur in the same thread: https://cr.openjdk.java.net/~ylu/testNonBlocking_raccept_modified.c
> 
> For connect/accept occur in two different threads: https://cr.openjdk.java.net/~ylu/testNonBlocking_raccept_modified_2threads.c
> 
> Thanks,
> Lucy
> 
>> -----Original Message-----
>> From: Alan Bateman [mailto:Alan.Bateman at oracle.com]
>> Sent: Saturday, December 8, 2018 8:10 AM
>> To: Chris Hegarty <chris.hegarty at oracle.com>; Lu, Yingqi
>> <yingqi.lu at intel.com>; nio-dev at openjdk.java.net
>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Aundhe,
>> Shirish <shirish.aundhe at intel.com>; Kaczmarek, Eric
>> <eric.kaczmarek at intel.com>
>> Subject: Re: adding rsockets support into JDK
>> 
>> On 08/12/2018 09:39, Chris Hegarty wrote:
>>> :
>>> 
>>> - It has become apparent that mixing blocking and non-blocking
>>>   connect/accept operations, in the same thread, may cause issues. For
>>>   example, attempting to setup a connected-socket on the same host by
>>>   issuing a non-blocking connect followed by a blocking accept, will
>>>   just hang and not make progress [3]. Upon further enquiries it
>>> appears
>>>   that the programming model for rsocket is a subtly different than
>>> that
>>>   of the regular Berkeley sockets ( at least for the connection
>>>   handshake ). It is not immediately clear how to reasonably
>>> workaround
>>>   this issue ( it's not a bug in rdma-core, but more a fundamental
>>> part
>>>   of its thread-less operation ).
>>> 
>> Would it be possible to expand on this to say whether the same issues arises
>> when the non-blocking connect is initiated on a different thread, or in a
>> different process, or even a different machine on the fabric.
>> That is, if the socket is non-blocking and I do a rconnect and then delay before
>> doing anything else on the socket then will the peer doing accept be
>> blocked/hung in the mean-time?
>> 
>> -Alan



More information about the nio-dev mailing list