[sctp-dev] Unexpected SocketException due to latency of SCTP ShutdownNotification

Danny danny at tower.telenet.be
Sat Feb 6 02:50:01 PST 2010


Hi,

I the process of using the SCTP client in combination with a Selector, 
to make it asynchronous, i received throws from the Socket layer by 
calling the SCTP Client send() method.
This problem didn't occur during standard debugging but during stress 
testing and was therefor some harder to research but i found the problem 
and would suggest and addition to the documentation of the send() method 
to solve it.

The exception thrown was a generic "SocketException"  which is due to a 
latency problem between the activity of the underlying SCTP stack (in my 
case lksctp on fedora) and the proliferation of notifications to the 
application level. The SocketException should be added explicitly to the 
send documentation because it is unexpected and one has the impression 
that all Socket related throws are caught  by the SCTP API  and that the 
application is abstracted from them.

This is how it happens :

In heavy stress SCTP client outbounds connections to an SCTP Server, 
resulting in SCTP Client inbounds connections to be accepted by the 
Server using a Selector, are set-up. The test is done with only the 
local host address.
The problem is consistent and now that i know what it is i can reproduce 
it with a single connection, no need for stress.

Once the connection is setup data can be exchanged. The test randomly 
decides whether it is the inbound or outbound client that will call SCTP 
shutdown() to disconnect the connection. The channel however is NOT 
closed() so that new connections can be made using the same channel. The 
result is as expected. The one that calls shutdown() receives the 
AssociationChangedNotification(SHUTDOWN) and the other peer received the 
ShutdownNotification. The problem occurs always with the client that did 
receive the ShutdownNotification if it calls the send() method just in 
between the moment that lksctp lib already acted upon that SHUTDOWN 
message by disabling the socket for send and only keeps it open for 
receive (graceful shutdown), and before the ShutdownNotification reaches 
the application level so that the application knows i may not send anymore.

At first one would think that this is not possible because using a 
Selector, the Selector would never trigger the SEND to allow the 
application to send because it is in sync with the underlying provider. 
And indeed, there is no problem there. However, a Selector based 
implementation, when it has nothing to send, disables it's interest in 
receiving  SEND triggers from the Selector, in order to avoid continues 
SEND triggers from the Selector while there is nothing to send, and one 
only re-enables the interest when a next message becomes available and 
is send in order to receive a SEND trigger when the message is 
effectively send at which point it can check if it has more data to send 
or should disable it's interest for SEND triggers again.
I think we all are kind of forced into that scenarion when using 
Selectors in a cpu optimized way.

As a consequence, once in a while, the send() method is not called due 
to a SEND trigger from the Selector but by class code in order to 
restart the sending cycle. Therefor it becomes possible that send() is 
called in between
the acting upon a SCTP SHUTDOWN message by the lksctp lib and the 
arrival of the ShutdownNotification at the application level where 
actions should be taken to avoid the application from further sending data.

Because from where i stand, and after having a look inside the lksctp 
source code, there is no way to avoid this it would be a good idea to 
solves this by documenting, or at least mention the possibility of a 
SocketException throw in the
list of throws of the send() method. In my application i now simple 
catch that exception specifically and give it the appropriate treatment.


Greetz,
Danny


More information about the sctp-dev mailing list