[sctp-dev] Unexpected SocketException due to latency of SCTP ShutdownNotification
Danny
danny at tower.telenet.be
Sat Feb 6 02:50:01 PST 2010
Hi,
I the process of using the SCTP client in combination with a Selector,
to make it asynchronous, i received throws from the Socket layer by
calling the SCTP Client send() method.
This problem didn't occur during standard debugging but during stress
testing and was therefor some harder to research but i found the problem
and would suggest and addition to the documentation of the send() method
to solve it.
The exception thrown was a generic "SocketException" which is due to a
latency problem between the activity of the underlying SCTP stack (in my
case lksctp on fedora) and the proliferation of notifications to the
application level. The SocketException should be added explicitly to the
send documentation because it is unexpected and one has the impression
that all Socket related throws are caught by the SCTP API and that the
application is abstracted from them.
This is how it happens :
In heavy stress SCTP client outbounds connections to an SCTP Server,
resulting in SCTP Client inbounds connections to be accepted by the
Server using a Selector, are set-up. The test is done with only the
local host address.
The problem is consistent and now that i know what it is i can reproduce
it with a single connection, no need for stress.
Once the connection is setup data can be exchanged. The test randomly
decides whether it is the inbound or outbound client that will call SCTP
shutdown() to disconnect the connection. The channel however is NOT
closed() so that new connections can be made using the same channel. The
result is as expected. The one that calls shutdown() receives the
AssociationChangedNotification(SHUTDOWN) and the other peer received the
ShutdownNotification. The problem occurs always with the client that did
receive the ShutdownNotification if it calls the send() method just in
between the moment that lksctp lib already acted upon that SHUTDOWN
message by disabling the socket for send and only keeps it open for
receive (graceful shutdown), and before the ShutdownNotification reaches
the application level so that the application knows i may not send anymore.
At first one would think that this is not possible because using a
Selector, the Selector would never trigger the SEND to allow the
application to send because it is in sync with the underlying provider.
And indeed, there is no problem there. However, a Selector based
implementation, when it has nothing to send, disables it's interest in
receiving SEND triggers from the Selector, in order to avoid continues
SEND triggers from the Selector while there is nothing to send, and one
only re-enables the interest when a next message becomes available and
is send in order to receive a SEND trigger when the message is
effectively send at which point it can check if it has more data to send
or should disable it's interest for SEND triggers again.
I think we all are kind of forced into that scenarion when using
Selectors in a cpu optimized way.
As a consequence, once in a while, the send() method is not called due
to a SEND trigger from the Selector but by class code in order to
restart the sending cycle. Therefor it becomes possible that send() is
called in between
the acting upon a SCTP SHUTDOWN message by the lksctp lib and the
arrival of the ShutdownNotification at the application level where
actions should be taken to avoid the application from further sending data.
Because from where i stand, and after having a look inside the lksctp
source code, there is no way to avoid this it would be a good idea to
solves this by documenting, or at least mention the possibility of a
SocketException throw in the
list of throws of the send() method. In my application i now simple
catch that exception specifically and give it the appropriate treatment.
Greetz,
Danny
More information about the sctp-dev
mailing list