[sctp-dev] SendFailedNotification and fragmentation

Wed Feb 10 10:09:29 PST 2010

Chris,

Before replying to your question a remark about 
http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-21#section-5.3.5 
to avoid that a reply becomes confusing and unproductive.
May i assume that when the lksctp provides an SCTP_SEND_FAILED 
notification as documented in 5.3.5 you do still receive the 
sctp_send_failed structure with inside it the SCTP_SNDRCV information 
(even thought the specs declare SCTP_SNDRCV deprecated for future use) 
and if so can any reasoning based on the fields it contain be assumed 
future proof or is there the risc that they will be suddenly replaced by 
SCTP_SNDINFO or SCTP_RCVINFO in which case only SCTP_RCVINFO would 
provide compatible information.

In the same line of reasoning, if it is a SCTP_SNDRCV that is present 
and since it contains different data depending on it's access via 
sendmsg() or rcvmsg() that from the SCTP API's perspective, since we are 
talking about a notification, you get the structure filled as for rcvmsg().

I will tune my answers to your questions withing the limits of what is 
available and taking in account that there is a opaque 'context' field 
available in those structures.

I will then also be able to elaborate on the SCTP_DATA_NOT_FRAG and the 
included ssf_flags (SCTP_DATA_SEND and SCTP_DATA_UNSEND) in the discussion.

finally it would also come in handy to know if at some point the lksctp 
provides you with feedback (through events or callback) about delivery 
acknowledgment. The text in 5.3.5 mentions it, and TCP as well as SCTP 
both being protocols that have end-to-end delivery guarantee have no 
other choice that keeping data until acknowledged, and therefor lksctp 
may have a way for the SCTP API to tap into that. That is by the way the 
missing link that i have been looking for which makes it 100% sure 
because if the API can turn that into a notification  or so, the circle 
is round and housekeeping can be performed.

-Danny.

Christopher Hegarty - Sun Microsystems Ireland wrote:
> Danny,
>
> I understand your comments and suggestions here, and I agree in 
> principle with what you are saying. When developing this API and 
> implementation we were greatly constrained by the limitations of the 
> native socket API extensions. Also, the fact that the native API is 
> not finalized and the various implementations support different 
> versions of the draft really caused headaches. Solaris is still on 
> version 10 of the draft API. For these reasons we did not put any 
> sophisticated error handling for send failed, or other exotic features.
>
> Looking at the latest socket extensions draft [1], it looks like they 
> added support (since around version 18) similar to what you are 
> suggesting ( or at least part of it ). I think if we could expose this 
> information through the Java API it should greatly help you efforts, 
> right? see SCTP_DATA_NOT_FRAG, SCTP_DATA_LAST_FRAG, etc.
>
> I don't believe that this draft version has been adopted by the main 
> stream implementations yet. At least not lksctp or Solaris. Have you 
> had a chance to look at this?
>
> That being said, I think that returning the MessageInfo in the 
> SendFailedNotification would be a useful addition. To confirm: you 
> require such information like ppid, ordered, complete, ttl, right? 
> Since the other members of MessageInfo are already exposed to the 
> notification handler directly.
>
> -Chris.
>
> [1] 
> http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-21#section-5.3.5
>
> On 09/02/2010 17:46, Danny wrote:
>> Chris,
>>
>> As an intro I'd like to synthesize from other SCTP related posts who's
>> content will be partiality relevant to the problem explained below (for
>> those reading only this post) that:
>> - SCTP_EXPLICIT_CONFIRM is support at the SCTP API level, waiting for
>> lksctp to support SCTP_EXPLICIT_EOR, but a work-around can be used.
>> - the other fragmentation related socket options (SCTP_DISABLE_FRAGMENTS
>> and SCTP_FRAGMENT_INTERLEAVE) are already operational and functioning
>> since earlier lksctp versions
>> - it is confirmed that situations exist in which send() throws
>> SocketExceptions when they occur between SHUTDOWN processing at lksctp
>> level and before ShutdownNotification at application level
>>
>> Furthermore :
>> - if they occur after the first part of a non-blocking send() returning
>> !0 (not 0) and before the effective send would take place on the wire a
>> SendFailureNotification would occur in stead of a SocketException throw
>> - the SendFailedNotification object does not have the MessageInfo, as
>> provided to the send() method, associated with a buffer. It provides
>> address, association, stream, error and buffer which don't lead to the
>> related MessageInfo.
>>
>> Statement :
>>
>> Given the above, and based on effective working code, i state that it is
>> possible to implement an asynchronous (non-blocking) class that offers
>> transparent fragmentation and that can for 100% guarantee that no
>> message loss will occur between the moment where an application calls a
>> send() method and the moment that the related data is effectively send
>> (what wasn't send is returned to the application =100% recovery). There
>> is however one situation of which i think, or fail to see, that any
>> implementation that uses the fragmentation features of SCTP will today
>> and in the future not be able to recover from or provide such a 100%
>> guarantee, resulting in data loss, unless the interface is adapted some.
>>
>> Problem:
>>
>> The specific case is when SendFailedNotification must be handled. These
>> notification will occur extremely rarely when there is data in the send
>> buffer while suddenly the socket becomes unfit to write. In that case
>> one, or more, SendFailedNotifications can be triggered for a same
>> channel/stream combination depending on how many non-blocking send()'s
>> have been absorbed into the underlying send buffer, because the buffer
>> was still sufficiently large to take the message(s) in. This multiple
>> pending send messages situation is equally the case if a Selector is
>> used because the SelectionKey will keep getting selected as long as
>> there is free space in the buffer, unless the application cancels SEND
>> interest because it has nothing to send anymore.
>>
>> A robust implementation would need to recover inside a
>> SendFailureNotification. This is easy and straight forward if the buffer
>> contains a complete message (not fragmented) because one just has to
>> bounce it back to the sender with whatever means have been implemented
>> to do so, but that information is not available at
>> SendFailureNotification time because it is part of the MessageInfo that
>> is not provided. However, even worse, if the buffer is a fragment, and
>> given the fact that in a good implementation the sender should be
>> fragmentation unaware, the implementation should salvage to complete
>> message, in order to conform to the SCTP contract on message boundary
>> guarantee, and then bounce that complete message, and not only that
>> fragment, to the sender. This means that the implementation must keep
>> the complete message until the last part is send because relying only on
>> consecutive SendFailureNotification doesn't guarantee that the complete
>> message can be re-composed because one doesn't know on what fragment the
>> first failure occurs and if the rest of the message is already in the
>> send buffer or not. More, last messages must be kept on a per-stream
>> basis because the buffer can contain messages for different streams (the
>> buffers are socket level not stream level) from an application viewpoint
>> at the moment that the connection or the channel unexpectedly closes.
>> And as if all this isn't already enough, there could be more then one
>> fragmented message for a same stream in the buffer which would imply
>> that the implementation must keep more then the last message per stream.
>>
>> Suggestion:
>>
>> All this brings me to the point to say that, in my opinion and current
>> state of knowledge, there is a need for some local means (send side)
>> that makes it possible for code that receives SendFailureNotifications
>> to access the MessageInfo that was associated with the buffer in the
>> send() method and preferably with an extra field on the MessageInfo, in
>> good java tradition an object, that the implementation can use to
>> associate data. This object should/may not be send to the peer, it is a
>> local thing in order to stay conform with the SCTP specs and compatible
>> with other SCTP stacks.
>>
>> Unfortunately this is only halve the solution because an implementation
>> that needs to keep messages until they are effectively send on the wire
>> due to the fact that until that moment a SendFailureNotifcation may
>> still occur for that message, needs some mechanism to be informed of the
>> sending in order to do the housekeeping of these kept messages. As far
>> as i know there is no mechanism that allows to know this at the SCTP API
>> level (=one that includes the channel/stream number/MessageInfo).
>> A possible solution could be an overload of the send method taking an
>> extra argument to provide a notification handler (same type as the one
>> used for the receive() statement) and add a notification type (eg:
>> SendReadyNotification or something). If the caller provides the
>> notification handle object to the send() method then the handler is
>> called with the SendReadyNotification containing the MessageInfo
>> (already second parameter of the send() method at this time). If the
>> caller does not provide the the notification handler then everything
>> stays as it is today.
>>
>> I would like feedback on this because i am under the impression that is
>> is going to be a generic problem, and therefore a generic need, for the
>> SCTP interface if it is going to be used with the SCTP fragmentation
>> features as intended to avoid application level (in message or higher
>> protocol) involvement. I think there may be consensus that the SCTP
>> fragmentation was intended to perform fragmentations without the need to
>> know what is in the data buffers (audio, video, text,etc) and without
>> the need to understand how to recombine it.
>>
>> Thanks,
>> Danny
>
>