websockets

Mon Feb 26 11:15:52 UTC 2018

James,

On Mon, Feb 26, 2018 at 4:37 AM, James Roper <james at lightbend.com> wrote:
> On the topic of error handling. A high level API doesn't need to report each
> individual error with sending.

Depends on what you mean by "high level APIs"... I guess there is no
formal definition of that, as one can even think that the Java socket
APIs are high level.

> So firstly, it is impossible to report *all*
> errors with sending, since it's impossible to know, once you send a message
> across the network, whether it got there or not. So if an application
> developer has a requirement to handle errors in sending (and realistically,
> there is never a business require to handle just the errors that can be
> detected in sending, it's always about handling all errors), do we expect
> them to implement that logic in two places, both on the sending side to
> handle errors that can be detected on the sending side, and then
> additionally write logic to handle errors that can't be detected on the
> sending side (such as, for example, having the remote side send ACKs, and
> track the ACKs on the receiving end)? I doubt it, having to handle logic on
> both sides is going to mean the application developer has to spend more time
> implementing boiler plate that could otherwise be spent solving their actual
> business problem.

I think you are oversimplifying this.

If my business is to write a WebSocket Proxy, I may totally want to
care about writing code that handles errors differently whether they
happened on the read side or on the write side, if just for logging
reasons.

If my business is to write a protocol on top of WebSocket, I may want
to use custom WebSocket error codes (depending on errors) when I close
the connection.
I may want to send a WS message the moment I receive a WebSocket
"close" event from a peer.

Sure, maybe in 80% of the cases read and write errors are handled the
same way, but as an application writer I would like to have the choice
between using an API that is "lower" level and allows me full
flexibility with respect to the WebSocket protocol, and an opinionated
(correctly opinionated in the majority of the cases) framework that
for some case reduces the API surface at the cost of something else
like flexibility, increased allocation, ecc.
Both have their use cases.

I believe the JDK net group always stated they went for the first choice.
The lack of the second choice is under discussion, and as I have
already stated may be a matter of resources.

> If an application developer needs to handle errors in
> sending messages, it would be much simpler for them to treat both types of
> errors in the same way. And hence, a high level API should expose detectable
> errors in sending on the receiving end, by cancelling upstream, terminating
> the WebSocket with an error, and then emitting the error in the downstream
> onError.
>
> When it comes to protocol specific concerns - a high level API should not
> expose that to developers.  A developer should not be allowed to send a text
> message between two split binary frames. A developer should not be
> responsible for implementing the closing handshake protocol. That is the
> responsibility of a high level API. WebSocket close messages are designed
> such that high level APIs don't need to expose them to end developers - you
> have one status for success, and then many statuses for close. This means,
> an application developer can signal a normal close by simply terminating the
> stream, and can signal any of the other closes by terminating with an error
> that gets mapped to an error status code - likewise, closes received can be
> mapped in the same way.  And, most of the error codes are for protocol level
> errors anyway that a user should never be generating and can't really
> handle. So I don't think, if providing a high level API, it makes any sense
> to expose close messages or the close handshake to application developers -
> the WebSocket protocol by design is not meant to be used that way.

I don't understand what you're saying here.
You want to notify application code if the other peer closed the connection.
If you do, then you're exposing the application code to the WebSocket
close event, which is part of the WebSocket close handshake.
An application can use custom WebSocket error codes, or may want to
reply with a different error code, or may want to send a last message
with information of what has been already processed so far (similar to
HTTP/2 last stream id).
All that can only be provided by application code so the API must
allow for that.
Again, if you want to write an opinionated framework that covers 80%
of the cases in a simpler way, great, but I'd like to have other
choices too.

> For fragmented frames, once again, I don't think the designers of the
> WebSocket protocol ever intended that this protocol level detail would ever
> be exposed to application developers. My reading of the spec is that the
> reason fragmentation is allowed is to allow endpoints to work with fixed
> buffer sizes, allowing messages to be split if they choose. In fact, this is
> exactly what the RFC says:
>
>> The primary purpose of fragmentation is to allow sending a message
>> that is of unknown size when the message is started without having to
>> buffer that message.  If messages couldn't be fragmented, then an
>> endpoint would have to buffer the entire message so its length could
>> be counted before the first byte is sent.  With fragmentation, a
>> server or intermediary may choose a reasonable size buffer and, when
>> the buffer is full, write a fragment to the network.
>
>
> https://tools.ietf.org/search/rfc6455#section-5.4
>
> This is a protocol level detail, not something that is supposed to be
> exposed to application developers.

It again depends on what you mean by "application developers".
If I'm writing a WebSocket Proxy, receiving WebSocket frames rather
than whole messages is exactly what I want.

> It is reasonable to expect that an
> implementation will buffer and put the fragments back together, up to a
> configured buffer size, for application consumption. If a message does
> exceed that buffer size, then the implementation can close the socket with
> the close code designed exactly for that purpose - 1009.

Our experience with Jetty and its JSR 356 implementation is the opposite.
People are not aware of the fact that WebSocket frames can be
exbibytes large, and are not aware that there even is a limit.
They come from HTTP and they can send/receive any arbitrary large content.
When they see error 1009 they are extremely surprised.
So "reasonable" here may not be something we all agree on.
Rather than having the implementation take a decision for me (to
buffer frames and possibly discard them because of some implementation
little known limit), "reasonable" may mean "pass me all the WebSocket
frames, I know what to do", sticking to least surprise.

One last orthogonal comment about usage of the Flow APIs in the new HTTP client.
As far as I know the Flow APIs are only used for request and response
content, not the for HTTP protocol lifecycle.
HTTP requests/responses are not modeled as Publishers, only their content is.

In that light, perhaps we should reflect whether WebSocket should be
modeled in the same way; that is, the Flow API be used only to model
message content, and not the WebSocket protocol lifecycle.

Thanks !

-- 
Simone Bordet
---
Finally, no matter how good the architecture and design are,
to deliver bug-free software with optimal performance and reliability,
the implementation technique must be flawless.   Victoria Livschitz