WebSocket client API

Tue Sep 1 20:45:04 UTC 2015

Hi Joakim,

First of all, thank you so much for such a comprehensive list of questions! I'll
try to answer them to the best of my knowledge.

But before, I need to clarify one thing. I've noticed many of the questions are
about lack of some advanced features in the API. Well, they are for sure
negotiable, but let's for now (for initial pass) answer those as "[*]" (to not
repeat the same argument over and over again), which would mean the following:

  We have a design goal for this to be a very lightweight API. It focuses on
  simplicity and convenience of data exchange -- primary service provided by the
  WebSocket protocol. The feature mentioned seems to be an advanced one, and
  though might be very useful for other APIs, at first glance doesn't look like
  a necessary requirement from our point of view.

> On 31 Aug 2015, at 18:01, Joakim Erdfelt <joakim.erdfelt at gmail.com> wrote:
> 
> - There's 2 ways some of the websocket headers can be declared, and 1 way for others.
>    Origin                   - easy enough, only in HttpClient.

This particular header may only be set by an HttpClient.

>    Sec-WebSocket-Protocol   - easy enough to declare as header in HttpClient, but it also exists in WebSocket.Builder (with a nice interface), but what happens if both are used?

This is a good question. I think the expected behaviour would be to overwrite
headers specified in the HttpClient with those specified in the
WebSocket.Builder. This should be mentioned explicitly in the javadoc though.
Thanks.

It might be harder to overwrite 'Sec-WebSocket-Extensions' header, since its
values might be spread across several fields. But that's more of a concern from
an implementation point of view rather than from an API's.

Another option would be to throw some sort of unchecked exception. But I think
it's less robust and might not be a good choice here.

>    Sec-WebSocket-Extensions - this is tricky to declare correctly, the extensions themselves should be the canonical source for a properly formatted Extension (with parameters) for this header.  
>                               I think it's a mistake to rely on only String here.
>                               See the history of extension configuration strings in the various PMCE specs.

Interesting. Could you please provide any links and/or examples? To be honest, I
suspect most users will simply copy-paste opaque strings with required
extensions' configurations for particular servers and since the syntax is
well-defined, there's nothing to worry about. But yes, I've noticed it's done
differently in other APIs.

There's one more reason that extensions have never been represented by anything
other than strings. As previously mentioned, since the intention is to keep this
API lightweight and as you can see no negotiation process abstraction is
provided, it's basically a single shot action. User states what they need
(subprotocol, extensions, etc.) Then user (hopefully) gets a connection. Then
user checks with accessors (extensions/subprotocol) what's in-use and if, for
some reason, they don't like the result, they simply abandon the WebSocket and
try again differently.

> - Make sure that adding Cookies to the upgrade is bullet-proof, its the feature with the most questions on 
>  stackoverflow (for all languages, even the Javascript WebSocket API)

Thanks. I will keep an eye on that.

> - Errors during upgrade are communicated only via the WebSocket.create() and WebSocket.createAsync() exceptions.
>  Access to the HTTP Response (headers, and status codes) are very important, even when using the
>  non HttpClient version via WebSocket.newBuilder(String uri)

So if I understand you correctly you are saying that in case of a failed
negotiation, the Builder should throw an exception with HTTP response fields? Is
that for diagnostics or for some other reason?

> - Add new WebSocket.newBuilder(URI) method for formal URI use

OK. Btw, do you think that String version has a certain value or we are better
without it? In my opinion it looks nicer (shorter) in examples, and removes
additional boilerplate code. But I'm not too sure about real life usage.

> Extensions:
> 
> - Extensions should be better formalized, real objects, or at least real configuration objects.
> - New extension implementations should be able to be provided.
> - WebSocket.extensions() should have apidoc indicating that it is list of extensions that were negotiated during the upgrade.
> - How can someone plug in a new extension implementation?

Initially there were some thoughts on designing an SPI for extensions. But
having analyzed the history and the current state of official extensions these
ideas were set aside. I'm sure you know well that out of 2 official extensions
that hit the RFC road (mux [1], perframe-compress [2]) none succeeded. mux's
been abandoned because of similar features in HTTP/2; after consecutive 5
drafts, perframe-compress turned into PMCE [3] -- a spec for compression
extensions family. PMCE's author, Takeshi Yoshino, has just updated it to a
draft #28 and given the recent dynamics I assume it would be finalized pretty
soon.

Agreed, early days of WebSocket many people seemed very enthusiastic about
extensibility of the protocol. However the number of official extensions at the
moment is close to zero.

Having said that, I believe it's easier to extend the API later with an SPI for
extensions if needed, rather than spend a lot of time now on it with probably
little or no gain.

Let me be clear. I'm not saying an SPI for extension is useless, I just doubt
its value:weight ratio for this particular API.

To sum up, the only way for an extension to be used with this API at the moment
is to become supported by JDK's implementation. Anyway I'll keep an eye on the
situation.

-------------------------------------------------------------------------------
[1] https://tools.ietf.org/html/draft-ietf-hybi-websocket-multiplexing-11
[2] https://tools.ietf.org/html/draft-ietf-hybi-websocket-perframe-compression-04
[3] https://tools.ietf.org/html/draft-ietf-hybi-permessage-compression-28

> - Will JDK9 ship with permessage-deflate enabled by default?

I hope it will be shipped with it. In other words

	WebSocket.supportedExtensions().contains("permessage-deflate") == true

But I'm not sure if it's going to be an initial JDK 9 release or any of further
updates. And I haven't thought about turning it on by default. Interesting idea.
It would require some changes in the WebSocket.Builder's spec though.

>  If so, will the java.util.zip.Deflater/Inflater support controlling the LZ77 Sliding Window Size?
>  See: https://tools.ietf.org/html/draft-ietf-hybi-permessage-compression-28#section-7.1.2

Not sure yet.

> - There should be possible to have extensions that are invisible and only exist on the local endpoint
>  (such as debugging / identity / analysis extensions)
> - Extensions should be configurable for features that exist outside of the extension negotiation parameters.
>  eg: permessage-deflate configured to only compress outgoing TEXT and not BINARY messages as the BINARY messages are images or other precompressed data.
>      permessage-deflate configured for a minimum message size before compression kicks in for outgoing messages.

If I understood correctly, you say that extensions can have 2 "facets", or 2
APIs. One is fully standardized in the form of extension parameters negotiated
between a client and a server, and the other is a sort of list of hints, or
commands to be supplied by clients to the extension on their side to control some
behaviour within negotiated boundaries. Interesting.

But wouldn't that require a standardized way to communicate the supplied hints
to an extension? And if so, then it means we're back to the question on SPI.

Of course we could provide such hint support without any SPI, just to be handled
by implementation for supported extensions, ignoring unknown or incorrect hints:

CompletableFuture<Void> sendAsync(Outgoing data, Hint... hints)

We will think about it. Thanks!

> Async Send
> 
> - What happens if a user spams sendAsync() too fast?
>  Is there some sort of immediate feedback about write back-pressure?
>  or is this information entirely within the CompletableFuture for the user to handle?
>  if so, how?
>  Do subsequent sendAsync() event block?
>  or do all sendAsync() calls queue up?  If so, how big is the queue?
> - Is there a way to ask the if a write can proceed without blocking?

That's a very good question. The only way sending could be "back pressured" in
the current API is by blocking completion of the returned
CompletableFuture<Void>. But given the intention to provide a fully asynchronous
non-blocking API, that's not a right thing to do. I have to think about a
callback a user can install to be notified when they should put sending on hold:

public interface BackpressureHandler {

  /**
   * Called by {@link WebSocket} no notify a client sending more data is
   * currently undesirable.
   */
  void suspendSending();

  /**
   * Called by {@link WebSocket} no notify a client sending may be resumed.
   */
  void resumeSending();
}

> - If there is queueing? can it be controlled?
>  eg: 
>   1) queue up X frames or bytes then write? (useful for lots of small writes)
>   2) queue only X frames or bytes, then toss warnings (useful for backpressure)
>   3) no queue in java, only rely on the OS write queue, if that's full toss warning.
>   4) allow queue to build up, but also allow cancel/remove of queued items by user (for stream control).

There's no queuing at the moment. To provide correct back pressure for sending
we should use both socket features and java queues.

> WebSocket.Outgoing	
> 
> - What does binary(Iterable<? extends java.nio.ByteBuffer> chunks)
>  and text(Iterable<? extends CharSequence> chunks) do?

They provide a way for an API user to construct a WebSocket message (m) from a
sequence (c[i], i: 0..n) of consecutive chunks, such that

m = c[0] + c[1] + c[2] + ... c[n-1] + c[n]

where "+" is concatenation operation. It's basically designed for cases when:

1. the overall size of the message is not known beforehand
2. the overall size of the message is too big to fit in a single
  ByteBuffer/CharSequence, or memory is a particularly scarce resource
3. it's more convenient for the user to provide something as a sequence of its
  parts rather than as a whole thing

Think of it as some sort of data supplier. This thing also allows not to bring
"fragmentation" concept in the API.

>  Is there an expectation that each chunk is preserved as a websocket frame?

No.

>  Or is this treated similarly to how GatheringByteChannel.write(ByteBuffer[]) functions?

Yes.

>  I would advocate that the call represents a single WebSocket message, and the individual chunks
>  can be merged/split/fragmented/mangled to suit the implementation and/or extension behavior as well.

Exactly. That what is meant.

> - Where's the .ping() or .pong() ?

* @apiNote Keep-alive features of WebSocket protocol are taken care of
* completely by implementations and are not exposed in this API.

We thought that a high-level API could live without this burden for the user. At
the same time the implementation will definitely have several configuration
parameters, tweaking its behaviour in respect to keep-alive features.

Yes, we've thought about application data that can be carried by those types of
messages, but due to restrictions to their size (125 bytes) and potential
out-of-order delivery we've decided not to expose them as data carriers.

> - text() should have apidoc indicating that *only* UTF-8 is supported.
> - text() based on CharSequence is an interesting choice.  how will you handle non-UTF8 encoded Strings?
>  Will you always (internally) use CharSequence.toString().getBytes(StandardCharset.UTF_8) to get
>  the raw bytes before sending them to the remote endpoint?  (thus potentially mangling / replacing
>  the bad code points with replacement chars)
>  You can't rely on the CharSequence.codePoints() or .chars() as the encoding could be invalid.
>  Will you validate the UTF8 locally before sending?
>  Or rely on the remote endpoint to close the connection with a 1002 Protocol violation or 1007 Invalid Data Consistency?

All incoming CharSequences will be encoded to byte streams. Any encoding/decoding
errors will be unrecoverable. (What if by the time the error occurred we've sent
some bytes? Would it be a recoverable situations? I don't think so. We have no
choice but close a WebSocket.)

> - If you will support half-closed connections, then the .close() methods should be on WebSocket.Outgoing

The API doesn't support half-closed connections.

> - If a single message results in multiple frames on the protocol, possibly fragmented by an extension, how does a fault half-way through the message get communicated to the CompleteableFuture<> ?

In this case a CompletableFuture<Void> returned from sendAsync completes
exceptionally.

> - If chunked/streaming send/write/outgoing is eventually supported, an exception occurs in 1 chunk or frame, that doesn't mean the message is invalid and the websocket connection is bad, it is quite likely to be recoverable by the client application.  

That's a tough question. It's probably the most controversial part of the API.
I've been thinking about error handling for some time now. And you know, I can't
really see how exceptions in WebSocket could be recoverable.

Let's take for example some well known I/O primitives: InputStream and
OutputStream. What people do (and what would you do?) in case they've
encountered an IOException while reading/writing? The vast majority of code I've
seen, does the only imaginable thing: ioStream.close(). I believe in NIO exactly
the same.

It seems like whenever an exception (either during sending or receiving) is
thrown, the most sane thing to do -- is to close the connection. And since
closing the WebSocket is a handshake (2 parties are involved), error handling
could be error-prone and tedious.

Now I think that if any exception in incoming or outgoing channel (except for
NullPointerException in sendAsync()) occurs, an implementation should close the
WebSocket and only notify the API user the WebSocket's been closed. This fact is
communicated in two places: Incoming.onClose and in CompletableFuture<Void>
returned from sendAsync().

That behaviour (if accepted) needs to be specified very clearly in the javadoc.

> Even backpressure indications on write should be able to be handled by the client application.

Agreed.

> - The binary() and text() methods should have options to allow metadata to tag along for the extension to work with.
>  eg: I want to have permessage-deflate enabled and negotiated, but i'm smart enough to know when what i'm about to send isn't worth compressing (it could be too small, or be already compressed).
>  having the ability to say .. text(msg) be compressed, and binary(myPng) not be compressed.

See the answer above.

> WebSocket.Incoming.Chunks
> 
> - Don't like the .beginData() and .endData() as there's no information about the type of message.

Absolutely agree! Though the information you're talking about could always be
recovered from the first call of .onTextChunk/.onBinaryChunk (the first call
defines the type of new data), it might not be that convenient.
A more serious flaw, as I believe, is that these methods do not handle complex
data interleaving. Let me explain what I mean.

The RFC mentioned the potential data interleaving [5.4. Fragmentation]:

  The fragments of one message MUST NOT be interleaved between the
  fragments of another message unless an extension has been
  negotiated that can interpret the interleaving.

Agreed, no official extension can do that thing now. I also don't think they
will be able to do it in the near future. But from the API design point of view
it might be better to be prepared. With Incoming.Chunks's .onBeginData() and
.onEndData() we could interpret simply nested data:

T1   T2    T2   T1
[----[-----]----]

(same type, properly nested)

or

T1   B2  B2     T1
[----[---]------]

T1   B2  T1     B1
[----[---]------]

(different types, nesting doesn't matter)

But not the case where data really interleaves (intersects):

T1   T2    T1   T2
[----[-----]----]

(same type, intersected)

As the user can't tell .onEndData(T1) from .onEndData(T2). Thus data might need
unique markers. Now, I don't tell we should turn a WebSocket.Incoming.Chunks
into a org.xml.sax.ContentHandler :)

  void startDocument();
  void endDocument();
  void startElement(String uri, String localName, String qName, Attributes atts);
  void endElement(String uri, String localName, String qName);
  ...

But the possible problem is there. We just need to evaluate how severe it is. If
we stay with current signatures for onBeginData() and .onEndData() then for
small messages we could probably fallback to buffering and later reordering of
intersected data, but... I don't know. It requires more thinking.

What do you think about it? Is it an overestimation of risks?

>  If the MUX extension gets revived (again) then this interface is invalid.
>  Note: the mux extension is currently abandoned, as it was thought that HTTP/2 could provide this feature
>  instead.  But the lack of WebSocket over HTTP/2 spec is having people rethink mux.

Last time I read MUX draft I got the impression that it completely changes the
semantics of the WebSocket protocol. It's not a transparent extension like
PMCEs. In other words after this extension is applied, WebSocket is no longer a
WebSocket, one can't simply continue to exchange data on such an object thinking
it would have the same semantics. It's something that maps one WebSocket to a
"collection" of WebSockets:

public class MultiplexedWebSocket {

  public MultiplexedWebSocket(WebSocket.Builder builder) { ... }

  public WebSocket addWebSocket() { ... }

  ...
} 

To me it's more like a subprotocol heavily using over-extensibility of WebSocket.

> - Consider using the signature .onChunk(WebSocket ws, T chunk, boolean final)
>  This handles the "reset" between messages cleanly, and contains the type information as well.

It's arguably better, but also doesn't solve the problem of interleaving.

> - Why is there no WebSocket.Outgoing chunking equivalent?  
>  This will be important for those wanting to deal with streaming behaviors over websocket.

See the answer on the purpose of Outgoing above.

> WebSocket.Incoming
> 
> - Where is .onPing() and .onPong() ?

See the answer on .ping() and .pong() above.

> - Will you allow half-closed .onClose() behavior? 

No ([*]).

>  meaning, if the local endpoint receives a onClose(), is it the responsibility of the user to issue the
>  WebSocket.close()? or does the implementation do this automatically?

Closures are dealt with automatically by implementations.

> - WebSocket.onClose() signature is invalid.
>  Not all exceptions cause a close.
>  the signature should not contain a Throwable, that should be part of an WebSocket level onError(Throwable) event.

See the answer on exceptions handling above.

> - WebSocket.onClose() should contain the close reason code.

What is it useful for? Is it ok to provide it as a part of description?

	String description = char code + ": " + String reason

> - WebSocket.text() should java apidoc indicating that it is *always* UTF-8 encoded.

Agreed.

> WebSocket
> 
> - WebSocket.close() handling ...
>  There are 3 close techniques to enable.
>    1) .close(int code, String reasonPhrase)
>    2) .close(int code)
>    3) .close()
> 
>  In case #1 the close can fail due to protocol reasons because the reasonPhrase is too large (or an invalid
>  close reason code is used)
>  If the reasonPhrase is too big, do we trim it? or throw an error to the user?
>  Trimming the reasonPhrase has to be on valid UTF8 boundaries.
>  In case #2 the close can fail for an invalid status code.
>  In case #3 the close is sent without a payload, this should not cause a failure or an exception or an error.
>  At this point the implementation should do its best to close (or disconnect if the opposite side has closed as well).

Closure is a delicate matter, a pretty complicated process. I think an
implementation would know better (at least in most cases) how to close the
connection in each particular case. Thus status code and the reason are
completely out of user's control. This behaviour is unlikely to be negotiable,
primarily because of [*].

> - WebSocket should implement java.io.Closeable

I’m not sure I see the value here. It's a push model of delivery. Working with
the WebSocket is spread across several threads. Thus I doubt very highly
WebSocket would be used in try-with-resource statements. In any other case it
could be simply wrapped with java.io.Closeable on the spot.

> - Where would the onError notifications for Incoming issues arrive?

See the answer on exceptions handling above.

> - A thought, WebSocket.close() is a non-half-closed approach, with WebSocket.Outgoing.close() being for half-closed behavior.

See the answer on the purpose of Outgoing above.

> - big thumbs up for .suspendReceiving() and .resumeReceiving() concept (but the location/method names seem awkward)
>  wouldn't having them on WebSocket.Incoming make more sense? (again this is partly my hangup on half-closed support, and partly my desire to see the the API be easier)
>    WebSocket.Incoming.suspend()
>    WebSocket.Incoming.resume()
>  Benefit here is that the sequence could be ...
>    Receive Frame
>    Parse Frame
>    Call method WebSocket.Incoming.on*()
>    while in user code, they call this.suspend() to halt reception of more 
>  also, a isSuspended() is probably warranted here.

WebSocket.Incoming is a callback (handler) type. Its methods are to be called by
the WebSocket, not by the user.

> General Questions
> 
> - Will the API eventually (jdk10?) grow into something to support server side WebSocket? 
>  (aka java.net.ServerWebSocket ?)
>  If so, then some of the API choices now could use small tweaks to make them reusable later.

It might. What tweaks do you have in mind?

> - Is this WebSocket API layer in stone? 

No, but we’re trying to lock things down soon (next couple of weeks).

>  or based on some sort of user/developer feedback? 
>  or encouraged based on guidelines for java, the JDK, or classlib?
>  can alternate APIs be proposed still?

It depends. What's on your mind?

-Pavel