HTTP 2 client API

Fri Jul 31 13:56:59 UTC 2015

Thanks for the quick review, Simone.
Some answers below.

Michael

On 31/07/15 14:10, Simone Bordet wrote:
> Hi,
>
> On Fri, Jul 31, 2015 at 9:00 AM, Michael McMahon
> <michael.x.mcmahon at oracle.com> wrote:
>> Hi Wenbo,
>>
>> The latest version of the docs is available at:
>> http://cr.openjdk.java.net/~michaelm/8087112/2/
>> I am hoping to finalize this very soon.
> Comments below.
>
> == HttpRequest.create(URI) and related.
> Using a URI is not correct since requests may be made for targets that
> are not URIs, see https://tools.ietf.org/html/rfc7230#section-5.3.
> Examples: OPTIONS * HTTP/1.1, CONNECT foo:9090 HTTP/1.1 where "*" and
> "foo:9090" do not parse as URIs.
> I suggest to rename uri() to target() and allow it to take a string
> rather than a URI.

I did find that to be problematic when implementing CONNECT
in particular.  That might be a good solution.

> == HttpRequestBodyProcessor & HttpResponseBodyProcessor
> Just a nit here, the methods should consistently have "Body" in their name ?
> onRequest*Body*Start() & onRequest*Body*Error(), etc. and same for
> HttpResponseBodyProcessor.

Ok. Now, on the whole flow-control issue, here is my thinking.
First that isFast() method is a relic from a previous version and should not
have been there.

My thinking is that the body processing methods of the response
body processor should be simply allowed to block. That exerts back pressure
in HTTP/1 by the socket buffer filling up and TCP flow control kicking in.
For HTTP/2, the protocol's own buffering and window management functions 
achieve the same
thing.

For sending, we already control the rate of outgoing data by how regularly
the implementation pulls data from the processor.

What the isFast() idea was getting at was a way to tag a particular
processor as non-blocking and therefore optimizable. But, I think there may
be better ways to do it, and I was planning to use internal APIs
for the response processor implementations I am providing for now.

> In my experience, one important use case to design the API for request
> (and response) body processing is the proxy use case.
> In particular, the API must allow an implementation to provide backpressure.
> Imagine a proxy, where HttpClient is used in a webapp to proxy
> requests from a client to an upstream server, reading  byte[] from
> ServletInputStream and sending these bytes to the upstream server.
> If writing to the server is slow and blocks, we don't want to read
> anything from the client until we know we can write more to the
> server, thus providing backpressure.
> This typically means that we need some sort of "callback" associated
> with the ByteBuffer to signal that the ByteBuffer has been written to
> the server, and resume reading.
>
> The same is true for the response side in HttpResponseBodyProcessor.
> The semantic of isFast() is unclear to me, and I think it's an
> implementation detail that surfaces into the API, but that is actually
> hiding the real issue, which is again backpressure.
> I think that if the API takes into account backpressure properly,
> isFast() would not be needed.
>
> A typical solution is to use a Callback interface (or
> CompletableFuture), so that the pair (ByteBuffer, Callback) is
> considered a single unit of what is provided for reads and writes.
> Another solution would be to look at the reactive stream APIs that
> seem to be scheduled for inclusion in JDK 9 via the Flow class
> proposed by Doug Lea.
> I'm sure there are other solutions, but I strongly recommend to
> implement a proxy using HttpClient and be sure to provide
> backpressure, both on the read side and the write side.
> The exercise of implementing a proxy is revelatory for the API design.

I agree that is a useful exercise.

> With the current APIs, I don't think it's possible to provide
> backpressure. On the response side, the content from the upstream
> server will arrive to onResponseBodyChunk(), where it is written
> asynchronously to a slow client.
> Returning from onResponseBodyChunk() has the implicit meaning that the
> ByteBuffer has been fully processed, but that is not true because the
> write to the client was asynchronous and may be pending.
> Upon returning from onResponseBodyChunk() it would be allowed to read
> more from the server, and call again onResponseBodyChunk(), which is
> not ready to write to the client yet because the previous write is
> still pending, so we will have to buffer, and this will cause lack of
> backpressure and memory is blown in case of large downloads.
> A similar case may be constructed for the read side.
> The solution to block inside onResponseBodyChunk() I don't think it's
> acceptable; the effort to change the API to support backpressure is
> not that large, so I'd strongly encourage to look at this change.
>
> == Missing "end exchange" event ?
> The end of the request *and* response event is crucial for
> applications to be able to send additional requests (and crucial in
> proxies too).
> HttpRequest.send*() methods have the semantic of waiting for the
> response headers, but that does not mean that the request is finished:
> a big upload may still be in progress.
> Is writing your own HttpRequestBodyProcessor and implement
> onRequestBodyComplete() / onRequestBodyError() the only way to know if
> a request is complete ?
> The javadocs for HttpRequest, in the async example, seem to assume
> that if the response are arrived, then the requests are finished too,
> but that is not necessarily the case.
> The line:
>
>        String lastResponse = last.join(); // -- wait for all work to complete
>
> is just an indication that the responses completed, not that the
> requests completed as well: they may still be uploading content to the
> server.
> My experience here is that applications are way more interested in
> "end of both request and response (exchange)" event than in "end of
> response", so I think the API should expose such facility that is
> currently missing.

Yes, the API is designed that responses are handled in two stages
response headers and then response body.

Using the asynchronous API it is easy to create a composition of
the two stages and have one CompletableFuture object complete
after both stages. Probably, the most useful will be CompletableFuture<T>
where T is the actual body type. Similar can be achieved with the blocking
API.

> == HttpResponse
> A nit: HttpResponse.responseCode() seems a repetition of "response",
> since other methods don't have the "response" qualifier (e.g. it's
> HttpResponse.headers(), not HttpResponse.responseHeaders()), so I'd
> better see it renamed to statusCode() - which is also the way it's
> defined in RFC 7230.

Hmm. Ok I'll think about that one.

> == Aborting
> There is currently no ability to abort an exchange.
> My experience is that providing such feature complicates the
> implementation by *a lot*, since the abort may come from any thread at
> any time.
> However, it's a feature that has always been requested in my
> experience... not sure if it is in scope for this effort.

Cancellation is part of the CompletableFuture API
but not in the blocking API currently. HTTP/2 should be
able to implement it by resetting streams.

> == HttpRequest.Builder
> The semantic of headers(String...) is not clear to me.
> It's headers(name1, value1, name2, value2) or headers(name1, name2,
> value1, value2), and how does it work for multiple values ?
> Too confusing, I would remove it, and just keep header(name, value),
> which is chainable and easy to use.

I think the docs could use a simple example, but I thought
it seemed a useful addition.

eg builder.headers("Foo", "foovalue", "Bar", "barvalue).

The effect is additive. So, there can be multiple values
for any header name

> A clarification on the semantic of Builder is also required,
> especially whether it is immutable or not.
> Example:
>
> Builder b1 = client.request();
> Builder b2 = b1.header("foo", "bar");
> Builder b3 = b1.header("a", "1");
>
> Does b3 contain the header "foo" ?
> This is important when the builder is passed to methods that may
> further customize it, since they have to either return the new
> immutable version produced by the modifications, or do nothing because
> the builder is mutable.
> The code that customizes the builder *must* know whether it's immutable or not.

Builders are mutable and each method returns this. maybe that should be
spelled out better

> Also, I don't see any way to set a cookie on a request (without
> polluting the global CookieManager) apart by explicitly setting the
> HTTP header ? Setting a cookie seem a common operation that would
> benefit of an explicit API.
>
> == Timeouts
> There are no definition of timeouts, including idle timeouts and
> exchange timeouts ?

Timeouts are all handled by the CompletableFuture API

> == Proxying
> I would make sure that both a HTTP proxy (normal and CONNECT) and a
> SOCKS proxy can be implemented with the current API proposal.
> They are very different, and implementing them will make sure the API
> is correct.

Will think about. No plans to implement SOCKs but it would be
important not to preclude its support in future

> == Pipelining
> I think it's a relic of the past. Nobody uses it by default, and using
> it has serious semantic effects. I would frankly remove support for
> this feature; implementations may be configured to support it, but I
> would not expose this in the APIs.

Interesting point. It's true that pipelining has problems in Http1
(that are solved in http/2) and the api possibly will cause confusion
in an http/2 context. will think about it.

  Thanks !