Connection timeouts in JEP 321 (HTTP Client)

Wed Aug 1 19:49:34 UTC 2018

Simone,

> On 1 Aug 2018, at 20:20, Simone Bordet <simone.bordet at gmail.com> wrote:
> 
> Hi,
> 
> On Wed, Aug 1, 2018 at 5:10 PM Chris Hegarty <chris.hegarty at oracle.com> wrote:
>> Before trying to draft the javadoc, the semantic we want for this
>> overall timeout is that it should approximately related to the observed
>> time spent in send[Async].
> 
> I'm not sure I agree, especially for sendAsync(), which should setup
> things for sending and then immediately return.

The CompletableFuture returned by sendAsync must complete
exceptionally with an HttpTimeoutException, say if the response is
not received, before the timeout elapses. I would expect that if one
was to observe the amount of time for the CF to complete it should
be approximately equal to the overall timeout.

>> Ignoring connect for now, this means that
>> the timer starts from the initial DNS lookup/TCP connection
> 
> Confused. You are not ignoring the connect if you take into account
> DNS and TCP connection establishment.

Sorry, I meant ignoring an explicit `connectTimeout` method for now, 
then the amount of time spent setting up the connection should be
considered part of the overall time spent for the timer. More on this
below.

>> and ends
>> when the complete request has been sent and the complete response
>> has been received. This seems like the most useful overall timeout
>> semantic.
> 
> Indeed.
> Note that it's indeed request *and* response completion, including
> data for both.

Right. I just wanted to confirm this explicitly ( since the current timeout
implementation stops the timer after the response headers are
received ). Changing this will take time.

> That means if the request takes longer than the response, it'll
> possible trigger _after_ the complete response has arrived to the
> client.

True, at which point there may not be a mechanism available to
report the error, beyond cancelling the request body’s publisher,
which may be enough in itself to indicate that the whole request
body was not sent.

>> There is a question about whether or not the response body is part
>> of the timeout. It currently is not since the body handler API has the
>> ability to either stream the response body or return it as a higher-level
>> Java Object, e.g String, etc.
> 
> IMHO it should be.

I think so too. I’m concerned though that attempting to change this 
now could destabilise things.  I’m working on a prototype, but we may
need a backup evolutionary plan that allows this to be adjusted post
JDK 11.

> Response content is where things may go really
> slow, as the HTTP response headers are typically delivered in just one
> TCP frame (for both HTTP1 and HTTP2).

Right.

>> I agree with adding `connectTimeout` to HttpClient.
> 
> Great.

Ok. We have agreement here.

>> The low-level details
>> of what constitutes a new underlying connection is somewhat known. I
>> agree that it should take into account such things as DNS lookup, TCP
>> connect, and SSL handshake.
> 
> As I said, I'm not sure about the latter, but either works for me.

Ok.

>> Summary of modes:
>> 
>>  1) HttpRequest `timeout` only - times both connect ( if any ) and
>>     until both the request and the response are fully send and received,
>>     respectively.
> 
> Yes, including - for both - their respective content.

Yes. 

>>  2) HttpClient `connectTimeout` only - times only the connect phase,
>>     if any. Timer stops after connection has been established.
> 
> Yes.
> 
>>  3) HttpClient `connectTimeout` and HttpRequest `timeout` - start
>>     connect phase timer at DNS lookup, stop at TCP connect finished
>>     or SSL handshake complete. Start request specific timer shortly
>>     after connect timer stops, i.e. just before the HTTP request is sent.
>>     Stop request specific timer when both the request and the response
>>     are fully send and received, respectively.
> 
> Seems to me that this contradicts what you said before (total timeout
> includes the connectTimeout).

What I meant was: Total timeout includes the time to connect if an explicit
connect timeout is not specified. If an explicit connect timeout is specified
then the connect part is timed by it, and the total timeout only starts after
the connection is setup. Anyway, what you outlined below may be more
straight forward.

> Given exchange = request+response, then:
> 1. send request
> 2. start "total" timer
> 3a. need to open connection, start "connectTimeout" timer and try to
> open connection; if connectTimeout elapses, "total" timeout is
> cancelled, and exchange is failed; if connection is opened, cancel
> connectTimeout.
> 3b. connection available, send request.
> 3c. connections exist, but none is available to send (all busy with
> other exchanges and cannot open a new connection due to capping of
> number of connections to same origin), request is queued.
> 4. "total" timeout elapses, exchange can be: queued, sending or
> receiving; in all cases fail the exchange.
> 5. exchange completes in time, cancel "total" timeout.

Implementing the connect timeout part of this seems reasonably straight
forward, and I have a few prototypes that I hacked on over the weekend.
They are stable and well tested ( some work remains on intermediate
responses/requests, like authentication and redirection ).

What this discussion has uncovered is the, justifiable, desire to time the
complete response body. The current timeout mechanism does not do
this, and has never done this ( the timer stops once the response headers
are received ). I want to prototype changing this to cover the complete
response body and see how it goes.

-Chris.