Connection timeouts in JEP 321 (HTTP Client)

Fri Aug 3 17:33:12 UTC 2018

We’ve almost come to agreement on this ( at least as much as is possible
without further prototyping ). The point where we reached, which I think is
a good solution, leaves connect timeout and response timeout as
orthogonal mechanisms. I would like to separate the issues.

1) 8208391 [1] will be used to implement and support a connect timeout
    as discussed below. I’ll send a separate review request for this.

2) 8208693 [2] will track the issue around the existing timeout and 
    the possibility of extending the current implementation to cover the 
    request+body AND response+body

-Chris.

[1] https://bugs.openjdk.java.net/browse/JDK-8208391
[2]https://bugs.openjdk.java.net/browse/JDK-8208693

> On 1 Aug 2018, at 20:49, Chris Hegarty <chris.hegarty at oracle.com> wrote:
> 
> Simone,
> 
>> On 1 Aug 2018, at 20:20, Simone Bordet <simone.bordet at gmail.com> wrote:
>> 
>> Hi,
>> 
>> On Wed, Aug 1, 2018 at 5:10 PM Chris Hegarty <chris.hegarty at oracle.com> wrote:
>>> Before trying to draft the javadoc, the semantic we want for this
>>> overall timeout is that it should approximately related to the observed
>>> time spent in send[Async].
>> 
>> I'm not sure I agree, especially for sendAsync(), which should setup
>> things for sending and then immediately return.
> 
> The CompletableFuture returned by sendAsync must complete
> exceptionally with an HttpTimeoutException, say if the response is
> not received, before the timeout elapses. I would expect that if one
> was to observe the amount of time for the CF to complete it should
> be approximately equal to the overall timeout.
> 
>>> Ignoring connect for now, this means that
>>> the timer starts from the initial DNS lookup/TCP connection
>> 
>> Confused. You are not ignoring the connect if you take into account
>> DNS and TCP connection establishment.
> 
> Sorry, I meant ignoring an explicit `connectTimeout` method for now, 
> then the amount of time spent setting up the connection should be
> considered part of the overall time spent for the timer. More on this
> below.
> 
>>> and ends
>>> when the complete request has been sent and the complete response
>>> has been received. This seems like the most useful overall timeout
>>> semantic.
>> 
>> Indeed.
>> Note that it's indeed request *and* response completion, including
>> data for both.
> 
> Right. I just wanted to confirm this explicitly ( since the current timeout
> implementation stops the timer after the response headers are
> received ). Changing this will take time.
> 
>> That means if the request takes longer than the response, it'll
>> possible trigger _after_ the complete response has arrived to the
>> client.
> 
> True, at which point there may not be a mechanism available to
> report the error, beyond cancelling the request body’s publisher,
> which may be enough in itself to indicate that the whole request
> body was not sent.
> 
>>> There is a question about whether or not the response body is part
>>> of the timeout. It currently is not since the body handler API has the
>>> ability to either stream the response body or return it as a higher-level
>>> Java Object, e.g String, etc.
>> 
>> IMHO it should be.
> 
> I think so too. I’m concerned though that attempting to change this 
> now could destabilise things.  I’m working on a prototype, but we may
> need a backup evolutionary plan that allows this to be adjusted post
> JDK 11.
> 
>> Response content is where things may go really
>> slow, as the HTTP response headers are typically delivered in just one
>> TCP frame (for both HTTP1 and HTTP2).
> 
> Right.
> 
>>> I agree with adding `connectTimeout` to HttpClient.
>> 
>> Great.
> 
> Ok. We have agreement here.
> 
>>> The low-level details
>>> of what constitutes a new underlying connection is somewhat known. I
>>> agree that it should take into account such things as DNS lookup, TCP
>>> connect, and SSL handshake.
>> 
>> As I said, I'm not sure about the latter, but either works for me.
> 
> Ok.
> 
>>> Summary of modes:
>>> 
>>> 1) HttpRequest `timeout` only - times both connect ( if any ) and
>>>    until both the request and the response are fully send and received,
>>>    respectively.
>> 
>> Yes, including - for both - their respective content.
> 
> Yes. 
> 
>>> 2) HttpClient `connectTimeout` only - times only the connect phase,
>>>    if any. Timer stops after connection has been established.
>> 
>> Yes.
>> 
>>> 3) HttpClient `connectTimeout` and HttpRequest `timeout` - start
>>>    connect phase timer at DNS lookup, stop at TCP connect finished
>>>    or SSL handshake complete. Start request specific timer shortly
>>>    after connect timer stops, i.e. just before the HTTP request is sent.
>>>    Stop request specific timer when both the request and the response
>>>    are fully send and received, respectively.
>> 
>> Seems to me that this contradicts what you said before (total timeout
>> includes the connectTimeout).
> 
> What I meant was: Total timeout includes the time to connect if an explicit
> connect timeout is not specified. If an explicit connect timeout is specified
> then the connect part is timed by it, and the total timeout only starts after
> the connection is setup. Anyway, what you outlined below may be more
> straight forward.
> 
>> Given exchange = request+response, then:
>> 1. send request
>> 2. start "total" timer
>> 3a. need to open connection, start "connectTimeout" timer and try to
>> open connection; if connectTimeout elapses, "total" timeout is
>> cancelled, and exchange is failed; if connection is opened, cancel
>> connectTimeout.
>> 3b. connection available, send request.
>> 3c. connections exist, but none is available to send (all busy with
>> other exchanges and cannot open a new connection due to capping of
>> number of connections to same origin), request is queued.
>> 4. "total" timeout elapses, exchange can be: queued, sending or
>> receiving; in all cases fail the exchange.
>> 5. exchange completes in time, cancel "total" timeout.
> 
> Implementing the connect timeout part of this seems reasonably straight
> forward, and I have a few prototypes that I hacked on over the weekend.
> They are stable and well tested ( some work remains on intermediate
> responses/requests, like authentication and redirection ).
> 
> What this discussion has uncovered is the, justifiable, desire to time the
> complete response body. The current timeout mechanism does not do
> this, and has never done this ( the timer stops once the response headers
> are received ). I want to prototype changing this to cover the complete
> response body and see how it goes.
> 
> -Chris.
>