RFR(s): 8228580: DnsClient TCP socket timeout

Mon Sep 9 13:04:17 UTC 2019

* Milan Mimica:

> On Thu, 5 Sep 2019 at 18:59, Florian Weimer <fweimer at redhat.com> wrote:
>>
>> But I think in the UDP case, the client will retry.  I think the total
>> timeout in the TCP case should equal the total timeout in the UDP case.
>> That's what I'm going to implement for glibc.  The difference is that in
>> the TCP case, the TCP stack will take care of the retries, not the
>> application code.
>
> I understand that, and it does make sense, but we have to put it in
> context of how current DnsClient.java works:
>             //
>             // The UDP retry strategy is to try the 1st server, and then
>             // each server in order. If no answer, double the timeout
>             // and try each server again.
>             //

Ahh.  The other option is to stick with one server and keep resending
with larger and larger timeouts.  Switching has the advantage that in
case of a server problem, you get to a working server more quickly.
Staying means that if the answer is delayed and you resend exactly the
same query, you might still pick up the answer to the original query and
process it, after the first timeout.

> Fallback to TCP happens within this process. Going immediately with
> timeout*2^maxRetry could yield significantly larger delays, if there
> happens to be some other server on the list that works better.
> I would rather look into reusing TCP connections, not to close them immediately.

But we know that the server is up because it responded our UDP, so
waiting more than one second for the TCP handshake to succeed might
worthwhile, yes.

> What about read() and non-handshake TCP retransmissions? Do those
> usually happen faster?

I think so, yes.

Thanks,
Florian