RFR(s): 8228580: DnsClient TCP socket timeout

Chris Yin xu.y.yin at oracle.com
Mon Sep 9 08:22:23 UTC 2019


> On 6 Sep 2019, at 9:20 PM, Pavel Rappo <pavel.rappo at oracle.com> wrote:
> 
> Milan,
> 
> In order to simulate the "Address already in use" scenario I did this:
> 
>    TcpDnsServer(int port) {
>    try {
> *       new ServerSocket(port, 0, InetAddress.getLoopbackAddress());
>        serverSocket = new ServerSocket(port, 0, InetAddress.getLoopbackAddress());
>    }
>    catch (BindException ex) {
> 
> The test failed, instead of being skipped. That's what I saw in the logs:
> 
> 
> jtreg.SkippedException: Cannot start TCP server
> 	at TcpTimeout$TcpDnsServer.<init>(TcpTimeout.java:97)
> 	at TcpTimeout.initTest(TcpTimeout.java:71)
> 	at TestBase.run(TestBase.java:48)
> 	at TcpTimeout.main(TcpTimeout.java:47)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
> 	at java.base/java.lang.Thread.run(Thread.java:830)
> Caused by: java.net.BindException: Address already in use: bind
> 	at java.base/sun.nio.ch.Net.bind0(Native Method)
> 	at java.base/sun.nio.ch.Net.bind(Net.java:469)
> 	at java.base/sun.nio.ch.Net.bind(Net.java:458)
> 	at java.base/sun.nio.ch.NioSocketImpl.bind(NioSocketImpl.java:643)
> 	at java.base/java.net.ServerSocket.bind(ServerSocket.java:355)
> 	at java.base/java.net.ServerSocket.<init>(ServerSocket.java:241)
> 	at TcpTimeout$TcpDnsServer.<init>(TcpTimeout.java:91)
> 	... 9 more
> 
> JavaTest Message: Test threw exception: jtreg.SkippedException
> JavaTest Message: shutting down test
> 
> 
> JavaTest Message: Problem cleaning up the following threads:
> Thread-0
>  at java.base at 14-internal/java.net.DualStackPlainDatagramSocketImpl.socketReceiveOrPeekData(Native Method)
>  at java.base at 14-internal/java.net.DualStackPlainDatagramSocketImpl.peekData(DualStackPlainDatagramSocketImpl.java:113)
>  at java.base at 14-internal/java.net.DatagramSocket.receive(DatagramSocket.java:746)
>  at DNSServer.receiveQuery(DNSServer.java:178)
>  at DNSServer.run(DNSServer.java:119)
> 
> 
> What's going on here? It looks like the test didn't call `cleanupTest()` (and therefore `server.stopServer()`) because the exception was thrown before it had reached `runTest()`. I think we must address this, otherwise our jtreg.SkippedException is only half measure.
> 
> I thought about how to do that. I'm thinking of going the full way up to TestBase as I believe it makes the most sense. Having said that, we'd better ask the original author (cc'ed), Chris Yin,  what he thinks about it. Here's the proposal:

The change to TestBase looks good to me. Just after try to understand what you want to achieve, some other thoughts in my mind:
1. Whether we give up too earlier when the Tcp port been used
2. Whether we put too many init related check logic in TcpDnsServer

Here is just pseudo code for your reference. First, we may have a few tries on different ports by reinit DNSServer, if all failed, then give up and skip test. Second, move socket check and throw SkppedExpection logic to initTest method, that looks more clear and TcpDnsServer could simply focus on server logic. Third, explicit cleanup for each retry fails, no need to hide cleanup in TestBase

@Override
public void initTest(String[] args) {
    for (loop a few times as you prefer as max retries) {
        super.initTest(args);
        var udpServer = (Server) env().get(DNSTestUtils.TEST_DNS_SERVER_THREAD);
        try {
            var serverSocket = new ServerSocket(udpServer.getPort(), 0, InetAddress.getLoopbackAddress());
            tcpDnsServer = new TcpDnsServer(serverSocket);
            break;
        } catch (BindException be) {
            DNSTestUtils.debug("Failed to bind server socket on port " + udpServer.getPort() + ", retry ...");
        } catch (Exception ex) {
            throw new RuntimeException("Unexpected exception during initTest", ex);
        } finally {
            if (tcpDnsServer == null) {
                super.cleanupTest();
            }
        }
    }

    if (tcpDnsServer == null) {
        throw new SkippedException("Cannot start TCP server after a few tries, skip test");
    }
}

Thanks,
Chris

> 
> 
> diff -r 53e139e55605 test/jdk/com/sun/jndi/dns/lib/TestBase.java
> --- a/test/jdk/com/sun/jndi/dns/lib/TestBase.java	Thu Sep 05 14:59:29 2019 +0100
> +++ b/test/jdk/com/sun/jndi/dns/lib/TestBase.java	Fri Sep 06 14:08:48 2019 +0100
> @@ -1,5 +1,5 @@
> /*
> - * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved.
> + * Copyright (c) 2018, 2019, Oracle and/or its affiliates. All rights reserved.
>  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>  *
>  * This code is free software; you can redistribute it and/or modify it
> @@ -44,10 +44,14 @@
>      * @throws Exception if any exception
>      */
>     public void run(String[] args) throws Exception {
> -        initRes();
> -        initTest(args);
> -        setupTest();
> -        launch();
> +        try {
> +            initRes();
> +            initTest(args);
> +            setupTest();
> +            launch();
> +        } finally {
> +            cleanupTest();
> +        }
>     }
> 
>     /**
> @@ -84,8 +88,6 @@
>             if (!handleException(e)) {
>                 throw e;
>             }
> -        } finally {
> -            cleanupTest();
>         }
>     }
> 
> @@ -108,6 +110,11 @@
> 
>     /**
>      * Cleanup test.
> +     *
> +     * This method may be called by the test at any time, including before all
> +     * the resources, initialization, and setup have been completed. This might
> +     * require careful attention. For example, before closing a resource this
> +     * method should check that resource for being {@code null}.
>      */
>     public abstract void cleanupTest();
> }
> 
> 
> -Pavel
> 
>> On 5 Sep 2019, at 11:20, Milan Mimica <milan.mimica at gmail.com> wrote:
>> 
>> On Wed, 4 Sep 2019 at 20:32, Florian Weimer <fweimer at redhat.com> wrote:
>>> 
>>> If you use the initial UDP timeout (one second, I think), the kernel
>>> will not complete the TCP handshake if the initial SYN segment is lost
>>> because the retransmit delay during the handshake is longer than that.
>> 
>> We could set a higher timeout value, but I would not prefer that.
>> After all, 1 second is just default value, and can be changed. If the
>> user wants us to give up on DNS query after the specified timeout then
>> lets do just that. True, some queries that previously worked might
>> start failing, but that is true even for slow socket.read()
>> operations.
>> 
>> New webrev: http://cr.openjdk.java.net/~mmimica/8228580/webrev.01/
>> 
>> * simplified test with no thread (nice catch Pavel)
>> * set connect timeout and account for it
>> 
>> -- 
>> Milan Mimica
> 



More information about the core-libs-dev mailing list