RFR: 8304885: Reuse stale data to improve DNS resolver resiliency [v12]

Sergey Bylokhov serb at openjdk.org
Thu Jun 1 21:46:14 UTC 2023


On Thu, 1 Jun 2023 18:13:37 GMT, Daniel Fuchs <dfuchs at openjdk.org> wrote:

>> patch and the CSR are updated as requested.
>
> @mrserb I'm seeing the new test failing intermittently on at least 3 different platforms when run in our CI.
> 
> Here is a failure log example (this one is on macOS):
> 
> 
> ----------System.out:(1/84)----------
> The following provider will be used by current test:impl.SimpleResolverProviderImpl
> ----------System.err:(22/1638)----------
> Jun 01, 2023 2:37:32 PM testlib.ResolutionRegistry <init>
> INFO: Creating ResolutionRegistry instance from file:/.../open/test/jdk/java/net/spi/InetAddressResolverProvider/addresses.txt
> Jun 01, 2023 2:37:32 PM testlib.ResolutionRegistry parseDataFile
> INFO: Constructed addresses registry:
> 	javaTest.org: /1.2.3.4 /ca:fe:ba:be:0:0:0:1 
> 
> Jun 01, 2023 2:37:32 PM impl.SimpleResolverProviderImpl$1 lookupByName
> INFO: Looking-up addresses for 'javaTest.org'. Lookup characteristics:111
> Jun 01, 2023 2:37:32 PM testlib.ResolutionRegistry lookupHost
> INFO: Looking-up 'javaTest.org' address
> Jun 01, 2023 2:37:37 PM impl.SimpleResolverProviderImpl$1 lookupByName
> INFO: Looking-up addresses for 'javaTest.org'. Lookup characteristics:111
> java.lang.AssertionError: Only one positive lookup is expected with caching enabled expected [1051960105228] but found [1046805905970]
> 	at org.testng.Assert.fail(Assert.java:99)
> 	at org.testng.Assert.failNotEquals(Assert.java:1037)
> 	at org.testng.Assert.assertEqualsImpl(Assert.java:140)
> 	at org.testng.Assert.assertEquals(Assert.java:122)
> 	at org.testng.Assert.assertEquals(Assert.java:797)
> 	at AddressesStaleCachingTest.doLookup(AddressesStaleCachingTest.java:137)
> 	at AddressesStaleCachingTest.lambda$testOnlyOneThreadIsBlockedDuringRefresh$0(AddressesStaleCachingTest.java:108)
> 	at java.base/java.lang.Thread.run(Thread.java:1583)
> STATUS:Failed.`main' threw exception: java.lang.AssertionError: Only one positive lookup is expected with caching enabled expected [1051960105228] but found [1046805905970]

@dfuch the test is updated, please take a look, it includes:

 - The fix for the slow systems when it was not enough time to complete the test.
 - The fix for the possible race in the SimpleResolverProviderImpl:
    1 Thread_1 decided to refresh the record, it takes the lock in the InetAddress.java, but do not call the ResolverProvider yet.
    2 Thread_2 decided to get a lock, fail to do that, take a stale data, and save the timestamp
    3 Thread_1 call the ResolverProvider and update the timestamp, then intentionally hangs(becouse of the blocker)
    4 Thread_2 decided to get a lock, fail to do that again, take a stale data, and save the timestamp
    5 Thread_2 fails to compare timestamps from steps 2 and 4
    It was fixed in the SimpleResolverProviderImpl, the blocked thread should update the timestamps at the end.


Note that I was not able to reproduce the problems above out of the box, but was able to do that by adding some custom delays here and there in the JDK code.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13285#issuecomment-1572828425


More information about the net-dev mailing list