RFR [XS]: 8229370: make jdk/jfr/event/runtime/TestNetworkUtilizationEvent.java more stable
David Holmes
david.holmes at oracle.com
Wed Aug 28 12:40:10 UTC 2019
Hi Matthias,
On 28/08/2019 10:02 pm, Baesken, Matthias wrote:
> Hi David, thanks for reaching out to the net folks.
>
> I use now the stream of InetAddresses from NetworkInterface.networkInterfaces().flatMap(NetworkInterface::inetAddresses) ;
>
> However on the machine where I noticed the issue , I still have to do a couple of retries .
> Just one send to the stream of InetAddresses from NetworkInterface.networkInterfaces().flatMap(NetworkInterface::inetAddresses) does not always give
> the expected result ☹ .
That doesn't make sense to me, if one of the interfaces is the real
network interface and we write to the real IP address. I think we need
to add some diagnostics to see exactly which interface and IP address we
are writing to, and then see what each interface reports on the read
back of the data.
Thanks,
David
> New webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8229370.2/
>
>
> Thanks, Matthias
>
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Mittwoch, 28. August 2019 09:41
>> To: mikhailo.seledtsov at oracle.com; Baesken, Matthias
>> <matthias.baesken at sap.com>
>> Cc: 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>;
>> hotspot-jfr-dev at openjdk.java.net
>> Subject: Re: RFR [XS]: 8229370: make
>> jdk/jfr/event/runtime/TestNetworkUtilizationEvent.java more stable
>>
>> Hi Misha,
>>
>> On 28/08/2019 12:53 pm, mikhailo.seledtsov at oracle.com wrote:
>>>
>>> On 8/27/19 6:14 PM, David Holmes wrote:
>>>> On 28/08/2019 6:47 am, Mikhailo Seledtsov wrote:
>>>>> On 8/27/19, 1:15 AM, Baesken, Matthias wrote:
>>>>>> Hi David, thanks for the info about
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8228990
>>>>>>
>>>>>>
>>>>>> regarding your comment in the bug :
>>>>>>
>>>>>>> So it makes no sense. I finally found an example where the test
>>>>>>> passed and failed on the same machine.
>>>>>> I've seen this too .
>>>>>>
>>>>>> Looks like my change only increased the probability of
>>>>>> incidental network traffic happening on the real network interfaces .
>>>>>>
>>>>>> Should we exclude the test, in the current state it might indeed be
>>>>>> problematic .
>>>>>>
>>>>>> (otherwise we could make the test pass on Linux when just 1
>>>>>> network interface is found, this might be a legitimate case isn’t
>>>>>> it ?)
>>>>> Based on David's analysis in the "JDK-8228990: [TESTBUG] JFR
>>>>> TestNetworkUtilizationEvent.java expects 2+ Network interfaces on
>>>>> Linux but finding 1", my opinion is to remove the check for number of
>>>>> interfaces all together. Or just check that there is 1 interface.
>>>>
>>>> The test expects there to be two interfaces always present: the
>>>> loopback interface and a real network interface. There could be
>>>> additional ones. The problem is that the test fails to generate
>>>> traffic on the real network interface due to the use of
>>>> 10.0.0.0:12345. I have no idea why someone thought sending a packet to
>>>> that address would necessarily cause the kind of traffic that would
>>>> show up in the JFR event.
>>>>
>>>> Are we really likely to be running this test on a machine without a
>>>> real network interface or the loopback interface? The former seems
>>>> very unlikely. The latter may be something configurable but it seems
>>>> very unlikely to me that anyone would configure a test system that
>>>> way. So I don't think the "expected number of interfaces" is the
>>>> issue. The issue is generating observable traffic on the real network
>>>> interface - at least that is what we see in our test failures (the
>>>> output for the "lo" interface is always present).
>>> Thank you for detailed explanation. Sorry, I did not understand this at
>>> first.
>>>>
>>>> So it should be as simple as changing 10.0.0.0:12345 into something
>>>> guaranteed to work?
>>>>
>>>> I think this needs to be looked at by the JFR folk and net-dev folk to
>>>> come up with a valid testing scenario.
>>>
>>> Perhaps, we can problem list the test for now, until we find a good
>>> solution. Several options come to mind:
>>>
>>> - pick a suitable destination address for "real network interface"
>>> and test it first; if no traffic is generated after sufficient retries,
>>> return jtreg.SkippedException (test skipped), instead of failure
>>>
>>> - try a range of suitable destination addresses; also have a
>>> fallback to jtreg.SkippedException if none of them work
>>>
>>> - in addition, a suitable address can be passed as a test property;
>>> this way test will be configurable for a given infrastructure
>>>
>>>
>>> What do you think?
>>
>> I spoke to the net folk and one thing that should be guaranteed to work
>> is to write to the IP address of the machine itself (the real IP address
>> not the loopback address). Now to get that without incurring a name
>> resolution we need to get the network interfaces. From Alan Bateman:
>>
>> NetworkInterface.networkInterfaces().flatMap(NetworkInterface::inetAddr
>> esses)
>> will give you a stream of the InetAddress objects.
>>
>> So we could write to one or more of those addresses and then check for
>> the utilization info.
>>
>> David
>> -----
>>
>>> Thank you,
>>>
>>> Misha
>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> Misha
>>>>>>
>>>>>>
>>>>>> Best regards, Matthias
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: David Holmes<david.holmes at oracle.com>
>>>>>>> Sent: Dienstag, 27. August 2019 09:56
>>>>>>> To: Baesken, Matthias<matthias.baesken at sap.com>; 'hotspot-
>>>>>>> dev at openjdk.java.net'<hotspot-dev at openjdk.java.net>; hotspot-
>> jfr-
>>>>>>> dev at openjdk.java.net
>>>>>>> Subject: Re: RFR [XS]: 8229370: make
>>>>>>> jdk/jfr/event/runtime/TestNetworkUtilizationEvent.java more stable
>>>>>>>
>>>>>>> Hi Matthias,
>>>>>>>
>>>>>>> On 27/08/2019 5:41 pm, Baesken, Matthias wrote:
>>>>>>>> Hello, any reviews for this small change ?
>>>>>>> I missed the initial request - sorry.
>>>>>>>
>>>>>>> Seems we have a double up of effort here as we also have
>>>>>>> JDK-8228990 for
>>>>>>> the exact same problem that we see on some of our test machines.
>>>>>>>
>>>>>>> Our analysis suggests that this test often passes by accident due to
>>>>>>> incidental activity on the real network interface when the logic
>>>>>>> intended to generate that activity (the packet sent to 10.0.0.0:12345)
>>>>>>> actually had no affect (unreachable address). If there is no
>>>>>>> incidental
>>>>>>> network activity then the real network interface is not seen and so
>>>>>>> the
>>>>>>> test fails.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks , Matthias
>>>>>>>>
>>>>>>>> From: Baesken, Matthias
>>>>>>>> Sent: Montag, 12. August 2019 14:33
>>>>>>>> To: 'hotspot-dev at openjdk.java.net'<hotspot-
>> dev at openjdk.java.net>;
>>>>>>> 'hotspot-jfr-dev at openjdk.java.net'<hotspot-jfr-
>> dev at openjdk.java.net>
>>>>>>>> Subject: RFR [XS]: 8229370: make
>>>>>>> jdk/jfr/event/runtime/TestNetworkUtilizationEvent.java more stable
>>>>>>>> Hello, please review this small test enhancement.
>>>>>>>>
>>>>>>>> We noticed that on some of our Linux machines (SLES12 based) the
>>>>>>> TestNetworkUtilizationEvent.java test reported just 1 interface
>>>>>>>> (the test TestNetworkUtilizationEvent.java expects more than 1 on
>>>>>>>> Linux).
>>>>>>>>
>>>>>>>> Looking into the HS code , os_perf_linux.cpp collects the
>>>>>>>> interfaces +
>>>>>>> additional information about bytes read/written (by looking at
>>>>>>> /sys/class/net/eth<X>/statistics/<countername> )
>>>>>>>> and this info is given to JFR .
>>>>>>>>
>>>>>>>> However it seems to need (at least on some machines / setups)
>> more
>>>>>>> packet send operations / potential retries to really get counter
>>>>>>> updates
>>>>>>> (and without updates in the counters, no interfaces are found).
>>>>>>>> So I adjusted the test accordingly.
>>>>>>>>
>>>>>>>>
>>>>>>>> Bug/webrev :
>>>>>>>>
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8229370
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8229370.0/
>>>>>>>>
>>>>>>>>
>>>>>>>> Best regards, Matthias
>>>>>>>>
More information about the hotspot-jfr-dev
mailing list