RFR [XS]: 8229370: make jdk/jfr/event/runtime/TestNetworkUtilizationEvent.java more stable

David Holmes david.holmes at oracle.com
Sun Sep 29 00:17:00 UTC 2019


Hi Matthias,

On 27/09/2019 8:56 pm, Baesken, Matthias wrote:
> Hi David /  Mikhailo ,  I  adjusted the test a bit more , and also  added   (+enabled) UL-based   jfr,event  tracing  in src/hotspot/share/jfr/periodic/jfrNetworkUtilization.cpp
>    to better see  the recorded event information .
> 
> The current revision
> 
> http://cr.openjdk.java.net/~mbaesken/webrevs/8229370.3/
> 
> sends   DatagramPackets   to     all     InetAddresses     of    all   network interfaces  of the machine  .
> I observed  that on our "problematic" machine  where the test  fails  we still need a little  delay   to see the   read / write   counters   (fetched by os_perf and then used in the JFR)
>     increase on the machine ( that’s why I wait a bit before every send operation).
> 
> Could you  please  check   8229370.3    also in your infrastructure where you noticed   sporadic failures   in  jdk/jfr/event/runtime/TestNetworkUtilizationEvent.java   and tell me
>   about the results ?

I've submitted a test run to our system.

I'm unclear about the details of the test. Does this:

  77         Stream<InetAddress> si = 
NetworkInterface.networkInterfaces().flatMap(NetworkInterface::inetAddresses);

not also return the loopback address that was already tested? Could it 
return interfaces that we really don't want to be trying to test?

  88             } catch(IOException ioe) {
  89             }

Why are we silently swallowing exceptions here?

Thanks,
David

> 
> Best regards, Matthias
> 
> 
>> Subject: Re: RFR [XS]: 8229370: make
>> jdk/jfr/event/runtime/TestNetworkUtilizationEvent.java more stable
>>
>> Hi Matthias,
>>
>> On 24/09/2019 12:23 am, Baesken, Matthias wrote:
>>> Hi David /  Mikhailo , I was busy with other tasks  but today  got back to
>> 8229370 .
>>>
>>> I noticed that in the meantime,   the test was excluded  with
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8230115
>>>
>>> "Problemlist JFR TestNetworkUtilization test"
>>>
>>>
>>> Do you think we still should  rely  on the OS counters , and expect to get  2+
>> network interfaces,  or  keep  the test excluded (or just relax  the check and
>> check for 1+  network interfaces on Linux)  ?
>>
>> Exclusion is just a temporary measure to clean up the testing results,
>> so this still needs to be fixed. I have nothing further to add from my
>> comments in the bug:
>>
>>   > So it should be as simple as changing 10.0.0.0:12345 into something
>>   > guaranteed to work?
>>   >
>>   > I think this needs to be looked at by the JFR folk and net-dev folk to
>>   > come up with a valid testing scenario.
>>
>> It's not the number of interfaces that is the issue, it is generating
>> traffic on the real interface.
>>
>> Thanks,
>> David
>>
>>>
>>> Best regards, Matthias
>>>
>>>
>>>
>>>>
>>>> On 29/08/2019 12:24 am, Baesken, Matthias wrote:
>>>>> Hi David ,   I could  add some  optional  UL logging  to see   what happens.
>>>>
>>>> I just want to see more visibility at the test level to ensure it is
>>>> finding the interfaces and addresses I would expect it to find.
>>>>
>>>> David
>>>>
>>>>> Maybe the  OS counters   that are fetched by   os_perf    are not that
>>>> reliable on some  kernels .
>>>>>
>>>>>
>>>>> Best regards, Matthias
>>>>>
>>>


More information about the hotspot-dev mailing list