RFR(S): 8223217: [TESTBUG] Create JFR tests with JMX across container boundary

Erik Gahlin erik.gahlin at oracle.com
Fri Feb 21 00:36:39 UTC 2020


Looks good.

I think null could be avoided all together and instead have something 
like this:

static JMXConnector waitForJmxConnection(String host, int port) throws 
Exception {
         String urlPath = "/jndi/rmi://" + host + ":" + port + "/jmxrmi";
         JMXServiceURL url = new JMXServiceURL("rmi", "", 0, urlPath);
         while (true) {
             try {
                 return JMXConnectorFactory.connect(url);
             } catch (IOException e) {
                 System.out.println("establishJmxConnection() thrown 
IOException: " + e.getMessage());
             }
             Thread.sleep(1000);
         }
     }

No need to send updated webrev if you change into the above.

Thanks
Erik



On 2020-02-21 01:20, mikhailo.seledtsov at oracle.com wrote:
>
> Hi Erik
>
>   Thank you for review. See my comments inline:
>
> On 2/20/20 3:48 PM, Erik Gahlin wrote:
>> Hi Misha,
>>
>> Overall looks good.
>>
>> Does the consumer used with ProcessTools.startProcess get output one 
>> line at a time? If so, there is no risk that the test reads half an 
>> IP address.
> Yes, it does.
>>
>> In waitForJmxConnection you loop until the connector is not null. 
>> This seems incorrect. I would expect JMXCinnectoryFactory to throw an 
>> exception on failure, not return null. You may want to catch the 
>> exception and retry.
>>
> I have updated the establishJmxConnection() method with try/catch, 
> returning null in case if IOException occurs.
>
>> Not sure if it matters, but if the test fails with an exception, for 
>> example in the file transfer,  killContainer(containerName) will not 
>> run and get cleaned up? Is that a problem? Try-finally would ensure 
>> that it is always run.
>
> Good catch. It is always good to cleanup resources, otherwise orphaned 
> container will keep running. I added try/finally as you recommend.
>
> Here is the updated webrev: 
> http://cr.openjdk.java.net/~mseledtsov/8223217.04/
>
>
> Thanks,
>
> Misha
>
>>
>> Thanks
>> Erik
>>
>>> On 21 Feb 2020, at 00:23, mikhailo.seledtsov at oracle.com 
>>> <mailto:mikhailo.seledtsov at oracle.com> wrote:
>>>
>>> I have resumed working on this issue, after some break for other 
>>> activities.
>>>
>>> Here is the latest webrev: 
>>> http://cr.openjdk.java.net/~mseledtsov/8223217.03/
>>>
>>> Please review.
>>>
>>> Summary of changes since last webrev:
>>>
>>>   - simplified the observee "EventProducer.java" to a minimum: it 
>>> communicates the self IP address, then produces events in a loop
>>>
>>>   - the main test class TestJFRWithJMX.java now controls the 
>>> observee; when done recording, it transfers the recording over JMX 
>>> and kills the observee container
>>>
>>>   - general cleanup
>>>
>>>
>>> Thanks,
>>>
>>> Misha
>>>
>>> *
>>> *
>>>
>>>
>>> On 12/10/19 6:27 AM, Erik Gahlin wrote:
>>>> Hi Misha,
>>>>
>>>> On 2019-12-05 19:06, mikhailo.seledtsov at oracle.com wrote:
>>>>> Hi Erik,
>>>>>
>>>>> On 12/4/19 11:30 PM, Erik Gahlin wrote:
>>>>>> Hi Misha,
>>>>>>
>>>>>> The recording may complete before:
>>>>>>
>>>>>> while (r.getRecordings().get(0).getState() != 
>>>>>> RecordingState.STOPPED)
>>>>>>
>>>>>> is reached, or the recording may mot even be created (added to 
>>>>>> the list). This will result in an IndexOutOfBoundsException.
>>>>> Right. I did not account for that.
>>>>>>
>>>>>> In my example I worked around it by looping over the list 
>>>>>> (possibly empty) and by not closing the recording I could be sure 
>>>>>> that the STOPPED state will eventually be seen. When the 
>>>>>> recording is closed it will disappear from the list.
>>>>>
>>>>> I will use your example then, with one change. Instead of existing 
>>>>> the test once (r.getState() == RecordingState.STOPPED) is true, 
>>>>> the test will break. If child process exits right away, this will 
>>>>> result in closing of the container, depriving the main process the 
>>>>> opportunity to transfer the recording. 
>>>>
>>>> Agree, it will not work with transfers.
>>>>
>>>>> Perhaps, I should break instead of System.exit(), and then have 
>>>>> another loop to wait for all the recordings to close 
>>>>> (r.getRecordings() become empty). After stopping the recording the 
>>>>> main process transfers the recordings to the "host" system from 
>>>>> the container, and then closes the recording. This will signal the 
>>>>> "child/observed" process to "exit". E.g.:
>>>>>
>>>>>         // Wait for recording to be transferred and be closed
>>>>>         while 
>>>>> (!FlightRecorder.getFlightRecorder().getRecordings().isEmpty()) {
>>>>>             Thread.sleep(100);
>>>>>         }
>>>>>
>>>>>         System.out.println("Leaving main()");
>>>>>
>>>>> Let me know if you see any problems with this approach.
>>>>
>>>> The transfer may complete before the child process sees the STOPPED 
>>>> state, which means it will be stuck in the first loop. I think you 
>>>> need a state(i.e a second recording or some other mechanism)  that 
>>>> is not associated with the recording that you want to operate on.
>>>>
>>>> Erik
>>>>
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Misha
>>>>>
>>>>>> If you think the test recording should be closed, you could find 
>>>>>> some other means to communicate to the child process. For 
>>>>>> example, create two new recordings when the test recording has 
>>>>>> been closed and then in the loop check the count and if it is two 
>>>>>> exit the application.
>>>>>>
>>>>>> Erik
>>>>>>
>>>>>>> On 5 Dec 2019, at 05:53, mikhailo.seledtsov at oracle.com wrote:
>>>>>>>
>>>>>>> Hi Erik,
>>>>>>>      I have followed your suggestions, but have modified the 
>>>>>>> code a bit in the event producer class from
>>>>>>>      your original. First the child "event producer" process 
>>>>>>> waits for the JFR system/recorder to be initialized,
>>>>>>>      then waits for recording to stop, and after that waiting 
>>>>>>> for recording(s) to close; the main test process
>>>>>>>      will transfer recording first, then close it.
>>>>>>>
>>>>>>>      Updated webrev: 
>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8223217.02/
>>>>>>>
>>>>>>>      I ran the test 20 times in our test lab, all PASS
>>>>>>>
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Misha
>>>>>>>
>>>>>>> On 12/4/19 10:02 AM, Mikhailo Seledtsov wrote:
>>>>>>>> Hi Erik,
>>>>>>>>
>>>>>>>>    Thank you for reviewing the test, and thanks for 
>>>>>>>> recommendations. I will make the updates according to your 
>>>>>>>> recommendations, and will post updated webrev after some testing.
>>>>>>>>
>>>>>>>>
>>>>>>>> Misha
>>>>>>>>
>>>>>>>> On 12/3/19, 11:46 AM, Erik Gahlin wrote:
>>>>>>>>> Hi Misha,
>>>>>>>>>
>>>>>>>>> Looks good overall, but it seems brittle to expect that the 
>>>>>>>>> test will finish within the 10 s the EvenyGenerator runs
>>>>>>>>>
>>>>>>>>> Maybe you could make it deterministic by (for example) 
>>>>>>>>> checking if the recording has stopped:
>>>>>>>>>
>>>>>>>>> int i = 0;
>>>>>>>>> while (true) {
>>>>>>>>>      SimpleEvent event = new SimpleEvent();
>>>>>>>>>      event.msg = "Hello";
>>>>>>>>>      event.count = i++;
>>>>>>>>>      event.commit();
>>>>>>>>>      Thread.sleep(10);
>>>>>>>>>      if (FlightRecorder.isInitialized()) {
>>>>>>>>>          for (Recording r : 
>>>>>>>>> FlightRecorder.getFlightRecorder().getRecordings()) {
>>>>>>>>>              if (r.getState() == RecordingState.STOPPED) {
>>>>>>>>>                  System.exit(0);
>>>>>>>>>              }
>>>>>>>>>          }
>>>>>>>>>      }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> and then not close the recording in the test?
>>>>>>>>>
>>>>>>>>> The test could also be made a bit more realistic if you run a 
>>>>>>>>> default recording
>>>>>>>>>
>>>>>>>>> bean.setPredefinedConfiguration(recordingId, "default");
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Erik
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2019-12-03 05:06, mikhailo.seledtsov at oracle.com wrote:
>>>>>>>>>> Please review this change that tests JFR operations across 
>>>>>>>>>> container boundary while using JMX to control JFR.
>>>>>>>>>> The test starts a container running a simple event producer, 
>>>>>>>>>> establishes a JMX connection, uses FlightRecorderMXBean to 
>>>>>>>>>> start the recording, stop and dump the recording, then 
>>>>>>>>>> transfers recording from the container.
>>>>>>>>>> Simple verification is performed on the transferred 
>>>>>>>>>> recording, making sure that the expected test event is present.
>>>>>>>>>>
>>>>>>>>>>      JBS: https://bugs.openjdk.java.net/browse/JDK-8223217
>>>>>>>>>>      Webrev: http://cr.openjdk.java.net/~mseledtsov/8223217.01/
>>>>>>>>>>      Testing:
>>>>>>>>>>          1. Ran the newly created test at least 30 times on a 
>>>>>>>>>> Linux-x64 host with docker engine configured - PASSED
>>>>>>>>>>          2. Ran the test at least 20 times on a test cluster: 
>>>>>>>>>> ALL PASS
>>>>>>>>>>          3. Ran all hotspot/containers tests on the test 
>>>>>>>>>> cluster: PASS
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thank you,
>>>>>>>>>> Misha
>>>>>>>>>>
>>


More information about the hotspot-jfr-dev mailing list