RFR(S): 8223217: [TESTBUG] Create JFR tests with JMX across container boundary
mikhailo.seledtsov at oracle.com
mikhailo.seledtsov at oracle.com
Fri Feb 21 00:37:46 UTC 2020
Thank you. I will update per your suggestion.
On 2/20/20 4:36 PM, Erik Gahlin wrote:
>
> Looks good.
>
> I think null could be avoided all together and instead have something
> like this:
>
> static JMXConnector waitForJmxConnection(String host, int port) throws
> Exception {
> String urlPath = "/jndi/rmi://" + host + ":" + port + "/jmxrmi";
> JMXServiceURL url = new JMXServiceURL("rmi", "", 0, urlPath);
> while (true) {
> try {
> return JMXConnectorFactory.connect(url);
> } catch (IOException e) {
> System.out.println("establishJmxConnection() thrown
> IOException: " + e.getMessage());
> }
> Thread.sleep(1000);
> }
> }
>
> No need to send updated webrev if you change into the above.
>
> Thanks
> Erik
>
>
>
> On 2020-02-21 01:20, mikhailo.seledtsov at oracle.com wrote:
>>
>> Hi Erik
>>
>> Thank you for review. See my comments inline:
>>
>> On 2/20/20 3:48 PM, Erik Gahlin wrote:
>>> Hi Misha,
>>>
>>> Overall looks good.
>>>
>>> Does the consumer used with ProcessTools.startProcess get output one
>>> line at a time? If so, there is no risk that the test reads half an
>>> IP address.
>> Yes, it does.
>>>
>>> In waitForJmxConnection you loop until the connector is not null.
>>> This seems incorrect. I would expect JMXCinnectoryFactory to throw
>>> an exception on failure, not return null. You may want to catch the
>>> exception and retry.
>>>
>> I have updated the establishJmxConnection() method with try/catch,
>> returning null in case if IOException occurs.
>>
>>> Not sure if it matters, but if the test fails with an exception, for
>>> example in the file transfer, killContainer(containerName) will not
>>> run and get cleaned up? Is that a problem? Try-finally would ensure
>>> that it is always run.
>>
>> Good catch. It is always good to cleanup resources, otherwise
>> orphaned container will keep running. I added try/finally as you
>> recommend.
>>
>> Here is the updated webrev:
>> http://cr.openjdk.java.net/~mseledtsov/8223217.04/
>>
>>
>> Thanks,
>>
>> Misha
>>
>>>
>>> Thanks
>>> Erik
>>>
>>>> On 21 Feb 2020, at 00:23, mikhailo.seledtsov at oracle.com
>>>> <mailto:mikhailo.seledtsov at oracle.com> wrote:
>>>>
>>>> I have resumed working on this issue, after some break for other
>>>> activities.
>>>>
>>>> Here is the latest webrev:
>>>> http://cr.openjdk.java.net/~mseledtsov/8223217.03/
>>>>
>>>> Please review.
>>>>
>>>> Summary of changes since last webrev:
>>>>
>>>> - simplified the observee "EventProducer.java" to a minimum: it
>>>> communicates the self IP address, then produces events in a loop
>>>>
>>>> - the main test class TestJFRWithJMX.java now controls the
>>>> observee; when done recording, it transfers the recording over JMX
>>>> and kills the observee container
>>>>
>>>> - general cleanup
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Misha
>>>>
>>>> *
>>>> *
>>>>
>>>>
>>>> On 12/10/19 6:27 AM, Erik Gahlin wrote:
>>>>> Hi Misha,
>>>>>
>>>>> On 2019-12-05 19:06, mikhailo.seledtsov at oracle.com wrote:
>>>>>> Hi Erik,
>>>>>>
>>>>>> On 12/4/19 11:30 PM, Erik Gahlin wrote:
>>>>>>> Hi Misha,
>>>>>>>
>>>>>>> The recording may complete before:
>>>>>>>
>>>>>>> while (r.getRecordings().get(0).getState() !=
>>>>>>> RecordingState.STOPPED)
>>>>>>>
>>>>>>> is reached, or the recording may mot even be created (added to
>>>>>>> the list). This will result in an IndexOutOfBoundsException.
>>>>>> Right. I did not account for that.
>>>>>>>
>>>>>>> In my example I worked around it by looping over the list
>>>>>>> (possibly empty) and by not closing the recording I could be
>>>>>>> sure that the STOPPED state will eventually be seen. When the
>>>>>>> recording is closed it will disappear from the list.
>>>>>>
>>>>>> I will use your example then, with one change. Instead of
>>>>>> existing the test once (r.getState() == RecordingState.STOPPED)
>>>>>> is true, the test will break. If child process exits right away,
>>>>>> this will result in closing of the container, depriving the main
>>>>>> process the opportunity to transfer the recording.
>>>>>
>>>>> Agree, it will not work with transfers.
>>>>>
>>>>>> Perhaps, I should break instead of System.exit(), and then have
>>>>>> another loop to wait for all the recordings to close
>>>>>> (r.getRecordings() become empty). After stopping the recording
>>>>>> the main process transfers the recordings to the "host" system
>>>>>> from the container, and then closes the recording. This will
>>>>>> signal the "child/observed" process to "exit". E.g.:
>>>>>>
>>>>>> // Wait for recording to be transferred and be closed
>>>>>> while
>>>>>> (!FlightRecorder.getFlightRecorder().getRecordings().isEmpty()) {
>>>>>> Thread.sleep(100);
>>>>>> }
>>>>>>
>>>>>> System.out.println("Leaving main()");
>>>>>>
>>>>>> Let me know if you see any problems with this approach.
>>>>>
>>>>> The transfer may complete before the child process sees the
>>>>> STOPPED state, which means it will be stuck in the first loop. I
>>>>> think you need a state(i.e a second recording or some other
>>>>> mechanism) that is not associated with the recording that you
>>>>> want to operate on.
>>>>>
>>>>> Erik
>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Misha
>>>>>>
>>>>>>> If you think the test recording should be closed, you could find
>>>>>>> some other means to communicate to the child process. For
>>>>>>> example, create two new recordings when the test recording has
>>>>>>> been closed and then in the loop check the count and if it is
>>>>>>> two exit the application.
>>>>>>>
>>>>>>> Erik
>>>>>>>
>>>>>>>> On 5 Dec 2019, at 05:53, mikhailo.seledtsov at oracle.com wrote:
>>>>>>>>
>>>>>>>> Hi Erik,
>>>>>>>> I have followed your suggestions, but have modified the
>>>>>>>> code a bit in the event producer class from
>>>>>>>> your original. First the child "event producer" process
>>>>>>>> waits for the JFR system/recorder to be initialized,
>>>>>>>> then waits for recording to stop, and after that waiting
>>>>>>>> for recording(s) to close; the main test process
>>>>>>>> will transfer recording first, then close it.
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>> http://cr.openjdk.java.net/~mseledtsov/8223217.02/
>>>>>>>>
>>>>>>>> I ran the test 20 times in our test lab, all PASS
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Misha
>>>>>>>>
>>>>>>>> On 12/4/19 10:02 AM, Mikhailo Seledtsov wrote:
>>>>>>>>> Hi Erik,
>>>>>>>>>
>>>>>>>>> Thank you for reviewing the test, and thanks for
>>>>>>>>> recommendations. I will make the updates according to your
>>>>>>>>> recommendations, and will post updated webrev after some testing.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Misha
>>>>>>>>>
>>>>>>>>> On 12/3/19, 11:46 AM, Erik Gahlin wrote:
>>>>>>>>>> Hi Misha,
>>>>>>>>>>
>>>>>>>>>> Looks good overall, but it seems brittle to expect that the
>>>>>>>>>> test will finish within the 10 s the EvenyGenerator runs
>>>>>>>>>>
>>>>>>>>>> Maybe you could make it deterministic by (for example)
>>>>>>>>>> checking if the recording has stopped:
>>>>>>>>>>
>>>>>>>>>> int i = 0;
>>>>>>>>>> while (true) {
>>>>>>>>>> SimpleEvent event = new SimpleEvent();
>>>>>>>>>> event.msg = "Hello";
>>>>>>>>>> event.count = i++;
>>>>>>>>>> event.commit();
>>>>>>>>>> Thread.sleep(10);
>>>>>>>>>> if (FlightRecorder.isInitialized()) {
>>>>>>>>>> for (Recording r :
>>>>>>>>>> FlightRecorder.getFlightRecorder().getRecordings()) {
>>>>>>>>>> if (r.getState() == RecordingState.STOPPED) {
>>>>>>>>>> System.exit(0);
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> and then not close the recording in the test?
>>>>>>>>>>
>>>>>>>>>> The test could also be made a bit more realistic if you run a
>>>>>>>>>> default recording
>>>>>>>>>>
>>>>>>>>>> bean.setPredefinedConfiguration(recordingId, "default");
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Erik
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019-12-03 05:06, mikhailo.seledtsov at oracle.com wrote:
>>>>>>>>>>> Please review this change that tests JFR operations across
>>>>>>>>>>> container boundary while using JMX to control JFR.
>>>>>>>>>>> The test starts a container running a simple event producer,
>>>>>>>>>>> establishes a JMX connection, uses FlightRecorderMXBean to
>>>>>>>>>>> start the recording, stop and dump the recording, then
>>>>>>>>>>> transfers recording from the container.
>>>>>>>>>>> Simple verification is performed on the transferred
>>>>>>>>>>> recording, making sure that the expected test event is present.
>>>>>>>>>>>
>>>>>>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8223217
>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8223217.01/
>>>>>>>>>>> Testing:
>>>>>>>>>>> 1. Ran the newly created test at least 30 times on
>>>>>>>>>>> a Linux-x64 host with docker engine configured - PASSED
>>>>>>>>>>> 2. Ran the test at least 20 times on a test
>>>>>>>>>>> cluster: ALL PASS
>>>>>>>>>>> 3. Ran all hotspot/containers tests on the test
>>>>>>>>>>> cluster: PASS
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thank you,
>>>>>>>>>>> Misha
>>>>>>>>>>>
>>>
More information about the hotspot-jfr-dev
mailing list