RFR(S): 8229209: [TESTBUG] test for cross-process JFR event streaming
Mikhailo Seledtsov
mikhailo.seledtsov at oracle.com
Thu Nov 14 00:42:54 UTC 2019
Here is the updated webrev:
http://cr.openjdk.java.net/~mseledtsov/8229209.04/
I have incorporated feedback from Igor, as well as feedback from
chatting with Erik, and some fixes.
Feedback from Erik:
- use VistualMachine.attach() instead of JCMD - preferred way,
simplifies code
- rework the crash logic in a producer a bit - simplifies code
Also, I have added exports for child process "--add-exports" for Unsafe,
added check for child process exit code, plus comment cleanup.
Thank you,
Misha
On 11/13/19, 2:12 PM, Erik Gahlin wrote:
>
> On 2019-11-13 22:38, Mikhailo Seledtsov wrote:
>> Hi Igor,
>>
>> Thank you for reviewing the test.
>>
>> The original question in your first email: I do not think it is
>> necessary to run the test in othervm mode; I will remove the
>> 'othervm' from at-run. The crashing process is the child of the main
>> test process, so this should not affect the state of the main test VM.
>>
>> On 11/12/19, 9:59 PM, Igor Ignatyev wrote:
>>> Hi Misha,
>>>
>>> I've finished reading, several more comments:
>>>
>>> 2) in consumeEvents at line #167, it'd be clearer and safer to use
>>> TestEvent.class.getName().
>> Fixed
>>> 3) instead of asking JVM which you create its system properties in
>>> getJfrRepository(), can we just start "producer" process w/
>>> jdk.jfr.repository being set to a well-know location?
>> AFAIK, JFR sub-system picks the location automatically. Part of
>> location is a base directory, e.g. "/private/var/folders/" on Linux,
>> while the second part is picked at runtime based on time stamp and
>> PID. I would like to leave this as is for now, while I start a
>> discussion with JFR team on how to improve this.
>
> It was our initial plan to provide means to set the repository on the
> command line, but we ran into issues so we decided against it, at
> least for the time being.
>
> -XX:FligtRecorderOptions:repository=... today means the base
> repository so it may break code if we change it.
>
> Adding a second repository option (what would it be called?) would
> confuse users, but also force us to deal with a situation where the
> repository already exist. This can happen if the JVM crashes, or if on
> Windows and another process is reading a file, and the directory can't
> be removed. It can also happen, if you start two JVMs with the same
> repository path.
>
> One can imagine a production deployment where a server is restarted if
> it crashes, but it will not start up again because the repository of
> the previous process was left behind.
>
>> This is a good point. As a consideration, the user should be able to
>> specify this property, or at least the base directory for JFR
>> repositories. Once this is resolved, I will update the test accordingly.
>>> 4) to make the test more robust, at line #118 I'd compare
>>> eventsDeliveredWhileProducerIsAlive w/ 1 instead of 1/2 of
>>> eventsDeliveredTotal.
>> OK.
>>> 5) at line #110 please use Asserts.assertEqual instead of
>>> Assert.assertTrue(a == b) as it not just compare the values but also
>>> prints them out in case !(a==b)
>> Fixed
>>> 6) at line #118 please use Asserts.assertGreaterThanOrEqual instead
>>> of Assert.assertTrue(a>= b) as it not just compare the values but
>>> also prints them out in case !(a>=b)
>> Fixed
>>> 7) is it really necessary to wait till "producer" process main is
>>> started?
>> This is necessary to make sure that VM has initialized to a point
>> where it can correctly receive JCMD requests. We ran into this type
>> of issue before.
>>> 7.1) if it is, you should use String::equals at line #204 instead of
>>> contains
>>> 7.2) if the answer of 7) is true, I'd recommend you to use
>>> j.u.c.CountDownLatch (w/ count = 1) instead of AtomicBoolean for
>>> producerMainStarted (you might need to find a better name, e.g.
>>> producerStartedSignal) to convey a signal that "producer" process
>>> reached its main, so line#205 will become
>>> producerMainStarted.countDown(), and line#142 --
>>> producerMainStarted.await()
>> Thanks for suggesting during our conversation the use of
>> ProcessTools.startProcess() with wait-until-predicate. I will use
>> this, it will simplify the code and be more declarative.
>>> 8) I find usage of AtomicIntegeres as a way to return two values
>>> from consumeEvents() at a very least confusing, could you please
>>> create a "record" class for that purpose instead?
>> I can do that, as we have discussed.
>>> 9) local 'id' (line #169) isn't use anywhere. will it be beneficial
>>> to check that its value is b/w 0 and (NUMBER_OF_EVENTS_PRODUCED - 1)
>>> and, for testProducerExitsNormally case, that sum of all id is
>>> ((NUMBER_OF_EVENTS_PRODUCED - 1) * NUMBER_OF_EVENTS_PRODUCED / 2) ?
>> The id is not necessary. I used it for development and
>> experimentation. I will remove it.
>>> 9.1) and if EventStream::onEvent is expected to preserve events
>>> order, that 'id' is strictly increasing?
>> No guarantee of that.
>>
>> I will make the updates, test, and post the updated webrev.
>>
>> Thank you,
>> Misha
>>>
>>> Thanks,
>>> -- Igor
>>>
>>>> On Nov 12, 2019, at 9:12 PM, Igor
>>>> Ignatyev<igor.ignatyev at oracle.com> wrote:
>>>>
>>>> Hi Misha,
>>>>
>>>> I haven't read the test code thoroughly yet, but I have a question
>>>> -- is there a reason for TestCrossProcessStreaming to be run in
>>>> othervm mode?
>>>>
>>>> Thanks,
>>>> -- Igor
>>>>
>>>>> On Nov 12, 2019, at 4:20 PM, Mikhailo
>>>>> Seledtsov<mikhailo.seledtsov at oracle.com> wrote:
>>>>>
>>>>> Please review this new JFR test. In this test the producer of JFR
>>>>> events and the consumer run in separate processes.
>>>>> This is expected to be a common scenario with JFR streaming for
>>>>> monitoring applications and services.
>>>>>
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8229209
>>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8229209.01/
>>>>> Testing:
>>>>> - ran this new test locally (Mac, Linux-x64) multiple times -
>>>>> PASS
>>>>> - running this test on the test cluster - in progress
>>>>>
>>>>>
>>>>> Thank you,
>>>>> Misha
>>>>>
More information about the hotspot-jfr-dev
mailing list