RFR(S): 8229209: [TESTBUG] test for cross-process JFR event streaming
Erik Gahlin
erik.gahlin at oracle.com
Wed Nov 13 22:12:08 UTC 2019
On 2019-11-13 22:38, Mikhailo Seledtsov wrote:
> Hi Igor,
>
> Thank you for reviewing the test.
>
> The original question in your first email: I do not think it is
> necessary to run the test in othervm mode; I will remove the 'othervm'
> from at-run. The crashing process is the child of the main test
> process, so this should not affect the state of the main test VM.
>
> On 11/12/19, 9:59 PM, Igor Ignatyev wrote:
>> Hi Misha,
>>
>> I've finished reading, several more comments:
>>
>> 2) in consumeEvents at line #167, it'd be clearer and safer to use
>> TestEvent.class.getName().
> Fixed
>> 3) instead of asking JVM which you create its system properties in
>> getJfrRepository(), can we just start "producer" process w/
>> jdk.jfr.repository being set to a well-know location?
> AFAIK, JFR sub-system picks the location automatically. Part of
> location is a base directory, e.g. "/private/var/folders/" on Linux,
> while the second part is picked at runtime based on time stamp and
> PID. I would like to leave this as is for now, while I start a
> discussion with JFR team on how to improve this.
It was our initial plan to provide means to set the repository on the
command line, but we ran into issues so we decided against it, at least
for the time being.
-XX:FligtRecorderOptions:repository=... today means the base
repository so it may break code if we change it.
Adding a second repository option (what would it be called?) would
confuse users, but also force us to deal with a situation where the
repository already exist. This can happen if the JVM crashes, or if on
Windows and another process is reading a file, and the directory can't
be removed. It can also happen, if you start two JVMs with the same
repository path.
One can imagine a production deployment where a server is restarted if
it crashes, but it will not start up again because the repository of the
previous process was left behind.
> This is a good point. As a consideration, the user should be able to
> specify this property, or at least the base directory for JFR
> repositories. Once this is resolved, I will update the test accordingly.
>> 4) to make the test more robust, at line #118 I'd compare
>> eventsDeliveredWhileProducerIsAlive w/ 1 instead of 1/2 of
>> eventsDeliveredTotal.
> OK.
>> 5) at line #110 please use Asserts.assertEqual instead of
>> Assert.assertTrue(a == b) as it not just compare the values but also
>> prints them out in case !(a==b)
> Fixed
>> 6) at line #118 please use Asserts.assertGreaterThanOrEqual instead
>> of Assert.assertTrue(a>= b) as it not just compare the values but
>> also prints them out in case !(a>=b)
> Fixed
>> 7) is it really necessary to wait till "producer" process main is
>> started?
> This is necessary to make sure that VM has initialized to a point
> where it can correctly receive JCMD requests. We ran into this type of
> issue before.
>> 7.1) if it is, you should use String::equals at line #204 instead of
>> contains
>> 7.2) if the answer of 7) is true, I'd recommend you to use
>> j.u.c.CountDownLatch (w/ count = 1) instead of AtomicBoolean for
>> producerMainStarted (you might need to find a better name, e.g.
>> producerStartedSignal) to convey a signal that "producer" process
>> reached its main, so line#205 will become
>> producerMainStarted.countDown(), and line#142 --
>> producerMainStarted.await()
> Thanks for suggesting during our conversation the use of
> ProcessTools.startProcess() with wait-until-predicate. I will use
> this, it will simplify the code and be more declarative.
>> 8) I find usage of AtomicIntegeres as a way to return two values from
>> consumeEvents() at a very least confusing, could you please create a
>> "record" class for that purpose instead?
> I can do that, as we have discussed.
>> 9) local 'id' (line #169) isn't use anywhere. will it be beneficial
>> to check that its value is b/w 0 and (NUMBER_OF_EVENTS_PRODUCED - 1)
>> and, for testProducerExitsNormally case, that sum of all id is
>> ((NUMBER_OF_EVENTS_PRODUCED - 1) * NUMBER_OF_EVENTS_PRODUCED / 2) ?
> The id is not necessary. I used it for development and
> experimentation. I will remove it.
>> 9.1) and if EventStream::onEvent is expected to preserve events
>> order, that 'id' is strictly increasing?
> No guarantee of that.
>
> I will make the updates, test, and post the updated webrev.
>
> Thank you,
> Misha
>>
>> Thanks,
>> -- Igor
>>
>>> On Nov 12, 2019, at 9:12 PM, Igor
>>> Ignatyev<igor.ignatyev at oracle.com> wrote:
>>>
>>> Hi Misha,
>>>
>>> I haven't read the test code thoroughly yet, but I have a question
>>> -- is there a reason for TestCrossProcessStreaming to be run in
>>> othervm mode?
>>>
>>> Thanks,
>>> -- Igor
>>>
>>>> On Nov 12, 2019, at 4:20 PM, Mikhailo
>>>> Seledtsov<mikhailo.seledtsov at oracle.com> wrote:
>>>>
>>>> Please review this new JFR test. In this test the producer of JFR
>>>> events and the consumer run in separate processes.
>>>> This is expected to be a common scenario with JFR streaming for
>>>> monitoring applications and services.
>>>>
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8229209
>>>> Webrev: http://cr.openjdk.java.net/~mseledtsov/8229209.01/
>>>> Testing:
>>>> - ran this new test locally (Mac, Linux-x64) multiple times -
>>>> PASS
>>>> - running this test on the test cluster - in progress
>>>>
>>>>
>>>> Thank you,
>>>> Misha
>>>>
More information about the hotspot-jfr-dev
mailing list