8202466: Test serviceability/tmtools/jstat/GcTest01.java fails: Number of concurrent GC events is 1, but CGCT is 0

Yasumasa Suenaga yasuenag at gmail.com
Sun May 6 12:44:23 UTC 2018


Hi Chris,

> As I pointed out in my initial review, this is not code I'm that familiar with. One question I have is if it is even necessary to fail if the value is .000? It seems the assumption is that if there was 1 GC event, the total time spent will be at least .0005ms, rounded up to .001ms, so it shows up in the output as a non-zero value.

I agree with you.


> If that is the case, maybe just drop the error check.

I want to do that.
Another option(s) affects jstat output, so I think we need to CSR.


Thanks,

Yasumasa


On 2018/05/05 7:03, Chris Plummer wrote:
> Hi Yasumasa,
> 
> I just noticed that GcTest01.java and GcCauseTest03.java have also failed for this reason. I see 9 total failures between the 3 tests so far.
> 
> On 5/4/18 6:17 AM, Yasumasa Suenaga wrote:
>> Hi Chris,
>>
>> Thank you for your evaluation !
>> It is very helpful for me.
>> (I've not reproduced this issue on linux-x64 !)
>>
>>> It's just very intermittent.
>>
>> I do not yet decided how do we fix this problem.
>>
>> IMHO we can add fallback code to get raw PerfCounter data through `jcmd PerfCounter.print`.
>> But I think we shouldn't be so because that fallback (jcmd) code might hide jstat problem(s).
>>
>> Also we cannot reproduce this issue completely, so we might not check the change for this issue.
>>
>>
>> Do you have any idea?
> As I pointed out in my initial review, this is not code I'm that familiar with. One question I have is if it is even necessary to fail if the value is .000? It seems the assumption is that if there was 1 GC event, the total time spent will be at least .0005ms, rounded up to .001ms, so it shows up in the output as a non-zero value. If that is the case, maybe just drop the error check. Another option is to use something other than .000 when the true value rounds to .000. Does it have to be 3 digits? Maybe always print enough digits so you never end up with all 0's.
> 
> thanks,
> 
> Chris
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>
>> On 2018/05/04 2:08, Chris Plummer wrote:
>>> Hi Yasumasa,
>>>
>>> Here are my results. I ran the test on macosx-x64-debug 500 times using random machines. I saw the failure twice, both times on different machines. I then ran 25 times on each of the 3 macosx machines that have shown the failure, and did not see it again. So I don't think this is necessarily an issue that is more likely to turn up on one macosx machine than any other. It's just very intermittent.
>>>
>>> I then ran 200 times each on all our supported platforms. So that would be both debug and product builds on macosx, linux-x64, solaris-sparc, and windows-x64. During those runs it turned up once on linux-x64 (product, not debug), so this issue does not seem to be limited to macosx.
>>>
>>> best regards,
>>>
>>> Chris
>>>
>>> On 5/3/18 8:26 AM, Chris Plummer wrote:
>>>> Hi Yasumasa,
>>>>
>>>> I only see the one reported failure in our recent test history for this test, so it doesn't look like it happens every time. I'll try running the test on the machine it failed on, both with the binary it failed with and a more recent binary.
>>>>
>>>> best regards,
>>>>
>>>> chris
>>>>
>>>> On 5/3/18 4:48 AM, Yasumasa Suenaga wrote:
>>>>> FYI: jdk-11-ea+11 passed GcTest01.java
>>>>> https://download.java.net/openjdk/testresults/11/archives/11/diff-hotspot-10-11.txt
>>>>>
>>>>>
>>>>> On 2018/05/02 21:46, Yasumasa Suenaga wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I've tried to fix 8202466, but I do not yet certain which solution is the best for it.
>>>>>> So I want your opinion for it.
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202466
>>>>>>
>>>>>> David reports serviceability/tmtools/jstat/GcTest01.java failed in Mach5. It seems to appear on OS X only.
>>>>>> He has shared jtreg report, and I found as below:
>>>>>>
>>>>>> ```
>>>>>> stdout: S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT CGC CGCT GCT
>>>>>>   0.0 0.0 0.0 0.0 3072.0 0.0 11264.0 2825.6 9472.0 9003.3 1024.0 845.3 26 0.204 3 0.292 1 0.000 0.497
>>>>>> ```
>>>>>>
>>>>>> YGCT: 0.204
>>>>>> FGCT: 0.292
>>>>>> CGCT: 0.000
>>>>>>
>>>>>>   GCT: 0.497
>>>>>>
>>>>>> I guess this failure was caused by rounding error because (YGCT + FGCT + CGCT) < GCT.
>>>>>> CGC is 1, so CGC operation might finish in very short time.
>>>>>>
>>>>>>
>>>>>> GcTest01.java:
>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcTest01.java#l57
>>>>>>
>>>>>> GcCauseTest01.java:
>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcCauseTest01.java#l53
>>>>>>
>>>>>> GcTest01 and GcCauseTest01 are very similar, but GcCauseTest01 passed.
>>>>>> Both tests use GcProvoker::provokeGc() to inflate memory usage. So I wonder why GcTest01 just only failed.
>>>>>> I guess we might encounter similar issue(s) in the future if we get more fast machines.
>>>>>>
>>>>>> Hence I think we can take two approaches as below:
>>>>>>
>>>>>>    1. Add all tests in serviceability/tmtools/jstat to ProblemList.
>>>>>>    2. Change all JstatGc*Tool to use custom jstat_options - to show raw values in PerfCounters
>>>>>>
>>>>>>
>>>>>> What do you think?
>>>>>> I can start to work for it if they are OK.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>
>>>>
>>>
>>>
> 
> 


More information about the serviceability-dev mailing list