8202466: Test serviceability/tmtools/jstat/GcTest01.java fails: Number of concurrent GC events is 1, but CGCT is 0

Chris Plummer chris.plummer at oracle.com
Fri May 4 22:03:53 UTC 2018


Hi Yasumasa,

I just noticed that GcTest01.java and GcCauseTest03.java have also 
failed for this reason. I see 9 total failures between the 3 tests so far.

On 5/4/18 6:17 AM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> Thank you for your evaluation !
> It is very helpful for me.
> (I've not reproduced this issue on linux-x64 !)
>
>> It's just very intermittent.
>
> I do not yet decided how do we fix this problem.
>
> IMHO we can add fallback code to get raw PerfCounter data through 
> `jcmd PerfCounter.print`.
> But I think we shouldn't be so because that fallback (jcmd) code might 
> hide jstat problem(s).
>
> Also we cannot reproduce this issue completely, so we might not check 
> the change for this issue.
>
>
> Do you have any idea?
As I pointed out in my initial review, this is not code I'm that 
familiar with. One question I have is if it is even necessary to fail if 
the value is .000? It seems the assumption is that if there was 1 GC 
event, the total time spent will be at least .0005ms, rounded up to 
.001ms, so it shows up in the output as a non-zero value. If that is the 
case, maybe just drop the error check. Another option is to use 
something other than .000 when the true value rounds to .000. Does it 
have to be 3 digits? Maybe always print enough digits so you never end 
up with all 0's.

thanks,

Chris
>
>
> Thanks,
>
> Yasumasa
>
>
>
> On 2018/05/04 2:08, Chris Plummer wrote:
>> Hi Yasumasa,
>>
>> Here are my results. I ran the test on macosx-x64-debug 500 times 
>> using random machines. I saw the failure twice, both times on 
>> different machines. I then ran 25 times on each of the 3 macosx 
>> machines that have shown the failure, and did not see it again. So I 
>> don't think this is necessarily an issue that is more likely to turn 
>> up on one macosx machine than any other. It's just very intermittent.
>>
>> I then ran 200 times each on all our supported platforms. So that 
>> would be both debug and product builds on macosx, linux-x64, 
>> solaris-sparc, and windows-x64. During those runs it turned up once 
>> on linux-x64 (product, not debug), so this issue does not seem to be 
>> limited to macosx.
>>
>> best regards,
>>
>> Chris
>>
>> On 5/3/18 8:26 AM, Chris Plummer wrote:
>>> Hi Yasumasa,
>>>
>>> I only see the one reported failure in our recent test history for 
>>> this test, so it doesn't look like it happens every time. I'll try 
>>> running the test on the machine it failed on, both with the binary 
>>> it failed with and a more recent binary.
>>>
>>> best regards,
>>>
>>> chris
>>>
>>> On 5/3/18 4:48 AM, Yasumasa Suenaga wrote:
>>>> FYI: jdk-11-ea+11 passed GcTest01.java
>>>> https://download.java.net/openjdk/testresults/11/archives/11/diff-hotspot-10-11.txt 
>>>>
>>>>
>>>>
>>>> On 2018/05/02 21:46, Yasumasa Suenaga wrote:
>>>>> Hi all,
>>>>>
>>>>> I've tried to fix 8202466, but I do not yet certain which solution 
>>>>> is the best for it.
>>>>> So I want your opinion for it.
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8202466
>>>>>
>>>>> David reports serviceability/tmtools/jstat/GcTest01.java failed in 
>>>>> Mach5. It seems to appear on OS X only.
>>>>> He has shared jtreg report, and I found as below:
>>>>>
>>>>> ```
>>>>> stdout: S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC 
>>>>> FGCT CGC CGCT GCT
>>>>>   0.0 0.0 0.0 0.0 3072.0 0.0 11264.0 2825.6 9472.0 9003.3 1024.0 
>>>>> 845.3 26 0.204 3 0.292 1 0.000 0.497
>>>>> ```
>>>>>
>>>>> YGCT: 0.204
>>>>> FGCT: 0.292
>>>>> CGCT: 0.000
>>>>>
>>>>>   GCT: 0.497
>>>>>
>>>>> I guess this failure was caused by rounding error because (YGCT + 
>>>>> FGCT + CGCT) < GCT.
>>>>> CGC is 1, so CGC operation might finish in very short time.
>>>>>
>>>>>
>>>>> GcTest01.java:
>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcTest01.java#l57 
>>>>>
>>>>>
>>>>> GcCauseTest01.java:
>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcCauseTest01.java#l53 
>>>>>
>>>>>
>>>>> GcTest01 and GcCauseTest01 are very similar, but GcCauseTest01 
>>>>> passed.
>>>>> Both tests use GcProvoker::provokeGc() to inflate memory usage. So 
>>>>> I wonder why GcTest01 just only failed.
>>>>> I guess we might encounter similar issue(s) in the future if we 
>>>>> get more fast machines.
>>>>>
>>>>> Hence I think we can take two approaches as below:
>>>>>
>>>>>    1. Add all tests in serviceability/tmtools/jstat to ProblemList.
>>>>>    2. Change all JstatGc*Tool to use custom jstat_options - to 
>>>>> show raw values in PerfCounters
>>>>>
>>>>>
>>>>> What do you think?
>>>>> I can start to work for it if they are OK.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>
>>>
>>
>>




More information about the serviceability-dev mailing list