RFR: 8202466: Test serviceability/tmtools/jstat/GcTest01.java fails: Number of concurrent GC events is 1, but CGCT is 0

Chris Plummer chris.plummer at oracle.com
Fri May 18 19:16:01 UTC 2018


Looks good.

Chris

On 5/18/18 5:41 AM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> Thank you for your comment. I uploaded new webrev:
>   http://cr.openjdk.java.net/~ysuenaga/JDK-8202466/webrev.01/
>
> webrev html for JstatGcNewResults.java does not contain the change 
> enough, so please see raw patch (jdk.patch).
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2018/05/18 6:43, Chris Plummer wrote:
>> Hi Yasumasa,
>>
>> Your changes look fine. Can you fix the alignment of the output in 
>> the following comment in JstatGcNewResults.java:
>>
>>    27  * Output example:
>>    28  * S0C           S1C         S0U    S1U   TT   MTT DSS 
>> EC                EU       YGC     YGCT
>>    29  * 11264.0  11264.0    0.0    0.0      15  15 0.0 67584.0 
>> 1351.7      0      0.000
>>
>> If it looks aligned to you in this email, it's because you are 
>> looking at it using a variable width font. Look at is using a fixed 
>> width font.
>>
>> thanks,
>>
>> Chris
>>
>>
>>
>> On 5/15/18 6:52 AM, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>>>> If that is the case, maybe just drop the error check.
>>>>
>>>> I want to do that.
>>>
>>> I removed them in following webrev. Could you review it?
>>>
>>>   http://cr.openjdk.java.net/~ysuenaga/JDK-8202466/webrev.00/
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2018/05/06 21:44, Yasumasa Suenaga wrote:
>>>> Hi Chris,
>>>>
>>>>> As I pointed out in my initial review, this is not code I'm that 
>>>>> familiar with. One question I have is if it is even necessary to 
>>>>> fail if the value is .000? It seems the assumption is that if 
>>>>> there was 1 GC event, the total time spent will be at least 
>>>>> .0005ms, rounded up to .001ms, so it shows up in the output as a 
>>>>> non-zero value.
>>>>
>>>> I agree with you.
>>>>
>>>>
>>>>> If that is the case, maybe just drop the error check.
>>>>
>>>> I want to do that.
>>>> Another option(s) affects jstat output, so I think we need to CSR.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2018/05/05 7:03, Chris Plummer wrote:
>>>>> Hi Yasumasa,
>>>>>
>>>>> I just noticed that GcTest01.java and GcCauseTest03.java have also 
>>>>> failed for this reason. I see 9 total failures between the 3 tests 
>>>>> so far.
>>>>>
>>>>> On 5/4/18 6:17 AM, Yasumasa Suenaga wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> Thank you for your evaluation !
>>>>>> It is very helpful for me.
>>>>>> (I've not reproduced this issue on linux-x64 !)
>>>>>>
>>>>>>> It's just very intermittent.
>>>>>>
>>>>>> I do not yet decided how do we fix this problem.
>>>>>>
>>>>>> IMHO we can add fallback code to get raw PerfCounter data through 
>>>>>> `jcmd PerfCounter.print`.
>>>>>> But I think we shouldn't be so because that fallback (jcmd) code 
>>>>>> might hide jstat problem(s).
>>>>>>
>>>>>> Also we cannot reproduce this issue completely, so we might not 
>>>>>> check the change for this issue.
>>>>>>
>>>>>>
>>>>>> Do you have any idea?
>>>>> As I pointed out in my initial review, this is not code I'm that 
>>>>> familiar with. One question I have is if it is even necessary to 
>>>>> fail if the value is .000? It seems the assumption is that if 
>>>>> there was 1 GC event, the total time spent will be at least 
>>>>> .0005ms, rounded up to .001ms, so it shows up in the output as a 
>>>>> non-zero value. If that is the case, maybe just drop the error 
>>>>> check. Another option is to use something other than .000 when the 
>>>>> true value rounds to .000. Does it have to be 3 digits? Maybe 
>>>>> always print enough digits so you never end up with all 0's.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2018/05/04 2:08, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> Here are my results. I ran the test on macosx-x64-debug 500 
>>>>>>> times using random machines. I saw the failure twice, both times 
>>>>>>> on different machines. I then ran 25 times on each of the 3 
>>>>>>> macosx machines that have shown the failure, and did not see it 
>>>>>>> again. So I don't think this is necessarily an issue that is 
>>>>>>> more likely to turn up on one macosx machine than any other. 
>>>>>>> It's just very intermittent.
>>>>>>>
>>>>>>> I then ran 200 times each on all our supported platforms. So 
>>>>>>> that would be both debug and product builds on macosx, 
>>>>>>> linux-x64, solaris-sparc, and windows-x64. During those runs it 
>>>>>>> turned up once on linux-x64 (product, not debug), so this issue 
>>>>>>> does not seem to be limited to macosx.
>>>>>>>
>>>>>>> best regards,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 5/3/18 8:26 AM, Chris Plummer wrote:
>>>>>>>> Hi Yasumasa,
>>>>>>>>
>>>>>>>> I only see the one reported failure in our recent test history 
>>>>>>>> for this test, so it doesn't look like it happens every time. 
>>>>>>>> I'll try running the test on the machine it failed on, both 
>>>>>>>> with the binary it failed with and a more recent binary.
>>>>>>>>
>>>>>>>> best regards,
>>>>>>>>
>>>>>>>> chris
>>>>>>>>
>>>>>>>> On 5/3/18 4:48 AM, Yasumasa Suenaga wrote:
>>>>>>>>> FYI: jdk-11-ea+11 passed GcTest01.java
>>>>>>>>> https://download.java.net/openjdk/testresults/11/archives/11/diff-hotspot-10-11.txt 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2018/05/02 21:46, Yasumasa Suenaga wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I've tried to fix 8202466, but I do not yet certain which 
>>>>>>>>>> solution is the best for it.
>>>>>>>>>> So I want your opinion for it.
>>>>>>>>>>
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202466
>>>>>>>>>>
>>>>>>>>>> David reports serviceability/tmtools/jstat/GcTest01.java 
>>>>>>>>>> failed in Mach5. It seems to appear on OS X only.
>>>>>>>>>> He has shared jtreg report, and I found as below:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> stdout: S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT 
>>>>>>>>>> FGC FGCT CGC CGCT GCT
>>>>>>>>>>   0.0 0.0 0.0 0.0 3072.0 0.0 11264.0 2825.6 9472.0 9003.3 
>>>>>>>>>> 1024.0 845.3 26 0.204 3 0.292 1 0.000 0.497
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> YGCT: 0.204
>>>>>>>>>> FGCT: 0.292
>>>>>>>>>> CGCT: 0.000
>>>>>>>>>>
>>>>>>>>>>   GCT: 0.497
>>>>>>>>>>
>>>>>>>>>> I guess this failure was caused by rounding error because 
>>>>>>>>>> (YGCT + FGCT + CGCT) < GCT.
>>>>>>>>>> CGC is 1, so CGC operation might finish in very short time.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> GcTest01.java:
>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcTest01.java#l57 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> GcCauseTest01.java:
>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcCauseTest01.java#l53 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> GcTest01 and GcCauseTest01 are very similar, but 
>>>>>>>>>> GcCauseTest01 passed.
>>>>>>>>>> Both tests use GcProvoker::provokeGc() to inflate memory 
>>>>>>>>>> usage. So I wonder why GcTest01 just only failed.
>>>>>>>>>> I guess we might encounter similar issue(s) in the future if 
>>>>>>>>>> we get more fast machines.
>>>>>>>>>>
>>>>>>>>>> Hence I think we can take two approaches as below:
>>>>>>>>>>
>>>>>>>>>>    1. Add all tests in serviceability/tmtools/jstat to 
>>>>>>>>>> ProblemList.
>>>>>>>>>>    2. Change all JstatGc*Tool to use custom jstat_options - 
>>>>>>>>>> to show raw values in PerfCounters
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What do you think?
>>>>>>>>>> I can start to work for it if they are OK.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>
>>




More information about the serviceability-dev mailing list