RFR: 8202466: Test serviceability/tmtools/jstat/GcTest01.java fails: Number of concurrent GC events is 1, but CGCT is 0

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Sun May 20 01:48:19 UTC 2018


Hi Yasumasa,

It looks good to me.

Thanks,
Serguei


On 5/18/18 21:06, Yasumasa Suenaga wrote:
> Thanks Chris!
> I'm waiting for a Reviewer.
>
>
> Yasumasa
>
>
> On 2018/05/19 4:16, Chris Plummer wrote:
>> Looks good.
>>
>> Chris
>>
>> On 5/18/18 5:41 AM, Yasumasa Suenaga wrote:
>>> Hi Chris,
>>>
>>> Thank you for your comment. I uploaded new webrev:
>>>   http://cr.openjdk.java.net/~ysuenaga/JDK-8202466/webrev.01/
>>>
>>> webrev html for JstatGcNewResults.java does not contain the change 
>>> enough, so please see raw patch (jdk.patch).
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2018/05/18 6:43, Chris Plummer wrote:
>>>> Hi Yasumasa,
>>>>
>>>> Your changes look fine. Can you fix the alignment of the output in 
>>>> the following comment in JstatGcNewResults.java:
>>>>
>>>>    27  * Output example:
>>>>    28  * S0C           S1C         S0U    S1U   TT   MTT DSS 
>>>> EC                EU       YGC     YGCT
>>>>    29  * 11264.0  11264.0    0.0    0.0      15  15 0.0 67584.0 
>>>> 1351.7      0      0.000
>>>>
>>>> If it looks aligned to you in this email, it's because you are 
>>>> looking at it using a variable width font. Look at is using a fixed 
>>>> width font.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>>
>>>>
>>>> On 5/15/18 6:52 AM, Yasumasa Suenaga wrote:
>>>>> Hi,
>>>>>
>>>>>>> If that is the case, maybe just drop the error check.
>>>>>>
>>>>>> I want to do that.
>>>>>
>>>>> I removed them in following webrev. Could you review it?
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8202466/webrev.00/
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2018/05/06 21:44, Yasumasa Suenaga wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>>> As I pointed out in my initial review, this is not code I'm that 
>>>>>>> familiar with. One question I have is if it is even necessary to 
>>>>>>> fail if the value is .000? It seems the assumption is that if 
>>>>>>> there was 1 GC event, the total time spent will be at least 
>>>>>>> .0005ms, rounded up to .001ms, so it shows up in the output as a 
>>>>>>> non-zero value.
>>>>>>
>>>>>> I agree with you.
>>>>>>
>>>>>>
>>>>>>> If that is the case, maybe just drop the error check.
>>>>>>
>>>>>> I want to do that.
>>>>>> Another option(s) affects jstat output, so I think we need to CSR.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2018/05/05 7:03, Chris Plummer wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> I just noticed that GcTest01.java and GcCauseTest03.java have 
>>>>>>> also failed for this reason. I see 9 total failures between the 
>>>>>>> 3 tests so far.
>>>>>>>
>>>>>>> On 5/4/18 6:17 AM, Yasumasa Suenaga wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> Thank you for your evaluation !
>>>>>>>> It is very helpful for me.
>>>>>>>> (I've not reproduced this issue on linux-x64 !)
>>>>>>>>
>>>>>>>>> It's just very intermittent.
>>>>>>>>
>>>>>>>> I do not yet decided how do we fix this problem.
>>>>>>>>
>>>>>>>> IMHO we can add fallback code to get raw PerfCounter data 
>>>>>>>> through `jcmd PerfCounter.print`.
>>>>>>>> But I think we shouldn't be so because that fallback (jcmd) 
>>>>>>>> code might hide jstat problem(s).
>>>>>>>>
>>>>>>>> Also we cannot reproduce this issue completely, so we might not 
>>>>>>>> check the change for this issue.
>>>>>>>>
>>>>>>>>
>>>>>>>> Do you have any idea?
>>>>>>> As I pointed out in my initial review, this is not code I'm that 
>>>>>>> familiar with. One question I have is if it is even necessary to 
>>>>>>> fail if the value is .000? It seems the assumption is that if 
>>>>>>> there was 1 GC event, the total time spent will be at least 
>>>>>>> .0005ms, rounded up to .001ms, so it shows up in the output as a 
>>>>>>> non-zero value. If that is the case, maybe just drop the error 
>>>>>>> check. Another option is to use something other than .000 when 
>>>>>>> the true value rounds to .000. Does it have to be 3 digits? 
>>>>>>> Maybe always print enough digits so you never end up with all 0's.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2018/05/04 2:08, Chris Plummer wrote:
>>>>>>>>> Hi Yasumasa,
>>>>>>>>>
>>>>>>>>> Here are my results. I ran the test on macosx-x64-debug 500 
>>>>>>>>> times using random machines. I saw the failure twice, both 
>>>>>>>>> times on different machines. I then ran 25 times on each of 
>>>>>>>>> the 3 macosx machines that have shown the failure, and did not 
>>>>>>>>> see it again. So I don't think this is necessarily an issue 
>>>>>>>>> that is more likely to turn up on one macosx machine than any 
>>>>>>>>> other. It's just very intermittent.
>>>>>>>>>
>>>>>>>>> I then ran 200 times each on all our supported platforms. So 
>>>>>>>>> that would be both debug and product builds on macosx, 
>>>>>>>>> linux-x64, solaris-sparc, and windows-x64. During those runs 
>>>>>>>>> it turned up once on linux-x64 (product, not debug), so this 
>>>>>>>>> issue does not seem to be limited to macosx.
>>>>>>>>>
>>>>>>>>> best regards,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 5/3/18 8:26 AM, Chris Plummer wrote:
>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>
>>>>>>>>>> I only see the one reported failure in our recent test 
>>>>>>>>>> history for this test, so it doesn't look like it happens 
>>>>>>>>>> every time. I'll try running the test on the machine it 
>>>>>>>>>> failed on, both with the binary it failed with and a more 
>>>>>>>>>> recent binary.
>>>>>>>>>>
>>>>>>>>>> best regards,
>>>>>>>>>>
>>>>>>>>>> chris
>>>>>>>>>>
>>>>>>>>>> On 5/3/18 4:48 AM, Yasumasa Suenaga wrote:
>>>>>>>>>>> FYI: jdk-11-ea+11 passed GcTest01.java
>>>>>>>>>>> https://download.java.net/openjdk/testresults/11/archives/11/diff-hotspot-10-11.txt 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2018/05/02 21:46, Yasumasa Suenaga wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> I've tried to fix 8202466, but I do not yet certain which 
>>>>>>>>>>>> solution is the best for it.
>>>>>>>>>>>> So I want your opinion for it.
>>>>>>>>>>>>
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202466
>>>>>>>>>>>>
>>>>>>>>>>>> David reports serviceability/tmtools/jstat/GcTest01.java 
>>>>>>>>>>>> failed in Mach5. It seems to appear on OS X only.
>>>>>>>>>>>> He has shared jtreg report, and I found as below:
>>>>>>>>>>>>
>>>>>>>>>>>> ```
>>>>>>>>>>>> stdout: S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC 
>>>>>>>>>>>> YGCT FGC FGCT CGC CGCT GCT
>>>>>>>>>>>>   0.0 0.0 0.0 0.0 3072.0 0.0 11264.0 2825.6 9472.0 9003.3 
>>>>>>>>>>>> 1024.0 845.3 26 0.204 3 0.292 1 0.000 0.497
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> YGCT: 0.204
>>>>>>>>>>>> FGCT: 0.292
>>>>>>>>>>>> CGCT: 0.000
>>>>>>>>>>>>
>>>>>>>>>>>>   GCT: 0.497
>>>>>>>>>>>>
>>>>>>>>>>>> I guess this failure was caused by rounding error because 
>>>>>>>>>>>> (YGCT + FGCT + CGCT) < GCT.
>>>>>>>>>>>> CGC is 1, so CGC operation might finish in very short time.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> GcTest01.java:
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcTest01.java#l57 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> GcCauseTest01.java:
>>>>>>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/4da7dce7e2bf/test/hotspot/jtreg/serviceability/tmtools/jstat/GcCauseTest01.java#l53 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> GcTest01 and GcCauseTest01 are very similar, but 
>>>>>>>>>>>> GcCauseTest01 passed.
>>>>>>>>>>>> Both tests use GcProvoker::provokeGc() to inflate memory 
>>>>>>>>>>>> usage. So I wonder why GcTest01 just only failed.
>>>>>>>>>>>> I guess we might encounter similar issue(s) in the future 
>>>>>>>>>>>> if we get more fast machines.
>>>>>>>>>>>>
>>>>>>>>>>>> Hence I think we can take two approaches as below:
>>>>>>>>>>>>
>>>>>>>>>>>>    1. Add all tests in serviceability/tmtools/jstat to 
>>>>>>>>>>>> ProblemList.
>>>>>>>>>>>>    2. Change all JstatGc*Tool to use custom jstat_options - 
>>>>>>>>>>>> to show raw values in PerfCounters
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> What do you think?
>>>>>>>>>>>> I can start to work for it if they are OK.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>>>>
>>
>>



More information about the serviceability-dev mailing list