RFR (trivial): 8214217: [TESTBUG] runtime/appcds/LotsOfClasses.java failed on solaris sparcv9

Mon Nov 26 23:55:15 UTC 2018

Hi Ioi,

On 11/26/18 3:35 PM, Ioi Lam wrote:

> As I commented on the bug report, we should improve the error message. 
> Also, maybe we can force GC to allow the test to run with less heap.

Updating the error message sounds good to me.
>
> A 3GB heap seems excessive. I was able to run the test with -Xmx256M 
> on Linux.

Using a small heap (with only little extra space) might still run into 
the issue in the future. As I pointed out, alignment and GC activities 
are also factors. Allocation size might also change in the future.

An alternative approach is to fix the test to recognize the 
fragmentation issue and don't report failure in that case. I'm now in 
favor of that approach since it's more flexible. We can also set a 
smaller heap size (such as 256M) in the test safely.
>
> Also, I don't understand what you mean by "all observed allocations 
> were done in the lower 2G range.". Why would heap fragmentation be 
> related to the location of the heap?

In my test run, only the heap regions in the lower 2G heap range were 
used for object allocations. It's not related to the heap location.

Thanks,
Jiangli
>
> Thanks
>
> - Ioi
>
>
> On 11/26/18 3:23 PM, Jiangli Zhou wrote:
>> Hi Ioi,
>>
>>
>> On 11/26/18 2:00 PM, Ioi Lam wrote:
>>> Hi Jiangli,
>>>
>>> -Xms3G will most likely fail on 32-bit platforms.
>>
>> We can make the change for 64-bit platform only since it's a 64-bit 
>> problem only. We do not archive java objects with 32-bit platform.
>>>
>>> BTW, why would this test fail only on Solaris and not linux? The 
>>> test doesn't specify heap size, so the initial heap size setting is 
>>> picked by Ergonomics. Can you reproduce the failure on Linux by 
>>> using the same heap size settings used by the failed Solaris runs?
>>
>> The failed Solaris run didn't set heap size explicitly. The heap size 
>> was determined by GC ergonomics, as you pointed out above. I ran the 
>> test this morning on the same solaris sparc machine using the same 
>> binary that was reported for the issue. In my test run, a very large 
>> heap (>26G) was used according to the gc region logging output. So 
>> the test didn't run into the heap fragmentation issue. All observed 
>> allocations were done in the lower 2G range.
>>
>> I don't think it is a Solaris only issue. If the heap size is small 
>> enough, you could run into the issue on all supported platforms. The 
>> issue could appear to be intermittent due to alignment and GC 
>> activities even with the same heap size that the failure was reported.
>>
>> On linux x64 machine, I can force the test to failure with the 
>> fragmentation error with 200M java heap.
>>>
>>> I think it's better to find out the root cause than just to mask it. 
>>> The purpose of LotsOfClasses.java is to stress the system to find 
>>> out potential bugs.
>>
>> I think this is a test issue, but not a CDS/GC issue. The test loads 
>> >20000 classes, but doesn't set java heap size. Relying on GC 
>> ergonomics to determine the 'right' heap size is incorrect in this 
>> case since dumping objects requires consecutive gc regions. 
>> Specifying the GC heap size explicitly doesn't 'mask' the issue, but 
>> is the right thing to do. :)
>>
>> Thanks,
>> Jiangli
>>
>>>
>>> Thanks
>>>
>>> - Ioi
>>>
>>>
>>> On 11/26/18 1:41 PM, Jiangli Zhou wrote:
>>>> Please review the following test fix, which sets the java heap size 
>>>> to 3G for dumping with large number of classes.
>>>>
>>>>   webrev: http://cr.openjdk.java.net/~jiangli/8214217/webrev.00/
>>>>
>>>>   bug: https://bugs.openjdk.java.net/browse/JDK-8214217
>>>>
>>>> Tested with tier1 and tier3. Also ran the test 100 times on 
>>>> solaris-sparcv9 via mach5.
>>>>
>>>> Thanks,
>>>>
>>>> Jiangli
>>>>
>>>
>>
>