Request for review (M): 8002144: G1: large number of evacuation failures may lead to large c heap memory usage

Wed Feb 6 14:46:57 UTC 2013

Hi everyone,

An updated webrev. Using the existing Stack<> data structure rather than 
introducing a new one. Thanks John Coomes for pointing this out!

http://cr.openjdk.java.net/~brutisso/8002144/webrev.02/

With this fix I have run several hundreds of iterations of the 
ManyObjects test without any native OOME.

I also noticed the PreservedMark class that is being used by 
PSMarkSweep. It would probably be good to make all collectors use this, 
but I think I'll file a separate RFE for that.

Thanks,
Bengt

On 2/5/13 10:21 PM, Bengt Rutisson wrote:
> On 2/5/13 10:13 PM, John Coomes wrote:
>> Bengt Rutisson (bengt.rutisson at oracle.com) wrote:
>>> Hi John,
>>>
>>> Thanks for looking at this!
>>>
>>> On 2/4/13 10:18 PM, John Coomes wrote:
>>>> Bengt Rutisson (bengt.rutisson at oracle.com) wrote:
>>>>> ...
>>>> It doesn't appear that G1 needs to process these in FIFO order.  If
>>>> not and LIFO order is ok, you can use the existing Stack<> template
>>>> which is a chunked stack with chunks a page in size (by default).  The
>>>> other collectors use this, e.g.,
>>>>
>>>>     Stack<markOop, mtGC> PSScavenge::_preserved_mark_stack;
>>>>     Stack<oop, mtGC> PSScavenge::_preserved_oop_stack;
>>>>
>>>> It would be nice to avoid a new data structure if a Stack will do.
>>>>
>>>> If FIFO order really is needed, consider adding a more generic data
>>>> structure (e.g., ChunkedQueue<T>) and making G1PreserveMarkQueue a
>>>> typedef or simple wrapper.
>>> I agree that FIFO order is probably not needed. The reason I preserved
>>> the FIFO order was mostly to reduce the risk of introducing 
>>> regressions.
>>> So, I think using the Stack data structure should be fine. I'll change
>>> my patch to do that instead and try it out. Thanks for pointing me to
>>> the Stack!
>>>
>>> I've been at JFokus all day today and I'll be there tomorrow as well.
>>> I'll send out an updated webrev as soon as I get a chance to try it 
>>> out.
>>>
>>> Any idea why G1 used the GrowableArray in the first place when the 
>>> other
>>> GCs use Stack?
>> All the GCs used GrowableArray until Stack<> was added for 6423256,
>> and a number of GrowableArrays were converted as part of that fix.  I
>> checked some saved email and notes but didn't find any clues as to why
>> these weren't changed at the same time.
>
> Thanks for digging around to find the bug. It definitely sounds like 
> that bug addresses the same issue that I see now in G1. I'll put 
> together a fix using Stack<> instead of the GrowableArray and see if 
> that passes my tests.
>
> Bengt
>
>>
>> -John
>>
>>>>> Without this fix I get native out of memory about every three runs 
>>>>> of the test.
>>>>> With this fix the test has been running for several days and more 
>>>>> than 5600
>>>>> iterations.
>>>>>
>>>>> The chunk size is variable but has a max limit. I use 40 entries 
>>>>> as initial
>>>>> size since this is what the GrowableArrays used. I picked 10000 as 
>>>>> the maximum
>>>>> size. The value 10000 can probably be tuned further, but 100000 
>>>>> was too much (I
>>>>> still got native OOME) and 10000 works fine.
>>>>>
>>>>> I have been comparing GC pause times with and without my fix for the
>>>>> ManyObjects test. I don't see any large differences in the pause 
>>>>> times. This
>>>>> will only affect performance for runs that have a lot of 
>>>>> evacuation failures.
>>>>> These runs will benefit form the fact that we don't have to do as 
>>>>> much copying
>>>>> as before, but they will also do several more mallocs compared to 
>>>>> before my
>>>>> fix. The runs I've done indicate that this evens out. I can't see 
>>>>> any large
>>>>> differences.
>>>> I hope we don't start caring about performance when there are many
>>>> evacuation failures :-).
>>>>
>>>> -John
>