CRR (XS): 7113006: G1: excessive ergo output when an evac failure happens

Wed Dec 28 17:38:59 UTC 2011

Ship it!

On 12/28/2011 8:49 AM, Tony Printezis wrote:
> Hi all,
>
> New webrev based on what I discussed with Jon below:
>
> http://cr.openjdk.java.net/~tonyp/7113006/webrev.1/
>
> Tony
>
> On 12/28/2011 11:06 AM, Tony Printezis wrote:
>> Hi Jon,
>>
>> Thanks for the prompt code review! Inline.
>>
>> On 12/28/2011 10:16 AM, Jon Masamitsu wrote:
>>> Tony,
>>>
>>> Change looks fine.   Is it possible for a second thread to come in 
>>> and take
>>> the new region just created by the expand?
>>
>> No. When a to-space region is replaced during a GC (whether it's 
>> survivor or old) we take the FreeList_lock (which guards the region 
>> free list). In the case of survivors, new_region() will (eventually) 
>> be called from attempt_allocation_locked():
>>
>> inline HeapWord* G1CollectedHeap::survivor_attempt_allocation(size_t
>>                                                               
>> word_size) {
>>   assert(!isHumongous(word_size),
>>          "we should not be seeing humongous-size allocations in this 
>> path");
>>
>>   HeapWord* result = 
>> _survivor_gc_alloc_region.attempt_allocation(word_size,
>>                                                       false /* 
>> bot_updates */);
>>   if (result == NULL) {
>>     MutexLockerEx x(FreeList_lock, Mutex::_no_safepoint_check_flag);
>>     result = 
>> _survivor_gc_alloc_region.attempt_allocation_locked(word_size,
>>                                                       false /* 
>> bot_updates */);
>>   }
>>   ...
>>
>> (for old regions, the code is similar to the above)
>>
>> Also note that the only time new_region() will try to expand the heap 
>> (i.e., do_expand == true) is when we allocate a to-space region 
>> during GC. That method is also used for mutator allocation regions 
>> but in that case do_expand == false. I'll actually put an extra 
>> assert to make sure that it's clear in the code.
>>
>>
>>> Not part of your change, but is this comment warranted?
>>>
>>>>   601       // Even though the heap was expanded, it might not have 
>>>> reached
>>>>   602       // the desired size. So, we cannot assume that the 
>>>> allocation
>>>>   603       // will succeed.
>>>
>>> I like comments but this one confused me a bit.  Where are we assuming
>>> that the allocation will succeed?
>>
>> Apologies, the comment is clearly unclear. :-) After expansion 
>> succeeds we call:
>>
>>       res = _free_list.remove_head_or_null();
>>
>> which will do what you expect (return the head or NULL). We could 
>> have called:
>>
>>       res = _free_list.remove_head();
>>
>> which assumes the free list head is non-NULL. So "we cannot assume 
>> that the allocation will succeed" refers to the use of the 
>> remove_head_or_null() method, instead of remove_head(). (As I said, 
>> you're totally right that this was very obscure.)
>>
>> Having said that, I think that if expansion succeeds the free list is 
>> also guaranteed to be non-empty. new_region() will always allocate a 
>> single region and we always expand the heap by an amount aligned to 
>> the heap region size (AFAIK at least). So, I _think_ using 
>> remove_head() there is probably OK. But I don't like tempting fate 
>> :-) and we just expanded the heap, so the extra check whether head is 
>> NULL or not is unlikely to cause any performance issues. So I'll 
>> leave the code as is and update the comment.
>>
>> I'll send out a new webrev shortly.
>>
>> Tony
>>
>>> Seems like we're just doing the necessary
>>> thing by getting a region off the freelist.  Yes, the result might 
>>> be null but
>>> we still need to try and get a region.
>>>
>>> Jon
>>>
>>>
>>> On 12/28/2011 4:04 AM, Tony Printezis wrote:
>>>> Hi all,
>>>>
>>>> Can I have a couple of code review for this very small change?
>>>>
>>>> http://cr.openjdk.java.net/~tonyp/7113006/webrev.0/
>>>>
>>>> The excessive ergo output was generated by repeated failed heap 
>>>> expansion attempts during a GC. I slightly expanded (pun 
>>>> unintended!) the scope of the CR to not only stop generating the 
>>>> ergo output after the first failed heap expansion but also to not 
>>>> re-attempt the heap expansion. If heap expansion fails once, it 
>>>> will probably fail again so it's pointless to keep trying. Without 
>>>> doing a careful and in-depth performance analysis I see a mild 
>>>> improvement in GC times when an alloc failure happens with this fix.
>>>>
>>>> Tony
>>>>