Request for code review - 8130265: gctests/LargeObjects/large001 fails with OutOfMemoryError: Java heap space

Mon Aug 24 12:55:22 UTC 2015

Hi,

On Fri, 2015-08-21 at 11:44 -0400, Alexander Harlap wrote:
> 
> 
> On 8/21/2015 8:18 AM, Jesper Wilhelmsson wrote:
> 
> > Hi Alex, 
> > 
> > Alexander Harlap skrev den 20/8/15 21:41: 
> > > Kim and Thomas, 
> > > 
> > > Here is new version with your suggestions in: 
> > > 
> > > http://cr.openjdk.java.net/~aharlap/8130265/webrev.03/ 
> > > <http://cr.openjdk.java.net/%7Eaharlap/8130265/webrev.03/> 
> > > Alex 
> > A few comments on this latest version. 
> > 
> > * Why do you assert_at_safepoint and set succeeded to true in your
> > helper? The original code did not do this at these places. I think
> > you can remove these lines. 
> > 
> > 1736   assert_at_safepoint(true /* should_be_vm_thread */); 
> > 1737 
> > 1738   *succeeded = true; 
> > 
> Even method called satisfy_failed_allocation_helper , there is no way
> to make sure that it is called only  from satisfy_failed_allocation.

I do not think we guarantee that for any method, i.e. that a given
method somehow checks its call chain. This would require much better
encapsulation, and in some cases this does not help you either. Also it
would require you to state the allowed callers somehow.

The helper may be private too, not protected.

> satisfy_failed_allocation_helper is a separate method and those lines
> are needed.
> So I would like to keep
> 1736   assert_at_safepoint(true /* should_be_vm_thread */); 
> 1737 
> 1738   *succeeded = true; 
> > 
> > * Please align the arguments in 
> > 
> > 1732 G1CollectedHeap::satisfy_failed_allocation_helper(size_t
> > word_size, 
> > 1733                                            AllocationContext_t
> > context, 
> > 1734                                            bool
> > clear_all_soft_refs, 
> > 1735                                            bool* succeeded) { 
> > 
> > and 
> > 
> > 1748   HeapWord* result =
> > attempt_allocation_at_safepoint(word_size, 
> > 1749                                            context, 
> > 1750                                            true /*
> > expect_null_mutator_alloc_region */); 
> > 
> > and also in the header file. 

Please also avoid method definitions like

<return value>
<class name>::<method-name> ...

(or even worse

<return value>
<class-name>::
<method-name>

(this change does not do it that way)
)
> > 
> Sure.
>
> > * Your last version did: 
> > 
> > attempt_allocation_at_safepoint(word_size, context, false); 
> > expand_and_allocate(word_size, context); 
> > do_collection(false, false, word_size); 
> > attempt_allocation_at_safepoint(word_size, context, true); 
> > expand_and_allocate(word_size, context); 
> > do_collection(false, true, word_size); 
> > attempt_allocation_at_safepoint(word_size, context, true); 
> > assert(!collector_policy()->should_clear_all_soft_refs()); 
> > expand_and_allocate(word_size, context); 
> > 
> > You have chosen to break this into: 
> > 
> > attempt_allocation_at_safepoint(word_size, context, false); 
> > expand_and_allocate(word_size, context); 
> > 
> > satisfy_failed_allocation_helper(word_size, context, false,
> > succeeded); 
> >    do_collection(false, false, word_size); 
> >    attempt_allocation_at_safepoint(word_size, context, true); 
> >    expand_and_allocate(word_size, context); 
> > 
> > satisfy_failed_allocation_helper(word_size, context, true,
> > succeeded); 
> >    do_collection(false, true, word_size); 
> >    attempt_allocation_at_safepoint(word_size, context, true); 
> >    expand_and_allocate(word_size, context); 
> > 
> > assert(!collector_policy()->should_clear_all_soft_refs()); 
> > 
> > This is not entirely identical, depending on how important the
> > assert is that could be an issue since the assert is now skipped in
> > some cases where it was previously used. 

The assert just states that after setting clear_all_soft_refs to true in
the call to do_collection, it must be false after collection.

So actually one could change the assert to

assert(!clear_all_soft_refs || !
collector_policy()->should_clear_all_soft_refs(), "...") and put it
directly in the helper.

Actually I think CollectorPolicy::clear_all_soft_refs() must be false
after any full gc, so it should be placed after the do_collection call.
> > 
> > My suggestion would be to break the code up like this: 
> > 
> > satisfy_failed_allocation_helper(word_size, context, do_gc,
> > clear_softrefs) 
> >    attempt_allocation_at_safepoint(word_size, context, false); 
> >    expand_and_allocate(word_size, context); 
> >    do_collection(false, false, word_size); 
> > 
> > satisfy_failed_allocation_helper(word_size, context, do_gc,
> > clear_softrefs) 
> >    attempt_allocation_at_safepoint(word_size, context, true); 
> >    expand_and_allocate(word_size, context); 
> >    do_collection(false, true, word_size); 
> > 
> > satisfy_failed_allocation_helper(word_size, context, do_gc,
> > clear_softrefs) 
> >    attempt_allocation_at_safepoint(word_size, context, true); 
> >    if (!do_gc) 
> >       assert(!collector_policy()->should_clear_all_soft_refs()); 
> >    expand_and_allocate(word_size, context); 
> > 
> > This would make the code in satisfy_failed_allocation() a lot more
> > clean. 
> > 
> In webrev.03 I decided to put same way as it was in the original code
> (before any of my changes), 
> at the very end of method just when we already exhausted all
> possibilities and about to return NULL.
> 
> attempt_allocation_at_safepoint(word_size, context, false); 
> expand_and_allocate(word_size, context); 
> do_collection(false, false, word_size); 
> attempt_allocation_at_safepoint(word_size, context, true); 
> do_collection(false, true, word_size); 
> attempt_allocation_at_safepoint(word_size, context, true); 
> assert(!collector_policy()->should_clear_all_soft_refs()); 
> 
>  // What else? We might try synchronous finalization later. If the
> total 1799 // space available is large enough for the allocation, then
> a more 
>  // complete compaction phase than we've tried so far might be 
> // appropriate. 
> assert(*succeeded, "sanity");
> return NULL; 1804 
> }
> 
> webrev.03 has this assert at same position, and I prefer it this way.
> 
> > 
> > * Unless I'm missing some subtle detail here, the usage of the
> > variable 'succeeded' is unnecessary as 'result' will always contain
> > the same information. I don't think we should pass on 'succeeded' to
> > the helper, but simply set it afterwards before we return from
> > satisfy_failed_allocation(). 

> No, helper may return NULL in two differnt cases:
> 1. Full GC failed, *succeeded = false, no attempts to allocate was
> made
> 2. Full GC succeeded, *succeeded = true, all attempts to allocate
> failed 
> 
> So we need to pass 'succeeded' to the helper.

Can we rename this to something more descriptive? I had to look into
do_collection() what it means. Even the gc_succeeded name in the helper
is better.

Also, the only condition where do_collection() fails is if the gc locker
is active. This return value is actually ignored in
do_full_collection().

So the gc locker check could be moved out of do_collection() (adding an
assert there), and the value of gc_succeeded/succeeded determined using
this check immediately.

At least, in the helper, move the "assert(*succeeded, ...)" up.

Like:
--------------

*succeeded = !GCLocker::check_active_before_gc(); // maybe use a temp
variable here, that is not negated because we check the negated result
again just later

if (!*succeeded) {
  return NULL;
}

do_collection(....);
assert(
   !collector_policy()->should_clear_all_soft_refs(), ...); // see above

assert(*succeeded, "..."); // for the paranoid, but I would just skip
it.

result = attempt_allocation_at_safepoint(...);

if (result != NULL) {
  return result;
}

result = expand_and_allocate(...);

return result;
---------------

(And the bool return value of do_collection removed).

I also have a question about the order of the order of
attempt_allocation_at_safepoint() and the expand_and_allocate(), and
could not find this answered before: so full gc might shrink the heap
too much after full gc, and we should expand the heap.
If I were to do this change, there would always (unconditionally) be an
expansion after full gc, because the current change only expands if the
allocation after gc fails. Also the 02 change.

The review thread so far suggests that full gc might shrink too much,
but this still seems to be the case here in some cases.

Thanks,
  Thomas