RFR (M): 8027959: Investigate early reclamation of large objects in G1

Tue Jul 22 11:28:15 UTC 2014

Hi Thomas,

On Monday 21 July 2014 15.20.57 Thomas Schatzl wrote:
> Hi Mikael,
> 
>   thanks for the review, and finding these issues!
> 
> On Thu, 2014-07-17 at 16:52 +0200, Mikael Gerdin wrote:
> > Hi Thomas,
> > 
> > On Tuesday 15 July 2014 11.10.53 Thomas Schatzl wrote:
> > > Hi all,
> > > 
> > >   could I have reviews for the following change that allows G1 to
> > > 
> > > eagerly/early reclaim humongous objects on every GC?
> > > 
> > > Problem:
> > > 
> > > In G1 large objects are always allocated in the old generation,
> > > currently requiring a complete heap liveness analysis (full gc, marking)
> > > to reclaim them.
> > > [...]
> > > The CR contains a graph showing large improvements on average humongous
> > > object reclamation delay. In total we have seen some benchmarks
> > > reclaiming GBs of heap space over time using this functionality (instead
> > > of waiting for the marking/full GC). This improves throughput
> > > significantly as there is more space available for the young gen on
> > > average now.
> > > 
> > > Also it might avoid users to manually increase heap region sizes just to
> > > avoid humongous object troubles.
> > > 
> > > CR:
> > >  https://bugs.openjdk.java.net/browse/JDK-8027959
> > > 
> > > Webrev:
> > >  http://cr.openjdk.java.net/~tschatzl/8027959/webrev/
> > 
> > g1CollectedHeap.hpp:
> > 
> > Most of the new functions you add for humongous regions operate on region
> > indices, except for
> > +  bool humongous_region_is_always_live(HeapRegion* region);
> > Is there any reason for it not operating on a region index as well? I
> > think
> > that would make the API nice and consistent.
> 
> Fixed, although internally the code will immediately convert to
> HeapRegion* anyway.

Of course, it looks a tiny bit cleaner :)

> 
> > g1CollectedHeap.cpp:
> > 
> > This comment is pretty confusing to me, can you try to rephrase it?
> > 3798     if (is_candidate) {
> > 3799       // Do not even try to reclaim a humongous object that we
> > already
> > know will
> > 3800       // not be treated as live later. A young collection will not
> > decrease the
> > 3801       // amount of remembered set entries for that region.
> 
> Hopefully fixed (changed to):
> 
> +// Is_candidate already filters out humongous regions with some
> remembered set.
> +// This will not lead to humongous object that we mistakenly keep alive
> because +// during young collection the remembered sets will only be added
> to.

Ok

> > forwardee can only be null here for humongous objects, correct?
> > Would it make sense to add an else-clause asserting that to keep some of
> > the old assert functionality?
> > 4674     if (forwardee != NULL) {
> > 4675       oopDesc::encode_store_heap_oop(p, forwardee);
> > 4676       if (do_mark_object != G1MarkNone && forwardee != obj) {
> > 4677         // If the object is self-forwarded we don't need to
> > explicitly
> > 4678         // mark it, the evacuation failure protocol will do so.
> > 4679         mark_forwarded_object(obj, forwardee);
> > 4680       }
> > 4681
> > 4682       if (barrier == G1BarrierKlass) {
> > 4683         do_klass_barrier(p, forwardee);
> > 4684       }
> > 4685       needs_marking = false;
> > 4686     }
> 
> I refactored the change so that the closure do_oop_work() methods
> directly use the value from the in_cset table, so this is obsolete now.
> 
> > g1OopClosures.inline.hpp
> > 
> > Are you sure that the change to is_in_cset_or_humongous is correct here?
> > I think that when FilterIntoCSClosure is used for card refinement we are
> > not actually interested in references to humongous objects since they
> > don't move. So this could cause us to needlessly add cards to the
> > into_cset_dcq.
> > 
> > However in the case of scanning remembered sets it's the new check is
> > probably exactly what you are looking for.
> > 
> >   43 template <class T>
> >   44 inline void FilterIntoCSClosure::do_oop_nv(T* p) {
> >   45   T heap_oop = oopDesc::load_heap_oop(p);
> >   46   if (!oopDesc::is_null(heap_oop) &&
> >   47       _g1-
> > >
> > >is_in_cset_or_humongous(oopDesc::decode_heap_oop_not_null(heap_oop))) {
> > >
> >   48     _oc->do_oop(p);
> >   49   }
> >   50 }
> 
> I think it does not hurt too much.

I agree.

> 
> > Perhaps this should be left for a further cleanup, one approach could be
> > to
> > fold the logic using is_in_cset_or_humongous into HeapRegion_DCTOC since
> > scanCard seems to be the only place where that closure is used.
> 
> I agree.
> 
> > Previsously the else-clause here would cause a remembered set entry of the
> > region containing "obj" to be added, to remember the reference "p"
> > Now that you're removing humongous objects from this consideration, can we
> > miss remembered set entries for humongous regions?
> > 
> >   65 inline void G1ParScanClosure::do_oop_nv(T* p) {
> >   66   T heap_oop = oopDesc::load_heap_oop(p);
> >   67
> >   68   if (!oopDesc::is_null(heap_oop)) {
> >   69     oop obj = oopDesc::decode_heap_oop_not_null(heap_oop);
> >   70     if (_g1->is_in_cset_or_humongous(obj)) {
> >   71       // We're not going to even bother checking whether the object
> >   is
> >   72       // already forwarded or not, as this usually causes an
> >   immediate
> >   73       // stall. We'll try to prefetch the object (for write, given
> >   that
> >   74       // we might need to install the forwarding reference) and we'll
> >   75       // get back to it when pop it from the queue
> >   76       Prefetch::write(obj->mark_addr(), 0);
> >   77       Prefetch::read(obj->mark_addr(), (HeapWordSize*2));
> >   78
> >   79       // slightly paranoid test; I'm trying to catch potential
> >   80       // problems before we go into push_on_queue to know where the
> >   81       // problem is coming from
> >   82       assert((obj == oopDesc::load_decode_heap_oop(p)) ||
> >   83              (obj->is_forwarded() &&
> >   84                  obj->forwardee() ==
> >   oopDesc::load_decode_heap_oop(p)),
> >   85              "p should still be pointing to obj or to its
> >   forwardee");
> >   86
> >   87       _par_scan_state->push_on_queue(p);
> >   88     } else {
> >   89       _par_scan_state->update_rs(_from, p, _worker_id);
> >   90     }
> 
> Fixed.
> 
> > I also somewhat agree with Bengt's opinion about explicitly checking for
> > humongous objects.
> 
> Done.
> 
> There were some more changes that are related to Bengt's suggestions. I
> will answer there.
> 
> Diff to last:
> http://cr.openjdk.java.net/~tschatzl/8027959/webrev.0_to_1
> Complete:
> http://cr.openjdk.java.net/~tschatzl/8027959/webrev.1

I had a look through the entirety of webrev.1 again.

g1CollectedHeap.hpp:
The comment before G1FastCSetBiasedMappedArray about failing TLAB allocations 
is no longer relevant, since the code has been changed to explicitly handle 
IsHumongous, right?.

The enum in_cset_state_t seems like it should live in G1CollectedHeap instead 
of in the scope of the datastructure used to maintain the information.
Not only will that reduce the length of 
G1FastCSetBiasedMappedArray::InCSet to
G1CollectedHeap::InCSet
but I think that callers of G1CollectedHeap::in_cset_state should not be 
required to know that it is in fact backed by a G1FastCSetBIasedMappedArray 
and that the values' enum is scoped inside that class.

g1CollectedHeap.cpp:
line 5535 contains trailing white space

TestEagerReclaimHumongousRegions.java:
The test looks good to me, but as you know anything may happen in the testing 
lab :)

I think the rest of the change looks fine.

/Mikael

> 
> Thanks,
>   Thomas