RFR (M): 8136681: Factor out IHOP calculation from G1CollectorPolicy

Fri Nov 20 16:26:47 UTC 2015

On 2015-11-16, Thomas Schatzl wrote:
> Hi Jon,
> 
>   thanks a lot for all these reminders for better documentation. I have
> been working too long on this functionality so that "everything is
> clear" to me :)
> 
> New webrevs with hopefully more complete explanations at:
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.1_to_2/
> (incremental)
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.2/ (changes)

Hi Thomas,

here comes a small partial review:

g1IHOPControl.cpp:
  109 #endif
  110 
  111 #ifndef PRODUCT
- Please continue using the existing #ifndef

  82   size_t threshold;
  83   
  84   threshold = ctrl.get_conc_mark_start_threshold();
- Please initialize the variable when you declare it

Other:
- Maybe rename _last_allocated_old_bytes to
  _bytes_allocated_in_old_since_last_gc?

Thanks,
Erik

> 
> On Fri, 2015-11-13 at 07:58 -0800, Jon Masamitsu wrote:
> >Thomas,
> > 
> > This is partial.  If you send out a second webrev based on Mikael's
> > review, I'll finish with that.
> > 
> > 
> > http://cr.openjdk.java.net/~tschatzl/8136681/webrev/src/share/vm/gc/g1/g1CollectedHeap.hpp.frames.html
> > 
> > > 1370 // Returns the number of regions the humongous object of the 
> > > given word size
> > > 1371 // covers.
> > 
> > "covers" is not quite right since to me it says that the humongous 
> > object completely uses the
> > region.  I'd use "requires".
> 
> Fixed.
> 
> > http://cr.openjdk.java.net/~tschatzl/8136681/webrev/src/share/vm/gc/g1/g1IHOPControl.hpp.html
> > 
> >    49   // Update information about recent time during which allocations happened,
> >    50   // how many allocations happened and an additional safety buffer.
> > 
> > // Update information about
> > 
> > I DON'T KNOW WHICH OF THESE IS MORE PRECISE.
> > //   Time during which allocations occurred (sum of mutator execution time + GC pause times)
> > OR
> > //   Concurrent marking time (concurrent mark end - concurrent mark start)
> > 
> > //   Allocations in bytes during that time
> > //   Safety buffer ???
> > 
> > I couldn't figure out what the safety buffer is supposed to be.  It seems to
> > be the young gen size but don't know why.
> 
> I tried to explain in in the text. In short, in G1 the IHOP value is
> based on old gen occupancy only. The problem is that the young gen also
> needs to be allocated somewhere too.
> 
> Now you could just say, use the maximum young gen size. However this is
> 60% of the heap by default... so the adaptive IHOP algorithm uses a
> measure of the young gen that is not bounded by G1ReservePercent.
> 
> The reason to use the unbounded value is because if the code used the
> bounded one, it would cancel out with G1ReservePercent, because the
> closer we get to G1ReservePercent, the smaller that bounded value would
> get, which would make the current IHOP value rise etc, which would delay
> the initiation of the marking.
> 
> That would end up loosing throughput as then the young gen gets smaller
> and smaller (and GC frequency increases), it can take a long time until
> G1 gets close enough to G1ReservePercent so that the other factors
> (allocation rate, marking time) are used.
> 
> Basically initial mark will be delayed until young gen reaches its
> minimum size, at which time G1 will continue to use that young gen size
> until marking occurs. Which means that typically G1 will eat into
> G1ReservePercent, which we also do not want.
> 
> Additionally it would get G1 more in trouble in regards to pause time,
> giving it less room during that time.
> 
> Note that G1 young gen sizing already has some issues with young gen
> sizing if it operates close to G1ReservePercent. If there is time I will
> look at it again later.
> 
> >    33 // Manages the decision about the threshold when concurrent marking should start.
> > 
> > // IHOP = Initiating heap occupancy in percent of the entire heap
> > 
> >    34 class G1IHOPControl : public CHeapObj<mtGC> {
> >    35  protected:
> >    36   double _ihop_percent;
> > // Occupancy in bytes.  This is always less than or equal to the (maximum or current maximum?) heap occupancy
> >    37   size_t _target_occupancy;
> > 
> > 
> > 	// Most recent complete allocation period (time from first young-only GC to
> > 	// end of marking) in seconds.  Any period which is interrupted by a full GC
> > 	// is ignored.
> >    65    _last_allocation_time_s(0.0),
> > 	// Number of bytes allocated during that period (mutator allocation + promotions)
> >    66    _last_allocated_bytes(0),
> > 	// Length of most recent marking cycle in seconds
> >    67    _last_marking_length_s(0.0) { }
> > 
> > 	// Occupany in bytes at which a concurrent marking cycle is started. This count
> > 	// includes the waste in regions occupied by humongous objects and waste at
> > 	// the end of any old region that is not the currently being used for promotions
> > 	// and the waste at the end of survivor regions.
> 
> I skipped that paragraph about which waste e.g. at end of TLABs or
> humongous regions is used, and which not. IHOP calculation needs to
> consider actual free space for allocation.
> 
> Unfortunately G1 is in two minds about this, i.e. used() for humongous
> objects does not contain the "waste" at the end of the las region, but
> used() for regular regions does.
> 
> > 
> >    69   size_t get_conc_mark_start_threshold() { return (size_t) (_ihop_percent * _target_occupancy / 100.0); }
> > 
> > This might be better named update_marking_length() but can be done in a later cleanup.
> > 
> 
> Done.
> 
> Thanks a lot for all these reminders for documentation. I have been
> working too long on this functionality so that "everything is clear" to
> me :)
> 
> New webrevs with hopefully more complete explanations at:
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.1_to_2/
> (incremental)
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.2/ (changes)
> 
> >    	// Update marking length (concurrent marking end - concurret marking start)
> >    77   virtual void update_time_to_mixed(double marking_length_s) {
> > 
> > Have you given much thought to what affect a to-space exhaustion should 
> > have on
> > IHOP?  I understand that it is not the design center for this but I 
> > think that to-space
> > exhaustion can confuse the statistics.   Maybe a reset of the statistics 
> > and a dropping
> > the IHOP to a small value (current heap occupancy or even 0) until you 
> > get 3 successful
> > marking cycles.  Or think about it later.
> 
> I already thought a little about how this should interact with regards
> to the calculation: the idea is that basically the algorithm will notice
> that there is a significant amount of additional allocation, and will
> lower the IHOP threshold automatically. (Looking through the code I
> think I saw some problems here, I will see to fix that)
> 
> If evacuation failure happens in a gc during marking, there are a few
> options, I have not decided on what's best:
> 
> - do nothing because the user asked to run an impossible to manage
> workload (combination of live set, allocation rate, and other options).
> 
>   - there is already some log output which that information can be
> derived from.
> 
> - allow the user to set a maximum IHOP threshold. He could base this
> value on the log messages he gets.
>   - the user can already do that by increasing G1ReservePercent btw
> 
> - make sure that marking completes in time
> 
>   - let the mutator threads (or during young gcs while marking is
> running) do some marking if we notice that we do not have enough time.
> Not sure if it is worth the effort.
> 
>   - fix some bugs in marking :) that prevent that in extraordinary
> conditions.
> 
>   - make sure that we always start marking early enough by making sure
> that mixed gc reclaims enough memory. Planning some work here as part of
> generic work on improving gc policy.
> 
>   - improve predictions: some work being done here.
> 
> - make evacuation failure really fast and hope it works out
> 
>   - do not immediately move young gen regions with evacuation failures
> into young gen regions, but make them survivor.
> 
>   - there have already been lots of improvements here as you might have
> noticed.
> 
> Thanks,
>   Thomas
> 
>