RFR (M): 8136681: Factor out IHOP calculation from G1CollectorPolicy

Jon Masamitsu jon.masamitsu at oracle.com
Mon Nov 16 22:21:00 UTC 2015



On 11/16/2015 05:31 AM, Thomas Schatzl wrote:
> Hi Jon,
>
>    thanks a lot for all these reminders for better documentation. I have
> been working too long on this functionality so that "everything is
> clear" to me :)
>
> New webrevs with hopefully more complete explanations at:
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.1_to_2/
> (incremental)
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.2/ (changes)
http://cr.openjdk.java.net/~tschatzl/8136681/webrev.1_to_2/src/share/vm/gc/g1/g1IHOPControl.hpp.frames.html

35 // The initial IHOP value relative to the target occupancy.
36 double _initial_ihop_percent;


Does _initial_ihop_percent ever change?  Meaning is it a value that
is saved when the control is initialized and stays at the value?

http://cr.openjdk.java.net/~tschatzl/8136681/webrev.1_to_2/src/share/vm/gc/g1/g1CollectedHeap.cpp.frames.html

Can you check if there is any live data in the region?  An empty
region should not old-allocated?

5643 if (cur->is_young()) {
5644 policy->add_last_old_allocated_bytes(HeapRegion::GrainBytes);
5645 }

Most of the changes looked like they were in response to Mikael's comments
and I'm assuming he'll know what needed to be done there.

Jon


>
>
> On Fri, 2015-11-13 at 07:58 -0800, Jon Masamitsu wrote:
>> Thomas,
>>
>> This is partial.  If you send out a second webrev based on Mikael's
>> review, I'll finish with that.
>>
>>
>> http://cr.openjdk.java.net/~tschatzl/8136681/webrev/src/share/vm/gc/g1/g1CollectedHeap.hpp.frames.html
>>
>>> 1370 // Returns the number of regions the humongous object of the
>>> given word size
>>> 1371 // covers.
>> "covers" is not quite right since to me it says that the humongous
>> object completely uses the
>> region.  I'd use "requires".
> Fixed.
>
>> http://cr.openjdk.java.net/~tschatzl/8136681/webrev/src/share/vm/gc/g1/g1IHOPControl.hpp.html
>>
>>     49   // Update information about recent time during which allocations happened,
>>     50   // how many allocations happened and an additional safety buffer.
>>
>> // Update information about
>>
>> I DON'T KNOW WHICH OF THESE IS MORE PRECISE.
>> //   Time during which allocations occurred (sum of mutator execution time + GC pause times)
>> OR
>> //   Concurrent marking time (concurrent mark end - concurrent mark start)
>>
>> //   Allocations in bytes during that time
>> //   Safety buffer ???
>>
>> I couldn't figure out what the safety buffer is supposed to be.  It seems to
>> be the young gen size but don't know why.
> I tried to explain in in the text. In short, in G1 the IHOP value is
> based on old gen occupancy only. The problem is that the young gen also
> needs to be allocated somewhere too.
>
> Now you could just say, use the maximum young gen size. However this is
> 60% of the heap by default... so the adaptive IHOP algorithm uses a
> measure of the young gen that is not bounded by G1ReservePercent.
>
> The reason to use the unbounded value is because if the code used the
> bounded one, it would cancel out with G1ReservePercent, because the
> closer we get to G1ReservePercent, the smaller that bounded value would
> get, which would make the current IHOP value rise etc, which would delay
> the initiation of the marking.
>
> That would end up loosing throughput as then the young gen gets smaller
> and smaller (and GC frequency increases), it can take a long time until
> G1 gets close enough to G1ReservePercent so that the other factors
> (allocation rate, marking time) are used.
>
> Basically initial mark will be delayed until young gen reaches its
> minimum size, at which time G1 will continue to use that young gen size
> until marking occurs. Which means that typically G1 will eat into
> G1ReservePercent, which we also do not want.
>
> Additionally it would get G1 more in trouble in regards to pause time,
> giving it less room during that time.
>
> Note that G1 young gen sizing already has some issues with young gen
> sizing if it operates close to G1ReservePercent. If there is time I will
> look at it again later.
>
>>     33 // Manages the decision about the threshold when concurrent marking should start.
>>
>> // IHOP = Initiating heap occupancy in percent of the entire heap
>>
>>     34 class G1IHOPControl : public CHeapObj<mtGC> {
>>     35  protected:
>>     36   double _ihop_percent;
>> // Occupancy in bytes.  This is always less than or equal to the (maximum or current maximum?) heap occupancy
>>     37   size_t _target_occupancy;
>>
>>
>> 	// Most recent complete allocation period (time from first young-only GC to
>> 	// end of marking) in seconds.  Any period which is interrupted by a full GC
>> 	// is ignored.
>>     65    _last_allocation_time_s(0.0),
>> 	// Number of bytes allocated during that period (mutator allocation + promotions)
>>     66    _last_allocated_bytes(0),
>> 	// Length of most recent marking cycle in seconds
>>     67    _last_marking_length_s(0.0) { }
>>
>> 	// Occupany in bytes at which a concurrent marking cycle is started. This count
>> 	// includes the waste in regions occupied by humongous objects and waste at
>> 	// the end of any old region that is not the currently being used for promotions
>> 	// and the waste at the end of survivor regions.
> I skipped that paragraph about which waste e.g. at end of TLABs or
> humongous regions is used, and which not. IHOP calculation needs to
> consider actual free space for allocation.
>
> Unfortunately G1 is in two minds about this, i.e. used() for humongous
> objects does not contain the "waste" at the end of the las region, but
> used() for regular regions does.
>
>>     69   size_t get_conc_mark_start_threshold() { return (size_t) (_ihop_percent * _target_occupancy / 100.0); }
>>
>> This might be better named update_marking_length() but can be done in a later cleanup.
>>
> Done.
>
> Thanks a lot for all these reminders for documentation. I have been
> working too long on this functionality so that "everything is clear" to
> me :)
>
> New webrevs with hopefully more complete explanations at:
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.1_to_2/
> (incremental)
> http://cr.openjdk.java.net/~tschatzl/8136681/webrev.2/ (changes)
>
>>     	// Update marking length (concurrent marking end - concurret marking start)
>>     77   virtual void update_time_to_mixed(double marking_length_s) {
>>
>> Have you given much thought to what affect a to-space exhaustion should
>> have on
>> IHOP?  I understand that it is not the design center for this but I
>> think that to-space
>> exhaustion can confuse the statistics.   Maybe a reset of the statistics
>> and a dropping
>> the IHOP to a small value (current heap occupancy or even 0) until you
>> get 3 successful
>> marking cycles.  Or think about it later.
> I already thought a little about how this should interact with regards
> to the calculation: the idea is that basically the algorithm will notice
> that there is a significant amount of additional allocation, and will
> lower the IHOP threshold automatically. (Looking through the code I
> think I saw some problems here, I will see to fix that)
>
> If evacuation failure happens in a gc during marking, there are a few
> options, I have not decided on what's best:
>
> - do nothing because the user asked to run an impossible to manage
> workload (combination of live set, allocation rate, and other options).
>
>    - there is already some log output which that information can be
> derived from.
>
> - allow the user to set a maximum IHOP threshold. He could base this
> value on the log messages he gets.
>    - the user can already do that by increasing G1ReservePercent btw
>
> - make sure that marking completes in time
>
>    - let the mutator threads (or during young gcs while marking is
> running) do some marking if we notice that we do not have enough time.
> Not sure if it is worth the effort.
>
>    - fix some bugs in marking :) that prevent that in extraordinary
> conditions.
>
>    - make sure that we always start marking early enough by making sure
> that mixed gc reclaims enough memory. Planning some work here as part of
> generic work on improving gc policy.
>
>    - improve predictions: some work being done here.
>
> - make evacuation failure really fast and hope it works out
>
>    - do not immediately move young gen regions with evacuation failures
> into young gen regions, but make them survivor.
>
>    - there have already been lots of improvements here as you might have
> noticed.
>
> Thanks,
>    Thomas
>
>




More information about the hotspot-gc-dev mailing list