RFR: 8244505: G1 pause time ratio calculation does not consider Remark/Cleanup pauses

Ivan Walulya iwalulya at openjdk.java.net
Wed Sep 16 10:07:11 UTC 2020


On Tue, 15 Sep 2020 16:48:53 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> The computation for long and short term pause time ratios does not include Remark/Cleanup pause times. Isolate updates
>> to _prev_collection_pause_end_ms from update_recent_gc_times(), then have all gc pauses (including Remark/Cleanup)
>> recorded by update_recent_gc_times(). Patch also includes trivial clean ups to g1Policy.* files. Testing: Tier 1 - 7
>> Performance testing did not show any significant changes in performance
>
> Thanks @tschatzl for the initial comments, I will split the changes.

> 
> - I'm worried that the assignment of "update_stats" in
> G1Policy::record_collection_pause_end() and G1Policy::record_pause()
> would get out of sync. It would be nice if the change either passed it
> to G1Policy::record_pause(), or create a new member function that
> returns that and is used in both places.

Will create a helper function to sync usage of 'update_stats'

> 
> - update_gc_pause_time_ratios() does not need to be public afaict

Right, this will be change

> 
> - I'm okay that now full gcs are considered in the pause time ratio
> calculation.
> This may cause (with JDK-8238687) that the heap will be kept larger
> after the full gc, which okay with me for naturally occurring full gcs.
> 
> One concern I have are full gcs caused by System.gc() calls - one of
> their purposes is to compact the heap, but now they will have impact
> beyond that one. Maybe try to exclude those from the gc time ratio
> calculation?
> 
> I think the previous (undocumented) exclusion of all full gcs has been
> to not overcomplicate the code assuming that most if any full gcs with
> G1 should be System.gcs() for heap compaction purposes.
> 
> Ivan, what do you think?

Right,  Full-gcs were excluded from the compute_pause_time_ratios immediately after the full-gc, but not from the
successive computations. Bug/Enhancement will be filed.

> 
> - I think the change in G1Analytics::compute_pause_time_ratios() is
> 
> double long_interval_ms = (end_time_sec -
> oldest_known_gc_end_time_sec()) * 1000.0;
> - _long_term_pause_time_ratio = _recent_gc_times_ms->sum() /
> long_interval_ms;
> + _long_term_pause_time_ratio = (_recent_gc_times_ms->sum() +
> pause_time_ms) / long_interval_ms;
> 
> wrong, or even worse than the current calculation. Sorry for not being
> able to pinpoint it yesterday evening.
> 
> Let's assume that we have the following pauses on the timeline in
> G1Analytics, where "X" means pause (the number at the end of an "X"
> block is just a running number), "-" means mutator time.
> 
> XXX1-----XX2---XXX3---------XXXX4---XXXXXX5--------XX6
> 
> ^--------------long_interval_ms-------------------^
> 
> (optimized for fixed font view)
> 
> Currently oldest_known_gc_end_time_sec() is at the end of pause 1, and
> _recent_gc_times_ms contains all gc pause times from 1 to 5. We did not
> call update_recent_gc_times() yet.
> 
> Now the (existing) calculation calculates "long_interval_ms" from the
> end of pause 1 to the end of pause 6 (current), and divides by the sum
> of pauses 1 to 5 - which is wrong as pause 1 can be different to pause 6
> in length. However, there is the the code permeating assumption (not
> documented, sorry :)) that pauses are fairly regular. So this somewhat
> works out.
> 
> With that change, the dividend is too large, i.e. in addition to taking
> the wrong long_interval_ms, the code now also contains additional pauses.
> 
> Maybe this could be fixed somehow? Unless it's a very small fix (or I'm
> wrong about the above), I would prefer to fix this separately.

Right, I will revert to the previous version, and file a bug to fix this separately.

> 
> Thanks,
> Thomas

-------------

PR: https://git.openjdk.java.net/jdk/pull/183



More information about the hotspot-gc-dev mailing list