RFR: 8231153: Improve concurrent refinement statistics
Thomas Schatzl
thomas.schatzl at oracle.com
Wed Sep 25 09:40:21 UTC 2019
Hi Kim,
On 24.09.19 04:58, Kim Barrett wrote:
> Please review this change to collection additional statistics about
> the generation and processing of dirty cards. These are needed for
> future improvements to the control of the concurrent refinement
> threads (JDK-8137022).
>
> This change adds tracking and prediction of the rate at which
> concurrent refinement threads process cards. It also adds tracking
> and prediction of the rate at which mutator threads generate cards.
>
> As part of that, there is now tracking of total cards processed by the
> concurrent refinement threads and the mutator threads. This is now
> used by G1RemsetSummary instead of the buffer counts previously reported.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8231153
>
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8231153/open.00/
>
> Testing:
> mach5 tier1-5
> some local by-hand testing to look at the rate tracking.
>
I think the change in functionality is good. I can't really judge the
appropriateness of the input values to the predictors since I do not
know their actual use and haven't gotten a feeling about whether there
is a problem with that :)
My experience is that sometimes it is better to be slightly inaccurate
in the values passed to reduce impact of outliers; in this case this may
e.g. be more appropriate to use concurrently scanned cards instead of
the actual concurrently refined cards as the difference are cards
filtered out during refinement, and actual work is only done on
*scanned* cards anyway. That may not be the case here, just an example -
I can't tell.
E.g. there may be a huge change in refined cards, but scanned cards do
not change, and you are actually really interested in the latter.
When reading the change I had the following thoughts to improve readability:
- maybe some comment somewhere what "scanned" really means compared to
"refined". Initially I was surprised with the change at
G1RemSet::_num_conc_scanned_cards, but some thinking made me aware of
the difference.
- the change reuses the "processed" term for counted cards in a few
places, and it is unclear to me what the difference to just "refined"
cards would be in some cases.
I.e. I think the additional term for apparently the same thing seems
unnecessary, and I would recommend changing it to just "refined".
E.g. G1DirtyCardQueueSet::refine_buffer(, size_t* cards_processed) ->
...(, size_t* num_cards_refined).
There is already a distinction between concurrently and mutator
processed cards by adding the appropriate term to the names, so
"processed" seems superfluous.
Also seen in G1ConcurrentRefineThread::run_service().
- I would also suggest to add a "num_" prefix to numbers/counts of values.
E.g. in the above method declaration it is unclear whether
"cards_processed" is a pointer to an array of cards or simply the
number/count of cards given only the name.
Other occurrences:
G1ConcurrentRefineThread::refined_cards -> num_refined_cards
G1DirtyCardQueueSet::mutator_refined_cards -> num_mutator_refined_cards
I saw that the change actually removes a few of these prefixes in
G1RemSetSummary, which I am not really happy about; this extra
redundancy does help with readability imho, particularly when you pass
them around (also when you start dealing with both at the same time :))
- in G1Policy::_pending_cards should be renamed to
"_pending_cards_at_start_of_gc" since we also now have a
"_pending_cards_after_last_gc" to distinguish their use a little better?
- pre-existing: probably rename G1RemSet::_num_conc_scanned_cards and
G1RemSetSummary::_conc_scanned_cards to "_concurrent_scanned_cards" to
match the "_concurrent_refined_cards".
- I would remove the two "Bleh" in g1ConcurrentRefine.cpp - while I can
see their point I would probably rather file an enhancement request :)
- not sure, but I think exposing size() and start() and in G1FreeIdSet
seems unnecessary: the only user is G1DirtyCardQueueSet anyway, and it
is already owner of G1FreeIdSet. I.e. it knows these values already (and
passes it to the initializer of the G1FreeIdSet instance, and already
has a getter for the size() value), so getting it back from G1FreeIdSet
seems a bit strange to me, but I am okay with current code.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list