RFR (S): 8227719: G1 pending cards estimation too conservative for cost prediction

Thu Jul 18 23:05:52 UTC 2019

> On Jul 18, 2019, at 4:33 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On Thu, 2019-07-18 at 15:43 -0400, Kim Barrett wrote:
>>> On Jul 18, 2019, at 10:26 AM, Thomas Schatzl <
>>> thomas.schatzl at oracle.com> wrote:
>>> 
>>> Hi all,
>>> 
>>> can I have reviews for this change that instead of using a very
>>> conservative estimate of the number of pending cards in the DCQS,
>>> that is assuming that all completed buffers are full, simply
>>> calculates it as buffers are completed, i.e. giving the exact
>>> value.
>> 
>> Before I review this, I have a high-level comment that strongly
>> affects this.
>> 
>> I suspect that we don't really want the number of buffers at all, but
>> only the number of cards.  I think the number of buffers is just
>> being used to estimate the number of cards.  Rather than adding a
>> second counter, perhaps consider replacement of one by the other?
> 
>  I did not look at whether there are other interesting uses for the
> number of buffers, but I will look at this tomorrow.
> 
> Without looking, I guess that the refinement thread activation needs
> the amount of buffers too?
> 
>> DirtyCardQueueSet::completed_buffers_num() could be retained (at
>> least for now) as an estimate based on the card count.  That would
>> probably simplify the replacement effort.
>> 
> 
> Sure it does not make sense to carry around a value that is effectively
> not used anywhere.
> 
> Do you have an idea how this would affect refinement thread activation?
> It will likely to be less aggressive then.
> 
> Could we base refinement thread activation on the entry count instead
> then? If so, this seems to be a change that is better made separately
> though. Not against doing this myselves, but it just seems a change big
> enough that would imho warrant a separate investigation.
> 
> Thanks,
>  Thomas

I think it won't have any significant effect on refinement.  Most of
the buffers in or going through the DCQS during concurrent refinement
are full.  And number of buffers is just used as an estimate of work
to be done, with number of cards being a better estimate.  I think a
much larger source of inaccuracy in that estimate is card duplication,
though your recent changes have eliminated redirty buffers as a source
for that.

I suggested keeping completed_buffers_num() for now, using a
definition of ceiling(card_count / buffer_size()), to reduce the
fanout of the change to internally carry around card counts rather
than buffer counts.  I can deal with followup on that as part of
working on improving DCQS and refinement thread control, which I think
I'm finally ready to start working on again, now that my PtrQueue[Set]
refactoring work seems to be approaching completion.

I would prefer to only be managing one counter rather than two,
especially as I start looking at mitigations for the DCQS lock.