[16] RFR: 8247319: Remove on-stack nmethod hotness counter sampling from safepoints

Fri Jun 12 07:36:29 UTC 2020

Hi Erik,

I can't comment on your rationale as I know nothing about how the 
hotness counters operate, but I can verify that what you propose indeed 
removes the processing from the safepoint cleanup logic. :) So in that 
limited sense LGTM.

Cheers,
David
-----

On 12/06/2020 5:29 pm, Erik Österlund wrote:
> Hi,
> 
> The sweeper is moving away from using safepoints for its heuristics. It 
> used to count safepoints to figure out when to sweep, but no longer does 
> that. At the same time, we have for a while been removing more and more 
> safepoints. Safepoints are becoming increasingly rare events, dominated 
> by when we need to GC (GuaranteedSafepointInterval is going to 
> disappear). The frequency of how often we need to GC does not have an 
> obvious connection to how often we need to sweep the code cache... any 
> more.
> 
> What still remains from the safepoint-based heuristics is the nmethod 
> hotness counter sampling that is performed in safepoint cleanup. I would 
> like to get rid of this.
> The rationale is that the use of hotness counters is kicking in when the 
> code cache is starting to fill up quite a bit, and there is a need to 
> kill off nmethods heuristically, rather than because they are invalid. 
> But when the code cache fills up, we sweep more and more aggressively. 
> And during these sweeper cycles, we perform nmethod marking using 
> handshakes. That operation also fills in hotness counters for all 
> sampled nmethods.
> 
> In other words, when there is need for acting on the hotness counters, 
> we are in a state where we may be getting more nmethod hotness counter 
> sample information from the sweeping cycles than we are from safepoint 
> sampling. Conversely, when code cache pressure is high and we need more 
> samples, we might end up getting very few from safepoint based sampling 
> (because the heap is large). The correlation between safepoint frequency 
> and code cache pressure is simply not there any more. And for us to walk 
> all stacks in the system in every single safepoint (which for ZGC is 
> starting to dominate pauses when I remove our stack sampling from 
> safepoints), there better be a really good reason to do this sampling in 
> safepoints. And there simply isn't. So I propose we delete it, in favour 
> of using the hotness counter samples we get from the sweeping cycles 
> instead, that are indeed proportional in frequency, to the code cache 
> pressure.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8247319
> 
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8247319/webrev.00/
> 
> Thanks,
> /Erik