RFR: JDK-8132849: Increased stop time in cleanup phase because of single-threaded walk of thread stacks in NMethodSweeper::mark_active_nmethods()

Mon Sep 24 11:47:16 UTC 2018

Hi Erik,

> I think that by answering the meta question "wait why are we doing this"
> in this email, I will cover the questions in the previous email too.
> 
> The nmethod marking is strictly required so that after you have selected
> your not entrant nmethods that you want to nuke, you know that at some
> snapshot in time, they were not on the stack (and cant have become so
> afterwards because they are not entrant). As I mentioned earlier, doing
> this both in safepoint cleanup for every safepoint, as well as in the VM
> operation itself, is "questionable". Doing it in just the VM
> operation/handshake should be enough.
> 
> The hotness counting is not strictly necessary at all. In fact, you can
> turn it off with the JVM flag -XX:-UseCodeAging.
> 
> So the hotness counter updating is part of the code aging mechanism.
> This is more of a heuristic thing than a correctness thing. You can just
> wait until you run out of space in the code heap, and then nuke a bunch
> of stuff (using the nmethod marking mechanism), and you are good. But
> similar to how you in your GC algorithm want to avoid running into full
> GCs because they are expensive, you also want to avoid filling up the
> code heap, because the consequences of that are also very expensive. The
> code aging mechanism was therefore introduced as a way of figuring out
> if there are seemingly inactive nmethods that can be discarded before
> running out of code heap memory.
> 
> So the way that works is that you give each nmethod a counter that you
> decay every now and then, but heat up again when you see said nmethods
> on the stack. That way, the sweeper can look for nmethods that do not
> seem to have been found on the stack "for a while", and select them as
> good candidates for being inactive.
> 
> So to answer the question whether you can update hotness counters only
> when you mark nmethods... you can. But by doing that, it no longer
> serves its purpose of finding inactive nmethods, and becomes more of a
> piece of logic that we run occasionally for the fun of it. So we should
> not do that.
> 
> The reason that hotness counters are in safepoint cleanup, is to provide
> fresh stack samples to the sweeper.
> 
> So my suggestion for now is:
> Do nmethod marking in VM operation/handshake operation.
> Do hotness counter updating when UseCodeAging in safepoint cleanup.
> 
> And now you might be wondering if it really makes sense to walk all
> stacks in the system every safepoint, to provide some heuristic about
> whether nmethods are inactive or not. Arguably not. I have an idea about
> a much better way of doing this. I will get back to you in a few days
> about that.

Thanks for your explanations. That's more or less what I figured out
from studying the code too.

Couldn't we have a CodeAgeInterval (or similar) every this many ms we do
the hotness-reset-scan, either by firing (from sweeper thread) a TLHS or
a VM_Op ? This should get us a more regular sampling than doing this at
the somewhat random safepoint-prologue?

Roman

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20180924/eecf3e61/signature-0001.asc>