RFR: 8253180: ZGC: Implementation of JEP 376: ZGC: Concurrent Thread-Stack Processing

Erik Österlund eosterlund at openjdk.java.net
Wed Sep 23 08:36:25 UTC 2020


On Tue, 22 Sep 2020 13:45:16 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:

>> This PR the implementation of "JEP 376: ZGC: Concurrent Thread-Stack Processing" (cf.
>> https://openjdk.java.net/jeps/376).
>> Basically, this patch modifies the epilog safepoint when returning from a frame (supporting interpreter frames, c1, c2,
>> and native wrapper frames), to compare the stack pointer against a thread-local value. This turns return polls into
>> more of a swiss army knife that can be used to poll for safepoints, handshakes, but also returns into not yet safe to
>> expose frames, denoted by a "stack watermark".  ZGC will leave frames (and other thread oops) in a state of a mess in
>> the GC checkpoint safepoints, rather than processing all threads and their stacks. Processing is initialized
>> automagically when threads wake up for a safepoint, or get poked by a handshake or safepoint. Said initialization
>> processes a few (3) frames and other thread oops. The rest - the bulk of the frame processing, is deferred until it is
>> actually needed. It is needed when a frame is exposed to either 1) execution (returns or unwinding due to exception
>> handling), or 2) stack walker APIs. A hook is then run to go and finish the lazy processing of frames.  Mutator and GC
>> threads can compete for processing. The processing is therefore performed under a per-thread lock. Note that disarming
>> of the poll word (that the returns are comparing against) is only performed by the thread itself. So sliding the
>> watermark up will require one runtime call for a thread to note that nothing needs to be done, and then update the poll
>> word accordingly. Downgrading the poll word concurrently by other threads was simply not worth the complexity it
>> brought (and is only possible on TSO machines). So left that one out.
>
> src/hotspot/share/compiler/oopMap.cpp line 243:
> 
>> 241:   } else {
>> 242:     all_do(fr, reg_map, f, process_derived_oop, &do_nothing_cl);
>> 243:   }
> 
> I wonder if we shouldn't hide the StackWatermarkSet in the GC code, and not "activate" the DerivedPointerTable when a
> SWS is used? Isn't it the case that already don't enable the table for ZGC? Couldn't this simply be: `
>   if (DerivedPointerTable::is_active()) {
>     all_do(fr, reg_map, f, add_derived_oop, &do_nothing_cl);
>   } else {
>     all_do(fr, reg_map, f, process_derived_oop, &do_nothing_cl);
>   }
> `

The problem isn't the GC code deciding to use or not use the derived pointer "table". It's the shared code uses of it
that is the problem here. The table is explicitly activated by shared code, when for example using the JFR leak
profiler, JVMTI heap walks, etc. This code makes the selection a GC choice when invoked by GC code, and a shared
runtime choice when invoked by the shared runtime.

-------------

PR: https://git.openjdk.java.net/jdk/pull/296


More information about the serviceability-dev mailing list