8202377: Modularize C2 GC barriers

Tue May 1 13:32:03 UTC 2018

Hi,

The GC barriers for C2 are not as modular as they could be. It currently 
uses switch statements to check which GC barrier set is being used, and 
call one or another barrier based on that, in a way that it can only be 
used for write barriers.

My proposed solution is to follow the same pattern that has been used by 
C1 (and the rest of HotSpot), which is to provide a GC barrier set code 
generation helper for C2. Its name is BarrierSetC2. Each barrier set 
class has its own BarrierSetC2, following a mirrored inheritance 
hierarchy to the BarrierSet hierarchy. You generate the accesses using 
some access_* member functions on GraphKit, which calls into BarrierSetC2.

A lot of the design looks very similar to BarrierSetC1. In C1, there was 
a wrapper object called LIRAccess that wrapped a bunch of context 
parameters that were passed around in the barrier set hierarchy. There 
is a similar wrapper for C2 that I call C2Access. Users of the API do 
not see it. They call, e.g. access_load_at, in GraphKit during parsing. 
The access functions wrap the access in a C2Access object with a bunch 
of context parameters, and calls the currently selected BarrierSetC2 
backend accessor with this context. For the atomic accesses, there is a 
C2AtomicAccess, inheriting from C2Access. It contains more context, as 
required by the atomic accesses (e.g. explicit alias_idx, whether the 
node needs pinning with an SCM projection, and a memory node).

Apart from the normal shared decorators, C2 does use its own additional 
decorators for its own use:
* C2_MISMATCHED and C2_UNALIGNED (describing properties of unsafe accesses)
* C2_WEAK_CMPXCHG: describing if a cmpxchg may have false negatives
* C2_CONTROL_DEPENDENT_LOAD: use when a load should have control dependency
* C2_PINNED_LOAD: use for loads that must be pinned
* C2_UNSAFE_ACCESS: Used to recognize this is an unsafe access. This 
decorator implies that loads have control dependency and need pinning, 
unless it can be proven that the access will be inside the bounds of an 
object.
* C2_READ_ACCESS and C2_WRITE_ACCESS: This denotes whether the access 
reads or writes to memory. Or both for atomics. It is useful for for 
figuring out what fencing is required for a given access and ordering 
semantics, as well as being useful for Shenandoah to figure out what 
type of barrier to use to ensure memory consistency.

The accesses go through a similar process as they do in C1. Let's take 
BarrierSetC2::store_at for example. It uses the the C2AccessFence scoped 
object helper to figure out what membars are required to surround the 
access, resolve the address (no-op for all GCs with a to-space 
invariant, which is all GCs except Shenandoah in HotSpot at the moment), 
and then calls store_at_resolved. The store_at_resolved member function 
generates the access and the barriers around it. The abstract 
ModRefBarrierSetC2 barrier set introduces the notion of pre/post write 
barriers, and lets concrete barrier sets do sprinkle their GC barriers 
in there. It calls BarrierSetC2::store_at_resolved to generate the 
actual access. For example CardTableBarrierSet only needs to override 
its post barrier for this to work as expected. The other accesses follow 
a similar pattern.

The Compile class now has a type erase (void*) per compilation unit 
state that is created for each compilation unit (with 
BarrierSetC2::create_barrier_state). For the GCs in HotSpot today, this 
is always NULL. But for GCs that have their own macro nodes, the 
compilation unit can be used for, e.g. lists of barrier-specific macro 
nodes, that should not pollute the Compile object. Such macro nodes can 
be expanded during macro expansion using the 
BarrierSetC2::expand_macro_nodes member function.

There are a few other helpers that may be good for a GC to have, like 
figuring out if a node is a GC barrier (for escape analysis), whether a 
GC barrier can be eliminated (for example using ReduceInitialCardMarks), 
whether array_copy requires GC barriers, how to step over a GC barrier. 
There is also a helper for loop optimizing GC barrier nodes.

This work will help to pave way for a new class of collectors utilizing 
load barriers (ZGC and Shenandoah) for concurrent compaction.

Webrev:
http://cr.openjdk.java.net/~eosterlund/8202377/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8202377

Thanks,
/Erik