RFR: 8290892: C2: Intrinsify Reference.reachabilityFence

Aleksey Shipilev shade at openjdk.org
Thu May 22 10:15:07 UTC 2025


On Tue, 20 May 2025 00:49:49 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

> This PR introduces C2 support for `Reference.reachabilityFence()`.
> 
> After [JDK-8199462](https://bugs.openjdk.org/browse/JDK-8199462) went in, it was discovered that C2 may break the invariant the fix relied upon [1]. So, this is an attempt to introduce proper support for `Reference.reachabilityFence()` in C2. C1 is left intact for now, because there are no signs yet it is affected.
> 
> `Reference.reachabilityFence()` can be used in performance critical code, so the primary goal for C2 is to reduce its runtime overhead as much as possible. The ultimate goal is to ensure liveness information is attached to interfering safepoints, but it takes multiple steps to properly propagate the information through compilation pipeline without negatively affecting generated code quality.
> 
> Also, I don't consider this fix as complete. It does fix the reported problem, but it doesn't provide any strong guarantees yet. In particular, since `ReachabilityFence` is CFG-only node, nothing explicitly forbids memory operations to float past `Reference.reachabilityFence()` and potentially reaching some other safepoints current analysis treats as non-interfering. Representing `ReachabilityFence` as memory barrier (e.g., `MemBarCPUOrder`) would solve the issue, but performance costs are prohibitively high. Alternatively, the optimization proposed in this PR can be improved to conservatively extend referent's live range beyond `ReachabilityFence` nodes associated with it. It would meet performance criteria, but I prefer to implement it as a followup fix.
> 
> Another known issue relates to reachability fences on constant oops. If such constant is GCed (most likely, due to a bug in Java code), similar reachability issues may arise. For now, RFs on constants are treated as no-ops, but there's a diagnostic flag `PreserveReachabilityFencesOnConstants` to keep the fences. I plan to address it separately. 
> 
> [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/ref/Reference.java#L667
> "HotSpot JVM retains the ref and does not GC it before a call to this method, because the JIT-compilers do not have GC-only safepoints."
> 
> Testing:
> - [x] hs-tier1 - hs-tier8
> - [x] hs-tier1 - hs-tier6 w/ -XX:+StressReachabilityFences -XX:+VerifyLoopOptimizations
> - [x] java/lang/foreign microbenchmarks

This is an impressive piece of work. It still makes me yearn for simpler solution. (I am impressed how much it snowballed from my original `@DontInline` implementation!)

> Representing ReachabilityFence as memory barrier (e.g., MemBarCPUOrder) would solve the issue, but performance costs are prohibitively high. 

How bad is it? `MemBarCPUOrder` pinches all memory, so I assume this breaks a lot of optimizations when `RF` is sitting in the hot loop? I remember we went through a similar exercise with `Blackholes`: [JDK-8296545](https://bugs.openjdk.org/browse/JDK-8296545) -- and decided to pinch only the control. I _guessing_ this is not enough to fix RF, or is it?

Some other comments:

src/hotspot/share/opto/block.cpp line 189:

> 187:          !get_node(end_idx)->is_Mach() &&
> 188:          !get_node(end_idx)->is_BoxLock() &&
> 189:          !(get_node(end_idx)->is_ReachabilityFence() && C->print_assembly())) {

Um. So this fairly generic method is predicated of whether a diagnostic VM option is enabled. Which risks that the compiler behavior with/without printing assembly is different? Which might hide the very issues we are trying to diagnose with printing assembly? Can we handle `RF` here without checking for `print_assembly`?

This will also obviate a need to pass `Compile* C` around.

src/hotspot/share/opto/callnode.cpp line 969:

> 967:             projs->exobj = e;
> 968:           } else  {
> 969:             // exception table for rethrow case

Feels like we want to assert other values of `e->in(0)->as_CatchProj()->_con` here? From the switch statement in the previous hunk, there seems to be `fall_through_index` that is not "rethrow case" (or is it?).

src/hotspot/share/opto/classes.cpp line 50:

> 48: #include "opto/subtypenode.hpp"
> 49: #include "opto/vectornode.hpp"
> 50: #include "utilities/macros.hpp"

This `macros.hpp` include is for `#if INCLUDE_SHENANDOAHGC` a line below, please keep it intact.

src/hotspot/share/opto/compile.cpp line 2586:

> 2584:  C->node_hash()->clear();
> 2585: 
> 2586:   // A method with only infinite loops has no edges entering loops from root

Redundant.

src/hotspot/share/opto/compile.cpp line 3912:

> 3910: // requires that the walk visits a node's inputs before visiting the node.
> 3911: 
> 3912: static bool has_non_debug_uses(Node* n) {

This got inserted right between  `------------------------------final_graph_reshaping_walk--------------------` comment and the `final_graph_reshaping_walk` implementation. Also, put `has_non_debug_uses` into `Compile`?

src/hotspot/share/opto/compile.cpp line 3968:

> 3966:     return;
> 3967: 
> 3968:   // Go over ReachabilityFence nodes to skip DecodeN nodes for referents.

This is a cute optimization. Does it happen in our code anywhere? I would have expected `DecodeN` to be near the heap loads, and suppose `RF` is mostly called on locals, which are already uncompressed?

src/hotspot/share/opto/compile.cpp line 3995:

> 3993:     for (int j = start; j < end; j++) {
> 3994:       Node* in = n->in(j);
> 3995:       if (in->is_DecodeNarrowPtr() && (is_uncommon || has_non_debug_uses(in))) {

The comment says we can skip when node is only referenced in debug info. Here we skip when there _are_ non-debug uses. Did you mean `!has_non_debug_uses(in)`?

src/hotspot/share/opto/parse1.cpp line 1225:

> 1223: 
> 1224:   if (StressReachabilityFences) {
> 1225:     // Keep all oop arguments alive until method return.

Comment says "arguments", but we save locals. Aren't arguments on "stack" in `JVMState`? For stress mode, would make sense to hook up both locals/stack from `JVMState`, maybe?

src/hotspot/share/opto/reachability.cpp line 211:

> 209:     }
> 210:   }
> 211:   return found;

`found` is always `false` here. I don't think this function even needs a return value, judging by the uses.

src/hotspot/share/opto/reachability.cpp line 431:

> 429:       }
> 430:     }
> 431:     redundant_rfs.push(rf);

I see `PhaseIdealLoop::optimize_reachability_fences` asks for `redundant_rfs.member(rf)` before going into this analysis. Is it because we can have duplicate RFs in the `C->reachability_fence` list? Can we have duplicate here? Should we check `.member(rf)` here as well?

src/hotspot/share/opto/reachability.hpp line 48:

> 46:     // Fake the incoming arguments mask for blackholes: accept all registers
> 47:     // and all stack slots. This would avoid any redundant register moves
> 48:     // for blackhole inputs.

Still mentions blackholes.

src/java.base/share/classes/java/lang/ref/Reference.java line 662:

> 660:      * @since 9
> 661:      */
> 662:     @IntrinsicCandidate

Sounds like we also want to restore `@DontInline` to cover the case when intrinsic is not available / disabled for some compiler. I vaguely remember some intrinsic handling code checks whether method is prohibited from inlining (maybe affects only global `-XX:-Inline`, not sure), so it might be as straightforward.

-------------

PR Review: https://git.openjdk.org/jdk/pull/25315#pullrequestreview-2860354171
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102055970
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102067844
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102002006
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102069381
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102071571
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102080895
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102104522
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102130716
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102147078
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102170286
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102011606
PR Review Comment: https://git.openjdk.org/jdk/pull/25315#discussion_r2102028883


More information about the hotspot-compiler-dev mailing list