[9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Tue Mar 14 17:15:01 UTC 2017

Roland,

I find it problematic to decide whether an unsafe access is always 
on-heap or not during parsing. Considering we plan to remove membars 
around suspicious accesses (as a fix for JDK-8176513), I would like to 
avoid conservatively treating them as raw.

Can we just disable loop unswitching when it produces raw accesses? How 
risky would it be?

There are cases when C2 already skips some optimizations when the base 
is NULL (e.g., LoadNode::split_through_phi or MemNode::Ideal_common).

FTR I faced a similar problem before (JDK-8155635 [1]) and initially 
experimented with a different approach: convert null-based oop pointers 
to raw pointers [2], but after additional discussions decided to abandon it.

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8155635
[2] 
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-April/022743.html

On 3/13/17 12:35 PM, Roland Westrelin wrote:
>
> http://cr.openjdk.java.net/~roland/8176506/webrev.00/
>
> In test1:
>
> 1) the call to test_helper is inlined and the checkcast first checks
> whether o is null or not with both branches possibly taken.
>
> 2) the compiler can't tell whether the unsafe access is off heap or
> not. It assumes it's on heap and compiles it as a load from an oop
> (address type is a TypeOopPtr).
>
> 3) the loop is unswitched using the null check from the checkcast. One
> of the loop copies is for the case where o is null. The memory access
> for that copy of the loop is a load from a null object and that causes a
> crash in C2.
>
> The fix I'm proposing is that for every unsafe access for which we can't
> tell whether the access is on heap or off heap we default to using raw
> memory instead of assuming it is on heap. This way we stop abusing the
> c2 type system by pretending something is an oop when we don't know and
> then crash when the assumption that that something is an oop becomes
> false. In the patch, I CheckCastPP the Object base to raw memory and
> makes sure we have one CheckCastPP per unsafe access so an address
> possibly computed from an oop is not live at a safepoint.
>
> This has drawbacks: we have an address computation per unsafe access, if
> the object becomes a constant or a stable array during optimizations we
> can't optimize the memory access anymore etc. but we already restrict
> optimizations a lot in the case where we do not know whether the access
> is off heap or not by putting membars around the memory access.
>
> There are cases where we actually implicitly know that the base is not
> null. test2 is such a case. The offset is small, so the object input can
> only be non null. Another example is an object field access which
> doesn't make sense off heap. In those cases, I suggest, c2 assumes the
> base is non null and uses a CastPP to cast it to non null.
>
> The problem is that the CastPP alone is not sufficient. In test2, if we
> cast the base to non null after loop unswitching we have a cast to non
> null with a null input in one branch. That data path dies but that
> branch of the if doesn't: that causes c2 to crash. So we actually need a
> proper null check that the CastPP depends on but, ideally, we wouldn't
> want to pay the price of the check at runtime. The trick I suggest is to
> have c2 add a null check with a null branch that goes to a Halt
> node. The If node input is a new opaque node with 2 inputs: input 1 is
> the null comparison, input 2 is the actual value of the null check (we
> know it never fails). After loop optimizations, the opaque node is
> replaced by its second input, the If goes away and there's no overhead
> at runtime. That new opaque node allows constant folding: if during
> optimizations input 1 of the opaque node becomes true or false, the
> opaque node is replaced by true or false. In the case of test2, the
> final code is the check for null from the checkcast with 2 branches: an
> impossible branch that ends up in an halt and the "good" branch where
> the unswitched loop is.
>
> Note that there are other intrinsics where we know from the context that
> a null object is impossible, so skip a null check. So for those
> intrinsics, we might need to cast to non null as well and need to have a
> null check that the cast depends on for consistency.
>
> Roland.
>