[9 or 10?] RFR(M): 8176506: C2: loop unswitching and unsafe accesses cause crash

Tue Mar 14 17:18:48 UTC 2017

> There are cases when C2 already skips some optimizations when the base
> is NULL (e.g., LoadNode::split_through_phi or MemNode::Ideal_common).

s/is NULL/can be NULL/

Best regards,
Vladimir Ivanov

> On 3/13/17 12:35 PM, Roland Westrelin wrote:
>>
>> http://cr.openjdk.java.net/~roland/8176506/webrev.00/
>>
>> In test1:
>>
>> 1) the call to test_helper is inlined and the checkcast first checks
>> whether o is null or not with both branches possibly taken.
>>
>> 2) the compiler can't tell whether the unsafe access is off heap or
>> not. It assumes it's on heap and compiles it as a load from an oop
>> (address type is a TypeOopPtr).
>>
>> 3) the loop is unswitched using the null check from the checkcast. One
>> of the loop copies is for the case where o is null. The memory access
>> for that copy of the loop is a load from a null object and that causes a
>> crash in C2.
>>
>> The fix I'm proposing is that for every unsafe access for which we can't
>> tell whether the access is on heap or off heap we default to using raw
>> memory instead of assuming it is on heap. This way we stop abusing the
>> c2 type system by pretending something is an oop when we don't know and
>> then crash when the assumption that that something is an oop becomes
>> false. In the patch, I CheckCastPP the Object base to raw memory and
>> makes sure we have one CheckCastPP per unsafe access so an address
>> possibly computed from an oop is not live at a safepoint.
>>
>> This has drawbacks: we have an address computation per unsafe access, if
>> the object becomes a constant or a stable array during optimizations we
>> can't optimize the memory access anymore etc. but we already restrict
>> optimizations a lot in the case where we do not know whether the access
>> is off heap or not by putting membars around the memory access.
>>
>> There are cases where we actually implicitly know that the base is not
>> null. test2 is such a case. The offset is small, so the object input can
>> only be non null. Another example is an object field access which
>> doesn't make sense off heap. In those cases, I suggest, c2 assumes the
>> base is non null and uses a CastPP to cast it to non null.
>>
>> The problem is that the CastPP alone is not sufficient. In test2, if we
>> cast the base to non null after loop unswitching we have a cast to non
>> null with a null input in one branch. That data path dies but that
>> branch of the if doesn't: that causes c2 to crash. So we actually need a
>> proper null check that the CastPP depends on but, ideally, we wouldn't
>> want to pay the price of the check at runtime. The trick I suggest is to
>> have c2 add a null check with a null branch that goes to a Halt
>> node. The If node input is a new opaque node with 2 inputs: input 1 is
>> the null comparison, input 2 is the actual value of the null check (we
>> know it never fails). After loop optimizations, the opaque node is
>> replaced by its second input, the If goes away and there's no overhead
>> at runtime. That new opaque node allows constant folding: if during
>> optimizations input 1 of the opaque node becomes true or false, the
>> opaque node is replaced by true or false. In the case of test2, the
>> final code is the check for null from the checkcast with 2 branches: an
>> impossible branch that ends up in an halt and the "good" branch where
>> the unswitched loop is.
>>
>> Note that there are other intrinsics where we know from the context that
>> a null object is impossible, so skip a null check. So for those
>> intrinsics, we might need to cast to non null as well and need to have a
>> null check that the cast depends on for consistency.
>>
>> Roland.
>>