[9] RFR(S): 8158228: C1 incorrectly folds mismatched loads from stable arrays

Tobias Hartmann tobias.hartmann at oracle.com
Thu Jun 2 13:32:22 UTC 2016


Hi Vladimir,

thanks for the feedback!

On 01.06.2016 15:38, Vladimir Ivanov wrote:
> Tobias,
> 
> Thanks for taking care of the bug.
> 
>> https://bugs.openjdk.java.net/browse/JDK-8158228
>> http://cr.openjdk.java.net/~thartmann/8158228/webrev.00/
>>
>> Multiple tests fail since JDK 9 b110 because String.charAt() returns a wrong value for a constant string. In b110, JDK-8143407 [1] enabled C1 to constant fold stable array elements and since JDK-8150180 [2] String.value is constant. The problem is that C1 does not correctly fold mismatched loads from stable arrays. In this case, String.value is a stable byte[] and StringUTF16.charAt() is intrinsified as char load. Instead of a char load, C1 emits a byte load because it does not check for mismatched loads.
>>
>> I fixed this by passing the BasicType of the mismatched load to ciArray::element_value() to ensure the correct constant is emitted. I verified that both 'Lex06403.java' and 'NewTestCase.java' pass with the fix. Also tested with JPRT and RBT (running).
> 
> As I understand the only source of mismatched accesses are String intrinsics (getCharStringU/putCharStringU).

Yes, the mismatched access is emitted by the getCharStringU intrinsic.

> (Unsafe accesses are the main source of mismatched accesses, but C1 doesn't optimize them).
> 
> Can you issue UnsafeObjectOp instead of LoadIndexed/StoreIndexed in GraphBuilder::append_char_access or does C1 produce less efficient code for unsafe accesses?

It's possible but some boilerplate code to compute the offset is necessary:
http://cr.openjdk.java.net/~thartmann/8158228/webrev.unsafe/

However, the code is far less than optimal:

  0x00007fe4d159746a: mov    $0x1c,%rdx
  0x00007fe4d1597474: mov    $0x6d4e80818,%rsi
  0x00007fe4d159747e: movzwl (%rsi,%rdx,1),%edx
  0x00007fe4d1597482: and    $0xffff,%edx
  0x00007fe4d1597488: and    $0xffff,%edx
  0x00007fe4d159748e: mov    %rdx,%rbx
  0x00007fe4d1597491: cmp    $0x67,%ebx

vs.
  
  0x00007fe14559744a: mov    $0x67,%ebx
  0x00007fe14559744f: cmp    $0x67,%ebx

The load is not folded, we don't use addressing modes (because we compute the long offset beforehand) and there is a useless 'and'.

> Otherwise, I suggest to mark LoadIndexed/StoreIndexed as mismatched (or introduce UnsafeLoadIndexed/UnsafeStoreIndexed node) and check that in Canonicalizer::do_LoadIndexed.
> 
> Mismatch detection is more complex than just comparing basic types (see Type::make_constant_from_array_element() & check_mismatched_access(), though some checks are C2-specific).

Right, I agree. I would like to go with marking LoadIndexed/StoreIndexed as mismatched when emitting the intrinsic:
http://cr.openjdk.java.net/~thartmann/8158228/webrev.01/

Like this we still have optimal code. What do you think?

Thanks,
Tobias

> Since C1 doesn't optimize unsafe accesses, I suggest to confine the fix to String intrinsics.
> 
> Best regards,
> Vladimir Ivanov
> 
>>
>> Thanks,
>> Tobias
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8143407
>> [2] https://bugs.openjdk.java.net/browse/JDK-8150180


More information about the hotspot-compiler-dev mailing list