[9] RFR(S): 8158228: C1 incorrectly folds mismatched loads from stable arrays
Tobias Hartmann
tobias.hartmann at oracle.com
Thu Jun 2 13:32:22 UTC 2016
Hi Vladimir,
thanks for the feedback!
On 01.06.2016 15:38, Vladimir Ivanov wrote:
> Tobias,
>
> Thanks for taking care of the bug.
>
>> https://bugs.openjdk.java.net/browse/JDK-8158228
>> http://cr.openjdk.java.net/~thartmann/8158228/webrev.00/
>>
>> Multiple tests fail since JDK 9 b110 because String.charAt() returns a wrong value for a constant string. In b110, JDK-8143407 [1] enabled C1 to constant fold stable array elements and since JDK-8150180 [2] String.value is constant. The problem is that C1 does not correctly fold mismatched loads from stable arrays. In this case, String.value is a stable byte[] and StringUTF16.charAt() is intrinsified as char load. Instead of a char load, C1 emits a byte load because it does not check for mismatched loads.
>>
>> I fixed this by passing the BasicType of the mismatched load to ciArray::element_value() to ensure the correct constant is emitted. I verified that both 'Lex06403.java' and 'NewTestCase.java' pass with the fix. Also tested with JPRT and RBT (running).
>
> As I understand the only source of mismatched accesses are String intrinsics (getCharStringU/putCharStringU).
Yes, the mismatched access is emitted by the getCharStringU intrinsic.
> (Unsafe accesses are the main source of mismatched accesses, but C1 doesn't optimize them).
>
> Can you issue UnsafeObjectOp instead of LoadIndexed/StoreIndexed in GraphBuilder::append_char_access or does C1 produce less efficient code for unsafe accesses?
It's possible but some boilerplate code to compute the offset is necessary:
http://cr.openjdk.java.net/~thartmann/8158228/webrev.unsafe/
However, the code is far less than optimal:
0x00007fe4d159746a: mov $0x1c,%rdx
0x00007fe4d1597474: mov $0x6d4e80818,%rsi
0x00007fe4d159747e: movzwl (%rsi,%rdx,1),%edx
0x00007fe4d1597482: and $0xffff,%edx
0x00007fe4d1597488: and $0xffff,%edx
0x00007fe4d159748e: mov %rdx,%rbx
0x00007fe4d1597491: cmp $0x67,%ebx
vs.
0x00007fe14559744a: mov $0x67,%ebx
0x00007fe14559744f: cmp $0x67,%ebx
The load is not folded, we don't use addressing modes (because we compute the long offset beforehand) and there is a useless 'and'.
> Otherwise, I suggest to mark LoadIndexed/StoreIndexed as mismatched (or introduce UnsafeLoadIndexed/UnsafeStoreIndexed node) and check that in Canonicalizer::do_LoadIndexed.
>
> Mismatch detection is more complex than just comparing basic types (see Type::make_constant_from_array_element() & check_mismatched_access(), though some checks are C2-specific).
Right, I agree. I would like to go with marking LoadIndexed/StoreIndexed as mismatched when emitting the intrinsic:
http://cr.openjdk.java.net/~thartmann/8158228/webrev.01/
Like this we still have optimal code. What do you think?
Thanks,
Tobias
> Since C1 doesn't optimize unsafe accesses, I suggest to confine the fix to String intrinsics.
>
> Best regards,
> Vladimir Ivanov
>
>>
>> Thanks,
>> Tobias
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8143407
>> [2] https://bugs.openjdk.java.net/browse/JDK-8150180
More information about the hotspot-compiler-dev
mailing list