[9] RFR(S): 8158228: C1 incorrectly folds mismatched loads from stable arrays

Tobias Hartmann tobias.hartmann at oracle.com
Mon Jun 6 10:56:58 UTC 2016


Hi Vladimir,

On 03.06.2016 15:28, Vladimir Ivanov wrote:
> 
>>> Primarily to keep the code sane :-) Unsafe accesses are the source of mismatched accesses and additional checks are required to filter out all problematic cases. So, I decided to leave them out when enabled constant folding of unsafe loads. My main motivation for the change was to make method handles for field / array element accesses as fast as using bytecode, and mismatched accesses didn't justify all the complications they cause.
>>
>> Okay, but in the case of the getChar intrinsic it should be safe to fold the mismatched access without additional checks, right?
> Yes, it will. But naive Java implementation does the job as good.
> 
> C2 intrinsic has the following code [1]:
> 
> bool LibraryCallKit::inline_string_char_access(bool is_store) {
> ...
>   // Bail when getChar over constants is requested: constant folding would
>   // reject folding mismatched char access over byte[]. A normal inlining for getChar
>   // Java method would constant fold nicely instead.
>   if (!is_store && value->is_Con() && index->is_Con()) {
>     return false;
>   }
> 
> You can do the same in C1.

Right, this code was added by JDK-8150180. When bailing out from the intrinsic we still have the chance to constant fold the accesses from the Java implementation. 

However, this seems to only work fine with C2 but not with C1:
http://cr.openjdk.java.net/~thartmann/8158228/webrev.03/

The generated code looks like this:

  0x00007f7b19597d12: mov    $0x6d4e01688,%rdx	// Array address
  0x00007f7b19597d1c: cmpl   $0xd,0xc(%rdx)
  0x00007f7b19597d23: jbe    0x00007f7b19597ef9
  0x00007f7b19597d29: movsbl 0x1d(%rdx),%edx    // Byte load
  0x00007f7b19597d2d: and    $0xff,%edx
  0x00007f7b19597d33: shl    $0x8,%edx
  0x00007f7b19597d36: or     $0x67,%edx
  0x00007f7b19597d39: and    $0xffff,%edx
  0x00007f7b19597d3f: and    $0xffff,%edx
  0x00007f7b19597d45: and    $0xffff,%edx
  0x00007f7b19597d4b: and    $0xffff,%edx
  0x00007f7b19597d51: mov    %rdx,%rbx
  0x00007f7b19597d54: cmp    $0x67,%ebx

C1 is not able to fold the second byte load that is shifted by 8 bits.

With webrev.02 (using the intrinsic but not constant folding the loads) it looks like this:

  0x00007fb780714cea: mov    $0x6d4e01688,%rdx
  0x00007fb780714cf4: movzwl 0x1c(%rdx),%edx
  0x00007fb780714cf8: and    $0xffff,%edx
  0x00007fb780714cfe: and    $0xffff,%edx
  0x00007fb780714d04: mov    %rdx,%rbx
  0x00007fb780714d07: cmp    $0x67,%ebx

I would therefore prefer to go with webrev.02. What do you think?

> While browsing the code, I noticed the following in j.l.StringUTF16:
> 
>     @HotSpotIntrinsicCandidate
>     public static char getChar(byte[] val, int index) {
>         index <<= 1;
>         return (char)(((val[index++] & 0xff) << HI_BYTE_SHIFT) |
>                       ((val[index]   & 0xff) << LO_BYTE_SHIFT));
>     }
> 
>     public static char charAt(byte[] value, int index) {
>         if (index < 0 || index >= value.length >> 1) {
>             throw new StringIndexOutOfBoundsException(index);
>         }
>         return getChar(value, index);
>     }
> 
> _getCharStringU doesn't do any out-of-bounds checks, but StringUTF16.getChar() is pervasively used across java.lang.
> 
> Is it intentional? I'd expect to see StringUTF16.getChar() private and users calling StringUTF16.charAt().

Yes, this is/was intentional. See my explanation here:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2016-May/023047.html

"The problem is that some of the intrinsified String methods are heavily used and C2 is not always able to remove the range checks introduced by the public wrapper method. For example, StringUTF16::getChar() is used in places where we implicitly know that the access is in bounds (for example, because the array length is always a multiple of two) but C2 cannot prove this. Therefore, if we want to avoid a regression, we have to call the unchecked version of the intrinsic in some cases."

I completely agree that we should try to avoid this and I plan to revisit the issue with JDK-8156534.

Thanks,
Tobias

> 
> Best regards,
> Vladimir Ivanov
> 
> [1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/7f42e988b083/src/share/vm/opto/library_call.cpp#l1629


More information about the hotspot-compiler-dev mailing list