RFR: 8278518: String(byte[], int, int, Charset) constructor and String.translateEscapes() miss bounds check elimination

amirhadadi duke at openjdk.java.net
Sat Dec 18 21:51:28 UTC 2021


On Fri, 17 Dec 2021 22:33:30 GMT, Dean Long <dlong at openjdk.org> wrote:

>> Originally this was spotted by by Amir Hadadi in https://stackoverflow.com/questions/70272651/missing-bounds-checking-elimination-in-string-constructor
>> 
>> It looks like in the following code in `String(byte[], int, int, Charset)`
>> 
>> while (offset < sl) {
>>     int b1 = bytes[offset];
>>     if (b1 >= 0) {
>>         dst[dp++] = (byte)b1;
>>         offset++;                                  // <---
>>         continue;
>>     }
>>     if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) &&
>>             offset + 1 < sl) {
>>         int b2 = bytes[offset + 1];
>>         if (!isNotContinuation(b2)) {
>>             dst[dp++] = (byte)decode2(b1, b2);
>>             offset += 2;
>>             continue;
>>         }
>>     }
>>     // anything not a latin1, including the repl
>>     // we have to go with the utf16
>>     break;
>> }
>> 
>> bounds check elimination is not executed when accessing byte array via `bytes[offset].
>> 
>> The reason, I guess, is that offset variable is modified within the loop (marked with arrow).
>> 
>> Possible fix for this could be changing:
>> 
>> `while (offset < sl)` ---> `while (offset >= 0 && offset < sl)`
>> 
>> However the best is to invest in C2 optimization to handle all such cases.
>> 
>> The following benchmark demonstrates good improvement:
>> 
>> @State(Scope.Thread)
>> @BenchmarkMode(Mode.AverageTime)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> public class StringConstructorBenchmark {
>>   private byte[] array;
>>   private String str;
>> 
>>   @Setup
>>   public void setup() {
>>     str = "Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen. Я";//Latin1 ending with Russian
>>     array = str.getBytes(StandardCharsets.UTF_8);
>>   }
>> 
>>   @Benchmark
>>   public String newString()  {
>>       return new String(array, 0, array.length, StandardCharsets.UTF_8);
>>   }
>> 
>>   @Benchmark
>>   public String translateEscapes()  {
>>       return str.translateEscapes();
>>   }
>> }
>> 
>> Results:
>> 
>> //baseline
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.newString avgt 50 173,092 ± 3,048 ns/op
>> 
>> //patched
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.newString avgt 50 126,908 ± 2,355 ns/op
>> 
>> The same is observed in String.translateEscapes() for the same String as in the benchmark above:
>> 
>> //baseline
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.translateEscapes avgt 100 53,627 ± 0,850 ns/op
>> 
>> //patched
>> Benchmark Mode Cnt Score Error Units
>> StringConstructorBenchmark.translateEscapes avgt 100 48,087 ± 1,129 ns/op
>> 
>> Also I've looked into this with `LinuxPerfAsmProfiler`, full output for baseline is available here https://gist.github.com/stsypanov/d2524f98477d633fb1d4a2510fedeea6 and for patched code here https://gist.github.com/stsypanov/16c787e4f9fa3dd122522f16331b68b7.
>> 
>> Here's the part of baseline assembly responsible for `while` loop: 
>> 
>>   3.62%           ││            │  0x00007fed70eb4c1c:   mov    %ebx,%ecx
>>   2.29%           ││            │  0x00007fed70eb4c1e:   mov    %edx,%r9d
>>   2.22%           ││            │  0x00007fed70eb4c21:   mov    (%rsp),%r8                   ;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>>                   ││            │                                                            ; - java.lang.String::<init>@107 (line 537)
>>   2.32%           ↘│            │  0x00007fed70eb4c25:   cmp    %r13d,%ecx
>>                    │            │  0x00007fed70eb4c28:   jge    0x00007fed70eb5388           ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>>                    │            │                                                            ; - java.lang.String::<init>@110 (line 537)
>>   3.05%            │            │  0x00007fed70eb4c2e:   cmp    0x8(%rsp),%ecx
>>                    │            │  0x00007fed70eb4c32:   jae    0x00007fed70eb5319
>>   2.38%            │            │  0x00007fed70eb4c38:   mov    %r8,(%rsp)
>>   2.64%            │            │  0x00007fed70eb4c3c:   movslq %ecx,%r8
>>   2.46%            │            │  0x00007fed70eb4c3f:   mov    %rax,%rbx
>>   3.44%            │            │  0x00007fed70eb4c42:   sub    %r8,%rbx
>>   2.62%            │            │  0x00007fed70eb4c45:   add    $0x1,%rbx
>>   2.64%            │            │  0x00007fed70eb4c49:   and    $0xfffffffffffffffe,%rbx
>>   2.30%            │            │  0x00007fed70eb4c4d:   mov    %ebx,%r8d
>>   3.08%            │            │  0x00007fed70eb4c50:   add    %ecx,%r8d
>>   2.55%            │            │  0x00007fed70eb4c53:   movslq %r8d,%r8
>>   2.45%            │            │  0x00007fed70eb4c56:   add    $0xfffffffffffffffe,%r8
>>   2.13%            │            │  0x00007fed70eb4c5a:   cmp    (%rsp),%r8
>>                    │            │  0x00007fed70eb4c5e:   jae    0x00007fed70eb5319
>>   3.36%            │            │  0x00007fed70eb4c64:   mov    %ecx,%edi                    ;*aload_1 {reexecute=0 rethrow=0 return_oop=0}
>>                    │            │                                                            ; - java.lang.String::<init>@113 (line 538)
>>   2.86%            │           ↗│  0x00007fed70eb4c66:   movsbl 0x10(%r14,%rdi,1),%r8d       ;*baload {reexecute=0 rethrow=0 return_oop=0}
>>                    │           ││                                                            ; - java.lang.String::<init>@115 (line 538)
>>   2.48%            │           ││  0x00007fed70eb4c6c:   mov    %r9d,%edx
>>   2.26%            │           ││  0x00007fed70eb4c6f:   inc    %edx                         ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>>                    │           ││                                                            ; - java.lang.String::<init>@127 (line 540)
>>   3.28%            │           ││  0x00007fed70eb4c71:   mov    %edi,%ebx
>>   2.44%            │           ││  0x00007fed70eb4c73:   inc    %ebx                         ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>>                    │           ││                                                            ; - java.lang.String::<init>@134 (line 541)
>>   2.35%            │           ││  0x00007fed70eb4c75:   test   %r8d,%r8d
>>                    ╰           ││  0x00007fed70eb4c78:   jge    0x00007fed70eb4c04           ;*iflt {reexecute=0 rethrow=0 return_oop=0}
>>                                ││                                                            ; - java.lang.String::<init>@120 (line 539)
>> 
>> and this one is for patched code:
>> 
>>  17.28%           ││  0x00007f6b88eb6061:   mov    %edx,%r10d                   ;*iload_2 {reexecute=0 rethrow=0 return_oop=0}
>>                   ││                                                            ; - java.lang.String::<init>@107 (line 537)
>>   0.11%           ↘│  0x00007f6b88eb6064:   test   %r10d,%r10d
>>                    │  0x00007f6b88eb6067:   jl     0x00007f6b88eb669c           ;*iflt {reexecute=0 rethrow=0 return_oop=0}
>>                    │                                                            ; - java.lang.String::<init>@108 (line 537)
>>   0.39%            │  0x00007f6b88eb606d:   cmp    %r13d,%r10d
>>                    │  0x00007f6b88eb6070:   jge    0x00007f6b88eb66d0           ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>>                    │                                                            ; - java.lang.String::<init>@114 (line 537)
>>   0.66%            │  0x00007f6b88eb6076:   mov    %ebx,%r9d
>>  13.70%            │  0x00007f6b88eb6079:   cmp    0x8(%rsp),%r10d
>>   0.01%            │  0x00007f6b88eb607e:   jae    0x00007f6b88eb6671
>>   0.14%            │  0x00007f6b88eb6084:   movsbl 0x10(%r14,%r10,1),%edi       ;*baload {reexecute=0 rethrow=0 return_oop=0}
>>                    │                                                            ; - java.lang.String::<init>@119 (line 538)
>>   0.37%            │  0x00007f6b88eb608a:   mov    %r9d,%ebx
>>   0.99%            │  0x00007f6b88eb608d:   inc    %ebx                         ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>>                    │                                                            ; - java.lang.String::<init>@131 (line 540)
>>  12.88%            │  0x00007f6b88eb608f:   movslq %r9d,%rsi                    ;*bastore {reexecute=0 rethrow=0 return_oop=0}
>>                    │                                                            ; - java.lang.String::<init>@196 (line 548)
>>   0.17%            │  0x00007f6b88eb6092:   mov    %r10d,%edx
>>   0.39%            │  0x00007f6b88eb6095:   inc    %edx                         ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>>                    │                                                            ; - java.lang.String::<init>@138 (line 541)
>>   0.96%            │  0x00007f6b88eb6097:   test   %edi,%edi
>>   0.02%            │  0x00007f6b88eb6099:   jl     0x00007f6b88eb60dc           ;*iflt {reexecute=0 rethrow=0 return_oop=0}
>> 
>> Between instructions mapped to `if_icmpge` and `aload_1` in baseline we have bounds check which is missing from patched code.
>
> This does look like a HotSpot JIT compiler issue to me.  My guess is that it is related to how checkBoundsOffCount() checks for offset < 0:
> 
> 396          if ((length | fromIndex | size) < 0 || size > length - fromIndex)
> 
> using | to combine three values.

@dean-long actually the issue reproduces with Java 17 where `checkBoundsOffCount` was implemented in a more straight forward way:


    static void checkBoundsOffCount(int offset, int count, int length) {
        if (offset < 0 || count < 0 || offset > length - count) {
            throw new StringIndexOutOfBoundsException(
                "offset " + offset + ", count " + count + ", length " + length);
        }
    }



Here's a [gist](https://gist.github.com/amirhadadi/9505c3f5d9ad68cad2fbfd1b9e01f0b8) with a benchmark you can run. In this benchmark I'm comparing safe and unsafe reads from the byte array  (I didn't modify the code to add the offset >= 0 condition).

Here are the results:


Benchmark                       Mode  Cnt    Score    Error  Units
StringBenchmark.safeDecoding    avgt   20  120.312 ± 11.674  ns/op
StringBenchmark.unsafeDecoding  avgt   20   72.628 ±  0.479  ns/op

-------------

PR: https://git.openjdk.java.net/jdk/pull/6812


More information about the hotspot-compiler-dev mailing list