RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals

Nick Gasson ngasson at openjdk.java.net
Tue Jun 15 03:57:48 UTC 2021


On Wed, 9 Jun 2021 03:10:45 GMT, Wang Huang <whuang at openjdk.org> wrote:

> Dear all, 
>      Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64.
>      We profile the performance by using this JMH case:
>  
> 
>    ```java
>     package com.huawei.string;
>     import java.util.*;
>     import java.util.concurrent.TimeUnit;
>     
>     import org.openjdk.jmh.annotations.CompilerControl;
>     import org.openjdk.jmh.annotations.Benchmark;
>     import org.openjdk.jmh.annotations.Level;
>     import org.openjdk.jmh.annotations.OutputTimeUnit;
>     import org.openjdk.jmh.annotations.Param;
>     import org.openjdk.jmh.annotations.Scope;
>     import org.openjdk.jmh.annotations.Setup;
>     import org.openjdk.jmh.annotations.State;
>     import org.openjdk.jmh.annotations.Fork;
>     import org.openjdk.jmh.infra.Blackhole;
>     
>     @State(Scope.Thread)
>     @OutputTimeUnit(TimeUnit.MILLISECONDS)
>     public class StringEqual {
>         @Param({"8", "64", "4096"})
>         int size;
>     
>         String str1;
>         String str2;
>     
>         @Setup(Level.Trial)
>         public void init() {
>             str1 = newString(size, 'c', '1');
>             str2 = newString(size, 'c', '2');
>         }
>     
>         public String newString(int length, char charToFill, char lastChar) {
>             if (length > 0) {
>                 char[] array = new char[length];
>                 Arrays.fill(array, charToFill);
>                 array[length - 1] = lastChar;
>                 return new String(array);
>             }
>             return "";
>         }
>     
>         @Benchmark
>         @CompilerControl(CompilerControl.Mode.DONT_INLINE)
>         public boolean EqualString() {
>             return str1.equals(str2);
>         }
>     }
> 
>    ```
> The result is list as following:(Linux aarch64 with 128cores)
> 
> Benchmark                       | (size) |  Mode | Cnt  |     Score |     Error |  Units
> ----------------------------------|-------|---------|-------|------------|------------|----------
> StringEqual.EqualString      |         8 | thrpt  | 10 | 123971.994 | ± 1462.131 | ops/ms
> StringEqual.EqualString       |       64 | thrpt |  10  | 56009.960  | ±  999.734 | ops/ms
> StringEqual.EqualString        |    4096 | thrpt |  10 |   1943.852 | ±  8.159 | ops/ms
> StringEqual.EqualStringWithNEON    |   8 | thrpt |  10 | 120319.271  | ± 1392.185 | ops/ms
> StringEqual.EqualStringWithNEON    |  64 | thrpt |  10 |  72914.767 | ± 1814.173 | ops/ms
> StringEqual.EqualStringWithNEON  |  4096 | thrpt  | 10  |  2579.155 | ± 15.589 | ops/ms
> 
> Yours, 
> WANG Huang

With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache.

src/hotspot/cpu/aarch64/aarch64.ad line 16676:

> 16674:   format %{ "String Equals $str1,$str2,$cnt -> $result" %}
> 16675:   ins_encode %{
> 16676:     // Count is in 8-bit bytes; non-Compact chars are 8 bits.

This change is a bit confusing: non-compact chars are still 16 bits, it's just at this point we know the string contains only 8-bit Latin characters. I think it's better to instead delete everything after the ";" (or leave it as it is).

-------------

PR: https://git.openjdk.java.net/jdk/pull/4423


More information about the hotspot-dev mailing list