RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals [v3]
Andrew Haley
aph at openjdk.java.net
Mon Jul 5 16:50:50 UTC 2021
On Fri, 2 Jul 2021 09:54:32 GMT, Wang Huang <whuang at openjdk.org> wrote:
>> Dear all,
>> Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64.
>> We profile the performance by using this JMH case:
>>
>>
>> ```java
>> package com.huawei.string;
>> import java.util.*;
>> import java.util.concurrent.TimeUnit;
>>
>> import org.openjdk.jmh.annotations.CompilerControl;
>> import org.openjdk.jmh.annotations.Benchmark;
>> import org.openjdk.jmh.annotations.Level;
>> import org.openjdk.jmh.annotations.OutputTimeUnit;
>> import org.openjdk.jmh.annotations.Param;
>> import org.openjdk.jmh.annotations.Scope;
>> import org.openjdk.jmh.annotations.Setup;
>> import org.openjdk.jmh.annotations.State;
>> import org.openjdk.jmh.annotations.Fork;
>> import org.openjdk.jmh.infra.Blackhole;
>>
>> @State(Scope.Thread)
>> @OutputTimeUnit(TimeUnit.MILLISECONDS)
>> public class StringEqual {
>> @Param({"8", "64", "4096"})
>> int size;
>>
>> String str1;
>> String str2;
>>
>> @Setup(Level.Trial)
>> public void init() {
>> str1 = newString(size, 'c', '1');
>> str2 = newString(size, 'c', '2');
>> }
>>
>> public String newString(int length, char charToFill, char lastChar) {
>> if (length > 0) {
>> char[] array = new char[length];
>> Arrays.fill(array, charToFill);
>> array[length - 1] = lastChar;
>> return new String(array);
>> }
>> return "";
>> }
>>
>> @Benchmark
>> @CompilerControl(CompilerControl.Mode.DONT_INLINE)
>> public boolean EqualString() {
>> return str1.equals(str2);
>> }
>> }
>>
>> ```
>> The result is list as following:(Linux aarch64 with 128cores)
>>
>> Benchmark | (size) | Mode | Cnt | Score | Error | Units
>> ----------------------------------|-------|---------|-------|------------|------------|----------
>> StringEqual.EqualString | 8 | thrpt | 10 | 123971.994 | ± 1462.131 | ops/ms
>> StringEqual.EqualString | 64 | thrpt | 10 | 56009.960 | ± 999.734 | ops/ms
>> StringEqual.EqualString | 4096 | thrpt | 10 | 1943.852 | ± 8.159 | ops/ms
>> StringEqual.EqualStringWithNEON | 8 | thrpt | 10 | 120319.271 | ± 1392.185 | ops/ms
>> StringEqual.EqualStringWithNEON | 64 | thrpt | 10 | 72914.767 | ± 1814.173 | ops/ms
>> StringEqual.EqualStringWithNEON | 4096 | thrpt | 10 | 2579.155 | ± 15.589 | ops/ms
>>
>> Yours,
>> WANG Huang
>
> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>
> unroll when small string sizes
I'm still seeing a slight advantage for `ldp` on Graviton 2:
Benchmark (size) Mode Cnt Score Error Units
StringEquals.equal 256 avgt 5 15.592 ± 0.080 us/op
StringEquals.equal 512 avgt 5 28.467 ± 0.245 us/op
StringEquals.equal 1024 avgt 5 53.883 ± 0.272 us/op
Versus the latest Neon version:
Benchmark (size) Mode Cnt Score Error Units
StringEquals.equal 256 avgt 5 16.848 ± 0.158 us/op
StringEquals.equal 512 avgt 5 29.640 ± 0.024 us/op
StringEquals.equal 1024 avgt 5 55.257 ± 0.050 us/op
-------------
PR: https://git.openjdk.java.net/jdk/pull/4423
More information about the hotspot-dev
mailing list