RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals [v3]
Andrew Haley
aph at openjdk.java.net
Mon Jul 5 12:16:51 UTC 2021
On Fri, 2 Jul 2021 09:54:32 GMT, Wang Huang <whuang at openjdk.org> wrote:
>> Dear all,
>> Could you give me a favor to review this patch? It improves the performance of the intrinsic of `String.equals` on Neon backend of Aarch64.
>> We profile the performance by using this JMH case:
>>
>>
>> ```java
>> package com.huawei.string;
>> import java.util.*;
>> import java.util.concurrent.TimeUnit;
>>
>> import org.openjdk.jmh.annotations.CompilerControl;
>> import org.openjdk.jmh.annotations.Benchmark;
>> import org.openjdk.jmh.annotations.Level;
>> import org.openjdk.jmh.annotations.OutputTimeUnit;
>> import org.openjdk.jmh.annotations.Param;
>> import org.openjdk.jmh.annotations.Scope;
>> import org.openjdk.jmh.annotations.Setup;
>> import org.openjdk.jmh.annotations.State;
>> import org.openjdk.jmh.annotations.Fork;
>> import org.openjdk.jmh.infra.Blackhole;
>>
>> @State(Scope.Thread)
>> @OutputTimeUnit(TimeUnit.MILLISECONDS)
>> public class StringEqual {
>> @Param({"8", "64", "4096"})
>> int size;
>>
>> String str1;
>> String str2;
>>
>> @Setup(Level.Trial)
>> public void init() {
>> str1 = newString(size, 'c', '1');
>> str2 = newString(size, 'c', '2');
>> }
>>
>> public String newString(int length, char charToFill, char lastChar) {
>> if (length > 0) {
>> char[] array = new char[length];
>> Arrays.fill(array, charToFill);
>> array[length - 1] = lastChar;
>> return new String(array);
>> }
>> return "";
>> }
>>
>> @Benchmark
>> @CompilerControl(CompilerControl.Mode.DONT_INLINE)
>> public boolean EqualString() {
>> return str1.equals(str2);
>> }
>> }
>>
>> ```
>> The result is list as following:(Linux aarch64 with 128cores)
>>
>> Benchmark | (size) | Mode | Cnt | Score | Error | Units
>> ----------------------------------|-------|---------|-------|------------|------------|----------
>> StringEqual.EqualString | 8 | thrpt | 10 | 123971.994 | ± 1462.131 | ops/ms
>> StringEqual.EqualString | 64 | thrpt | 10 | 56009.960 | ± 999.734 | ops/ms
>> StringEqual.EqualString | 4096 | thrpt | 10 | 1943.852 | ± 8.159 | ops/ms
>> StringEqual.EqualStringWithNEON | 8 | thrpt | 10 | 120319.271 | ± 1392.185 | ops/ms
>> StringEqual.EqualStringWithNEON | 64 | thrpt | 10 | 72914.767 | ± 1814.173 | ops/ms
>> StringEqual.EqualStringWithNEON | 4096 | thrpt | 10 | 2579.155 | ± 15.589 | ops/ms
>>
>> Yours,
>> WANG Huang
>
> Wang Huang has updated the pull request incrementally with one additional commit since the last revision:
>
> unroll when small string sizes
There is one other thing I should mention, of which you may not be aware.
Whenever you expand a macro inline, you reduce the opportunities for methods to be inlined. That's because if a method is bigger than (default) 2500 bytes, we do not inline it into other methods. Inlining is the most powerful optimization we have, but we need to prevent code size explosion.
So not only does inlining code add pressure on the machine's icache, the HotSpot code cache, and so on, but it also prevents other optimizations. That's why we are very wary of increasing the size of String.equals to benefit unusual cases.
-------------
PR: https://git.openjdk.java.net/jdk/pull/4423
More information about the hotspot-dev
mailing list