RFR: 8268229: Aarch64: Use Neon in intrinsics for String.equals [v2]
Wang Huang
whuang at openjdk.java.net
Thu Jun 24 09:26:30 UTC 2021
On Thu, 17 Jun 2021 09:28:19 GMT, Andrew Haley <aph at openjdk.org> wrote:
>>> With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache.
>>
>> That's an excellent point. There's no need at all for the Neon part to be expanded inline: it could be a subroutine. We'd have to use fixed Neon registers at the call site.
>
>> > With this change the size of the`string_equals` intrinsic increases by ~60% from 120 bytes to 196 bytes and this gets expanded at every `String.equals` call site. It looks good on a micro-benchmark but I wonder if on a larger program this improvement is outweighed by the negative effects of methods taking up more space in the icache.
>>
>> That's an excellent point. There's no need at all for the Neon part to be expanded inline: it could be a subroutine. We'd have to use fixed Neon registers at the call site.
>
> Thinking some more,we could use this opportunity to move as much of the bulk comparison code as we can out of line, hopefully achieving a reduction in footprint as well as an improvement in performance.
Dear @theRealAph @dgbo @nick-arm @mdinacci,
I have pushed my recent patch. In this commit,
* I have tested some cases as @theRealAph suggested and found some points
1) we changed the diff postions in the strings and get the data if we used neon in all cases

Due to this result, if the string is small, we used old implementaion.
2) The result of `8:64` in this figure is something like bugs, and I fixed it by unrolling the loop
```c++
bind(LOOP); {
ldr(tmp1, Address(post(a1, wordSize)));
ldr(tmp2, Address(post(a2, wordSize)));
subs(cnt1, cnt1, wordSize);
eor(tmp1, tmp1, tmp2);
cbnz(tmp1, DONE);
br(LT, SHORT);
ldr(tmp1, Address(post(a1, wordSize)));
ldr(tmp2, Address(post(a2, wordSize)));
subs(cnt1, cnt1, wordSize);
eor(tmp1, tmp1, tmp2);
cbnz(tmp1, DONE);
} br(GE, LOOP);
```
3) `UseSimpleStringEquals` is added in this patch. If the option is `true` , we used old implentation.
* The result of my JMH is listed here ,
**Diff postion is in the LAST 2/3 of whole string**
Benchmark |(size) |Mode |Cnt| Score| Error |Units
-------------------------------|-------|-----|---|-------|--------|-----
StringEquals.equalsLenT | 8 |avgt | 10| 7.869|± 0.063 |ns/op
StringEquals.equalsLenT | 16 |avgt | 10| 8.651|± 0.201 |ns/op
StringEquals.equalsLenT | 32 |avgt | 10| 9.869|± 0.049 |ns/op
StringEquals.equalsLenT | 64 |avgt | 10| 11.379|± 0.134 |ns/op
StringEquals.equalsLenT | 128 |avgt | 10| 17.312|± 0.274 |ns/op
StringEquals.equalsLenT_simple | 8 |avgt | 10| 7.912|± 0.439 |ns/op
StringEquals.equalsLenT_simple | 16 |avgt | 10| 8.764|± 0.061 |ns/op
StringEquals.equalsLenT_simple | 32 |avgt | 10| 30.452|± 0.065 |ns/op
StringEquals.equalsLenT_simple | 64 |avgt | 10| 14.550|± 0.199 |ns/op
StringEquals.equalsLenT_simple | 128 |avgt | 10| 20.071|± 2.465 |ns/op
**Diff postion is in the FIRST 1/3 of whole string**
Benchmark | (size) |Mode |Cnt | Score| Error |Units
------------------------------|--------|-----|----|------|--------|-----
StringEquals.equalsLenH | 8 |avgt | 10 | 7.822|± 0.148 |ns/op
StringEquals.equalsLenH | 16 |avgt | 10 | 7.631|± 0.179 |ns/op
StringEquals.equalsLenH | 32 |avgt | 10 | 8.553|± 0.064 |ns/op
StringEquals.equalsLenH | 64 |avgt | 10 |11.944|± 0.554 |ns/op
StringEquals.equalsLenH | 128 |avgt | 10 |12.691|± 0.091 |ns/op
StringEquals.equalsLenH_simple| 8 |avgt | 10 | 7.873|± 0.141 |ns/op
StringEquals.equalsLenH_simple| 16 |avgt | 10 | 7.972|± 0.556 |ns/op
StringEquals.equalsLenH_simple| 32 |avgt | 10 | 8.383|± 0.100 |ns/op
StringEquals.equalsLenH_simple| 64 |avgt | 10 |29.364|± 0.344 |ns/op
StringEquals.equalsLenH_simple| 128 |avgt | 10 |14.748|± 0.354 |ns/op
-------------
PR: https://git.openjdk.java.net/jdk/pull/4423
More information about the hotspot-dev
mailing list