RFR: 8269559: AArch64: Implement string_compare intrinsic in SVE
Nick Gasson
ngasson at openjdk.java.net
Tue Aug 17 07:17:30 UTC 2021
On Mon, 16 Aug 2021 20:59:55 GMT, TatWai Chong <github.com+78814694+tatwaichong at openjdk.org> wrote:
> This patch implements string_compare intrinsic in SVE.
> It supports all LL, LU, UL and UU comparisons.
>
> As we haven't found an existing benchmark to measure performance impact,
> we created a benchmark derived from the test [1] for this evaluation.
> This benchmark is attached to this patch.
>
> Besides, remove the unused temporary register `vtmp3` from the existing
> match rules for StrCmp.
>
> The result below shows all varients can be benefited largely.
> Command: make exploded-test TEST="micro:StringCompareToDifferentLength"
>
> Benchmark (size) Mode Cnt Score Speedup Units
> compareToLL 24 avgt 10 1.0x ms/op
> compareToLL 36 avgt 10 1.0x ms/op
> compareToLL 72 avgt 10 1.0x ms/op
> compareToLL 128 avgt 10 1.4x ms/op
> compareToLL 256 avgt 10 1.8x ms/op
> compareToLL 512 avgt 10 2.7x ms/op
> compareToLU 24 avgt 10 1.6x ms/op
> compareToLU 36 avgt 10 1.8x ms/op
> compareToLU 72 avgt 10 2.3x ms/op
> compareToLU 128 avgt 10 3.8x ms/op
> compareToLU 256 avgt 10 4.7x ms/op
> compareToLU 512 avgt 10 6.3x ms/op
> compareToUL 24 avgt 10 1.6x ms/op
> compareToUL 36 avgt 10 1.7x ms/op
> compareToUL 72 avgt 10 2.2x ms/op
> compareToUL 128 avgt 10 3.3x ms/op
> compareToUL 256 avgt 10 4.4x ms/op
> compareToUL 512 avgt 10 6.1x ms/op
> compareToUU 24 avgt 10 1.0x ms/op
> compareToUU 36 avgt 10 1.0x ms/op
> compareToUU 72 avgt 10 1.4x ms/op
> compareToUU 128 avgt 10 2.2x ms/op
> compareToUU 256 avgt 10 2.6x ms/op
> compareToUU 512 avgt 10 3.7x ms/op
>
> [1] https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/intrinsics/string/TestStringCompareToDifferentLength.java
src/hotspot/cpu/aarch64/aarch64.ad line 16515:
> 16513: instruct string_compareUL(iRegP_R1 str1, iRegI_R2 cnt1, iRegP_R3 str2, iRegI_R4 cnt2,
> 16514: iRegI_R0 result, iRegP_R10 tmp1, iRegL_R11 tmp2,
> 16515: vRegD_V0 vtmp1, vRegD_V1 vtmp2, vRegD_V2 vtmp3, rFlagsReg cr)
I think vtmp3 (=V2) is still used by the non-SVE compare-long-strings stub? (see `generate_compare_long_string_different_encoding`)
-------------
PR: https://git.openjdk.java.net/jdk/pull/5129
More information about the core-libs-dev
mailing list