RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7]
Vladimir Kozlov
kvn at openjdk.org
Tue Jan 16 22:31:30 UTC 2024
On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons <sgibbons at openjdk.org> wrote:
>> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers:
>>
>>
>> Benchmark Score Latest
>> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x
>> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x
>> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x
>> StringIndexOf.constantPattern 9.361 11.906 1.271872663x
>> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x
>> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x
>> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x
>> StringIndexOf.success 9.186 9.713 1.057369911x
>> StringIndexOf.successBig 14.341 46.343 3.231504079x
>> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x
>> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x
>> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x
>> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x
>> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x
>> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x
>> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x
>> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803
>
> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits:
>
> - Merge branch 'openjdk:master' into indexof
> - Merge branch 'openjdk:master' into indexof
> - Addressing review comments.
> - Fix for JDK-8321599
> - Support UU IndexOf
> - Only use optimization when EnableX86ECoreOpts is true
> - Fix whitespace
> - Merge branch 'openjdk:master' into indexof
> - Comments; added exhaustive-ish test
> - Subtracting 0x10 twice.
> - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4111:
> 4109: if ((UseAVX == 2) && EnableX86ECoreOpts && VM_Version::supports_avx2()) {
> 4110: StubRoutines::_string_indexof = generate_string_indexof();
> 4111: }
What motivation for this extensive new code only for avx2? 30% is nice (for some cases) but it is enabled only for AVX2 and not for avx512 which all modern x86 CPUs have so the code will not be used.
Or it is typo?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1454139710
More information about the core-libs-dev
mailing list