RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v7]

Vladimir Kozlov kvn at openjdk.org
Tue Jan 16 22:31:30 UTC 2024


On Thu, 11 Jan 2024 23:06:32 GMT, Scott Gibbons <sgibbons at openjdk.org> wrote:

>> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions.  This change accelerates String.IndexOf on average 1.3x for AVX2.  The benchmark numbers:
>> 
>> 
>> Benchmark	                                               Score		Latest		
>> StringIndexOf.advancedWithMediumSub   343.573		317.934		0.925375393x
>> StringIndexOf.advancedWithShortSub1	  1039.081		1053.96		1.014319384x
>> StringIndexOf.advancedWithShortSub2	      55.828		110.541		1.980027943x
>> StringIndexOf.constantPattern	                9.361		11.906		1.271872663x
>> StringIndexOf.searchCharLongSuccess	        4.216		4.218		1.000474383x
>> StringIndexOf.searchCharMediumSuccess	3.133		3.216		1.02649218x
>> StringIndexOf.searchCharShortSuccess	3.76		        3.761		1.000265957x
>> StringIndexOf.success	                                9.186		9.713		1.057369911x
>> StringIndexOf.successBig	                      14.341		46.343		3.231504079x
>> StringIndexOfChar.latin1_AVX2_String	  6220.918		12154.52		1.953814533x
>> StringIndexOfChar.latin1_AVX2_char	  5503.556		5540.044		1.006629895x
>> StringIndexOfChar.latin1_SSE4_String	  6978.854		6818.689		0.977049957x
>> StringIndexOfChar.latin1_SSE4_char	  5657.499		5474.624		0.967675646x
>> StringIndexOfChar.latin1_Short_String	  7132.541		6863.359		0.962260014x
>> StringIndexOfChar.latin1_Short_char	16013.389	      16162.437		1.009307711x
>> StringIndexOfChar.latin1_mixed_String	  7386.123	      14771.622		1.999915517x
>> StringIndexOfChar.latin1_mixed_char	  9901.671		9782.245		0.987938803
>
> Scott Gibbons has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits:
> 
>  - Merge branch 'openjdk:master' into indexof
>  - Merge branch 'openjdk:master' into indexof
>  - Addressing review comments.
>  - Fix for JDK-8321599
>  - Support UU IndexOf
>  - Only use optimization when EnableX86ECoreOpts is true
>  - Fix whitespace
>  - Merge branch 'openjdk:master' into indexof
>  - Comments; added exhaustive-ish test
>  - Subtracting 0x10 twice.
>  - ... and 12 more: https://git.openjdk.org/jdk/compare/8e12053e...3e58d0c2

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4111:

> 4109:   if ((UseAVX == 2) && EnableX86ECoreOpts && VM_Version::supports_avx2()) {
> 4110:     StubRoutines::_string_indexof = generate_string_indexof();
> 4111:   }

What motivation for this extensive new code only for avx2? 30% is nice (for some cases) but it is enabled only for AVX2 and not for avx512 which all modern x86 CPUs have so the code will not be used.

Or it is typo?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1454139710


More information about the core-libs-dev mailing list