RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v4]
Nils Eliasson
neliasso at openjdk.java.net
Tue Oct 6 20:21:09 UTC 2020
On Fri, 2 Oct 2020 08:40:58 GMT, Jason Tatton <github.com+70893615+jasontatton-aws at openjdk.org> wrote:
>> This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 byte encoded Strings). It is provided for
>> x86 and ARM64. The implementation is greatly inspired by the indexOf(char) intrinsic for StringUTF16. To incorporate it
>> I had to make a small change to StringLatin1.java (refactor of functionality to intrisified private method) as well as
>> code for C2. Submitted to: hotspot-compiler-dev and core-libs-dev as this patch contains a change to hotspot and
>> java/lang/StringLatin1.java https://bugs.openjdk.java.net/browse/JDK-8173585
>>
>> Details of testing:
>> ============
>> I have created a jtreg test “compiler/intrinsics/string/TestStringLatin1IndexOfChar” to cover this new intrinsic. Note
>> that, particularly for the x86 implementation of the intrinsic, the code path taken is dependent upon the length of the
>> input String. Hence the test has been designed to cover all these cases. In summary they are:
>> - A “short” string of < 16 characters.
>> - A SIMD String of 16 – 31 characters.
>> - A AVX2 SIMD String of 32 characters+.
>>
>> Hardware used for testing:
>> -----------------------------
>>
>> - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) • Intel i7 processor (with AVX2 support).
>> - AWS Graviton 2 (ARM 64 processor).
>>
>> I also ran; ‘run-test-tier1’ and ‘run-test-tier2’ for: x86_64 and aarch64.
>>
>> Possible future enhancements:
>> ====================
>> For the x86 implementation there may be two further improvements we can make in order to improve performance of both
>> the StringUTF16 and StringLatin1 indexOf(char) intrinsics:
>> 1. Make use of AVX-512 instructions.
>> 2. For “short” Strings (see below), I think it may be possible to modify the existing algorithm to still use SSE SIMD
>> instructions instead of a loop.
>> Benchmark results:
>> ============
>> **Without** the new StringLatin1 indexOf(char) intrinsic:
>>
>> | Benchmark | Mode | Cnt | Score | Error | Units |
>> | ------------- | ------------- |------------- |------------- |------------- |------------- |
>> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ± 182.581 | ns/op |
>> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ± 435.933 | ns/op |
>>
>>
>> **With** the new StringLatin1 indexOf(char) intrinsic:
>>
>> | Benchmark | Mode | Cnt | Score | Error | Units |
>> | ------------- | ------------- |------------- |------------- |------------- |------------- |
>> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ± 407.716 | ns/op |
>> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ± 167.306 | ns/op |
>>
>>
>> The objective of the patch is to bring the performance of StringLatin1 indexOf(char) in line with StringUTF16
>> indexOf(char) for x86 and ARM64. We can see above that this has been achieved. Similar results were obtained when
>> running on ARM.
>
> Jason Tatton has updated the pull request incrementally with one additional commit since the last revision:
>
> 8173585: Intrinsify StringLatin1.indexOf(char)
>
> Rewrite of unit test and newlines added to end of files
>
> Changes to unit test:
> - main test adjusted such that Strings gennerated are much longer (up to
> 2048 characters) and of the form: azaza, aazaazaa, aaazaaazaaa, etc with
> 'z' being the search character searched for. Multiple instances of the
> search character are included in the String in order to validate that
> the starting offset is correctly handleded. Results are compared to non
> intrinsified version of the code. Longer strings means that the looping
> functionality of the various paths is entered into.
> - Run configurations introduced such that it checks behaviour where use
> of SSE and AVX instructions are restricted.
> - Tier4InvocationThreshold adjusted so as to ensure C2 code iis invoked.
>
> Other changes:
> - newlines added at end of files
test/hotspot/jtreg/compiler/intrinsics/string/TestStringLatin1IndexOfChar.java line 25:
> 23: import jdk.test.lib.Asserts;
> 24:
> 25: public class TestStringLatin1IndexOfChar{
Can you please add testing for these edge cases:
- when the search char is the first char
- when the search char is the last char
- when the string has length 1
-------------
PR: https://git.openjdk.java.net/jdk/pull/71
More information about the core-libs-dev
mailing list