RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v2]
Jason Tatton
github.com+70893615+jasontatton-aws at openjdk.java.net
Tue Oct 6 15:46:12 UTC 2020
On Fri, 2 Oct 2020 08:44:48 GMT, Jason Tatton <github.com+70893615+jasontatton-aws at openjdk.org> wrote:
>> Hi Jason,
>>
>> thanks for bringing String.indexOf() for latin strings up to date with the Unicode version.
>>
>> Your changes look good except a few minor issues I've commented on right in the code.
>>
>> I'd only like to ask you if you could possibly improve your test a little bit. As far as I understand, your search text
>> is a consecutive sequence of "abc" characters, so you'll always find the character your searching for within the next
>> three characters of the source text. This won't exercise the loops of your intrinsic. Maybe you can also add some test
>> versions where the search character will be found beyond the first 32/64 characters after "fromIndex"?
>
> @simonis Thank you for the corrections, I have ammended them in the latest comit as follows:
>
> Changes to unit test:
> - main test adjusted such that Strings gennerated are much longer (up to 2048 characters) and of the form: `azaza`,
> `aazaazaa`, `aaazaaazaaa`, etc with `'z'` being the search character searched for. Multiple instances of the search
> character are included in the String in order to validate that the starting offset is correctly handleded. Results are
> compared to non intrinsified version of the code. Longer strings means that the looping functionality of the various
> paths is entered into.
> - Run configurations introduced such that it checks behaviour where use of SSE and AVX instructions are restricted.
> - Tier4InvocationThreshold adjusted so as to ensure C2 code iis invoked.
>
> Other changes:
> - newlines added at end of files
>
> @vnkozlov here are the performance numbers as requested. I have included performance of the UTF16 version of the
> intrinsic for reference:
> | UseAVX= | UseSSE= | Benchmark | Mode | Cnt | Score | Error | Units |
> |---------|---------|-----------------------------------|------|-----|-------------|-------------|-------|
> | | 0 | IndexOfBenchmark.latin1_long_char | avgt | 5 | **447,493.398** | ± 4,666.386 | ns/op |
> | 0 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **104,735.941** | ± 2,484.403 | ns/op |
> | 1 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **104,342.844** | ± 2,656.343 | ns/op |
> | 2 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **61,000.418** | ± 1,543.951 | ns/op |
> | 3 | | IndexOfBenchmark.latin1_long_char | avgt | 5 | **60,607.988** | ± 1,466.354 | ns/op |
> | | 0 | IndexOfBenchmark.utf16_long_char | avgt | 5 | 672,475.302 | ± 4,998.596 | ns/op |
> | 0 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 175,521.654 | ± 7,549.094 | ns/op |
> | 1 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 172,514.981 | ± 3,561.040 | ns/op |
> | 2 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 120,725.748 | ± 2,004.400 | ns/op |
> | 3 | | IndexOfBenchmark.utf16_long_char | avgt | 5 | 120,664.623 | ± 1,988.419 | ns/op |
>
> I think the results are as expected, we see improvements in performance as the range of SSE and AVX instructions which
> can be used is expanded upon. Note that no improvement is observed with UseAVX=3 because there is no AVX-512 code in
> these intrinsics.
Hi All,
Just wondering if there is anything you'd like me to do in order to assist with moving this patch forward?
Thanks,
Jason
-------------
PR: https://git.openjdk.java.net/jdk/pull/71
More information about the core-libs-dev
mailing list