RFR: 8173585: Intrinsify StringLatin1.indexOf(char)

Mon Sep 14 08:37:39 UTC 2020

On 12/09/2020 00:06, Jason Tatton wrote:

> The current indexOf char intrinsics for StringUTF16 and the new one
> here for StringLatin1 both use the AVX2 – i.e. 256 bit instructions,
> these are also affected by the frequency scaling which affects the
> AVX-512 instructions you pointed out. Of course in a world where all
> the work taking place is AVX instructions this wouldn’t be an issue
> but in mixed mode execution this is a problem.

Right.

> However, the compiler does have knowledge of the capability of the
> CPU upon which it’s optimizing code for and is able to decide
> whether to use AVX instructions if they are supported by the CPU AND
> if it wouldn’t be detrimental for performance. In fact, there is a
> flag which one can use to interact with this: -XX:UseAVX=version.

Sure. But the question to be answered is this: we can use AVX for
this, but should we? "It's there, so we should use it" isn't sound
reasoning.

My guess is that there are no (non-FP) workloads where AVX dominates.
And that therefore throwing AVX at trivial jobs may make our programs
slower; this is a pessimization.

I have to emphahsize that I don't know the answer to this question.

> This of course made testing this patch an interesting experience as
> the AVX2 instructions were not enabled on the Xeon processors which
> I had access to at AWS, but in the end I was able to use an i7 on my
> corporate macbook to validate the code.

So: is this worth doing? Are there workloads where this helps? Where
this hinders? This is the kind of systems thinking we should be doing
when optimizing.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671