RFR: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops [v12]

Fri Nov 11 12:43:12 UTC 2022

On Mon, 31 Oct 2022 12:25:43 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 3484:
>> 
>>> 3482:   decrementl(index);
>>> 3483:   jmpb(LONG_SCALAR_LOOP_BEGIN);
>>> 3484:   bind(LONG_SCALAR_LOOP_END);
>> 
>> You can share this loop with the scalar ones above.
>
> This might be messier than it first looks, since the two different loops use different temp registers based (long scalar can scratch cnt1, short scalar scratches the coef register). I'll have to think about this for a bit.

As it happens in the latest version the vector loop drops into the scalar loop after all 32-element chunks has been processed.

-------------

PR: https://git.openjdk.org/jdk/pull/10847