RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version [v8]
Yuri Gaevsky
duke at openjdk.org
Tue Aug 19 09:30:44 UTC 2025
On Wed, 4 Jun 2025 06:04:46 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
>> As you can expect I am trying to implement the following code with RVV:
>>
>> for (; i + (N-1) < cnt; i += N) {
>> h = 31^^N * h
>> + 31^^(N-1) * val[i + 0]
>> + 31^^(N-2) * val[i + 1]
>> ...
>> + 31^^1 * val[i + (N-2)]
>> + 31^^0 * val[i + (N-1)];
>> }
>> for (; i < cnt; i++) {
>> h = 31 * h + val[i];
>> }
>>
>> where `N` is a number of processing array elements in "chunk".
>> IIUC, the main issue with your approach is "reverse" order of array elements versus preloaded `31^^X` coeffs WHEN the remaining number of elems is less than `N`, say `M=N-1`.
>>
>> h = 31^^M * h
>> + 31^^(M-1) * val[i + 0]
>> + 31^^(M-2) * val[i + 1]
>> ...
>> + 31^^1 * val[i + (M-2)]
>> + 32^^0 * val[i + (M-1)];
>>
>> or returning to our `N` for clarity
>>
>> h = 31^^(N-1) * h
>> + 31^^(N-2) * val[i + 0]
>> + 31^^(N-3) * val[i + 1]
>> ...
>> + 31^^1 * val[i + (N-3)]
>> + 31^^0 * val[i + (N-2)];
>>
>> Now we need to "slide down" preloaded multiplier coeffs in designated vector register by one (as `M=N-1`) to be in "sync" with `val[i + X]` (may be move them into temporary VR in the process), and moreover, DO this operation IFF the remaining `cnt` is less than `N` (==>an additional check on every iteration). That's probably acceptable only at tail phase as one-time operation but NOT inside of main loop...
>
> @ygaevsky @RealFYang how can we procced ?
> Thanks for the update. Latest version LGTM. Please get approval from @robehn
Sure. Many thanks for your thorough review.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17413#issuecomment-3199967971
More information about the hotspot-compiler-dev
mailing list