RFR: 8300208: Optimize Adler32 stub for AVX-512 targets. [v2]

Fri Jan 27 17:35:20 UTC 2023

On Fri, 27 Jan 2023 11:22:36 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_adler.cpp line 155:
>> 
>>> 153:       __ vpaddd(yb, yb, ya, Assembler::AVX_512bit);
>>> 154:       __ addptr(data, CHUNKSIZE);
>>> 155:       __ cmpptr(data, end);
>> 
>> This still processes 16 bytes worth of data in one loop iteration as the AVX2 loop. Have you given thoughts on processing double the size with AVX3?
>
> If we double the size of inputs processed within the loop from 16 to 32 then subsequent ADLER_SCALE will also change from 0-31, given that we cannot accommodate more than 16 integer in a 512 bit vector hence it will not be possible to scale up the loop block size.

Could be done something like below:
[adler32.patch](https://github.com/openjdk/jdk/files/10521481/adler32.patch)

-------------

PR: https://git.openjdk.org/jdk/pull/12045