RFR: 8300208: Optimize Adler32 stub for AVX-512 targets. [v2]
Sandhya Viswanathan
sviswanathan at openjdk.org
Fri Jan 27 17:35:20 UTC 2023
On Fri, 27 Jan 2023 11:22:36 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_adler.cpp line 155:
>>
>>> 153: __ vpaddd(yb, yb, ya, Assembler::AVX_512bit);
>>> 154: __ addptr(data, CHUNKSIZE);
>>> 155: __ cmpptr(data, end);
>>
>> This still processes 16 bytes worth of data in one loop iteration as the AVX2 loop. Have you given thoughts on processing double the size with AVX3?
>
> If we double the size of inputs processed within the loop from 16 to 32 then subsequent ADLER_SCALE will also change from 0-31, given that we cannot accommodate more than 16 integer in a 512 bit vector hence it will not be possible to scale up the loop block size.
Could be done something like below:
[adler32.patch](https://github.com/openjdk/jdk/files/10521481/adler32.patch)
-------------
PR: https://git.openjdk.org/jdk/pull/12045
More information about the hotspot-compiler-dev
mailing list