RFR: 8300208: Optimize Adler32 stub for AVX-512 targets.

Tue Jan 17 17:33:04 UTC 2023

Patch optimizes Adler32 stub for AVX512 target.

Main computation loop now uses zero extended lane widening load vector operation.

New sequence also honors AVX3Thresholds so that implementation uses existing AVX2 instruction sequence on relevant targets
if input size is smaller than threshold limit (default 4096).

Following are the result of an [existing JMH micro ](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/TestAdler32.java)on various targets.

**System Configurations : Turbo frequency scaling is disabled, all the data is collected at fixed frequency of 2.8 GHz.
SUT1   : Intel® Xeon® Platinum 8480+ Processor (Sapphire Rapids)  56C 2S
SUT2   : Intel(R) Xeon(R) Platinum 8380 CPU (Icelake Server) 40C 2S
SUT3   : Intel(R) Xeon(R) Platinum 8280 CPU (Cascadelake Server) 28C 2S**

![image](https://user-images.githubusercontent.com/59989778/212934730-68717a61-191f-4dba-8c83-2eddf6007a47.png)

![image](https://user-images.githubusercontent.com/59989778/212934945-cada95ad-c93c-487f-bacc-928a2e3b5c21.png)

![image](https://user-images.githubusercontent.com/59989778/212935059-511aca3b-c736-40a2-bff6-89caf0664828.png)

Please review and share your feedback.

Best Regards,
Jatin

-------------

Commit messages:
 - 8300208: Optimize Adler32 stub for AVX-512 targets.

Changes: https://git.openjdk.org/jdk/pull/12045/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12045&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8300208
  Stats: 142 lines in 4 files changed: 84 ins; 28 del; 30 mod
  Patch: https://git.openjdk.org/jdk/pull/12045.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12045/head:pull/12045

PR: https://git.openjdk.org/jdk/pull/12045