RFR: 8294588: Auto vectorize half precision floating point conversion APIs [v4]

Tue Dec 6 19:16:16 UTC 2022

> Hi All, 
> 
> I have added changes for autovectorizing Float.float16ToFloat and Float.floatToFloat16 API's.
> Following are the performance numbers of JMH micro Fp16ConversionBenchmark:
> Before code changes:
> Benchmark | (size) | Mode | Cnt | Score | Error | Units
> Fp16ConversionBenchmark.float16ToFloat | 2048 | thrpt | 3 | 1044.653 | ±     0.041 | ops/ms
> Fp16ConversionBenchmark.float16ToFloatMemory | 2048 | thrpt | 3 | 2341529.9 | ± 11765.453 | ops/ms
> Fp16ConversionBenchmark.floatToFloat16 | 2048 | thrpt | 3 | 2156.662 | ±     0.653 | ops/ms
> Fp16ConversionBenchmark.floatToFloat16Memory | 2048 | thrpt | 3 | 2007988.1 | ±   361.696 | ops/ms
> 
> After:
> Benchmark | (size) | Mode |  Cnt | Score | Error |   Units
> Fp16ConversionBenchmark.float16ToFloat  | 2048 | thrpt | 3 |  20460.349 |±  372.327 |  ops/ms
> Fp16ConversionBenchmark.float16ToFloatMemory | 2048 |  thrpt | 3 | 2342125.200 |± 9250.899  |ops/ms
> Fp16ConversionBenchmark.floatToFloat16  |  2048 | thrpt  |  3 |   22553.977 |±  483.034 | ops/ms
> Fp16ConversionBenchmark.floatToFloat16Memory | 2048 | thrpt |  3 |  2007899.797 |±  150.296 | ops/ms
> 
> Kindly review and share your feedback.
> 
> Thanks.
> Smita

Smita Kamath has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits:

 - Merge master
 - Updated instruction definition
 - Updated code as per review comments
 - Auto vectorize half precision floating point conversion APIs

-------------

Changes: https://git.openjdk.org/jdk/pull/11471/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11471&range=03
  Stats: 214 lines in 11 files changed: 212 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/11471.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11471/head:pull/11471

PR: https://git.openjdk.org/jdk/pull/11471