[vectorIntrinsics+compress] RFR: 8274975: Add micro benchmark: ALIBABA selective store use case

Tue Oct 12 14:29:58 UTC 2021

On Tue, 12 Oct 2021 11:10:00 GMT, Eric Liu <eliu at openjdk.org> wrote:

> How about the performance for byte and short types? Do you have performance data comparing Java implementation against with scalar version? As some platforms may not support the compress/expand operation well. E.g., SVE doesn’t support compress on byte and short types. AVX512 maybe the same without vbmi2 extension.

Eric, on platforms without hardware support, like AVX2 or Neon, vector-to-vector compression could be a composition of mask -> shuffle -> rearrange. A precomputed permutation table could be leveraged to lookup shuffle from mask. John Rose elaborated the algorithm how to compute a compression shuffle from a mask. See design discussion at https://github.com/openjdk/panama-vector/pull/115

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/149