RFR: 8309583: AArch64: Optimize firstTrue() when amount of elements < 8 [v3]
Chang Peng
duke at openjdk.org
Mon Jun 19 02:06:27 UTC 2023
> This patch optimizes VectorMask.firstTrue() on Neon when there are 2 or 4 elements in vector registers.
>
> VectorMask.firstTrue() should return VLEGNTH when vector mask is all false [1]. Current implementation uses rbit and then clz [2] to count leading zeros, then uses csel [3] (conditional select) to get the smaller value between VLENGTH and the number of unset lanes to ensure correctness.
>
> This patch sets the 16th or 32nd bit as 1, when there are only 2 or 4 elements in boolean masks, before rbit and clz. With this trick, maximum value calculated in such case will be VLENGTH (2 or 4).
>
> Test:
> All vector and vectorapi test passed.
>
> Performance:
> The benchmark functions are in MaskQueryOperationsBenchmark.java [4]. This patch also modifies above benchmark to measure mask operations' performance more effectively.
>
> Following data is collected on a 128-bit Neon machine.
>
> Benchmark (inputs) Mode Before After Units
> MaskQueryOperationsBenchmark.testFirstTrueInt 1 thrpt 5952.670 7298.491 ops/ms
> MaskQueryOperationsBenchmark.testFirstTrueInt 2 thrpt 5951.513 7297.620 ops/ms
> MaskQueryOperationsBenchmark.testFirstTrueInt 3 thrpt 5953.048 7298.072 ops/ms
> MaskQueryOperationsBenchmark.testFirstTrueLong 1 thrpt 3496.990 4003.188 ops/ms
> MaskQueryOperationsBenchmark.testFirstTrueLong 2 thrpt 3497.755 4002.577 ops/ms
> MaskQueryOperationsBenchmark.testFirstTrueLong 3 thrpt 3500.085 4002.471 ops/ms
>
> [1]: https://docs.oracle.com/en/java/javase/20/docs/api/jdk.incubator.vector/jdk/incubator/vector/VectorMask.html#firstTrue()
> [2]: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/aarch64_vector.ad#L5540
> [3]: https://developer.arm.com/documentation/ddi0602/2021-12/Base-Instructions/CSEL--Conditional-Select-
Chang Peng has updated the pull request incrementally with one additional commit since the last revision:
Update MaskQueryOperationsBenchmark.java
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/14373/files
- new: https://git.openjdk.org/jdk/pull/14373/files/d8507105..62a6522c
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=14373&range=02
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=14373&range=01-02
Stats: 195 lines in 1 file changed: 103 ins; 34 del; 58 mod
Patch: https://git.openjdk.org/jdk/pull/14373.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/14373/head:pull/14373
PR: https://git.openjdk.org/jdk/pull/14373
More information about the hotspot-compiler-dev
mailing list