RFR: 8309583: AArch64: Optimize firstTrue() when amount of elements < 8

Chang Peng duke at openjdk.org
Thu Jun 8 02:52:19 UTC 2023


This patch optimizes VectorMask.firstTrue() on Neon when there are 2 or 4 elements in vector registers.

VectorMask.firstTrue() should return VLEGNTH when vector mask is all false [1]. Current implementation uses rbit and then clz [2] to count leading zeros, then uses csel [3] (conditional select) to get the smaller value between VLENGTH and the number of unset lanes to ensure correctness.

This patch sets the 16th or 32nd bit as 1, when there are only 2 or 4 elements in boolean masks, before rbit and clz. With this trick, maximum value calculated in such case will be VLENGTH (2 or 4).

Test:
All vector and vectorapi test passed.

Performance:
The benchmark function is like:


@Benchmark
public static int testInt() {
    int res = 0;
    for (int i = 0; i < LENGTH; i += INT_SPECIES.length()) {
        VectorMask<Integer> m = VectorMask.fromArray(INT_SPECIES, ia, i);
        res += m.firstTrue();
    }

    return res;
}


Following data is collected on a 128-bit Neon machine.

Benchmark     Before     After     Unit
testInt       22214.740  25627.833 ops/ms
testLong      11649.898  13698.535 ops/ms

[1]: https://docs.oracle.com/en/java/javase/20/docs/api/jdk.incubator.vector/jdk/incubator/vector/VectorMask.html#firstTrue()
[2]: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/aarch64_vector.ad#L5540
[3]: https://developer.arm.com/documentation/ddi0602/2021-12/Base-Instructions/CSEL--Conditional-Select-

Change-Id: I4a2de805ffa4469f88d510c96617eae165f0e025

-------------

Commit messages:
 - 8309583: AArch64: Optimize firstTrue() when amount of elements < 8

Changes: https://git.openjdk.org/jdk/pull/14373/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14373&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8309583
  Stats: 84 lines in 2 files changed: 14 ins; 58 del; 12 mod
  Patch: https://git.openjdk.org/jdk/pull/14373.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/14373/head:pull/14373

PR: https://git.openjdk.org/jdk/pull/14373


More information about the hotspot-compiler-dev mailing list