RFR: 8356760: VectorAPI: Optimize VectorMask.fromLong for all-true/all-false cases [v3]
erifan
duke at openjdk.org
Fri Jul 4 11:10:40 UTC 2025
On Fri, 4 Jul 2025 06:18:02 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
> public static final VectorSpecies<Float> FSP = FloatVector.SPECIES_512;
public static long micro1(long a) {
long mask = Math.min(-1, Math.max(-1, a));
return VectorMask.fromLong(FSP, mask).toLong();
}
public static long micro2() {
return FSP.maskAll(true).toLong();
}
With this JMH method we can not see obvious performance improvement, because the hot spots are other instructions. Adding a loop is better.
@Benchmark
public long micro_3() {
long result = 0;
for (int i = 0; i < ITERATION; i++) {
long mask = Math.min(-1, Math.max(-1, result));
result += VectorMask.fromLong(FSP, mask).toLong();
}
return result;
}
But if it is not a floating point type, there will be no obvious performance improvement. Because the pattern `VectorMaskToLong(VectorLongToMask (l))` for integer types has been implemented, and `VectorMaskToLong(VectorMaskCast (VectorLongToMask (l)))` for floating-point types is not implemented. So if we add JMH benchmarks for this optimization, we can only see good performance gain from floating point types. So do you think it is necessary?
@jatin-bhateja Thanks for your review!
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25793#issuecomment-3035646085
More information about the hotspot-compiler-dev
mailing list