RFR: 8373026: C2 SuperWord and Vector API: vector algorithms test and benchmark [v10]
Emanuel Peter
epeter at openjdk.org
Tue Jan 20 07:23:36 UTC 2026
On Tue, 20 Jan 2026 07:12:05 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
> > If the Vector API code can be fully intrinsified, we get fast code. But somtimes, the Vector API is horribly slow, much slower than scalar loop performance.
> > linux_aarch64_server: filterI, scanAddI, reduceAddIFieldsX4 are very slow
>
> Hi @eme64 , I noticed that these benchmarks have even much worse performance on AAch64. May I ask whether the machine supports SVE/SVE2 feature? Thanks!
@XiaohongGong thanks for reviewing!
No, I only have access to NEON machines, unfortunately. I added a comment to the PR description for that just now :)
> test/micro/org/openjdk/bench/vm/compiler/VectorAlgorithmsImpl.java line 598:
>
>> 596: if (mask.anyTrue()) {
>> 597: var ml = mask.toLong();
>> 598: return i + Long.numberOfTrailingZeros(ml);
>
> Can we use `mask.firstTrue()` here?
Yes we can! Great suggestion. I applied the change :)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28639#issuecomment-3771397665
PR Review Comment: https://git.openjdk.org/jdk/pull/28639#discussion_r2707076289
More information about the hotspot-compiler-dev
mailing list