RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]
Xiaohong Gong
xgong at openjdk.org
Tue Jul 1 06:43:44 UTC 2025
On Tue, 1 Jul 2025 06:07:03 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
>> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits:
>>
>> - Address review comments
>> - Merge 'jdk:master' into JDK-8355563
>> - 8355563: VectorAPI: Refactor current implementation of subword gather load API
>
> Ping again! Thanks in advance!
> @XiaohongGong I'm a little busy at the moment, and soon going on a summer vacation, so I cannot promise a full review soon. Feel free to ask someone else to have a look.
>
> I quickly looked through your new benchmark results you published after integration of #25539. There seem to still be a few cases where `Gain < 1`. Especially:
>
> ```
> GatherOperationsBenchmark.microShortGather512_MASK 256 thrpt 30 ops/ms 11587.465 10674.598 0.92
> GatherOperationsBenchmark.microShortGather512_MASK 1024 thrpt 30 ops/ms 2902.731 2629.739 0.90
> GatherOperationsBenchmark.microShortGather512_MASK 4096 thrpt 30 ops/ms 741.546 671.124 0.90
> ```
>
> and
>
> ```
> GatherOperationsBenchmark.microShortGather256_MASK 256 thrpt 30 ops/ms 11339.217 10951.141 0.96
> GatherOperationsBenchmark.microShortGather256_MASK 1024 thrpt 30 ops/ms 2840.081 2718.823 0.95
> GatherOperationsBenchmark.microShortGather256_MASK 4096 thrpt 30 ops/ms 725.334 696.343 0.96
> ```
>
> and
>
> ```
> GatherOperationsBenchmark.microByteGather512_MASK 64 thrpt 30 ops/ms 50588.210 48220.741 0.95
> ```
>
> Do you know what happens in those cases?
Thanks for your input! Yes, I spent some time making an analysis on these little regressions. Seems there are the architecture HW influences like the cache miss or code alignment. I tried with a larger loop alignment like 32, and the performance will be improved and regressions are gone. Since I'm not quite familiar with X86 architectures, I'm not sure of the exact point. Any suggestions on that?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-3022088710
More information about the hotspot-compiler-dev
mailing list