RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

Xiaohong Gong xgong at openjdk.org
Tue May 20 05:42:51 UTC 2025


On Tue, 20 May 2025 02:22:13 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>> Ping again~  could any one please take a look at this PR? Thanks a lot!
>
>> Hi @XiaohongGong , Very nice work!, Looks good to me, will do some testing and get back.
>> 
>> Do you have any idea about following regression?
>> 
>> ```
>> GatherOperationsBenchmark.microByteGather256              thrpt  30  ops/ms  64    55844.814   48311.847  0.86
>> GatherOperationsBenchmark.microByteGather256              thrpt  30  ops/ms  256   15139.459   13009.848  0.85
>> GatherOperationsBenchmark.microByteGather256              thrpt  30  ops/ms  1024   3861.834    3284.944  0.85
>> GatherOperationsBenchmark.microByteGather256              thrpt  30  ops/ms  4096    938.665     817.673  0.87
>> ```
>> 
>> Best Regards
> 
> Yes, I also observed such regression. After analyzing, I found it was caused by the java side changes, which influences the range check elimination inside `IntVector.fromArray()` and `ByteVector.intoArray()` in the benchmark. The root cause is the counted loop in following benchmark case is not recognized by compiler as expected:
> 
> public void microByteGather256() {
>         for (int i = 0; i < SIZE; i += B256.length()) {
>             ByteVector.fromArray(B256, barr, 0, index, i)
>                             .intoArray(bres, i);
>         }
>  }
> ``` 
> The loop iv phi node is not recognized successfully when C2 recognize the counted loop pattern, because it was casted twice with `CastII` in this case. The ideal graph looks like:
> 
>                            Loop
>                                \
>                                     \        / -----------------------------
>                                        \  /                                 |
>                                        Phi                                  |
>                                          |                                  | 
>                                       CastII                                |
>                                           |                                 |
>                                       CastII                                |
>                                           |                                 |
>                                              \   ConI                       |
>                                                 \  |                        |
>                                                 AddVI                       |
>                                                   |-------------------------|                           
> 
> Relative code is https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/loopnode.cpp#L1667. 
> 
> Befor...

> @XiaohongGong Thanks for splitting this one out, and for investigating the regressions here.
> 
> Putting the permalink here, fixed to the current change (the link you pasted will always refer to the newest, which may later on point to the wrong line when lines above are inserted / deleted):
> 
> https://github.com/openjdk/jdk/blob/7077535c0b0a6ea0a2a167f9135b1504a3d71fb3/src/hotspot/share/opto/loopnode.cpp#L1659-L1661
> 
> I wonder if we should just use `Node::uncast` there? But I'm quite unsure about that.

Sounds good to me. I will have a deep investigation for it. Thanks!



> > Yes, I also observed such regression.
> > It would be nice if you proactively mentioned regressions, so it does not have to be pointed out by reviewers.
> 
> For me, it could be ok to fix it in a follow-up patch. I think we are too close to RDP1 for JDK25 now anyway, and so we could push this patch here into JDK26, and then we have enough time in JDK26 to investigate the regression. Even better would be if we could do the other patch first, so we never even encounter a regression.

Sounds good to me. Thanks!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-2893026228


More information about the core-libs-dev mailing list