[code-reflection] RFR: Float4 arrayView support
Juan Fumero
jfumero at openjdk.org
Fri Oct 31 22:04:52 UTC 2025
On Tue, 28 Oct 2025 17:16:23 GMT, Ruby Chen <duke at openjdk.org> wrote:
>> We will need to propagate this change also for the CUDA codegen.
>
> Good point; currently, a line like `vC[index * 4] = Float4.add(vA[index * 4], vB[index * 4])` won't be handled properly (the add operation won't be recursed over, and the resulting code will look like `vstore4(vA, 0, &c->array[index*4])` instead of storing the add operation's result).
>
> (Edit: I originally thought that if we extracted the array accesses `vA[index * 4]` and `vB[index * 4]` into Float4 variables first before operating on them, the original code would work, but it doesn't seem to be the case)
>
> There might also be something I'm missing that causes a case like `vC[index * 4] = Float4.add(vA[index * 4], vB[index * 4])` to fail with the original code; I'll give it another pass.
Ok, got it. Thanks for the clarification. So it looks to me that possibly there are some missing `HATVectorVarLoadOps` in this case.
For example:
`Float4.add(vA[index * 4], vB[index * 4])`, would have two `HATVectorVarLoadOp` loading a `float4`. But I get now, with the array views, we will see actually two plain arrays being loaded. So if that's the case, it makes sense to invoke `recurse`.
I am not suggesting to use `HATVectorVarLoadOp`, but it is good we have the cases identified.
-------------
PR Review Comment: https://git.openjdk.org/babylon/pull/646#discussion_r2472182044
More information about the babylon-dev
mailing list