RFR: 8323582: C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory

Tue Feb 25 07:15:56 UTC 2025

On Tue, 25 Feb 2025 00:34:14 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> @vnkozlov I mean the issue this: once I implement aliasing-analysis runtime-checks with this multiversion approach, then we'd get regressions if we do not optimize the slow path loop. Currently, we would not vectorize (because we have to be ready for aliasing cases), but we at least unroll, and whatever else we can except vectorization. But if we do not optimize the slow path loop, then we would get performance regressions in aliasing cases because we have no unrolling for them any more. I think we need to avoid that - would you agree?
>
>> But if we do not optimize the slow path loop, then we would get performance regressions in aliasing cases because we have no unrolling for them any more. 
> 
> Okay, we are back to our previous conversation - we will wait your aliasing-analysis runtime-checks implementation and do performance runs to see if "slow" path affects performance.
> 
> Okay.
> 
> PS: "slow" path implies that it is not taking frequently and it should not affect general performance of application.

@vnkozlov @rwestrel Let me summarize the tasks left to do here:
- Rename `stalled` -> `delayed`. And `unstall` -> `resume_optimizations` or alike. Improve some comments.
- File follow-up RFE for more verification (must find multiversion-if from multiversioned loop) - currently blocked by predicate traversal issue. Maybe we can also assert that we can always find the pre-loop from the main-loop, at least during loop-opts.
- When working on aliasing-analysis runtime-check, we have to do more performance analysis, and show the need of both the fast and slow path loops.

Let me know if there is more ;)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22016#issuecomment-2680894298