RFR: 8374349: [VectorAPI]: AArch64: Prefer merging mode SVE CPY instruction
Eric Fang
erfang at openjdk.org
Thu Jan 29 01:24:48 UTC 2026
On Wed, 28 Jan 2026 10:17:30 GMT, Andrew Haley <aph at openjdk.org> wrote:
> > Therefore, when you test this change using the C case, you will see a significant performance improvement.
> > > I see 2% uplift on these numbers.
> >
> >
> > @theRealAph And I think this also explains your question on these numbers.
>
> Not at all.
>
> The performance claim above was:
>
> > Microbenchmarks show this change brings performance uplift ranging from 11% to 33%, depending on the specific operation and data types.
>
> But the real performance uplift, as measured in Java microbenchmarks, is 2%.
Sorry, this is my mistake, I should be more precise. I should say that when this optimization takes effect, the performance improvement is 11%-33%, depending on the specific operation and data types. Thanks for point this out!
> Definitions in Assembler should generate the instructions in the Architecture reference Manual. When doing this, please override sve_cpy in MacroAssembler instead of here.
Agreed, that's exactly what I was thinking too.
@theRealAph Thank you for your suggestion. I will address the issue you pointed out in the next commit.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/29359#issuecomment-3814815084
More information about the hotspot-dev
mailing list