Vector API: blend() performance on AArch64
August Nagro
augustnagro at gmail.com
Tue Mar 9 09:20:04 UTC 2021
In my limited experience `-XX:+PrintIntrinsics` can be misleading since the
output is large, and some things that fail inlining actually are inlined
eventaully.
On Tue, Mar 9, 2021 at 12:11 AM Gunnar Morling <gunnar at hibernate.org> wrote:
> Paul, Ningsheng,
>
> > you can use JMH with profiling, on the Mac using the command line "-prof
> dtraceasm”
>
> Ah, yes, good point. I had shied away from this so far, as it requires
> system integrity protection to be disabled, but I'll look into it.
>
> > However, on AArch64 NEON, the max hardware vector size is 128 bits
>
> I see, that makes sense.
>
> Thanks a lot for taking your time to look into this and your replies! I've
> updated this part of the post as per that info.
>
> --Gunnar
>
>
> Am Di., 9. März 2021 um 06:59 Uhr schrieb Ningsheng Jian <
> ningsheng.jian at arm.com>:
>
> > Hi Gunnar,
> >
> > Thanks for trying Vector API on AArch64. I see you were using
> > IntVector.SPECIES_256 species in your benchmarks. However, on AArch64
> > NEON, the max hardware vector size is 128 bits. So for 256-bits, we are
> > not able to intrinsify to use SIMD directly, which will fall back to
> > Java implementation of those APIs, blend() for example. You can use
> > -XX:+PrintIntrinsics option to see some details.
> >
> > For the benchmarks, I would suggest to write in a more (performance)
> > portable way, e.g. use IntVector.SPECIES_PREFERRED and do not assume the
> > actual vector length in code logic.
> >
> > Thanks,
> > Ningsheng
> >
> > On 3/9/21 4:25 AM, Gunnar Morling wrote:
> > > Hi,
> > >
> > > I was exploring the Vector API a bit [1] and noticed that the
> performance
> > > of my vectorized FizzBuzz information is pretty poor on AArch64. I
> first
> > > thought this may be specific to the Apple M1 chip on which I was
> running
> > > this; but numbers don't look better with Linux (AWS Graviton2, see the
> > repo
> > > [2] for all numbers) either. My implementation is using the blend() API
> > > method, is this not (yet) supported on AArch64 perhaps?
> > >
> > > Thanks for any hints,
> > >
> > > --Gunnar
> > >
> > > [1] https://www.morling.dev/blog/fizzbuzz-simd-style/
> > > [2] https://github.com/gunnarmorling/simd-fizzbuzz
> > >
> >
> >
>
More information about the panama-dev
mailing list