RFR: 8261553: Efficient mask generation using BMI2 BZHI instruction [v2]
Jatin Bhateja
jbhateja at openjdk.java.net
Thu Feb 11 13:54:38 UTC 2021
On Thu, 11 Feb 2021 12:56:24 GMT, Claes Redestad <redestad at openjdk.org> wrote:
> > Hi Claes,
> > Here is the JMH performance data over CLX for arraycopy benchmarks:
> > http://cr.openjdk.java.net/~jbhateja/8261553/JMH_PERF_CLX_BASELINE.txt
> > http://cr.openjdk.java.net/~jbhateja/8261553/JMH_PERF_CLX_WITH_OPTS.txt
> > Regards,
> > Jatin
>
> Thanks! Eyeballing the results it looks like a mixed bag. There even seems to be a few regressions such as this:
>
> ```
> o.o.b.java.lang.ArrayCopyUnalignedSrc.testLong 1200 N/A avgt 2 61.663 ns/op
> -->
> o.o.b.java.lang.ArrayCopyUnalignedSrc.testLong 1200 N/A avgt 2 74.160 ns/op
> ```
Hi Claes, This could be a run to run variation, in general we are now having fewer number of instructions (one shift operation saved per mask computation) compared to previous masked generation sequence and thus it will always offer better execution latencies.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2522
More information about the hotspot-compiler-dev
mailing list