RFR: 8335444: Generalize implementation of AndNode mul_ring

Tue Aug 6 04:07:36 UTC 2024

On Mon, 5 Aug 2024 08:03:52 GMT, Damon Fenacci <dfenacci at openjdk.org> wrote:

>> Thank you for running testing @chhagedorn! I think I didn't run into this because my device doesn't support AVX-512. Does the failure have an ideal node printout as well? I think that could help in diagnosing the issue. Thanks!
>
> @jaskarth out of curiosity: could you by chance notice any measurable performance difference (e.g. for specific/ad-hoc benchmarks)?

@dafedafe I added a microbenchmark based on the case I saw above, and got these results:

                                                                                             Baseline                    Patch           Improvement
Benchmark                                                   (COUNT)  (seed)  Mode  Cnt    Score   Error  Units     Score   Error  Units
TypeVectorOperations.TypeVectorOperationsNonSuperWord.andZ      512       0  avgt    8  155.288 ± 1.175  ns/op   188.844 ± 4.189  ns/op  (+ 19.5%)
TypeVectorOperations.TypeVectorOperationsNonSuperWord.andZ     2048       0  avgt    8  629.098 ± 7.489  ns/op   732.558 ± 3.983  ns/op  (+ 15.2%)
TypeVectorOperations.TypeVectorOperationsSuperWord.andZ         512       0  avgt    8   22.917 ± 0.338  ns/op    23.578 ± 1.003  ns/op  (+ 2.8%)
TypeVectorOperations.TypeVectorOperationsSuperWord.andZ        2048       0  avgt    8   35.683 ± 0.232  ns/op    37.546 ± 1.063  ns/op  (+ 5.1%)

In general though I've found that unfortunately it's pretty difficult to identify specific places where performance is improved, since rather than improving nodes locally this analysis strengthens other idealizations that use int types. By improving the type we might be able to find more operations that evaluate to constants or prune out redundant comparisons, either directly or through another node that transforms the type further. I've been wanting to make our type analysis stronger, so that we can find more nontrivial optimizations without needing specialized idealization rules.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20066#issuecomment-2270339413