RFR: 8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE
Pengfei Li
pli at openjdk.org
Tue Jul 25 07:53:42 UTC 2023
On Tue, 25 Jul 2023 07:42:59 GMT, Pengfei Li <pli at openjdk.org> wrote:
> Hotspot jtreg `compiler/loopopts/superword/TestDependencyOffsets.java` fails on AArch64 CPUs with 512-bit SVE. The reason is that many test loops in the code cannot be vectorized due to data dependence but IR tests assume they can.
>
> On AArch64, these IR tests just check the CPU feature of `asimd` and incorrectly assumes AArch64 vectors are at most 256 bits. But actually, `asimd` on AArch64 only represents NEON vectors which are at most 128 bits. AArch64 CPUs may have another feature of `sve` which represents scalable vectors of at most 2048 bits. The vectorization won't succeed on 512-bit SVE CPUs if the memory offset between some read and write is less than 512 bits.
>
> As this jtreg is auto-generated by a python script, we have updated the script and re-generated this jtreg. In this new version, we checked the auto-vectorization on both NEON-only and NEON+SVE platforms. Below is the diff of the generator script. We have also attached the new script to the JBS page.
>
>
> @@ -321,7 +321,8 @@ class Type:
> p.append(Platform("avx512", ["avx512", "true"], 64))
> else:
> assert False, "type not implemented" + self.name
> - p.append(Platform("asimd", ["asimd", "true"], 32))
> + p.append(Platform("asimd", ["asimd", "true", "sve", "false"], 16))
> + p.append(Platform("sve", ["sve", "true"], 256))
> return p
>
> class Test:
> @@ -457,7 +458,7 @@ class Generator:
> lines.append(" * and various MaxVectorSize values, and +- AlignVector.")
> lines.append(" *")
> lines.append(" * Note: this test is auto-generated. Please modify / generate with script:")
> - lines.append(" * https://bugs.openjdk.org/browse/JDK-8308606")
> + lines.append(" * https://bugs.openjdk.org/browse/JDK-8312570")
> lines.append(" *")
> lines.append(" * Types: " + ", ".join([t.name for t in self.types]))
> lines.append(" * Offsets: " + ", ".join([str(o) for o in self.offsets]))
> @@ -598,7 +599,8 @@ class Generator:
> # IR rules
> for p in test.t.platforms():
> elements = p.vector_width // test.t.size
> - lines.append(f" // CPU: {p.name} -> vector_width: {p.vector_width} -> elements in vector: {elements}")
> + max_pre = "max " if p.name == "sve" else ""
> + lines.append(f" // CPU: {p.name} -> {max_pre}vector_width: {p.vector_width} -> {max_pre}elements in vector: {elements}")
> ############### -Align...
@eme64 Please help look at this. And, how about adding the test generator script you wrote into the jdk repo?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15010#issuecomment-1649310477
More information about the hotspot-compiler-dev
mailing list