RFR: 8291809: Convert compiler/c2/cr7200264/TestSSE2IntVect.java to IR verification test [v2]
Daniel Lundén
dlunden at openjdk.org
Thu Jan 25 13:41:27 UTC 2024
On Thu, 25 Jan 2024 07:58:14 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> @chhagedorn: Do you mean that `test_divc` and `test_divc_n` vectorize after JDK-8282365? They don't vectorize on my machine (on this PR).
>
> I just checked in my machine (on top of commit fb822e49f2a84423c8fd17db2e95bbdd5e7ec191) and these division tests do seem to vectorize, this is e.g. the innermost loop in `test_divc` right before code emission:
>
> ![test_divc](https://github.com/openjdk/jdk/assets/8792647/129d51c2-a1ad-4d02-ab81-02cd849af36f)
>
> Here are my processor features in case it helps (subset of `lscpu` output):
>
>
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Address sizes: 39 bits physical, 48 bits virtual
> Byte Order: Little Endian
> CPU(s): 12
> On-line CPU(s) list: 0-11
> Vendor ID: GenuineIntel
> Model name: Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz
> CPU family: 6
> Model: 158
> Thread(s) per core: 2
> Core(s) per socket: 6
> (...)
> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
> a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
> ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
> arch_perfmon pebs bts rep_good nopl xtopology nonstop_
> tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cp
> l vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid ss
> e4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
> xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_f
> ault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhan
> ced tpr_shadow flexpriority ept vpid ept_ad fsgsbase ts
> c_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed ad
> x smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsav
> es dtherm ida arat pln pts hwp hwp_notify hwp_act_windo
> w hwp_epp vnmi md_clear flush_l1d arch_capabilities
> (...)
Thanks for the clarification @robcasloz and @chhagedorn. I've investigated now, and they do vectorize on my machine as well. I was confused because, before the change below, the IR framework did not register the nodes (wrong vector size of 4 instead of the default of 8). Is that expected, and should we specify something else instead of the catch-all `IRNode.VECTOR_SIZE_ANY`?
@Test
- @IR(counts = { IRNode.ADD_VI, "> 0",
- IRNode.RSHIFT_VI, "> 0",
- IRNode.SUB_VI, "> 0" },
+ @IR(counts = { IRNode.ADD_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
+ IRNode.RSHIFT_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
+ IRNode.SUB_VI, IRNode.VECTOR_SIZE_ANY, "> 0" },
applyIfCPUFeatureOr = {"sse2", "true", "asimd", "true"})
void test_divc(int[] a0, int[] a1) {
for (int i = 0; i < a0.length; i+=1) {
@@ -519,9 +519,9 @@ void test_divc(int[] a0, int[] a1) {
}
@Test
- @IR(counts = { IRNode.ADD_VI, "> 0",
- IRNode.RSHIFT_VI, "> 0",
- IRNode.SUB_VI, "> 0" },
+ @IR(counts = { IRNode.ADD_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
+ IRNode.RSHIFT_VI, IRNode.VECTOR_SIZE_ANY, "> 0",
+ IRNode.SUB_VI, IRNode.VECTOR_SIZE_ANY, "> 0" },
applyIfCPUFeatureOr = {"sse2", "true", "asimd", "true"})
void test_divc_n(int[] a0, int[] a1) {
for (int i = 0; i < a0.length; i+=1) {
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17428#discussion_r1466392626
More information about the hotspot-compiler-dev
mailing list