[14] RFR (XS): 8234392: C2: Extend Matcher::match_rule_supported_vector() with element type information
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Dec 10 20:03:03 UTC 2019
>> Similar checks for byte operations are redundant, because
>> Matcher::vector_width_in_bytes() explicitly checks for AVX512BW:
>>
>> if (UseAVX > 2 && (bt == T_BYTE || bt == T_SHORT || bt == T_CHAR))
>> size = (VM_Version::supports_avx512bw()) ? 64 : 32;
>
> It’s true that 512bw and 512dq are different branches of 512, so it
> is permissible to detect them in different places. But… it would
> seem easier to follow to check both in the one place:
>
> if (UseAVX > 2 && (bt == T_BYTE || bt == T_SHORT || bt == T_CHAR))
> size = (VM_Version::supports_avx512bw()) ? 64 : 32;
> + if (UseAVX > 2 && (bt == T_FLOAT || bt == T_DOUBLE))
> + size = (VM_Version::supports_avx512dq()) ? 64 : 32;
>
> This is probably an oversimplification, so file it as BS (brain storming).
> If such a consolidation of sizing logic is possible, it can be done as
> a separate cleanup.
Unfortunately, single- and double-precision FP operations are scattered
between AVX512F and AVX512DQ, so I doubt it makes sense to limit FP
operations to 256-bit vectors when AVX512DQ is not available.
It would severely penalize Xeon Phis (which lack BW, DQ, and VL
extensions), but maybe there'll be a moment when Skylake (F+CD+BW+DQ+VL)
can be chosen as the baseline.
Anyway, I got your idea and it makes perfect sense to me to collect such
ideas.
> (Side observation: Since Abs and Neg are treated together in the vabsnegf
> macro-instruction, it seems odd that there would be special processing
> of NegVF and not AbsVF also. More cleanups to do?)
Yes, it's a first round and there are other opportunities left. More
will come after the patches are upstreamed.
>> I can add a comment that AVX512DQ is required, but it doesn't look
>> like an improvement since the checks say exactly the same.
>
> Maybe add a comment cross-referencing the two hunks of logic,
> since they are not parallel in the sources? (For whatever reasons;
> it doesn’t matter…) It’s important to be able to find all of the
> vector-sizing logic so it can be analyzed together, even if it must
> be distributed into different locations.
>
>> tricky reason they must be called out explicitly, it would be
>>> good to explain. The CMoveV rules probably stem from limited
>>> support for CMove, but again a comment would be good.
>>
>> Yes, there are only vcmov8F_reg and vcmov4D_reg present, so the checks
>> are there to avoid matching failures. I'll put a comment that it's an
>> implementation limitation.
>>
>
> Yes, comments. The matcher logic is tricky, and the structure
> of AVX512 is tricky, and comments would help navigate it all.
Yes, the complexity (very) quickly adds up... (Or "multiplies up"?..)
Best regards,
Vladimir Ivanov
More information about the hotspot-compiler-dev
mailing list