RFR: 8279651: [vectorapi] Implement the missing intrinsics for casting from integrals on x64 [v8]
Vladimir Ivanov
vlivanov at openjdk.java.net
Mon Jan 17 19:32:26 UTC 2022
On Fri, 14 Jan 2022 04:08:03 GMT, Quan Anh Mai <duke at openjdk.java.net> wrote:
>> Hi,
>>
>> This patch implements the remaining casts from integral types on x64.
>>
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
>
> fix
Overall,
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1496:
> 1494: }
> 1495:
> 1496: void C2_MacroAssembler::load_vector(XMMRegister dst, Address src, int vlen_in_bytes) {
`loadV` and `storeV` in `x86.ad` can be migrated to these new methods as well.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4150:
> 4148: jmp(done);
> 4149:
> 4150: bind(slow_path);
How common the slow path is expected to be?
Depending on the frequency, it may be sensible to extract the code into a stub shared across all the vector casts in generated code.
src/hotspot/cpu/x86/x86.ad line 1787:
> 1785: case Op_VectorCastB2X:
> 1786: case Op_VectorCastS2X:
> 1787: case Op_VectorCastI2X:
I don't see any new code added for `VectorCastI2X`. Why is it so?
src/hotspot/cpu/x86/x86.ad line 1793:
> 1791: break;
> 1792: case Op_VectorCastL2X:
> 1793: if (is_integral_type(bt) && size_in_bits == 256 && UseAVX < 2) {
No new code for `VectorCastL2X` to integral types? Why?
src/hotspot/cpu/x86/x86.ad line 6881:
> 6879: instruct vcastBtoX(vec dst, vec src) %{
> 6880: predicate(UseAVX > 1 ||
> 6881: Matcher::vector_element_basic_type(n) != T_FLOAT ||
Before the change, the implicit predicate was `bt != T_DOUBLE && size_in_bits == 256 && UseAVX < 2`.
Why do you check `T_FLOAT` then?
Why `Matcher::vector_length_in_bytes(n) < 32` and not `Matcher::vector_length_in_bytes(n) != 32`?
src/hotspot/cpu/x86/x86.ad line 6951:
> 6949:
> 6950: instruct vcastStoX_evex(vec dst, vec src) %{
> 6951: predicate((Matcher::vector_element_basic_type(n) == T_BYTE && UseAVX > 2 && VM_Version::supports_avx512vlbw()) ||
The following code is redundant:
switch (to_elem_bt) {
case T_BYTE:
if (!VM_Version::supports_avx512vl()) {
vlen_enc = Assembler::AVX_512bit;
}
src/hotspot/cpu/x86/x86.ad line 6952:
> 6950: instruct vcastStoX_evex(vec dst, vec src) %{
> 6951: predicate((Matcher::vector_element_basic_type(n) == T_BYTE && UseAVX > 2 && VM_Version::supports_avx512vlbw()) ||
> 6952: (Matcher::vector_element_basic_type(n) == T_FLOAT && (UseAVX > 1 || Matcher::vector_length_in_bytes(n) < 32)) ||
Same question here (as for `VectorCastB2X`): why doesn't it match the condition removed from `Matcher::match_rule_supported_vector()` (e.g., `T_FLOAT` vs `T_DOUBLE`)?
src/hotspot/cpu/x86/x86.ad line 6992:
> 6990: %}
> 6991:
> 6992: instruct vcastBStoF(vec dst, vec src, vec xtmp) %{
Maybe `vcastBStoF_avx`?
src/hotspot/cpu/x86/x86.ad line 7150:
> 7148:
> 7149: instruct vcastLtoFD_avx(vec dst, vec src, vec xtmp1, vec xtmp2, rRegI tmp, rFlagsReg cr) %{
> 7150: predicate((UseAVX <= 2 || !VM_Version::supports_avx512dq()) &&
`!VM_Version::supports_avx512dq()` is equivalent to `UseAVX <= 2 || !VM_Version::supports_avx512dq()` here, isn't it?
src/hotspot/cpu/x86/x86.ad line 7167:
> 7165:
> 7166: instruct vcastLtoFD_evex(vec dst, vec src, vec xtmp1, vec xtmp2, rRegI tmp, kReg ktmp, rFlagsReg cr) %{
> 7167: predicate((UseAVX <= 2 || !VM_Version::supports_avx512dq()) &&
Same here. Moreover, `UseAVX <= 2` doesn't make sense for `_evex` case.
-------------
Changes requested by vlivanov (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/7002
More information about the hotspot-compiler-dev
mailing list