RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v8]
Vladimir Kozlov
kvn at openjdk.java.net
Mon May 23 23:11:36 UTC 2022
On Fri, 20 May 2022 09:51:24 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Hi All,
>>
>> Patch adds the planned support for new vector operations and APIs targeted for [JEP 426: Vector API (Fourth Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173)
>>
>> Following is the brief summary of changes:-
>>
>> 1) Extends the scope of existing lanewise API for following new vector operations.
>> - VectorOperations.BIT_COUNT: counts the number of one-bits
>> - VectorOperations.LEADING_ZEROS_COUNT: counts the number of leading zero bits
>> - VectorOperations.TRAILING_ZEROS_COUNT: counts the number of trailing zero bits
>> - VectorOperations.REVERSE: reversing the order of bits
>> - VectorOperations.REVERSE_BYTES: reversing the order of bytes
>> - compress and expand bits: Semantics are based on Hacker's Delight section 7-4 Compress, or Generalized Extract.
>>
>> 2) Adds following new APIs to perform cross lane vector compress and expansion operations under the influence of a mask.
>> - Vector.compress
>> - Vector.expand
>> - VectorMask.compress
>>
>> 3) Adds predicated and non-predicated versions of following new APIs to load and store the contents of vector from foreign MemorySegments.
>> - Vector.fromMemorySegment
>> - Vector.intoMemorySegment
>>
>> 4) C2 Compiler IR enhancements and optimized X86 and AARCH64 backend support for each newly added operation.
>>
>>
>> Patch has been regressed over AARCH64 and X86 targets different AVX levels.
>>
>> Kindly review and share your feedback.
>>
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> 8284960: Integrating incremental patches.
src/hotspot/cpu/x86/assembler_x86.cpp line 7934:
> 7932:
> 7933: void Assembler::evplzcntd(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) {
> 7934: assert(VM_Version::supports_avx512cd() && (vector_len == AVX_512bit || VM_Version::supports_avx512vl()), "");
Please, split assert as in other instructions - it will help to understand failure better.
src/hotspot/cpu/x86/assembler_x86.cpp line 7946:
> 7944:
> 7945: void Assembler::evplzcntq(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) {
> 7946: assert(VM_Version::supports_avx512cd() && (vector_len == AVX_512bit || VM_Version::supports_avx512vl()), "");
Split assert.
src/hotspot/cpu/x86/assembler_x86.cpp line 8173:
> 8171:
> 8172: void Assembler::vinsertf32x4(XMMRegister dst, XMMRegister nds, XMMRegister src, uint8_t imm8) {
> 8173: assert(VM_Version::supports_evex(), "");
Hmm, did we never trigger this wrong assert because the use was guarded by correct check?
src/hotspot/cpu/x86/assembler_x86.cpp line 11720:
> 11718:
> 11719: void Assembler::evpcompressb(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) {
> 11720: assert(VM_Version::supports_avx512_vbmi2() && (vector_len == AVX_512bit || VM_Version::supports_avx512vl()), "");
Split assert in this and following new instructions.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4455:
> 4453: break;
> 4454: default:
> 4455: fatal("Unsupported type");
Print wrong type: `fatal("Unsupported type : %s", type2name(type));`
Below too.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4561:
> 4559: case 4 : evpbroadcastd(dst, rtmp, vec_enc); break;
> 4560: case 8 : evpbroadcastq(dst, rtmp, vec_enc); break;
> 4561: default : ShouldNotReachHere(); break;
`ShouldNotReachHere` does not give any information in case of failure. Use `fatal()` which prints wrong `lane_size`.
Same below.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4666:
> 4664: break;
> 4665: default:
> 4666: ShouldNotReachHere();
Use `fatal()`.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4693:
> 4691: break;
> 4692: default:
> 4693: ShouldNotReachHere();
Use `fatal()`.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4732:
> 4730: vector_reverse_byte(bt, dst, xtmp2, rtmp, vec_enc);
> 4731:
> 4732: } else if(!VM_Version::supports_avx512vlbw() && vec_enc == Assembler::AVX_512bit) {
No need to check `!VM_Version::supports_avx512vlbw()`.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4759:
> 4757: vpandn(xtmp2, xtmp2, xtmp1, vec_enc);
> 4758: vpsrlq(xtmp2, xtmp2, 1, vec_enc);
> 4759: vporq(xtmp1, dst, xtmp2, vec_enc);
All 3 code snippets are the same except constants. Also similar code in `vector_reverse_byte64` for `short` type.
Consider factoring out it into separate method.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4819:
> 4817: break;
> 4818: default:
> 4819: fatal("Unsupported type");
Print wring type.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4846:
> 4844: break;
> 4845: default:
> 4846: fatal("Unsupported type");
Print wring type.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4892:
> 4890: break;
> 4891: default:
> 4892: ShouldNotReachHere();
Use `fatal` and print type.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5018:
> 5016: break;
> 5017: default:
> 5018: ShouldNotReachHere();
Use fatal and print type.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5037:
> 5035: break;
> 5036: default:
> 5037: ShouldNotReachHere();
Use fatal and print type.
src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5056:
> 5054: break;
> 5055: default:
> 5056: ShouldNotReachHere();
Use fatal and print type.
src/hotspot/cpu/x86/x86.ad line 8879:
> 8877: // special handling should be removed.
> 8878: if (bt == T_LONG && rbt == T_INT) {
> 8879: if (VM_Version::supports_avx512vl()) {
Predicate say `!VM_Version::supports_avx512vl()`
src/hotspot/share/opto/node.hpp line 1006:
> 1004:
> 1005: // The node is a CountedLoopEnd with a mask annotation so as to emit a restore context
> 1006: bool has_vector_mask_set() const { return (_flags & Flag_has_vector_mask_set) != 0; }
I don't see use of this flag.
src/hotspot/share/opto/vectorIntrinsics.cpp line 86:
> 84: if ((mask_use_type & VecMaskUseLoad) != 0) {
> 85: if (!Matcher::match_rule_supported_vector(Op_VectorLoadMask, num_elem, elem_bt) ||
> 86: !Matcher::match_rule_supported_vector(Op_LoadVector, num_elem, T_BOOLEAN)) {
Add comment explaining new check. In follow ing places too.
src/hotspot/share/runtime/vmStructs.cpp line 1779:
> 1777: declare_c2_type(CMoveVDNode, VectorNode) \
> 1778: declare_c2_type(CompressVNode, VectorNode) \
> 1779: declare_c2_type(ExpandVNode, VectorNode) \
Not all new nodes listed.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8425
More information about the hotspot-compiler-dev
mailing list