[vectorIntrinsics] RFR: 8285281: [x86] Add C2 mid-end and back-end implementation for COMPRESS_BITS and EXPAND_BITS operations [v2]
Xiaohong Gong
xgong at openjdk.java.net
Sun Apr 24 04:29:48 UTC 2022
On Thu, 21 Apr 2022 13:27:41 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Summary of changes:
>> - Patch intrinsifies following newly added Java SE APIs
>> 1. Integer.compress
>> 2. Integer.expand
>> 3. Long.compress
>> 4. Long.expand
>> - Adds C2 IR nodes and corresponding ideal transformations for new operations.
>> - Inline expansion of new vector operations COMPRESS_BITS and EXPAND_BITS are performed using their scalar counterparts and lane insertion/extraction operations.
>> - Performance of JIT sequence generated using above approach vs directly vectorizing scalar algorithm using existing vector APIs is within in +/-%10 range depending on the width of the operation, since X86 offers direct instructions PEXT/PDEP for parallel bit extraction and deposition operations hence performance of scalar loop is always superior to corresponding vector operations.
>> - Adds an IR framework based test to validate newly introduced IR transformations.
>>
>> Kindly review and share your feedback.
>>
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> 8285281: Removing CompressExpand.java since fallback implementation directly calls new [Integer/Long].[compress/expand] Java SE APIs
src/hotspot/share/opto/intrinsicnode.cpp line 171:
> 169: // compress(x, -1) == x
> 170: if(phase->type(n->in(2))->higher_equal( TypeLong::MINUS_1)) return n->in(1);
> 171: }
The codes are almost the same for int and long type except for the "ZERO and MINUS_1" node. Could you please remove the duplicate codes by just defining different `ZERO` and `MUNUS_1` nodes for int and long?
src/hotspot/share/opto/intrinsicnode.hpp line 264:
> 262: virtual uint ideal_reg() const { return Op_RegF; }
> 263: };
> 264: //----------------------------CompressBits/ExpandBits---------------------------
Style: please insert a blank line between line 263-264
src/hotspot/share/opto/intrinsicnode.hpp line 285:
> 283: virtual Node* Ideal(PhaseGVN* phase, bool can_reshape);
> 284: virtual Node* Identity(PhaseGVN* phase);
> 285:
Please remove the blank line in line-285. Thanks!
src/hotspot/share/opto/library_call.cpp line 2212:
> 2210: switch (id) {
> 2211: case vmIntrinsics::_compress_i: n = new CompressBitsNode(argument(0), argument(1), TypeInt::INT); break;
> 2212: case vmIntrinsics::_expand_i: n = new ExpandBitsNode(argument(0), argument(1), TypeInt::INT); break;
Style: one more space after `new ExpandBitsNode(argument(0), `
src/hotspot/share/opto/vectorIntrinsics.cpp line 70:
> 68: }
> 69:
> 70: bool LibraryCallKit::arch_supports_vector_bitshuffle(int opc, int num_elem, BasicType elem_bt) {
Could you please add an assertion for `opc` that it should be `Op_CompressBitsV or Op_ExpandBitsV` ?
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/195
More information about the panama-dev
mailing list