RFR: 8266720: Wrong implementation in LibraryCallKit::inline_vector_shuffle_iota [v2]
Wang Huang
whuang at openjdk.java.net
Sat May 8 03:30:06 UTC 2021
> Dear All,
> Here is the patch of JDK-8266720. Could you do me a favor to review this?
> * Reproduce:
> * cherry-pick JDK-8265956
> * run patch's `TestVectorShuffleIotaByteWrongImpl.java`
> * However, this wrong of this code is obvious.
> * Reason :
> 1. In interpreter:
>
> static int partiallyWrapIndex(int index, int laneCount) {
> return checkIndex0(index, laneCount, (byte)-1);
> }
>
> @ForceInline
> static int checkIndex0(int index, int laneCount, byte mode) {
> int wrapped = VectorIntrinsics.wrapToRange(index, laneCount);
> if (mode == 0 || wrapped == index) { // NOTE here
> return wrapped;
> }
> if (mode < 0) {
> return wrapped - laneCount; // special mode for internal storage
> }
> throw checkIndexFailed(index, laneCount);
> }
>
> @ForceInline
> static int wrapToRange(int index, int size) {
> if ((size & (size - 1)) == 0) {
> // Size is zero or a power of two, so we got this.
> return index & (size - 1);
> } else {
> return wrapToRangeNPOT(index, size);
> }
> }
>
> 2. However, we have this intrinsics in
> src/hotspot/share/opto/vectorIntrinsics.cpp [jdk/jdk]
> ```c++
> 386 } else {
> 387 ConINode* pred_node = (ConINode*)gvn().makecon(TypeInt::make(1)); // BoolTest::gt here
> 388 Node * lane_cnt = gvn().makecon(TypeInt::make(num_elem));
> 389 Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt, num_elem, type_bt));
> // here BoolTest::ge != 1 (which means BoolTest::gt)
> 390 Node* mask = gvn().transform(new VectorMaskCmpNode(BoolTest::ge, bcast_lane_cnt, res, pred_node, vt));
>
> 3. In `aarch64` neon backend, we use `BoolTest::ge` for generated code:
> ```c++
> // cond is useless here
> instruct vcmge8B(vecD dst, vecD src1, vecD src2, immI cond)
> %{
> predicate(n->as_Vector()->length() == 8 &&
> n->as_VectorMaskCmp()->get_predicate() == BoolTest::ge &&
> n->in(1)->in(1)->bottom_type()->is_vect()->element_basic_type() == T_BYTE);
> match(Set dst (VectorMaskCmp (Binary src1 src2) cond));
> format %{ "cmge $dst, T8B, $src1, $src2\t# vector cmp (8B)" %}
> ins_cost(INSN_COST);
> ins_encode %{
> __ cmge(as_FloatRegister($dst$$reg), __ T8B,
> as_FloatRegister($src1$$reg), as_FloatRegister($src2$$reg));
> %}
> ins_pipe(vdop64);
> %}
>
>
> However, we use cond (=1 or BoolTest::gt). So X86 is **right** on jdk/jdk
> ```c++
> instruct vcmp(legVec dst, legVec src1, legVec src2, immI8 cond, rRegP scratch) %{
> predicate(vector_length_in_bytes(n->in(1)->in(1)) >= 8 && // src1
> vector_length_in_bytes(n->in(1)->in(1)) <= 32 && // src1
> is_integral_type(vector_element_basic_type(n->in(1)->in(1)))); // src1
> match(Set dst (VectorMaskCmp (Binary src1 src2) cond));
> effect(TEMP scratch);
> format %{ "vector_compare $dst,$src1,$src2,$cond\t! using $scratch as TEMP" %}
> ins_encode %{
> int vlen_enc = vector_length_encoding(this, $src1);
> Assembler::ComparisonPredicate cmp = booltest_pred_to_comparison_pred($cond$$constant);
> Assembler::Width ww = widthForType(vector_element_basic_type(this, $src1));
> __ vpcmpCCW($dst$$XMMRegister, $src1$$XMMRegister, $src2$$XMMRegister, cmp, ww, vlen_enc, $scratch$$Register);
> %}
> ins_pipe( pipe_slow );
> %}
>
> 4. In repo `panama-vector`, both of them are wrong, because the IR is fixed:
> ```c++
> 455 } else {
> 456 ConINode* pred_node = (ConINode*)gvn().makecon(TypeInt::make(BoolTest::ge));// WRONG here
> 457 Node * lane_cnt = gvn().makecon(TypeInt::make(num_elem));
> 458 Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt, num_elem, type_bt));
> 459 Node* mask = gvn().transform(new VectorMaskCmpNode(BoolTest::ge, bcast_lane_cnt, res, pred_node, vt));
>
> Yours,
> Wang Huang
Wang Huang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
8266720: Wrong implementation in LibraryCallKit::inline_vector_shuffle_iota
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/3933/files
- new: https://git.openjdk.java.net/jdk/pull/3933/files/65e20a50..9f5577f0
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3933&range=01
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3933&range=00-01
Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
Patch: https://git.openjdk.java.net/jdk/pull/3933.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/3933/head:pull/3933
PR: https://git.openjdk.java.net/jdk/pull/3933
More information about the hotspot-compiler-dev
mailing list