RFR: 8266720: Wrong implementation in LibraryCallKit::inline_vector_shuffle_iota

Wang Huang whuang at openjdk.java.net
Sat May 8 02:43:23 UTC 2021


Dear All,
  Here is the patch of JDK-8266720. Could you do me a favor to review this?
* Reproduce:
   * cherry-pick JDK-8265956 
   * run patch's `TestVectorShuffleIotaByteWrongImpl.java`
   * However, this wrong of this code is obvious.
* Reason :
 1. In interpreter: 

static int partiallyWrapIndex(int index, int laneCount) {
    return checkIndex0(index, laneCount, (byte)-1);
}

@ForceInline
static int checkIndex0(int index, int laneCount, byte mode) {
    int wrapped = VectorIntrinsics.wrapToRange(index, laneCount);
    if (mode == 0 || wrapped == index) { // NOTE here
        return wrapped;
    }
    if (mode < 0) {
        return wrapped - laneCount;  // special mode for internal storage
    }
    throw checkIndexFailed(index, laneCount);
}

@ForceInline
static int wrapToRange(int index, int size) {
    if ((size & (size - 1)) == 0) {
        // Size is zero or a power of two, so we got this.
        return index & (size - 1);
    } else {
        return wrapToRangeNPOT(index, size);
    }
}

2. However, we have this intrinsics in   
src/hotspot/share/opto/vectorIntrinsics.cpp [jdk/jdk]
```c++
 386   } else {
 387     ConINode* pred_node = (ConINode*)gvn().makecon(TypeInt::make(1)); // BoolTest::gt here
 388     Node * lane_cnt  = gvn().makecon(TypeInt::make(num_elem));
 389     Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt, num_elem, type_bt));
// here BoolTest::ge != 1 (which means BoolTest::gt)
 390     Node* mask = gvn().transform(new VectorMaskCmpNode(BoolTest::ge, bcast_lane_cnt, res, pred_node, vt));

3. In `aarch64` neon backend, we use `BoolTest::ge` for generated code:
```c++
// cond is useless here
instruct vcmge8B(vecD dst, vecD src1, vecD src2, immI cond)
%{
  predicate(n->as_Vector()->length() == 8 &&
            n->as_VectorMaskCmp()->get_predicate() == BoolTest::ge &&
            n->in(1)->in(1)->bottom_type()->is_vect()->element_basic_type() == T_BYTE);
  match(Set dst (VectorMaskCmp (Binary src1 src2) cond));
  format %{ "cmge  $dst, T8B, $src1, $src2\t# vector cmp (8B)" %}
  ins_cost(INSN_COST);
  ins_encode %{
    __ cmge(as_FloatRegister($dst$$reg), __ T8B,
            as_FloatRegister($src1$$reg), as_FloatRegister($src2$$reg));
  %}
  ins_pipe(vdop64);
%}


However, we use cond (=1 or BoolTest::gt). So X86 is **right** on jdk/jdk
```c++
instruct vcmp(legVec dst, legVec src1, legVec src2, immI8 cond, rRegP scratch) %{
  predicate(vector_length_in_bytes(n->in(1)->in(1)) >=  8 && // src1
            vector_length_in_bytes(n->in(1)->in(1)) <= 32 && // src1
            is_integral_type(vector_element_basic_type(n->in(1)->in(1)))); // src1
  match(Set dst (VectorMaskCmp (Binary src1 src2) cond));
  effect(TEMP scratch);
  format %{ "vector_compare $dst,$src1,$src2,$cond\t! using $scratch as TEMP" %}
  ins_encode %{
    int vlen_enc = vector_length_encoding(this, $src1);
    Assembler::ComparisonPredicate cmp = booltest_pred_to_comparison_pred($cond$$constant);
    Assembler::Width ww = widthForType(vector_element_basic_type(this, $src1));
    __ vpcmpCCW($dst$$XMMRegister, $src1$$XMMRegister, $src2$$XMMRegister, cmp, ww, vlen_enc, $scratch$$Register);
  %}
  ins_pipe( pipe_slow );
%}

4. In repo `panama-vector`, both of them are wrong, because the IR is fixed:
```c++
 455   } else {
 456     ConINode* pred_node = (ConINode*)gvn().makecon(TypeInt::make(BoolTest::ge));// WRONG here
 457     Node * lane_cnt  = gvn().makecon(TypeInt::make(num_elem));
 458     Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt, num_elem, type_bt));
 459     Node* mask = gvn().transform(new VectorMaskCmpNode(BoolTest::ge, bcast_lane_cnt, res, pred_node, vt));

Yours, 
Wang Huang

-------------

Commit messages:
 - 8266720: Wrong implementation in LibraryCallKit::inline_vector_shuffle_iota

Changes: https://git.openjdk.java.net/jdk/pull/3933/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3933&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8266720
  Stats: 58 lines in 2 files changed: 56 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3933.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3933/head:pull/3933

PR: https://git.openjdk.java.net/jdk/pull/3933


More information about the hotspot-compiler-dev mailing list