[vectorIntrinsics] RFR: 8266720: Wrong implementation in LibraryCallKit::inline_vector_shuffle_iota

Xiaohong Gong xgong at openjdk.java.net
Fri May 14 03:12:42 UTC 2021


On Fri, 14 May 2021 01:45:30 GMT, Wang Huang <whuang at openjdk.org> wrote:

>> After changing notes with @XiaohongGong , I think we can also fix like this: 
>> 
>>     ConINode* pred_node = (ConINode*)gvn().makecon(TypeInt::make(BoolTest::ge));
>>     Node * lane_cnt_tmp  = gvn().makecon(TypeInt::make(num_elem - 1));
>>     Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt_tmp, num_elem, type_bt));
>>     Node* mask = gvn().transform(new VectorMaskCmpNode(BoolTest::ge, bcast_lane_cnt, res, pred_node, vt));
>> 
>>     // Make the indices greater than lane count as -ve values. This matches the java side implementation.
>>     res = gvn().transform(VectorNode::make(Op_AndI, res, bcast_mod, num_elem, elem_bt));
>>     Node * lane_cnt  = gvn().makecon(TypeInt::make(num_elem)); // Add a mov & bcast here
>>     Node * bcast_lane_cnt = gvn().transform(VectorNode::scalar2vector(lane_cnt, num_elem, type_bt));
>>     Node * biased_val = gvn().transform(VectorNode::make(Op_SubI, res, bcast_lane_cnt, num_elem, elem_bt));
>>     res = gvn().transform(new VectorBlendNode(biased_val, res, mask));
>
>> Unsigned comparison adds overhead and is not supported on all architectures.
> 
> However, if we don't use ugt ,we will encounter problem if  length > 1024  in future.  Changing `< num_elem` to `<= 128` is just a solution to `1024` itself. If `num_elem > 128`, it will be invalid.

Currently making it work well for <= 1024-bits makes sense to me. We can revisit this issue after the API issues for vector length > 1024-bits are fixed in future.

-------------

PR: https://git.openjdk.java.net/panama-vector/pull/81


More information about the panama-dev mailing list