RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2]
Pengfei Li
pli at openjdk.org
Tue Jul 4 02:29:14 UTC 2023
On Wed, 28 Jun 2023 10:20:25 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Address part of comments from Emanuel
>
> src/hotspot/share/opto/vmaskloop.cpp line 790:
>
>> 788: // Compute vector duplication count and the vmask tree level
>> 789: int dup_cnt = lane_size / _size_stats.smallest_size();
>> 790: int level = exact_log2(dup_cnt);
>
> Rename `level` to something more expressive. Maybe just `vmask_tree_level`. Also in all other methods. Otherwise it is not quite clear what it is supposed to be.
Renamed
> src/hotspot/share/opto/vmaskloop.cpp line 798:
>
>> 796: if (type2aelembytes(statement_bottom_type(stmt)) != lane_size) {
>> 797: continue;
>> 798: }
>
> You could assert here, that the max vector size for bt is as expected.
It's done.
> src/hotspot/share/opto/vmaskloop.cpp line 874:
>
>> 872: void VectorMaskedLoop::adjust_vector_node(Node* vn, Node_List* vmask_tree,
>> 873: int level, int mask_off) {
>> 874: Node* vmask = vmask_tree->at((1 << level) + mask_off);
>
> Again, rename `level`. Maybe it could be `vmask_tree_level` and `vmask_tree_level_offset`? Here I finally understood what you mean by the two variables `level` and `mask_off`.
Also renamed.
> src/hotspot/share/opto/vmaskloop.cpp line 876:
>
>> 874: Node* vmask = vmask_tree->at((1 << level) + mask_off);
>> 875: int lane_size = type2aelembytes(Matcher::vector_element_basic_type(vmask));
>> 876: uint vector_size_in_bytes = Matcher::max_vector_size(T_BYTE);
>
> Can you add an assert that this is the same as `Matcher::vector_width_in_bytes(Matcher::vector_element_basic_type(vmask))` ?
Asserts added.
> src/hotspot/share/opto/vmaskloop.cpp line 884:
>
>> 882: Node* ptr = vn->in(MemNode::Address);
>> 883: Node* base = ptr->in(AddPNode::Base);
>> 884: int mem_scale = Matcher::max_vector_size(T_BYTE);
>
> Duplicate of `vector_size_in_bytes`?
Aha, it's duplicated now. Previously we did some strided access support so we added this. Removed now.
> src/hotspot/share/opto/vmaskloop.cpp line 893:
>
>> 891: // 2) For populate index, update start index for non-zero mask offset
>> 892: if (mask_off != 0) {
>> 893: int v_stride = vector_size_in_bytes / lane_size * _cl->stride_con();
>
> Is there any test for PopulateIndex with stride that is not `1`? For now I guess only `-1` would even be allowed.
Good question, I will check it then.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405695
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405307
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405747
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405983
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251406373
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251406879
More information about the hotspot-compiler-dev
mailing list