RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2]

Pengfei Li pli at openjdk.org
Tue Jul 4 01:45:14 UTC 2023


On Tue, 27 Jun 2023 17:49:43 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> General question: Do you have any tests with varying loop limit, and check that you stop exactly at the right iteration? Would be even more interesting with mixed type examples. Just to see that you do not over/under duplicate the vectors.

Yes, we previously tested this with a lot of fuzzer tests. We did find issues before but they are all fixed now.
(Previously we also supported reductions, and it's a bit tricky to duplicate reductions.)

> src/hotspot/share/opto/vmaskloop.cpp line 403:
> 
>> 401:       int opc = node->Opcode();
>> 402:       BasicType bt = elem_bt(node);
>> 403:       int vlen = Matcher::max_vector_size(bt);
> 
> Theoretically, different `bt` can have different `Matcher::vector_width_in_bytes`. So `vlen` would not always correspond to `MaxVectorSize / element_size`. It just means that here you would end up checking for a shorter length than maybe expected? But maybe that is intended, it depends on how you generate the nodes later.

I think it's good, at least for AArch64 SVE. Do you mean that other architecture may prefer using shorter vectors for better performances? (say, using 256-bit on AVX-512?) Does setting a smaller `MaxVectorSize` help?

> src/hotspot/share/opto/vmaskloop.cpp line 442:
> 
>> 440: // nodes to bail out for complex loops
>> 441: bool VectorMaskedLoop::analyze_loop_body_nodes() {
>> 442:   VectorSet tracked(_arena);
> 
> This is probably a good case where you could use `ResourceMark rm;` and just put the `VectorSet` on the default resource arena.

Updated here, and in another place. thanks.

> src/hotspot/share/opto/vmaskloop.cpp line 465:
> 
>> 463:   for (int idx = 0; idx < n_nodes; idx++) {
>> 464:     Node* node = _body_nodes.at(idx);
>> 465:     if ((node->is_Mem() && node->as_Mem()->is_Store())) {
> 
> Suggestion:
> 
>     if ((node->is_Store())) {

Done

> src/hotspot/share/opto/vmaskloop.cpp line 474:
> 
>> 472:         if (!in_body(out)) {
>> 473:           trace_msg(node, "Node has out-of-loop user found");
>> 474:           return false;
> 
> Can this be handled in the future with a extract node? I guess you would have to extract it from a variable element, as the last iteration is not always the same.

Probably, but I haven't tried it so far. Will do it in the future.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1619339468
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251383442
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251383769
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251383913
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251384357


More information about the hotspot-compiler-dev mailing list