RFR: 8264104: Eliminate unnecessary vector mask conversion during VectorUnbox for floating point VectorMask [v2]

Thu Apr 8 03:45:42 UTC 2021

On Wed, 7 Apr 2021 16:06:44 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> @iwanowww should confirm correctness of such optimization.
>> Regarding changes - they seem fine to me. I notice that VectorNode and its subclasses do not check for TOP inputs. Since Vector API introduce vectors in graph before SuperWord transformation their input could become dead. How such cases handled? And why we did not hit them yet? is_vect() should hit assert.
>
> I'm not fond of the proposed approach. 
> 
> It hard-codes some implicit assumptions about vector mask representation.  
> Also, vector type mismatch can trigger asserts since some vector-related code expects that types of vector inputs match precisely).
> 
> I suggest to introduce artificial cast nodes (`VectorLoadMask (VectorStoreMask vmask) ==> VectorMaskCast (vmask)`) which are then lowered into no-ops on backend side.

Hi @iwanowww , thanks for looking at this PR.

> It hard-codes some implicit assumptions about vector mask representation.
> Also, vector type mismatch can trigger asserts since some vector-related code expects that types of vector inputs match precisely).

Agree. I'm also anxious about the potential assertion although I didn't met the issue currently.

> I suggest to introduce artificial cast nodes (`VectorLoadMask (VectorStoreMask vmask) ==> VectorMaskCast (vmask)`) which are then lowered into no-ops on backend side.

It's a good idea to add a cast node for mask. I tried with it, and it can work well for the casting with same element size and vector   length. However, for the different element size (i.g. short->int) casting, I think the original `VectorLoadMask (VectorStoreMask vmask)` is better since there are some optimizations existed. Also using `VectorMaskCast` for all cases needs more match rules in the backend which I think is duplicated with `VectorLoadMask+VectorStoreMask`. So does it look good for you if the `VectorMaskCast` is limited to the masking cast with the same element size? The changes might look like:

diff --git a/src/hotspot/share/opto/vectornode.cpp b/src/hotspot/share/opto/vectornode.cpp
index 326b9c4a5a0..d7b53c247fb 100644
--- a/src/hotspot/share/opto/vectornode.cpp
+++ b/src/hotspot/share/opto/vectornode.cpp
@@ -1232,6 +1232,10 @@ Node* VectorUnboxNode::Ideal(PhaseGVN* phase, bool can_reshape) {
         bool is_vector_mask    = vbox_klass->is_subclass_of(ciEnv::current()->vector_VectorMask_klass());
         bool is_vector_shuffle = vbox_klass->is_subclass_of(ciEnv::current()->vector_VectorShuffle_klass());
         if (is_vector_mask) {
+          if (in_vt->length_in_bytes() == out_vt->length_in_bytes() &&
+              Matcher::match_rule_supported(Op_VectorMaskCast)) {
+            return new VectorMaskCastNode(value, out_vt);
+          }
           // VectorUnbox (VectorBox vmask) ==> VectorLoadMask (VectorStoreMask vmask)
           value = phase->transform(VectorStoreMaskNode::make(*phase, value, in_vt->element_basic_type(), in_vt->length()));
           return new VectorLoadMaskNode(value, out_vt);

-------------

PR: https://git.openjdk.java.net/jdk/pull/3238