[14] RFR (L): 8234391: C2: Generic vector operands

Tue Dec 3 11:49:14 UTC 2019

Thanks a lot for the review, John.

Incremental change:
   http://cr.openjdk.java.net/~vlivanov/jbhateja/8234391/webrev.01-00

(Also, extended idealreg2regmask refactoring to non-vector registers.)

> In machnode.cpp I think the code would be easier to read if a common
> subexpression were assigned a name ’num_edges’ as in similar places:
> 
>     uint num_edges = _opnds[opcnt]->num_edges();
> 
> (Alternatively, you could just increment “skipped” on the fly, which
> would be OK too.)  I’m being picky because it cost me a second to verify
> that the set of raw edges was processed exhaustively.  This is a task
> every reader of the code will have to do.

Agree, fixed.

> Another nit:  There is some dissonance between the seemingly general
> name postselect_cleanup and supports_generic_vector_operands, which
> is a very specific name.  But they refer to the same condition.  Pick the
> shorter name, maybe, and convert the longer one into a nice comment.
> It’s also not clear that helpers like process_mach_node and get_vector_operand
> are is part of postselect_cleanup.  Should they be cleanup_mach_node
> cleanup_vector_operand?  It’s a nit-pick, but might help future readers.

Agree, fixed.

> The name get_vector_operand is particularly bad, since it suggests that
> it accesses something previously computed, where in fact it transforms
> the graph.  Also, IMO, “get_” is one of those noise words which offers little
> help to the reader and just takes up valuable screen space.

Agree, fixed.

> The name clone_generic_vector_operand is confusing; I would expect it
> to be called something like [specialize,cleanup,…]_generic_vector_operand.

Agree, fixed.

> There’s a funny condition reported in this comment:
>    // RShiftCntV/RShiftCntV report wide vector type, but VecS as ideal register.
> 
> It seems to comes out of the blue sky.  Maybe add a cross-referencing comment
> between that and the vshiftcnt instruction in the AD file?  (Are there asserts
> that would catch similar oddities if they were to arise?  Was this one caught
> via an assert?  I certainly hope it was, rather than by debugging bad code!)

It comes not from vshiftcnt, but from ideal node declaration :

class LShiftCntVNode : public VectorNode {
  public:
   LShiftCntVNode(Node* cnt, const TypeVect* vt) : VectorNode(cnt,vt) {}
   virtual int Opcode() const;
   virtual uint ideal_reg() const { return 
Matcher::vector_shift_count_ideal_reg(vect_type()->length_in_bytes()); }
};

Yes, complete migration to generic vector operands help with catching 
such cases. It leads to concrete vector type mismatches on IR level and 
those are caught with asserts.

Anyway, I added the reference to vectornode.hpp.

> On a similar note (about asserts), I’m very glad to see verify_mach_nodes.
> The name is a little non-specific.  Maybe verify_after_postselect_cleanup.

Agree, fixed.

> Why do vector[xy]_reg_legacy and vectorz_reg_legacy get different treatments
> in the change set?  I’m mainly curious here.  The vectorz_reg_vl thing is a
> dynamic set (?) which is fine, but why is it needed for z and not xy?  A comment
> might help.  Also, this gripe is not part of this review, but I’ll make it anyway:
> The very very short acronym “vl” which appears here starts for “AVX512VL”
> referring to “variable length” but it bumps into the phrase “vector legacy”
> with an unfortunate occasion for confusion.  Suggest “_vl” be renamed to
> “_512vl” or some other more specific thing.

I agree that _vl postfix is cryptic. I'll file an RFE to investigate 
possible cleanups. (One of the ideas was to completely eliminate dynamic 
register masks for vectors and get rid of (leg)vec[SDXYZ] along the way, 
but it was left for a future enhancement.)

> For the string intrinsics, there’s a regular replacement of legRegS by legRegD.
> That strikes me as potentially a semantic change.  I suppose the register
> allocator will do the same things in both cases, and there’s no spill code
> generated by the JIT for such a temp.  I wonder, if the change was necessary,
> how do we know that all the occurrences were correctly found and changed?
> (Also, the string intrinsic macros use AVX512 instructions when available,
> and in theory those would require heftier register masks.)  Can someone
> comment on this, for the record…maybe even in the source code?

The change is from legVecS to legRegD.

It doesn't cause any behavioral changes, but aligns the declaration with 
the actual behavior. Regarding Vec=>Reg part, string intrinsic nodes 
don't advertise themselves as vector nodes. For example, they don't set 
C->max_vector_size() and are still used when vectors are disabled 
(-XX:MaxVectorSize=0). Such problems are caught during graph 
verification with -XX:MaxVectorSize=0 or when there are no vector 
operations present.

S=>D change aligns x86_32.ad and x86_64.ad. 32-bit version consistently 
uses regD everywhere. Instead of migrating both x86_32.ad and x86_64.ad, 
it was decided to stick with regD/legVecD.

> I suggest that the warning comments “(leg)Vec should be used instead” could
> be a little less cryptic.  An unconditional warning like this makes the reader
> wonder, “so why is it here at all?”  Maybe use a cross-reference as in:
>    // Replaces (leg)vec during post-selection cleanup.  See above.

Agree, fixed.

Best regards,
Vladimir Ivanov

> On Nov 19, 2019, at 6:30 AM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
>>
>> http://cr.openjdk.java.net/~vlivanov/jbhateja/8234391/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8234391
>>
>> Introduce generic vector operands and migrate existing usages from fixed sized operands (vec[SDXYZ]) to generic ones.
>>
>> (It's an updated version of generic vector support posted for review in August, 2019 [1] [2]. AD instruction merges will be handled separately.)
>>
>> On a high-level it is organized as follows:
>>
>>   (1) all AD instructions in x86.ad/x86_64.ad/x86_32.ad use vec/legVec;
>>
>>   (2) at runtime, right after matching is over, a special pass is performed which does:
>>
>>       * replaces vecOper with vec[SDXYZ] depending on mach node type
>>          - vector mach nodes capute bottom_type() of their ideal prototype;
>>
>>       * eliminates redundant reg-to-reg vector moves (MoveVec2Leg /MoveLeg2Vec)
>>          - matcher needs them, but they are useless for register allocator (moreover, may cause additional spills);
>>
>>
>>    (3) after post-selection pass is over, all mach nodes should have fixed-size vector operands.
>>
>>
>> Some details:
>>
>>    (1) vec and legVec are marked as "dynamic" operands, so post-selection rewriting works
>>
>>
>>    (2) new logic is guarded by new matcher flag (Matcher::supports_generic_vector_operands) which is enabled only on x86
>>
>>
>>    (3) post-selection analysis is implemented as a single pass over the graph and processing individual nodes using their own (for DEF operands) or their inputs (USE operands) bottom_type() (which is an instance of TypeVect)
>>
>>
>>    (4) most of the analysis is cross-platform and interface with platform-specific code through 3 methods:
>>
>>      static bool is_generic_reg2reg_move(MachNode* m);
>>      // distinguishes MoveVec2Leg/MoveLeg2Vec nodes
>>
>>      static bool is_generic_vector(MachOper* opnd);
>>      // distinguishes vec/legVec operands
>>
>>      static MachOper* clone_generic_vector_operand(MachOper* generic_opnd, uint ideal_reg);
>>      // constructs fixed-sized vector operand based on ideal reg
>>      //   vec    + Op_Vec[SDXYZ] =>    vec[SDXYZ]
>>      //   legVec + Op_Vec[SDXYZ] => legVec[SDXYZ]
>>
>>
>>    (5) TEMP operands are handled specially:
>>      - TEMP uses max_vector_size() to determine what fixed-sized operand to use
>>          * it is needed to cover reductions which don't produce vectors but scalars
>>      - TEMP_DEF inherits fixed-sized operand type from DEF;
>>
>>
>>    (6) there is limited number of special cases for mach nodes in Matcher::get_vector_operand_helper:
>>
>>        - RShiftCntV/RShiftCntV: though it reports wide vector type as Node::bottom_type(), its ideal_reg is VecS! But for vector nodes only Node::bottom_type() is captured during matching and not ideal_reg().
>>
>>        - vshiftcntimm: chain instructions which convert scalar to vector don't have vector type.
>>
>>
>>    (7) idealreg2regmask initialization logic is adjusted to handle generic vector operands (see Matcher::get_vector_regmask)
>>
>>
>>    (8) operand renaming in x86_32.ad & x86_64.ad to avoid name conflicts with new vec/legVec operands
>>
>>
>>    (9) x86_64.ad: all TEMP usages of vecS/legVecS are replaced with regD/legRegD
>>       - it aligns the code between x86_64.ad and x86_32.ad
>>       - strictly speaking, it's illegal to use vector operands on a non-vector node (e.g., string_inflate) unless its usage is guarded by C2 vector support checks (-XX:MaxVectorSize=0)
>>
>>
>> Contributed-by: Jatin Bhateja <jatin.bhateja at intel.com>
>> Reviewed-by: vlivanov, sviswanathan, ?
>>
>> Testing: tier1-tier4, jtreg compiler tests on KNL and SKL,
>>          performance testing (SPEC* + Octane + micros / G1 + ParGC).
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-August/034822.html
>>
>> [2] http://cr.openjdk.java.net/~jbhateja/genericVectorOperands/generic_operands_support_v1.0.pdf
>