RFR: 8310308: IR Framework: check for type and size of vector nodes [v26]

Mon Aug 14 10:56:29 UTC 2023

On Tue, 8 Aug 2023 17:20:01 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> For some changes to `SuperWord`, and maybe auto-vectorization in general, I want to strengthen the IR Framework.
>> 
>> **Motivation**
>> I want to not just find the relevant IR nodes, but also assert that they have the maximal length that they could have on the respective platform (given the CPU features and `MaxVectorSize`). Without this verification it is possible that a future change leads to a regression where we still vectorize but at shorter vector widths as before - leading to performance loss.
>> 
>> **How to use it**
>> 
>> All `IRNode`s in `test/hotspot/jtreg/compiler/lib/ir_framework/IRNode.java` that are created with `vectorNode` are now all matched with their `type` and `size`. The regex might now look something like this:
>> 
>> `"(\d+(\s){2}(VectorCastF2X.*)+(\s){2}===.*vector[A-Za-z][8]:{int})"`
>> which would match with IR nodes dumped like that:
>> `1150  VectorCastF2X  === _ 1151  [[ 1146 ]]  #vectory[8]:{int} ...`
>> 
>> The goal was to keep it simple and straight forward. In most cases, you can just use the nodes as before, and implicitly we now check for maximal size automatically. However, in some cases we want to ensure there is no or only a limited number of nodes (`failOn` or comparison `<` or `<=` or `=0`) - in those cases we usually want to make sure there is not any node of any size, so we match with any size by default. The size can also explicitly be constrained using `IRNode.VECTOR_SIZE`.
>> 
>> Some examples:
>> 1. `@IR(counts = {IRNode.LOAD_VECTOR_I,  " >0 "})` -> search for a `LoadVector` node with `type` `int`, and maximal `size` possible on the machine (limited by CPU features and `MaxVectorSize`). This is the most common use case.
>> 2. `@IR(failOn = { IRNode.LOAD_VECTOR_L, IRNode.STORE_VECTOR })` -> fail if there is a `LoadVector` with type `long`, of `any` size.
>> 3. `@IR(counts = { IRNode.XOR_VI, IRNode.VECTOR_SIZE_4, " > 0 "})` -> find at least one `XorV` node with type `int` and exactly `4` elements. Useful for VectorAPI when the vector species is fixed.
>> 4. `@IR(counts = { IRNode.LOAD_VECTOR_D, IRNode.VECTOR_SIZE + "min(4, max_double)", " >0 " })` -> search for a `LoadVector` node with `type` `double`, and `size` exactly equals to `min(4, max_double)` (so 4 elements, or if the hardware allows fewer `doubles`, then that number).
>> 5. `@IR(counts = { IRNode.ABS_VF, IRNode.VECTOR_SIZE + "min(LoopMaxUnroll, max_float)", ">= 1" })` -> find at least one `AbsV` nodes with type `float`, and the `size` exactly equals to the smaller of...
>
> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 71 commits:
> 
>  - manual merge from master
>  - duplicate rules in VectorLogicalOpIdentityTest.java
>  - Merge branch 'master' into JDK-8310308
>  - Duplicated =1 counts for vector nodes in compiler/vectorapi/reshape/tests/TestVectorCast.java
>  - Merge branch 'master' into JDK-8310308
>  - Fix with canTrustVectorSize for Cascade Lake
>  - TestSpillTheBeans.java
>  - print VMInfo from Test VM
>  - merge from master, manual merge for VectorLogicalOpIdentityTest.java
>  - Response to Tobias' review
>  - ... and 61 more: https://git.openjdk.org/jdk/compare/509f80bb...48fa52ba

test/hotspot/jtreg/compiler/lib/ir_framework/README.md line 90:

> 88: ```
> 89: 
> 90: However, the size does not have to be specified. In most cases, one either wants to have vectorization at the maximal possible vector width, or no vectorization at all. Hence, for lower bound counts ('>' or '>=') the default size is `IRNode.VECTOR_SIZE_MAX`, and for upper bound counts ('<' or '<=' or '=0' or failOn) the default is `IRNode.VECTOR_SIZE_ANY`. Equal count comparisons with a strictly positive count (e.g. '=2') are not allowed for vector nodes. On machines with 'canTrustVectorSize == false' (cascade lake) the maximal vector width is not predictable currently. Hence, on such a machine we have to automatically weaken the IR rules. All lower bound counts are performed checking with `IRNode.VECTOR_SIZE_ANY`. Upper bound counts with no user specified size are performed with `IRNode.VECTOR_SIZE_ANY` but upper bound counts with a user specified size are not checked at all. Details and reasoning can be found in [RawIRNode](./driver/irmatching/irrule/checkattribute/parsing/Ra
 wIRNode.java).

Suggestion:

However, the size does not have to be specified. In most cases, one either wants to have vectorization at the maximal possible vector width, or no vectorization at all. Hence, for lower bound counts ('>' or '>=') the default size is `IRNode.VECTOR_SIZE_MAX`, and for upper bound counts ('<' or '<=' or '=0' or failOn) the default is `IRNode.VECTOR_SIZE_ANY`. Equal count comparisons with a strictly positive count (e.g. '=2') are not allowed for vector nodes. On machines with 'canTrustVectorSize == false' (Cascade Lake) the maximal vector width is not predictable currently. Hence, on such a machine we have to automatically weaken the IR rules. All lower bound counts are performed checking with `IRNode.VECTOR_SIZE_ANY`. Upper bound counts with no user specified size are performed with `IRNode.VECTOR_SIZE_ANY` but upper bound counts with a user specified size are not checked at all. Details and reasoning can be found in [RawIRNode](./driver/irmatching/irrule/checkattribute/parsing/RawIRNod
 e.java).

Same for other occurrences.

test/hotspot/jtreg/compiler/lib/ir_framework/driver/irmatching/irrule/checkattribute/parsing/RawIRNode.java line 110:

> 108:                         // If we have a size specified but cannot trust the size, and must check an upper
> 109:                         // bound, this can be impossible to count correctly - if we have an incorrect size
> 110:                         // we may count either too many nodes. We just create a impossible regex which will

Suggestion:

                        // we may count either too many nodes. We just create an impossible regex which will

test/hotspot/jtreg/compiler/lib/ir_framework/driver/irmatching/irrule/constraint/raw/RawCountsConstraint.java line 76:

> 74:             }
> 75:             case "=" -> {
> 76:                 // "=0" is same as setting upper bound - just like for failOn. But i we compare equals a

Suggestion:

                // "=0" is same as setting upper bound - just like for failOn. But if we compare equals a

test/hotspot/jtreg/compiler/lib/ir_framework/driver/irmatching/irrule/constraint/raw/RawCountsConstraint.java line 77:

> 75:             case "=" -> {
> 76:                 // "=0" is same as setting upper bound - just like for failOn. But i we compare equals a
> 77:                 // strictly positive number it is like setting both and upper and lower bound (equal).

Suggestion:

                // strictly positive number it is like setting both upper and lower bound (equal).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14539#discussion_r1288279325
PR Review Comment: https://git.openjdk.org/jdk/pull/14539#discussion_r1288278270
PR Review Comment: https://git.openjdk.org/jdk/pull/14539#discussion_r1288283599
PR Review Comment: https://git.openjdk.org/jdk/pull/14539#discussion_r1288283843