RFR: 8342095: Add autovectorizer support for subword vector casts [v10]

Emanuel Peter epeter at openjdk.org
Fri Apr 4 06:06:56 UTC 2025


On Wed, 2 Apr 2025 14:15:19 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

>> Hi all,
>> This patch adds initial support for the autovectorizer to generate conversions between subword types. Currently, when superword sees two packs that have different basic types, it discards them and bails out of vectorization. This patch changes the behavior to ask the backend if a cast between the conflicting types is supported, and keeps the pack if it is. Later, when the `VTransform` graph is built, a synthetic cast is emitted when packs requiring casts are detected. Currently, only narrowing casts are supported as I wanted to re-use existing `VectorCastX2Y` logic for the initial version, but adding more conversions is simple and can be done with a subsequent RFE. I have attached a JMH benchmark and got these results on my Zen 3 machine:
>> 
>> 
>>                                                   Baseline                    Patch
>> Benchmark                  (SIZE)  Mode  Cnt    Score    Error  Units   Score    Error  Units    Improvement
>> VectorSubword.intToByte      1024  avgt   12  200.049 ± 19.787  ns/op   56.228 ± 3.535  ns/op  (3.56x)
>> VectorSubword.intToShort     1024  avgt   12  179.826 ±  1.539  ns/op   43.332 ± 1.166  ns/op  (4.15x)
>> VectorSubword.shortToByte    1024  avgt   12  245.580 ±  6.150  ns/op   29.757 ± 1.055  ns/op  (8.25x)
>> 
>> 
>> I've also added some IR tests and they pass on my linux x64 machine. Thoughts and reviews would be appreciated!
>
> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix copyright after merge

This looks really good, thanks @jaskarth for the work you put in!

I have a few more comments below.

src/hotspot/share/opto/superword.cpp line 2329:

> 2327: // Check if the output type of def is compatible with the input type of use, i.e. if the
> 2328: // types have the same size.
> 2329: bool SuperWord::is_velt_basic_type_compatible_use_def(Node* use, Node* def, const uint def_size) const {

Suggestion:

bool SuperWord::is_velt_basic_type_compatible_use_def(Node* use, Node* def, const uint pack_size) const {

I think that would be more descriptive here. It would indicate we are not interested in the size of an element (i.e. bytes per element), but the size of the pack.

src/hotspot/share/opto/superword.cpp line 2361:

> 2359: 
> 2360:   // Input sizes differ, but platform supports a cast to change the def shape to the use shape
> 2361: 

Suggestion:

  // Subword cast: Element sizes differ, but the platform supports a cast to change the def shape to the use shape.

src/hotspot/share/opto/superwordVTransformBuilder.cpp line 195:

> 193:     // If the use and def types are different, emit a cast node
> 194:     if (use_bt != def_bt && !p0->is_Convert()
> 195:         && (is_subword_type(def_bt) || is_subword_type(use_bt)) && VectorCastNode::implemented(-1, pack->size(), def_bt, use_bt)) {

Suggestion:

    if (use_bt != def_bt && !p0->is_Convert() &&
        (is_subword_type(def_bt) || is_subword_type(use_bt)) &&
        VectorCastNode::implemented(-1, pack->size(), def_bt, use_bt)) {

Optional nit :)

test/hotspot/jtreg/compiler/loopopts/superword/TestCompatibleUseDefTypeSize.java line 113:

> 111:         tests.put("testShortToInt",  () -> { return testShortToInt(aS.clone(), bI.clone()); });
> 112:         tests.put("testByteToInt",   () -> { return testByteToInt(aB.clone(), bI.clone()); });
> 113:         tests.put("testByteToShort", () -> { return testByteToShort(aB.clone(), bS.clone()); });

What about a `testLongToShort` etc? It could be good to just have casts from/to all types, just to be sure ;)

test/micro/org/openjdk/bench/vm/compiler/VectorSubword.java line 44:

> 42:     private byte[] bytes;
> 43:     private short[] shorts;
> 44:     private int[] ints;

It would be nice if you covered also `char` and `long`, for completeness :)

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/23413#pullrequestreview-2739175207
PR Review Comment: https://git.openjdk.org/jdk/pull/23413#discussion_r2026615573
PR Review Comment: https://git.openjdk.org/jdk/pull/23413#discussion_r2028138291
PR Review Comment: https://git.openjdk.org/jdk/pull/23413#discussion_r2028139632
PR Review Comment: https://git.openjdk.org/jdk/pull/23413#discussion_r2028142164
PR Review Comment: https://git.openjdk.org/jdk/pull/23413#discussion_r2028145292


More information about the hotspot-compiler-dev mailing list