RFR: 8338023: Support two vector selectFrom API [v8]
Jatin Bhateja
jbhateja at openjdk.org
Mon Sep 16 03:02:08 UTC 2024
On Sun, 15 Sep 2024 07:16:17 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> > > Can you please **define** somewhere what it means to `prune indexes`? It does not help me much more than the previous "massaging indexes" you had before I asked you to change it.
> > > > Also: I'm a little worried about the semantics change of the RearrangeNode that you did with the changes in RearrangeNode::Ideal. It looks a little "hacky", especially in conjunction with the vector_indexes_needs_massaging method. Can you give a clear definition of the semantics of RearrangeNode and vector_indexes_needs_massaging, please?
> > >
> > >
> > > You have also not responded to this yet. It seems to me that before your proposed change, `RearrangeNode` had a clear and easy semantic, and now you somehow "hack it" to work with your `vector_indexes_needs_pruning`. Can you explain please why this makes sense and add a comment to `RearrangeNode` what its semantics is?
> >
> >
> > In case target does not directly support two vector selection instruction we lower the IR to its constituents, this is better than intrinsification failure as it saves costly vector boxing penalties.
> > Consider in terms of desired compiler IR and not rearrange API semantics, VectorRearrange IR node generally expects shape conformance b/w vector to be permuted and index vector, since shuffle indices are held in byte array based backing storage hence compiler injects VectorLoadShuffle nodes to upcast the byte vector lanes holding indexes to match the input vector lane. Since selectFrom API already passes the indexes through vector hence we can save emitting redundant toShuffle() + toVector() operations in some cases apart from some target specific scenarios e.g. AVX2 targets [do not support direct short vector permute]instruction "VPERMW", hence we need to [massage the index vector](https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L8771) to emulate desired permutation using byte permute instruction.
> > VectorLoadShuffle abstraction hides target specific index massaging which is why adding a target specific hook like Matcher::vector_indexes_needs_pruning compiler to selectively emit VectorLoadShuffle.
>
> I still do not see a **definition of the semantics of RearrangeNode**: what inputs does it accept and what does it do with them?
>
> Can you put this explanation as comment in the code, please?
>
> It sounds like this is what the `massaging` / `pruning` is: `emulate desired permutation using byte permute instruction.` You should find an accordingly more suiting name for the method name. Maybe it is something like `must_emulate_permutation_with...`. Or maybe it is rather a `supported` kind of question? I leave that up to you.
Hi @eme64 , As per discussion on [PR# 20634 ](https://github.com/openjdk/jdk/pull/20634#discussion_r1759701508), we plan to suppress VectorLoadShuffle bypassing optimization for now and address this as a follow up optimization for both the flavors of selectFrom API.
I have addressed your comments. Kindly verify.
Best Regards,
Jatin
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20508#issuecomment-2351944720
More information about the core-libs-dev
mailing list