[vectorIntrinsics] RFR: 8287289: Gather/Scatter with Index Vector

Wed Jun 8 23:01:12 UTC 2022

On 2 Jun 2022, at 3:55, Joshua Zhu wrote:

> Make sense, I will follow your suggestion and implement Gather/Scatter 
> operation over memory segments by int/long index.
> I think besides Gather/Scatter API on MemorySegment, the existing 
> proof-of-concept APIs over Java arrays are still needed, right? As for 
> the implementation of them, I will figure out whether the way that 
> depends on memory segments is better.

If the POC APIs are easy to emulate over newer, more general ones, then 
we can probably phase them out.  The first step is to try to make ones 
with fewer assumptions about the use of memory locations to store 
indexes, as noted.

When you say “index” (stored in int or long vectors for S/G) do you 
mean a index to be scaled, that is multiplied by the access element 
size, to get a non-scaled byte offset?  That is more convenient but less 
general.  I think if we find that use cases are *usually* based on 
indexes that are scaled, we probably want to avoid giving only the 
non-scaled (offset-based) versions.  But the non-scaled offset-based 
versions are more general, and under the hood they should be the 
primitive.  The JIT should be able to strength-reduce alignment checks, 
if those are important.  There’s no solid performance related reason 
to do *only* the scaled indexes and not also the non-scaled offsets.

So I think the API should give choices that are related to:

{op:scatter/gather}{base-address:array/MemorySegment}{element:B/S/I/J/F/D}{offset-type:int/long}{offset-storage:array/vector}{offset-mode:scaled/unscaled}.

Given both offset-types seems fundamental.  We are in a 32/64-bit world 
here.  The offset-mode:scaled can be built on top of 
offset-mode:unscaled but ease-of-use might warrant supplying both.

The offset-storage:array choice can be built on top of 
offset-storage:vector, and we will have to recheck ease-of-use 
considerations, but we should try to get rid of it.

Similarly with base-address:array; it might be an ease-of-use win but I 
think that probably we can just as users to build that on top of the 
more general base-address:MemorySegment.  Notice that the “from char 
array” variations can go away, since you can build the same sort of 
access to both `short[]` and `char[]` through `MemorySegment`.

Supporting all lane types is the clean design.  We could cut back to 
{element:I/J/F/D} omitting subword types, maybe, but that doesn’t feel 
right to me.  On on the other hand, indexes stored in subword types 
don’t feel right either, so for subwords the index sizes are larger 
than the payload sizes.

— John