[vector] java doc for iota()

John Rose john.r.rose at oracle.com
Wed Nov 13 20:18:41 UTC 2019


On Nov 13, 2019, at 11:00 AM, Viswanathan, Sandhya <sandhya.viswanathan at intel.com> wrote:
> 
> We need advice from John on partial wrapping and the 256-element vector length.

Here’s my advice:  The design of shuffles to have *exactly one* partial bit per lane
is the thing I explained in my previous mail.  This design is (I claim) portable and
useful, and should be implemented as is.

The partial wrap state is useful to encode overflows, which otherwise (in the
case of mandatory full wrapping) would be suppressed, masking bugs.

The partial wrap state is *also* useful to intentionally encode the shuffling
of *two* vectors, a frequent operation and one supported by some hardware.

The implication is that if a vector’s lane size is too small to represent the full
range of shuffle lane values, the vector itself cannot be used as representation
for the shuffle.

SVE allows byte vectors of length 256, so in this case a partially wrapped
shuffle would require an extra 256 bits, somewhere.  I suggest using another
byte vector, for now, containing upper bytes, and later on (when we get smarter
about predicate registers0 using a mask.

(Another way to do this would be 16-bit lanes in the shuffle.  You still need
two vectors, but the bits are organized differently.  I think this format is
less desirable, because it makes it harder to optimize the common case
where the shuffle is already known to be fully wrapped.  In that common
case, the companion vector, containing the high bytes, is a constant zero.)

We don’t have a framework now for working with vector tuples.  I regard
this as technical debt to be paid off at some point, probably after we convert
to value types.  So for the immediate future, I suggest that 256-long byte
vector shuffle just not be optimized, until we convert to value types.

— John




More information about the panama-dev mailing list