Improve the efficiency of VectorShuffle usage
Quân Anh Mai
anhmdq at gmail.com
Wed Dec 18 07:27:34 UTC 2024
Hi,
I want to discuss how to improve the efficiency of creating and using a
VectorShuffle.
Currently, when a VectorShuffle is created, it will try to wrap all
out-of-bound indices into the interval [-VLENGTH, -1]. However, this is
useless most of the time, as the most frequent operation with a
VectorShuffle, rearrange on a single vector, will wrap all indices to the
interval [0, VLENGTH - 1] regardless. This may be noticeable in look up
algorithms such as UTF-8 validation, as in those algorithms, the
VectorShuffle is a loop-variant and will be computed in each iteration.
This is a consequence of the fact that VectorShuffle is used for 2
operations: 1-operand rearrange and 2-operand rearrange.
As a result, I propose we add a field to VectorShuffle to discriminate
instances which are used for 1-operand rearrange and instances which are
used for 2-operand rearrange. For instances which are created for 1-operand
rearrange, all indices are wrapped to [0, VLENGTH - 1] while for ones which
are for 2-operand rearrange, the interval for indices to wrap to is [0, 2 *
VLENGTH - 1]. Instances which are created for 1 operation must not be used
for the other.
This distinction is more preferable than just changing the VectorShuffle
creation semantics so that elements are wrapped to the interval [0, 2 *
VLENGTH - 1] because of 3 reasons:
- It aligns the wrapping in VectorShuffle creation with the wrapping in
VectorShuffle usage, which helps reduce 1 unnecessary wrapping.
- It is necessary to support 2048-bit SVE rearrange. As for those, a
1-operand byte rearrange is sensible while a 2-operand byte rearrange is
not (there are 512 elements in a table from 2 vectors, which is larger than
the index values themselves). While we can catch the usage in 2-operand
rearrange, the semantics are muddy for other operations such as toVector or
toString. This is because we inevitably lose information when casting the
elements to the implementation-detail type. It would be confusing if, for
all species, elements are wrapped to the interval [0, 2 * VLENGTH - 1]
while suddenly for 2048-bit byte shuffles, the elements are wrapped to the
interval [0, VLENGTH - 1]. As a result, forbidding creating such a
VectorShuffle in the first place is a more sensible choice.
- Using a VectorShuffle for both 1-operand rearranges and 2-operand
rearranges is questionable, as they have different semantics. If the users
want to use 1 index vector that is sensible to be converted to both a
1-operand shuffle and a 2-operand shuffle. Then 2 conversion seems to be a
more reasonable thing to do.
API-wise, I propose removing Vector::toShuffle and adding 8 methods:
<T> VectorShuffle<T> ByteVector::toShuffle(VectorSpecies<T> species)
<T> VectorShuffle<T> ByteVector::toLookUpIndices(VectorSpecies<T> species,
int numTable)
<T> VectorShuffle<T> ShortVector::toShuffle(VectorSpecies<T> species)
<T> VectorShuffle<T> ShortVector::toLookUpIndices(VectorSpecies<T> species,
int numTable)
<T> VectorShuffle<T> IntVector::toShuffle(VectorSpecies<T> species)
<T> VectorShuffle<T> IntVector::toLookUpIndices(VectorSpecies<T> species,
int numTable)
<T> VectorShuffle<T> LongVector::toShuffle(VectorSpecies<T> species)
<T> VectorShuffle<T> LongVector::toLookUpIndices(VectorSpecies<T> species,
int numTable)
where species must have the same length as the receiver and numTable must
be 1 or 2, XXXVector::toShuffle(species) is equivalent to
XXXVector::toLookUpIndices(species, 1)
On the opposite direction, I also propose removing VectorShuffle::toVector
and adding VectorShuffle::toVector(VectorSpecies species) where species
must be an integral vector with the same length as the receiver.
Alternatively, we can split VectorShuffle into VectorShuffle and
VectorLookUpIndices. The former can only be used for 1-operand rearrange
while the latter can only be used for 2-operand rearrange. Personally, I
prefer this approach as it gives us a stronger type safety compared to
discriminating VectorShuffle based on a field.
Please share your thoughts, thanks a lot,
Quan Anh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20241218/775424f2/attachment.htm>
More information about the panama-dev
mailing list