[vectorIntrinsics] conversions and partial results
John Rose
john.r.rose at oracle.com
Fri May 31 07:26:00 UTC 2019
I just wrote the following javadoc for Vector,
concerning the difficult problem of expanding
and contracting conversion operations.
(It may be too much. We should let the CSR reviewers
help us to cut it down. For now more is better.)
Does this seem like a good direction to go with
these sorts of conversions? Is there a better one?
— John
———————————
+ * A <em>lane-wise conversion</em> operation takes one input vector,
+ * distributing a unary scalar conversion operator across the lanes,
+ * and produces a resulting vector of the converted values, or as
+ * many of them as can be fit into the required shape.
+ *
+ * <p> Unlike other lane-wise operations, conversions can change lane
+ * type, from the input "domain" type to the output "range" type. The
+ * lane size may change along with the type. In order to manage the
+ * size changes, lane-wise conversion methods can product <em>partial
+ * results</em>, under the control of a {@code part} parameter, which
+ * is {@linkplain Vector.html#conversions explained elsewhere}.
+ *
+ * <p> The following pseudocode expresses the behavior of this operation
+ * category, including the handling of partial results:
+ *
+ * <pre>{@code
+ * ETYPE2 scalar_conversion_op(ETYPE s);
+ * EVector a = ...;
+ * int part = ...;
+ * VectorSpecies<E> dom = a.species();
+ * VectorSpecies<E2> ran = dom.withLanes(ETYPE2.class);
+ * assert dom.vectorShape() == ran.vectorShape();
+ * int domlen = dom.vectorLength();
+ * int ranlen = ran.vectorLength();
+ * ETYPE2[] ar = new ETYPE2[ran.vectorLength()];
+ * if (domlen == ranlen) { // in-place
+ * assert part == 0;
+ * assert dom.elementSize() == ran.elementSize();
+ * for (int i = 0; i < limit; i++) {
+ * ar[i] = scalar_conversion_op(a.lane(i));
+ * }
+ * } else if (domlen > ranlen) { // expanding
+ * assert ran.elementSize() > dom.elementSize();
+ * int origin = decodePartForExpand(dom, ran, part);
+ * for (int i = 0; i < ranlen; i++) {
+ * ETYPE s = a.lane(origin + i);
+ * ar[i] = scalar_conversion_op(s);
+ * }
+ * } else { // (domlen < ranlen) // contracting
+ * assert ran.elementSize() < dom.elementSize();
+ * int origin = decodePartForContract(dom, ran, part);
+ * for (int i = 0; i < domlen; i++) {
+ * ETYPE s = a.lane(i);
+ * ar[origin + i] = scalar_conversion_op(s);
+ * }
+ * E2Vector r = E2Vector.fromArray(ran, ar, 0);
+ * }</pre>
+ * </li>
———————————
* <h1><a id="conversions">Conversions and Partial Results</h1>
* This API provides a set of lane-wise conversion operators.
* They are described by constants of type
* {@link VectorOperation.Conversion}, which are passed as
* arguments to the
* {@link Vector#convert(VectorOperators.Conversion,int) Vector.convert()}
* method.
*
* <p> Every conversion operator has a specified
* {@linkplain VectorOperations.Conversion#domainType() domain type} and
* {@linkplain VectorOperations.Conversion#rangeType() range type},
* which exactly match the lane types of the input and output
* vectors.
*
* <p> A conversion operator is classified as (respectively) in-place,
* expanding, or contracting, depending on whether the bit-size of its
* domain type is (respectively) equal, less than, or greater than the
* bit-size of its range type.
*
* An expanding conversion, such as {@code short} to {@code long},
* takes a scalar value and represents it in a larger format (always
* with some information redundancy). A contracting conversion, such
* as {@code double} to {@code float}, takes a scalar value and
* represents it in a smaller format (always with some information
* loss). Some in-place conversions may also include information
* loss, such as with conversions between {@code long} and {@code
* double}, and also {@code int} and {@code float}. Expanding
* conversions never "lose bits", but they may sometimes disturb the
* sign of a value, if a domain or range is unsigned.
*
* <p> This classification is important, because, unless otherwise
* documented, conversion operations <em>never change vector
* shape</em>, regardless of how they may change <em>lane sizes</em>.
*
* Therefore an <em>expanding</em> conversion cannot store all of its
* results in its output vector, because the output vector has fewer
* lanes of larger size, in order to have the same overall bit-size as
* its input.
*
* Likewise, a contracting conversion must store its relatively small
* results into a subset of the lanes of the output vector, defaulting
* the unused lanes to zero.
*
* In all cases, the number of lane values actually computed by a
* conversion of any kind is the smaller {@code VLENGTH} of the input
* and output vectors. We will call this important number {@code B},
* the block size of the conversion. If you need more than {@code B}
* values from a vector conversion operation, you must run the
* operation more than once.
*
* <p> Expanding and contracting conversions are further characterized
* by a factor {@code M} which is the (integer) ratio of the domain
* and range type sizes. Since all element sizes are currently powers
* of two, one size always divides the other. In fact, {@code M} is
* {@code 2}, {@code 4}, or {@code 8}.
*
* As an example, a conversion from {@code byte} to {@code long}
* ({@code M=8}) will discard 87.5% of the input values in order to
* convert the remaining 12.5% into the roomy {@code long} lanes of
* the output vector. The inverse conversion will convert back all of
* the large results, but will waste 87.5% of the lanes in the output
* vector.
*
* Only <em>in-place</em> conversions ({@code M=1}) deliver all of
* their results in one output vector, without wasting lanes.
*
* <p> Given the ratio {@code M}, a second parameter called the block
* size {@code B} is derived from the {@code VSIZE} of the
*
* <p> To help manage the multiple outputs of expanding conversions,
* and merge the multiple inputs of the inverse contracting
* conversions, the conversion methods feature an additional parameter
* called {@code part}, which selects partial results from expansions,
* and also steers the results of contractions in the opposite
* direction. The value {@code part} is processed as follows
* for each kind of conversion:
*
* <ul>
* <li> expanding by {@code M}: {@code part} must be in the range
* {@code [0..M-1]}, and selects the block of {@code B} input lanes
* starting at the <em>origin lane</em> at {@code part*VLENGTH/B},
* where {@code VLENGTH} is the length of the <em>input</em>.
*
* <li> contracting by {@code M}: {@code part} must be in the range
* {@code [0..M-1]}, and steers all {@code B} input lanes into
* the output located at the <em>origin lane</em> {@code part*VLENGTH/B},
* where {@code VLENGTH} is the length of the <em>output</em>.
*
* <li> in-place ({@code M=1}): {@code part} must be zero.
* The {@code VLENGTH} of both vectors is {@code B}, and the
* the <em>origin lane</em> value is always the first lane.
*
* </ul>
*
* <p> Thus, an expanding conversion can iterate over all possible
* output blocks (selected by {@code part} values) to obtain the full
* set of converted values, into a sequence of {@code N} output
* vectors of length {@code B}. And if the reverse operation is
* necessary, a series of contracting conversions can iterate over all
* possible input blocks (again selected by {@code part} values)
* and merge the results into a vector in which all the lanes
* are used to hold a result value. And in all cases, a value of
* zero is always valid as a {@code part} parameter, if the user
* accepts the resulting pattern of results.
More information about the panama-dev
mailing list