[vectorIntrinsics] lane order and byte order

John Rose john.r.rose at oracle.com
Fri May 31 07:28:13 UTC 2019


I just wrote the following.  It may be too much,
but that's the direction I'm erring in at the moment.
During CSR we can transfer some of the non-normative
observations into @apiNote or external documentation.

 * <h1><a id="lane-order">Lane order and byte order</h1>
 *
 * The number of lane values stored in a given vector is referred to
 * as its {@linkplain #length() vector length} or {@code VLENGTH}.
 *
 * It is useful to consider vector lanes as ordered
 * <em>sequentially</em> from first to last, with the first lane
 * numbered {@code 0}, the next lane numbered {@code 1}, and so on to
 * the last lane numbered {@code VLENGTH-1}.  This is a temporal
 * order, where lower-numbered lanes are considered earlier than
 * higher-numbered (later) lanes.  This API uses these terms
 * in preference to spatial terms such as "left", "right", "high",
 * and "low".
 *
 * <p> Temporal terminology works well for vectors because they
 * (usually) represent small fixed-sized segments in a long sequence
 * of workload elements, where the workload is conceptually traversed
 * in time order from beginning to end.  (This is a mental model: it
 * does not exclude multicore divide-and-conquer techniques.)  Thus,
 * when a scalar loop is transformed into a vector loop, adjacent
 * scalar items (one earlier, one later) in the workload end up as
 * adjacent lanes in a single vector (again, one earlier, one later).
 * At a vector boundary, the last lane item in the earlier vector is
 * adjacent to (and just before) the first lane item in the
 * immediately following vector.
 *
 * <p> Vectors are also sometimes thought of in spatial terms, where
 * the first lane is placed at an edge of some virtual paper, and
 * subsequent lanes are presented in order next to it.  When using
 * spatial terms, all directions are equally plausible: Some vector
 * notations present lanes from left to right, and others from right
 * to left; still others present from top to bottom or vice versa.
 * Using the language of time (before, after, first, last) instead of
 * space (left, right, high, low) is often more likely to avoid
 * misunderstandings.
 *
 * <p> As second reason to prefer temporal to spatial language about
 * vector lanes is the fact that the terms "left", "right", "high" and
 * "low" are widely used to describe the relations between bits in
 * scalar values.  The leftmost or highest bit in a given type is
 * likely to be a sign bit, while the rightmost or lowest bit is
 * likely to be the arithmetically least significant, and so on.
 * Applying these terms to vector lanes risks confusion, however,
 * because it is relatively rare to find algorithms where, given two
 * adjacent vector lanes, one lane is somehow more arithmetically
 * significant than its neighbor, and even in those cases, there is no
 * general way to know which neighbor is the the more significant.
 *
 * <p> Putting the terms together, we view the information structure
 * of a vector as a temporal sequence of lanes ("first", "next",
 * "earlier", "later", "last", etc.)  of bit-strings which are
 * internally ordered spatially (either "low" to "high" or "right" to
 * "left").  The primitive values in the lanes are decoded from these
 * bit-strings, in the usual way.  Most vector operations, like most
 * Java scalar operators, treat primitive values as atomic values, but
 * some operations reveal the internal bit-string structure.
 *
 * <p> When a vector is loaded from or stored into memory, the order
 * of vector lanes is <em>always consistent </em> with the inherent
 * ordering of the memory container.  This is true whether or not
 * individual lane elements are subject to "byte swapping" due to
 * details of byte order.  Thus, while the scalar lane elements of
 * vector might be "byte swapped", the lanes themselves are never
 * reordered, except by an explicit method call that performs
 * cross-lane reordering.
 *
 * <p> When vector lane values are stored to Java variables of the
 * same type, byte swapping is performed if and only if the
 * implementation of the vector hardware requires such swapping.  It
 * is therefore unconditional and invisible.
 *
 * <p> As a useful fiction, this API presents a consistent illusion
 * that vector lane bytes are composed into larger lane scalars in
 * <em>little endian order</em>.  This means that storing a vector
 * into a Java byte array will reveal the successive bytes of the
 * vector lane values in little-endian order on all platforms,
 * regardless of native memory order, and also regardless of byte
 * order (if any) within vector unit registers.
 *
 * <p> This hypothetical little-endian ordering also appears when a
 * {@linkplain #reinterpret(VectorSpecies) reinterpret conversion} is
 * applies in such a way that lane boundaries are discarded and
 * redrawn differently, while maintaining vector bits unchanged.  In
 * such an operation, two adjacent lanes will contribute bytes to a
 * single new lane (or vice versa), and the sequential order of the
 * two lanes will determine the arithmetic order of the bytes in the
 * single lane.  In this case, the little-endian convention provides
 * portable results, so that on all platforms earlier lanes tend to
 * contribute lower (rightward) bits, and later lanes tend to
 * contribute higher (leftward) bits.  The {@linkplain #asByteVector
 * reinterpreting conversions} between {@link ByteVector}s and the
 * other non-byte vectors use this convention to clarify their
 * portable semantics.
 *
 * <p> The little-endian fiction for relating lane order to per-lane
 * byte order is slightly preferable to an equivalent big-endian
 * fiction, because some related formulas are much simpler,
 * specifically those which renumber bytes after lane structure
 * changes.  The earliest byte is invariantly earliest across all lane
 * structure changes, but only if little-endian convention are used.
 * The root cause of this is that bytes in scalars are numbered from
 * the least significant (rightmost) to the omst significant
 * (leftmost), and almost never vice-versa.  If we habitually numbered
 * sign bits as zero (as on some computers) then this API would reach
 * for big-endian fictions to create unified addressing of vector
 * bytes.
 *



More information about the panama-dev mailing list