[vectorIntrinsics] lane order and byte order
John Rose
john.r.rose at oracle.com
Fri May 31 07:28:13 UTC 2019
I just wrote the following. It may be too much,
but that's the direction I'm erring in at the moment.
During CSR we can transfer some of the non-normative
observations into @apiNote or external documentation.
* <h1><a id="lane-order">Lane order and byte order</h1>
*
* The number of lane values stored in a given vector is referred to
* as its {@linkplain #length() vector length} or {@code VLENGTH}.
*
* It is useful to consider vector lanes as ordered
* <em>sequentially</em> from first to last, with the first lane
* numbered {@code 0}, the next lane numbered {@code 1}, and so on to
* the last lane numbered {@code VLENGTH-1}. This is a temporal
* order, where lower-numbered lanes are considered earlier than
* higher-numbered (later) lanes. This API uses these terms
* in preference to spatial terms such as "left", "right", "high",
* and "low".
*
* <p> Temporal terminology works well for vectors because they
* (usually) represent small fixed-sized segments in a long sequence
* of workload elements, where the workload is conceptually traversed
* in time order from beginning to end. (This is a mental model: it
* does not exclude multicore divide-and-conquer techniques.) Thus,
* when a scalar loop is transformed into a vector loop, adjacent
* scalar items (one earlier, one later) in the workload end up as
* adjacent lanes in a single vector (again, one earlier, one later).
* At a vector boundary, the last lane item in the earlier vector is
* adjacent to (and just before) the first lane item in the
* immediately following vector.
*
* <p> Vectors are also sometimes thought of in spatial terms, where
* the first lane is placed at an edge of some virtual paper, and
* subsequent lanes are presented in order next to it. When using
* spatial terms, all directions are equally plausible: Some vector
* notations present lanes from left to right, and others from right
* to left; still others present from top to bottom or vice versa.
* Using the language of time (before, after, first, last) instead of
* space (left, right, high, low) is often more likely to avoid
* misunderstandings.
*
* <p> As second reason to prefer temporal to spatial language about
* vector lanes is the fact that the terms "left", "right", "high" and
* "low" are widely used to describe the relations between bits in
* scalar values. The leftmost or highest bit in a given type is
* likely to be a sign bit, while the rightmost or lowest bit is
* likely to be the arithmetically least significant, and so on.
* Applying these terms to vector lanes risks confusion, however,
* because it is relatively rare to find algorithms where, given two
* adjacent vector lanes, one lane is somehow more arithmetically
* significant than its neighbor, and even in those cases, there is no
* general way to know which neighbor is the the more significant.
*
* <p> Putting the terms together, we view the information structure
* of a vector as a temporal sequence of lanes ("first", "next",
* "earlier", "later", "last", etc.) of bit-strings which are
* internally ordered spatially (either "low" to "high" or "right" to
* "left"). The primitive values in the lanes are decoded from these
* bit-strings, in the usual way. Most vector operations, like most
* Java scalar operators, treat primitive values as atomic values, but
* some operations reveal the internal bit-string structure.
*
* <p> When a vector is loaded from or stored into memory, the order
* of vector lanes is <em>always consistent </em> with the inherent
* ordering of the memory container. This is true whether or not
* individual lane elements are subject to "byte swapping" due to
* details of byte order. Thus, while the scalar lane elements of
* vector might be "byte swapped", the lanes themselves are never
* reordered, except by an explicit method call that performs
* cross-lane reordering.
*
* <p> When vector lane values are stored to Java variables of the
* same type, byte swapping is performed if and only if the
* implementation of the vector hardware requires such swapping. It
* is therefore unconditional and invisible.
*
* <p> As a useful fiction, this API presents a consistent illusion
* that vector lane bytes are composed into larger lane scalars in
* <em>little endian order</em>. This means that storing a vector
* into a Java byte array will reveal the successive bytes of the
* vector lane values in little-endian order on all platforms,
* regardless of native memory order, and also regardless of byte
* order (if any) within vector unit registers.
*
* <p> This hypothetical little-endian ordering also appears when a
* {@linkplain #reinterpret(VectorSpecies) reinterpret conversion} is
* applies in such a way that lane boundaries are discarded and
* redrawn differently, while maintaining vector bits unchanged. In
* such an operation, two adjacent lanes will contribute bytes to a
* single new lane (or vice versa), and the sequential order of the
* two lanes will determine the arithmetic order of the bytes in the
* single lane. In this case, the little-endian convention provides
* portable results, so that on all platforms earlier lanes tend to
* contribute lower (rightward) bits, and later lanes tend to
* contribute higher (leftward) bits. The {@linkplain #asByteVector
* reinterpreting conversions} between {@link ByteVector}s and the
* other non-byte vectors use this convention to clarify their
* portable semantics.
*
* <p> The little-endian fiction for relating lane order to per-lane
* byte order is slightly preferable to an equivalent big-endian
* fiction, because some related formulas are much simpler,
* specifically those which renumber bytes after lane structure
* changes. The earliest byte is invariantly earliest across all lane
* structure changes, but only if little-endian convention are used.
* The root cause of this is that bytes in scalars are numbered from
* the least significant (rightmost) to the omst significant
* (leftmost), and almost never vice-versa. If we habitually numbered
* sign bits as zero (as on some computers) then this API would reach
* for big-endian fictions to create unified addressing of vector
* bytes.
*
More information about the panama-dev
mailing list