[foreign-memaccess] RFR 8228447: Remove kinds from value layouts

Mon Jul 22 23:38:27 UTC 2019

On Jul 22, 2019, at 8:50 AM, Jorn Vernee <jbvernee at xs4all.nl> wrote:
> 
>> 
>> We need to
>> invent a description that is general enough to capture all the details
>> the ABI care about. But if we have layouts, why not doubling down and
>> reuse those?
> 
> Depending on layout annotations seems like a good choice to me, at least to get going. We can always evaluate and replace with a more ABI specific abstraction later.

I have one warning about this (another “siren song” to avoid):

Memory layouts are a crystal-clear notation for describing the local organization of blocks of memory.

As such, they can be composed sequentially.

But look, argument lists are also composed sequentially and made from individual items which are layouts.

So let’s reuse layout sequences for arguments lists as well as memory structures.  ⇐ This is the false step.

The problem with the proposed reuse (which crops up pretty frequently as an “obvious” design move)
is that you would the create layout formulae which are superficially similar (sequences; what could be simpler?)
but with deeply different semantics.  Argument list layouts (in various ABIs) have a their own rules for
composition that are not the same as memory juxtaposition.  Argument lists are not contiguously stored,
usually, so it makes no sense to ask for the “byte offset” of an argument within its list.

If we follow that siren song, we end up muddying the semantics of memory layouts, with a series of
questions like “is this a real memory struct or a parameter list?”  If you make a mistake about which
is which, you could get into real trouble, up to and including security vulnerabilities.  Because parameter
lists are not memory structures, they add no value to the task of describing memory layouts, so there’s
no benefit to offset the harm to clarity of the layout calculus.

OTOH, parameter lists contain arguments, and arguments are helpful to describe as memory layouts.
One thing to remember, though, is that a memory layout is closely coupled to the form of the data you
*load from* and *store to* that memory layout.  For argument lists (and return values) it is this latter
aspect of the layout that is valuable.  This distinction, between a bare value and its memory layout,
can be seen arising when you ask certain subtle questions about either a value (on one hand) or
its memory layout:  What is the alignment requirement?  Well, there is none for a bare value, or
a register that contains it, but you need to know alignment when loading or storing (or allocating
heap or stack space).  If we want to locate and enumerate component bytes (or bits, for bitfields)
we need a numbering scheme that is unambiguous, even apart from memory ordering, since
different CPU instructions swap bytes differently going between memory and register.  Working
the details of bit- and byte- numbering for values (not memory) leads to a conclusion that
memory endian-ness is a hindrance to a clear accounting of the locations of bits and bytes
in *values*.  In both cases, concepts designed for memory don’t transfer cleanly when
talking about the (scalar or composite) *values* we want to load and store.

We just went through the latter exercise with the Vector API, and chose to distinguish
vectors outside of memory from their images in memory.  Load and store operators
all take an optional ByteOrder argument, which is applied in the obvious way.
Operators which (for any reason) must number bits or bytes or lanes of vector
values rigorously avoid depending on byte order conventions, since those would
just make certain operators (reinterpret casts, for example) have one or another
semantics unpredictably depending on which CPU you are running, which is not
the way Java works.

The byte-order-independent convention we chose is to number lanes (starting
with zero) in the order they are loaded or stored into memory (since AFAIK no
CPU memory unit does lane swapping as a generalization of byte swapping).
Also we number bits inside a lane according to arithmetic order (as if they
were integral values, which they often are), with bit zero being the LSB and
bit #N-1 being the sign bit.  Byte order in a vector *value* is derived from
those two previous rules, and works out to be little-endian.  So when you
reinterpret the bits of a vector into a new type it is as if the vector were
stored using LE byte swapping into a temp buffer and loaded again
as the new type, again with LE rules.  But there’s also a (simple,
deterministic) rule for renumbering bits from one type to the other,
which does not appeal to LE vs. BE.

Of course, when a vector is stored to user-visible memory (memory
you can break up into bits and bytes), the ByteOrder setting is respected,
and that setting defaults to be compatible with the current platform’s
preferences.  (Thus, you can use Unsafe to pick apart a stored vector
and see the same byte order artifacts as you would if it were a series
of scalars.)

In essence, a vector value is first and foremost a string of bits divided
in a mathematically rigorous way into lanes, bytes, and bits, without
appealing to LE/BE distinctions.

I’m going into detail about this, in part, because I think this way of looking
at vector values would also apply nicely to C structs passed by value.
When working with C arguments and return values you can ignore
LE/BE distinctions for the most part, and this is helpful, but structs
seem claw back some of the complexity into the world of arguments
and return values, since when you store a C struct you clearly should
store it correctly.  And C structs seem to be *always* in memory.  But
it’s not quite true; they are sometimes bound to registers.  To me that
variability of storage is a clue that a C struct has a value which can be
contemplated alone, independently of its backing store.  (Another clue
is that memory structure can only be observed if you can alias a pointer
to the stored value, but C structs passed by value are not aliasable,
so they don’t need a predictable storage format!)

The net result is that we could agree that all ABI arguments and return
values, when passed by value, can be described as byte-strings, and a
portable (LE) convention is useful for numbering those byte.  A related
agreement that probably is also possible is that all ABI arguments are
*word-sequences*, where each word is an intptr_t.  The two accounts
can be converted between each other, probably, since ABIs usually
“round up” their sub-word values into words; the padding concept from
layouts would be useful to note which bits are just slop.

If data values are treated as byte-strings, then layouts can have a
useful career in describing their *internal* structures.  (Not their
*external ordering* which is the siren song I started with above.)
For example, packing two arguments into two halves of one
register could be described (without LE/BE dependencies) by
simply naming them as components in a layout that describes
the register as a whole.

HTH
— John