tweaking layout API to work on bytes instead of bits
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Tue May 16 14:21:35 UTC 2023
Hi,
as we're going through the FFM API with a finer comb, we were reminded
again of an asymmetry between memory segments and memory layouts. Memory
segments are expressed as "bag of bytes". All their sizes and offsets
are expressed in number of bytes, which obviously makes sense given that
(a) memory addressing is byte-oriented and (b) ByteBuffer API also works
that way (so changing it would make transition from BB a lot more painful).
On the other hand, memory layouts are expressed in bits. The historical
reasons for this can be found in John's great LDL document [1].
Essentially, the layout language proposed in the LDL document was
originally intended to model both memory **and** registers. That said,
the memory layout API is firmly in the territory of modelling memory
structure/dereference, so it seems that ship has sailed already. If we
want to talk about sub-byte structure, we could still do so, in the
future, by adding a dedicated API for "register layouts" (e.g. so that a
JAVA_INT could be associated with a sub-byte layout which indicates how
the 32 bits are partitioned and used).
While this asymmetry can rarely be observed in practice, it is
bothersome for a number of reasons:
* Factories accepting layouts (e.g.
SegmentAllocator::allocate(MemoryLayout)) cannot be expressed as simple
sugar for factories expressed in byte size/alignment (e.g.
SegmentAllocator::allocate(long, long)). That is, there is always some
segments that can be allocated in one factory which can't be allocated
in the other.
* Var handles generated using the memory layout API have different
constraints from those generated directly from MethodHandles. The latter
just accepts a byte offset (in sync with what memory segments do), while
the former perform all internal computation, as well as range checking,
in bits - which again leads to asymmetries.
We would like to rectify this asymmetry now (possibly in Java 21).
Here's a draft PR that does just that:
https://github.com/openjdk/jdk/pull/14013
While tedious, in reality there's not much that leaks outside the API
and tests because:
* Clients access value layouts using one of the ready-made constants
(e.g. ValueLayout.JAVA_INT or, ValueLayout.JAVA_INT_UNALIGNED for
unaligned access)
* I suspect clients are only using methods such as
MemoryLayout::byteSize() and MemoryLayout::byteAlignment() already
There is however, some compatibility surface as well:
* The factory for padding layouts will need to take a byte size, not a
bit one (but note that we already required the bit size to be multiple
of 8, so no real change). This is by far the most annoying, because
existing code will "not get the memo", and, if unchanged, will end up
creating padding layouts that are too big.
* Instead of using MemoryLayout::withBitAlignment, clients will need to
use MemoryLayout::withByteAlignment. Since this will result in a
compilation error (and since the method name says "byte" and not "bit"),
I believe that, while annoying, this poses far less issues.
Of course code that works against jextract bindings won't need any
updates, but bindings will need to be re-generated to have paddings and
alignments expressed in bits, not bytes.
I believe that, on the whole, these changes are rather sensible - having
segments and layouts using different "currencies" seems like a recipe
for future trouble. Of course if there are some objections, we'd like to
hear from you.
Thanks
Maurizio
[1] - https://cr.openjdk.org/~jrose/panama/minimal-ldl.html
More information about the panama-dev
mailing list