tweaking layout API to work on bytes instead of bits

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Tue May 16 14:21:35 UTC 2023


Hi,
as we're going through the FFM API with a finer comb, we were reminded 
again of an asymmetry between memory segments and memory layouts. Memory 
segments are expressed as "bag of bytes". All their sizes and offsets 
are expressed in number of bytes, which obviously makes sense given that 
(a) memory addressing is byte-oriented and (b) ByteBuffer API also works 
that way (so changing it would make transition from BB a lot more painful).

On the other hand, memory layouts are expressed in bits. The historical 
reasons for this can be found in John's great LDL document [1]. 
Essentially, the layout language proposed in the LDL document was 
originally intended to model both memory **and** registers. That said, 
the memory layout API is firmly in the territory of modelling memory 
structure/dereference, so it seems that ship has sailed already. If we 
want to talk about sub-byte structure, we could still do so, in the 
future, by adding a dedicated API for "register layouts" (e.g. so that a 
JAVA_INT could be associated with a sub-byte layout which indicates how 
the 32 bits are partitioned and used).

While this asymmetry can rarely be observed in practice, it is 
bothersome for a number of reasons:

* Factories accepting layouts (e.g. 
SegmentAllocator::allocate(MemoryLayout)) cannot be expressed as simple 
sugar for factories expressed in byte size/alignment (e.g. 
SegmentAllocator::allocate(long, long)). That is, there is always some 
segments that can be allocated in one factory which can't be allocated 
in the other.

* Var handles generated using the memory layout API have different 
constraints from those generated directly from MethodHandles. The latter 
just accepts a byte offset (in sync with what memory segments do), while 
the former perform all internal computation, as well as range checking, 
in bits - which again leads to asymmetries.

We would like to rectify this asymmetry now (possibly in Java 21). 
Here's a draft PR that does just that:

https://github.com/openjdk/jdk/pull/14013

While tedious, in reality there's not much that leaks outside the API 
and tests because:

* Clients access value layouts using one of the ready-made constants 
(e.g. ValueLayout.JAVA_INT or, ValueLayout.JAVA_INT_UNALIGNED for 
unaligned access)
* I suspect clients are only using methods such as 
MemoryLayout::byteSize() and MemoryLayout::byteAlignment() already

There is however, some compatibility surface as well:

* The factory for padding layouts will need to take a byte size, not a 
bit one (but note that we already required the bit size to be multiple 
of 8, so no real change). This is by far the most annoying, because 
existing code will "not get the memo", and, if unchanged, will end up 
creating padding layouts that are too big.
* Instead of using MemoryLayout::withBitAlignment, clients will need to 
use MemoryLayout::withByteAlignment. Since this will result in a 
compilation error (and since the method name says "byte" and not "bit"), 
I believe that, while annoying, this poses far less issues.

Of course code that works against jextract bindings won't need any 
updates, but bindings will need to be re-generated to have paddings and 
alignments expressed in bits, not bytes.

I believe that, on the whole, these changes are rather sensible - having 
segments and layouts using different "currencies" seems like a recipe 
for future trouble. Of course if there are some objections, we'd like to 
hear from you.

Thanks
Maurizio

[1] - https://cr.openjdk.org/~jrose/panama/minimal-ldl.html



More information about the panama-dev mailing list