Memory segment and unaligned access

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Mon Jun 13 14:35:46 UTC 2022


Hi Alexander,
MemorySegment and ByteBuffer are a bit different when it comes to 
dereferencing. ByteBuffers always allow you to dereference a value, 
regardless of its starting offset. So you can read an `int` value at 
offset 3 in a buffer - which might or might not be aligned (but of 
course I do realized that, when working with packed representation, 
alignment is not a concern).

On the other hand, when a memory segment is dereferenced, the 
ValueLayout used for the dereference operation is consulted. This layout 
might have alignment information attached (all layout constants provided 
by default such as JAVA_INT and JAVA_LONG do).

So if you try something like

segment.get(JAVA_INT, 3);

This will likely fail with an alignment error.

If this is not the behavior you want, because you work on packed 
layouts, or you simply wants the same behavior you get with the 
ByteBuffer API, the solution is to declare unaligned layout constants - 
like this:

final static ValueLayout.OfInt JAVA_INT_UNALIGNED = 
JAVA_INT.withBitAlignment(8);

And then:

segment.get(JAVA_INT_UNALIGNED, 3);

This will work w/o issues.

I note that this tends to come up quite a bit - which probably suggests 
that working with unaligned layouts is not uncommon - we might consider 
adding unaligned layout constants in the ValueLayout API, so that both 
aligned and unaligned use cases are supported "out-of-the-box". But the 
capability to support unaligned access is there.

I hope this helps
Maurizio

On 13/06/2022 14:06, Alexander Biryukov wrote:
> Hi, I'm testing MemorySegment and stumbled across some inconsistent
> behavior in the API.
>
> I have the following task:
> * read binary file to RAM, file structure is fixed, but no
> padding/alignment is present (lots of strings of variable length)
> * read file content depending on the user input, e.g. sometimes I need to
> read offset 13, sometimes offset 24 and so on
> * Stored values are typically privitives (byte, long, double etc) or arrays
> (byte[], short[], int[] and so on)
> * Primitives are decoded on the fly to variables, same is true for arrays
> (random access) or sometimes arrays are copied as a blob to another
> MemorySegment
>
> I'm trying to create a VarHandle for MemorySegment:
>
>> @Test
>> fun sample() {
>>      val mem = MemorySegment.allocateNative(100,
>> MemorySession.openImplicit())
>>      val buf = ByteBuffer.allocateDirect(100).order(ByteOrder.LITTLE_ENDIAN)
>>      buf.put(-20)        // 0
>>      buf.putDouble(5.0)  // 1
>>      buf.put(2)          // 9
>>      buf.putInt(5)       // 10
>>      buf.putInt(10)      // 14
>>      buf.flip()
>>      mem.copyFrom(MemorySegment.ofBuffer(buf))
>>
>      val mh = MethodHandles.memorySegmentViewVarHandle(ValueLayout.JAVA_INT)
>>      val bh = MethodHandles.byteBufferViewVarHandle(IntArray::class.java,
>> ByteOrder.LITTLE_ENDIAN)
>
>>      println(mh.get(mem, 14L) as Int)               // <-- throws
>> "Misaligned access at address"
>>      println(bh.get(mem.asByteBuffer(), 14) as Int) // works, prints "10"
>> }
>
> The variant with memorySegmentViewVarHandle throws, while variant with
> byteBufferViewVarHandle works.
> I checked the source code and found that for some reason MemorySegment-variant
> uses offsetNoVMAlignCheck, while ByteBuffer-variant doesn't.
> Here's the code from the source:
>
> ByteBuffer-variant
>
>> return UNSAFE.getLongUnaligned(
>>          ba,
>>          ((long) index(ba, index)) + Unsafe.ARRAY_BYTE_BASE_OFFSET,
>>          handle.be);
>>
> MemorySegment-variant
>
>> return SCOPED_MEMORY_ACCESS.getIntUnaligned(bb.sessionImpl(),
>>          bb.unsafeGetBase(),
>>          offsetNoVMAlignCheck(bb, base, handle.alignmentMask),
>>          handle.be);
>>
>
> I'm not sure, but it seems like  offsetNoVMAlignCheck shouldn't be present
> or there must be an alternative API for querying unaligned data?
> If nothing is wrong, is there a way to query data in MemorySegment without
> alignment?
> ByteBuffer is technically fine for the time being, but I'm expecting a high
> load on the service, and since I'm not using ByteBuffers I'd rather not pay
> for them (GC).
>
>
> Best regards,
> Alexander Biryukov


More information about the panama-dev mailing list