Using Layouts on heap - JDK18
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Jun 3 21:51:11 UTC 2022
On 03/06/2022 21:57, leerho wrote:
> Maurizio,
> Thank you for your patience! You are correct. I was mistaken.
> Extracting an array of any type from ByteBuffer, MemorySegment or
> Unsafe requires a copy operation ...
> With the possible exception of ByteBuffer, which allows access to the
> underlying byte[] via array(). Similarly with IntBuffer.array() if it
> was initialized with an int[]. (Whether this is safe or a good idea or
> not is another issue.)
> I do note that MemorySegment does not allow this.
Yep, memory segments do not "cough up" info about the heap array.
ByteBuffer does that, but then if the BB is read-only it throws. I guess
I still have to make up my mind as to whether that's a precedent we want
to follow.
>
> When using unsafe:
> unsafe.getInt(Object, offsetBytes/*includes object header*/);
> The object can be practically anything including any of the primitive
> arrays. In other words, the unsafe view of the world is strictly
> bytes. If you give in a long[], it loses the information that it was a
> long array. It is just bytes.
True. But the same is true for MemorySegment::getInt. It will work on
native segments, heap segments backed by int[], long[], whatever. I
think the two APIs are equivalent in terms of expressiveness.
And, there's something that MemorySegment does which AFAIK, is not
available in other APIs (including BB). That is, like Unsafe, a
MemorySegment will let you view a float[] as an array of ints or shorts
- Buffers do not allow for that, as FloatBuffer can only get floats. So,
MemorySegment ~~ Unsafe in terms of dereference freedom.
>
> When using MemorySegment I have to remember that it was initialized,
> say, from a long[]. Because, from reading the API, the
> copy(...ValueLayout ...) methods will blow up if the ValueLayout
> alignment does not match a long[]. And I can't find a way to
> determine from a given segment what its internal alignment assumptions
> are! Is this correct?
That's a different thing :-)
If you want to do everything w/o worrying about alignment constraints,
just declare unaligned layout constants - and store them into a static
final field, e.g.
static final ValueLayout.OfLong JAVA_LONG_UNALIGNED =
ValueLayout.JAVA_LONG.withBitAlignment(8);
Then use JAVA_LONG_UNALIGNED everywhere, and the alignment exceptions
will disappear. That's because, by default, layouts have the correct
alignment so, if you try to store an 8-byte aligned datum inside a
byte[] that won't work, because the heap-allocated memory for a byte[]
is not guaranteed to be 8-byte aligned. The lack of conservative
alignment checks like these has caused, in the past, intermittent
failures on less commonly used platforms (e.g. x86), where the JVM might
align array elements at different offsets than in x64.
Anyway, these are extra safety belt to make sure that alignment checks
are respected. If you do not care about those, you can turn them off by
using unaligned layouts (I believe Lucene does that as well).
>
> ###
> Back to the original issue of Layouts. (and this is where my
> understanding is fuzzy)
>
> Suppose I have the struct from your wonderful documentation:
> |typedef struct { char kind; int value; } TaggedValues[5];|
|Looking at this, I realize this layout is "wrong", as it's missing a
padding after the char. E.g. the struct is 4-byte aligned, and the int
field should start at offset 4, so that there is char, then 2byte
padding, then the int. As written the struct is "packed". So that can
cause alignment issues down the road. But let's not dive into that for now.|
> Clearly I can create a layout that models this struct.
> And, I can use it to dereference an *off-heap* MemorySegment:
> allocateNative
> <#allocateNative(jdk.incubator.foreign.MemoryLayout,jdk.incubator.foreign.ResourceScope)>(MemoryLayout
> <MemoryLayout.html> layout, ResourceScope <ResourceScope.html> scope);
>
> But how do I use the power of the layout to dereference a segment
> derived from a heap byte array, or a mapped file, or a ByteBuffer? I
> could use a /asSlice(long offset, MemoryLayout layout),/ to do that if
> it existed.
I think here is where the misunderstanding happens. You think
`allocateNative(MemoryLayout)` is more powerful than it really is. It
doesn't say, as you seem to imply, "given me a segment that is backed by
a given layout". It simply says: "use the information contained in this
layout to create a memory region that is compatible with the
size/alignment of that layout".
In other words, that method is just a shorcut for:
MemorySegment.allocateNative(layout.byteSize(), layout.byteAlignment())
(as the Javadoc says).
After you invoked that method, you just have a segment, and you can
start dereferencing it however you like. The API doesn't verify that you
dereference the segment in ways that are consistent with the layout you
provided at the beginning. The dereference API is completely orthogonal
(in a way that is much closer to Unsafe than you think).
>
> if you pass that [a layout] (along with a segment) to a
> dereference API, you have all the ingredients you need.
>
>
> It is not obvious to me how to do that from the MemorySegment, or
> MemoryLayout API ( I must be really dense!)
Here's a Gist that should help you:
https://gist.github.com/mcimadamore/424fcfc6ad36a8d4de322bc5f707c98b
(note that, since the struct layout is packed, and we want to use a
byte[] as backing buffer, the example has to use unaligned layouts. If
we wanted to work w/ aligned layouts, then we'd need to insert some
padding between the two fields, and use long[], double[], int[] or
float[] as a backing buffer).
Cheers
Maurizio
>
> Cheers, and thanks again!
> Lee.
>
>
>
>
>
>
>
>
>
>
> On Fri, Jun 3, 2022 at 11:09 AM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>
>
> On 03/06/2022 18:59, leerho wrote:
>> Maurizio,
>>
>> ```
>> MemorySegment segment = MemorySegment.ofArray(new
>> byte[layout.byteSize()]);
>> ```
>>
>>
>> I thought of that too. It is obvious that I can derive a byteSize
>> from the layout, which passes to the ofArray method, but it isn't
>> obvious to me that I can subsequently access the array using the
>> layout as I don't see any linkage between the layout and the
>> segment, because the ofArray method is just getting a size and is
>> oblivious to the fact that I'm using a layout to get it.
>> Similarly, I can't find any reference in the Layout API of a way
>> to "assign" a layout to a (previously created) segment.
> Layouts cannot be "assigned" to segments. Layouts and segments are
> orthogonal abstractions. You create a segment, which contain some
> bytes, then you use layouts to _dereference_ the segment. A layout
> has knowledge about size, alignment and endianness, so if you pass
> that (along with a segment) to a dereference API, you have all the
> ingredients you need.
>>
>> I am just learning about layouts so I may be missing something.
>>
>> ###
>>
>> * throw if the slice is not compatible with the layout alignment
>> constraints, or
>> * align the slice
>>
>> 1. If the slice is just bytes wouldn't that always work?
>> 2. I'm not sure what "align the slice" means? What would happen
>> underneath? If the segment was created from an int[], and the
>> layout alignment required byte alignment -- what would "align the
>> slice" do?
>>
>> #####
>> Your mentioning of the alignment issue brings up another
>> "asymmetry" that I am somewhat troubled by.
>>
>> The "segment.asByteBuffer()" method is very limited to the case
>> where the segment was created from pure bytes. If a segment
>> wraps a heap int[], and I try to acquire a ByteBuffer from the
>> segment it fails. (And I do see the caveat "Throws
>> UnsupportedOperationException" ... " in the Javadocs.) What also
>> surprised me is that internally inside the segment you retain the
>> int[] structure! Why? Why don't you just convert all inputs to
>> just bytes?
> You mean copying int[] into a byte[] ? That would not be just a
> "view" then, it would allocate memory. All the "as" methods are
> meant to create views over memory, and not copy contents.
>>
>> This is very different from ByteBuffer, where I can extract any
>> of the multibyte primitive arrays from the raw bytes of the
>> buffer, because the raw resource under the ByteBuffer is just a
>> blob of bytes. And Unsafe provides even more flexibility. I can
>> start with an array of longs and extract an array of ints because
>> the underlying resource view of Unsafe is just bytes. So I am
>> surprised you didn't do that here. I would think that having the
>> internal view being just a blob of bytes would be the most
>> flexible. If you want me to provide you with a use case for
>> this I can provide you one.
>
> I think you are mistaken w.r.t. ByteBuffer. A ByteBuffer is backed
> by a byte[] - you can't turn that into an int[] w/o copying.
>
> I also don't think that there is an Unsafe facility that lets you
> view a portion of a long[] as an int[] (you need a new Object
> header for the int[], so... again... allocation and copy).
>
> Maybe you are talking about different things? Please provide a
> snippet of code using either Unsafe or ByteBuffer that does
> something that you think MemorySegment does not support. Perhaps
> the discussion would be easier that way.
>
> Thanks
> Maurizio
>
>> Lee.
>>
>>
>>
>> On Fri, Jun 3, 2022 at 10:06 AM Maurizio Cimadamore
>> <maurizio.cimadamore at oracle.com> wrote:
>>
>> Hi Lee,
>> in terms of heap allocation, how is your "allocate" different
>> from this:
>>
>> ```
>> MemorySegment segment = MemorySegment.ofArray(new
>> byte[layout.byteSize()]);
>> ```
>>
>> The above doesn't do any copy, it just creates a segment view
>> over the
>> array. (note that the carrier type used is important here -
>> e.g. a
>> segment that wraps a byte[] has different properties when it
>> comes to
>> alignment, than a segment that wraps long[]).
>>
>> But then you bring up the slice method, which makes me think
>> that,
>> perhaps you aren't after allocation at all - you have an
>> existing
>> segment and you want to slice it, using a layout.
>>
>> I think the slicing API overload you describe is a nice one
>> (and one
>> that I found myself wanting at times too). Is
>> "layout-oriented" slicing
>> the main use case you are referring to here?
>>
>> If we do this the "right" way, the slicing method would have
>> to take
>> into account the required alignment of the provided layout
>> and either:
>>
>> * throw if the slice is not compatible with the layout alignment
>> constraints, or
>> * align the slice
>>
>> Of these I would probably prefer the former, because it's
>> "less magic".
>>
>> Maurizio
>>
>> On 03/06/2022 17:56, leerho wrote:
>> > Hi,
>> > It is straightforward to create a native MemorySegment
>> governed by a
>> > layout, but I can't see a way to create a heap segment
>> governed by a layout
>> > (without a copy). We need to be able to read and write to
>> structs on-heap
>> > as well as off-heap. Being able to overlay a Layout on a
>> segment derived
>> > from a ByteBuffer or from a mapped file would also be
>> useful. Perhaps these
>> > could be implemented via a slice?
>> >
>> > What I would like to see is something like:
>> >
>> > *MemorySegment *allocate*(MemoryLayout layout); //base
>> resource is byte[]*
>> >
>> > OR
>> >
>> >> *MemorySegment *asSlice*(long offset, MemoryLayout layout)*
>> >
>> > The second could be applied to any base resource, not just
>> off-heap.
>> >
>> > Am I missing something?
>> >
>> > Cheers,
>> > Lee.
>>
More information about the panama-dev
mailing list