[vector-api] interoperability loading from non-byte[] heap MemorySegments
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Mon Oct 23 11:34:57 UTC 2023
Hi Chris,
this is a good question. The main reason as to why the check is there is
this: the code this new segment-based method replaces is code that used
to accept a ByteBuffer. The check you see there was added just to make
sure we didn't run into new and unexpected situations.
Moving forward I could see this relaxed, either by tweaking the API to
accept an explicit layout from the user (so that the API can check if
the user really wants unaligned floats). Or, by doing a different check
which validates the maximum supported alignment against the vector
species. E.g. I suppose loading an int[] into a DoubleVector is not ok -
but float[]/int[]/short[]/char[]/byte[] in FloatVector should be ok.
But, going down that path is messy: while heap segments keep track of
their underlying Java array, which can then be used to perform
validation, an off-heap segment doesn't have any particular alignment
constraint. So (I think) we're back to the user providing a layout
parameter to the load method.
Maurizio
On 23/10/2023 12:10, Chris Hegarty wrote:
>
> Hi,
>
> I'm curious about the restriction when loading from heap backed memory
> segments. E.g.
>
> * @throws IllegalArgumentException if the memory segment is a heap
> segment that is * not backed by a {@code byte[]} array. * ... FloatVectorfromMemorySegment(VectorSpecies<Float> species, MemorgSegment ms, ...)
>
> Which results in:
>
> jshell> float[] arr = new float[] { 1, 2, 3, 4, 5, 6, 7, 8 }
> arr ==> float[8] { 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 }
>
> jshell> var vec =
> FloatVector.fromMemorySegment(FloatVector.SPECIES_PREFERRED,
> MemorySegment.ofArray(arr), 0, ByteOrder.nativeOrder())
> | Exception java.lang.IllegalArgumentException
> | at ScopedMemoryAccess.loadFromMemorySegment
> (ScopedMemoryAccess.java:334)
> | at FloatVector.fromMemorySegment0Template (FloatVector.java:3353)
> | at Float128Vector.fromMemorySegment0 (Float128Vector.java:864)
> | at FloatVector.fromMemorySegment (FloatVector.java:2986)
> | at do_it$Aux (#43:1)
> | at (#43:1)
>
> I can see that this is deliberate ...
>
> V loadFromMemorySegmentMasked(...) {
> // @@@ Smarter alignment checking if accessing heap segment backing
> non-byte[] array if (msp.maxAlignMask() >1) {
> throw new IllegalArgumentException();
> }
>
> Is this just temporary? I would expect the alignment to be as if
> `ValueLayout.JAVA_FLOAT.withByteAlignment(1)`, no? Is the intent to
> eventually support other non-byte primitive array typed, and to
> require and check alignment? It should just work, right?
>
> --
>
> The reason I ask about this is that over in Luceneland we're
> considering a switch to loading vector data from float[] to
> MemorySegment - which allows to load search vectors directly from the
> mmapped index file. But we still have some code paths which have
> float[], which may or may not be coming from the mmapped file. So we
> end up with something like this:
>
> dotProduct(float[] a, float[] b)
>
> dotProduct(float[] a, MemorySegment b)
>
> dotProduct(MemorySegmenta, MemorySegment b)
>
> ... and we have cosine and Euclidean distance too. We can of course
> write the three variants of the code (and we've done this), just that
> it would be desirable to have the float[] accepting methods just wrap
> and delegate to the MemorySegment variant. In our use case, we don't
> slice or offset into the heap segment, but I do see how things could
> get misaligned quite quickly, but again I expect this to behave as if
> with byte alignment(1).
>
> Thanks,
>
> -Chris.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20231023/922e3b78/attachment.htm>
More information about the panama-dev
mailing list