[vector-api] interoperability loading from non-byte[] heap MemorySegments

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Mon Oct 23 11:34:57 UTC 2023


Hi Chris,
this is a good question. The main reason as to why the check is there is 
this: the code this new segment-based method replaces is code that used 
to accept a ByteBuffer. The check you see there was added just to make 
sure we didn't run into new and unexpected situations.

Moving forward I could see this relaxed, either by tweaking the API to 
accept an explicit layout from the user (so that the API can check if 
the user really wants unaligned floats). Or, by doing a different check 
which validates the maximum supported alignment against the vector 
species. E.g. I suppose loading an int[] into a DoubleVector is not ok - 
but float[]/int[]/short[]/char[]/byte[] in FloatVector should be ok. 
But, going down that path is messy: while heap segments keep track of 
their underlying Java array, which can then be used to perform 
validation, an off-heap segment doesn't have any particular alignment 
constraint. So (I think) we're back to the user providing a layout 
parameter to the load method.

Maurizio





On 23/10/2023 12:10, Chris Hegarty wrote:
>
> Hi,
>
> I'm curious about the restriction when loading from heap backed memory 
> segments. E.g.
>
>   * @throws IllegalArgumentException if the memory segment is a heap 
> segment that is * not backed by a {@code byte[]} array. * ...  FloatVectorfromMemorySegment(VectorSpecies<Float> species, MemorgSegment ms, ...)
>
> Which results in:
>
> jshell> float[] arr = new float[] { 1, 2, 3, 4, 5, 6, 7, 8 }
> arr ==> float[8] { 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 }
>
> jshell> var vec = 
> FloatVector.fromMemorySegment(FloatVector.SPECIES_PREFERRED, 
> MemorySegment.ofArray(arr), 0, ByteOrder.nativeOrder())
> |  Exception java.lang.IllegalArgumentException
> |        at ScopedMemoryAccess.loadFromMemorySegment 
> (ScopedMemoryAccess.java:334)
> |        at FloatVector.fromMemorySegment0Template (FloatVector.java:3353)
> |        at Float128Vector.fromMemorySegment0 (Float128Vector.java:864)
> |        at FloatVector.fromMemorySegment (FloatVector.java:2986)
> |        at do_it$Aux (#43:1)
> |        at (#43:1)
>
> I can see that this is deliberate ...
>
> V loadFromMemorySegmentMasked(...) {
>     // @@@ Smarter alignment checking if accessing heap segment backing 
> non-byte[] array if (msp.maxAlignMask() >1) {
>       throw new IllegalArgumentException();
>     }
>
> Is this just temporary? I would expect the alignment to be as if 
> `ValueLayout.JAVA_FLOAT.withByteAlignment(1)`, no? Is the intent to 
> eventually support other non-byte primitive array typed, and to 
> require and check alignment? It should just work, right?
>
> --
>
> The reason I ask about this is that over in Luceneland we're 
> considering a switch to loading vector data from float[] to 
> MemorySegment - which allows to load search vectors directly from the 
> mmapped index file. But we still have some code paths which have 
> float[], which may or may not be coming from the mmapped file. So we 
> end up with something like this:
>
>  dotProduct(float[] a, float[] b)
>
>  dotProduct(float[] a, MemorySegment b)
>
>  dotProduct(MemorySegmenta, MemorySegment b)
>
> ... and we have cosine and Euclidean distance too. We can of course 
> write the three variants of the code (and we've done this), just that 
> it would be desirable to have the float[] accepting methods just wrap 
> and delegate to the MemorySegment variant. In our use case, we don't 
> slice or offset into the heap segment, but I do see how things could 
> get misaligned quite quickly, but again I expect this to behave as if 
> with byte alignment(1).
>
> Thanks,
>
> -Chris.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20231023/922e3b78/attachment.htm>


More information about the panama-dev mailing list