[vector-api] interoperability loading from non-byte[] heap MemorySegments

Chris Hegarty chegar999 at gmail.com
Mon Oct 23 11:52:55 UTC 2023


Hi Maurizio,

Thanks for the quick reply.

On 23/10/2023 12:34, Maurizio Cimadamore wrote:
>
> Hi Chris,
> this is a good question. The main reason as to why the check is there 
> is this: the code this new segment-based method replaces is code that 
> used to accept a ByteBuffer. The check you see there was added just to 
> make sure we didn't run into new and unexpected situations.
>
Ah yes. That's kinda what I thought.
>
> Moving forward I could see this relaxed, either by tweaking the API to 
> accept an explicit layout from the user (so that the API can check if 
> the user really wants unaligned floats). Or, by doing a different 
> check which validates the maximum supported alignment against the 
> vector species. E.g. I suppose loading an int[] into a DoubleVector is 
> not ok - but float[]/int[]/short[]/char[]/byte[] in FloatVector should 
> be ok. But, going down that path is messy: while heap segments keep 
> track of their underlying Java array, which can then be used to 
> perform validation, an off-heap segment doesn't have any particular 
> alignment constraint. So (I think) we're back to the user providing a 
> layout parameter to the load method.
>
Yeah, I think an explicit layout would be the way to go here. And maybe 
a default of 4-byte aligned, JAVA_FLOAT, when not explicitly passed for 
heap backed segments. The default should be relatively straightforward 
to enable, behaving as if JAVA_FLOAT with the passed byte order.

-Chris.

> Maurizio
>
>
>
>
>
> On 23/10/2023 12:10, Chris Hegarty wrote:
>>
>> Hi,
>>
>> I'm curious about the restriction when loading from heap backed 
>> memory segments. E.g.
>>
>>   * @throws IllegalArgumentException if the memory segment is a heap 
>> segment that is * not backed by a {@code byte[]} array. * ...  FloatVectorfromMemorySegment(VectorSpecies<Float> species, MemorgSegment ms, ...)
>>
>> Which results in:
>>
>> jshell> float[] arr = new float[] { 1, 2, 3, 4, 5, 6, 7, 8 }
>> arr ==> float[8] { 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 }
>>
>> jshell> var vec = 
>> FloatVector.fromMemorySegment(FloatVector.SPECIES_PREFERRED, 
>> MemorySegment.ofArray(arr), 0, ByteOrder.nativeOrder())
>> |  Exception java.lang.IllegalArgumentException
>> |        at ScopedMemoryAccess.loadFromMemorySegment 
>> (ScopedMemoryAccess.java:334)
>> |        at FloatVector.fromMemorySegment0Template 
>> (FloatVector.java:3353)
>> |        at Float128Vector.fromMemorySegment0 (Float128Vector.java:864)
>> |        at FloatVector.fromMemorySegment (FloatVector.java:2986)
>> |        at do_it$Aux (#43:1)
>> |        at (#43:1)
>>
>> I can see that this is deliberate ...
>>
>> V loadFromMemorySegmentMasked(...) {
>>     // @@@ Smarter alignment checking if accessing heap segment backing 
>> non-byte[] array if (msp.maxAlignMask() >1) {
>>       throw new IllegalArgumentException();
>>     }
>>
>> Is this just temporary? I would expect the alignment to be as if 
>> `ValueLayout.JAVA_FLOAT.withByteAlignment(1)`, no? Is the intent to 
>> eventually support other non-byte primitive array typed, and to 
>> require and check alignment? It should just work, right?
>>
>> --
>>
>> The reason I ask about this is that over in Luceneland we're 
>> considering a switch to loading vector data from float[] to 
>> MemorySegment - which allows to load search vectors directly from the 
>> mmapped index file. But we still have some code paths which have 
>> float[], which may or may not be coming from the mmapped file. So we 
>> end up with something like this:
>>
>>  dotProduct(float[] a, float[] b)
>>
>>  dotProduct(float[] a, MemorySegment b)
>>
>>  dotProduct(MemorySegmenta, MemorySegment b)
>>
>> ... and we have cosine and Euclidean distance too. We can of course 
>> write the three variants of the code (and we've done this), just that 
>> it would be desirable to have the float[] accepting methods just wrap 
>> and delegate to the MemorySegment variant. In our use case, we don't 
>> slice or offset into the heap segment, but I do see how things could 
>> get misaligned quite quickly, but again I expect this to behave as if 
>> with byte alignment(1).
>>
>> Thanks,
>>
>> -Chris.
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20231023/47bf9927/attachment-0001.htm>


More information about the panama-dev mailing list