Using MemorySegment::byteSize as a loop bound is not being hoisted

Fri Jun 27 10:28:11 UTC 2025

Hi,

I've been rewriting parts of our codebase which currently uses the 
Panama Vector API to provide optimised distance comparison functions for 
vector search algorithms. We previously used float[]'s which necessity a 
copy from our off-heap storage into the heap, so we simply want to use a 
MemorySegments to avoid this - since our stored vectors are in a file 
on-disk and mmapp'ed.

I see that using MemorySegment::byteSize as a loop bound is not as 
optimised as it could be. The bound is not getting hoisted out of the 
loop body, where it does when using array length.

I created a minimal jmh benchmark that demonstrates what I see
(some assumptions are made about unrolling and tail avoidance for 
simplicity):
https://github.com/ChrisHegarty/memseg-vector-bench/tree/main

Example output
Benchmark                               Mode  Cnt   Score   Error  Units
MemorySegmentBench.dotProductArray      avgt   20  61.154 ± 0.266  ns/op
MemorySegmentBench.dotProductHeapSeg    avgt   20  98.806 ± 3.143  ns/op
MemorySegmentBench.dotProductNativeSeg  avgt   20  95.282 ± 0.356  ns/op

I would have expected memory segment to perform better than this, but 
maybe this is just not optimised yet on AArch64 (I've not tried x64 
yet). OR I'm doing something wrong?

For now, I'm working around this by writing my own native implementation 
and linking through FFI, but this is quite a bit of effort just to avoid 
this bug. And my native implementation only gets me back to the array perf.

-Chris.