FFM performance tweaks

Wed Nov 20 12:53:56 UTC 2024

> One thing that strikes me as odd is the use of a MemorySegment in a 
> context in which I really want an "unbounded" memory segment, which 
> cannot be represented by the MemorySegment interface.
I think this is a fair observation. Memory segments are designed to be 
essentially a safe API. They have all sorts of checks to make sure 
memory is accessed within bounds and while it's still alive. Even 
restricted methods like "reinterpret" were mostly designed around 
interaction with native functions, to attach bounds to pointers of 
(statically) unknown size/lifetime -- e.g. to make FFI interaction also 
safer.

The underlying expectation with memory segments is that, when used in 
idiomatic ways (e.g. counted loops etc.) C2 will try very hard (and 
often succeed) to amortize the costs of all those checks. On the other 
hand, with Unsafe you access memory at a given address (no questions 
asked) using a fast JVM intrinsics. So, even when everything works, FFM 
and Unsafe get at the desired results in very different ways. And this 
difference cannot be eliminated (at least not in all cases) -- 
fundamentally, Unsafe is a much lower level API than FFM.

So, while we keep looking for opportunities to make memory segment 
access faster, I don't think it's particularly fruitful to chase dubious 
API changes which turn the memory segment API into what it's not, e.g. 
by adding first-class support for unbounded memory segment, or cramming 
additional access primitives into memory segments so that one can get 
more direct access to Unsafe. While these API changes will give power 
users what they want, they will inevitably create confusion for 
everybody else, and the temptation to use unchecked access will be too 
great, even in cases where safe access is enough (the majority).

The evidence *so far* does not seem to support the addition of a 
lower-level off-heap access primitive -- possibly through a completely 
different API. After all, in most cases FFM can be used in idiomatic 
way, without issues. And, in less fortunate cases (e.g. random access), 
the cost of bound checks can still be managed (even if not completely 
eliminated) with the workarounds described in the past. I think it's 
fair to say that, while we don't want to completely close the door on 
such an API, we also don't want to jump into it prematurely. It would be 
better to wait a bit more and see (a) how much memory segment access can 
be improved and (b) whether more use cases will emerge where such an API 
would bring considerable performance advantage compared to FFM.

Maurizio