FFM performance tweaks

Wed Nov 20 21:50:00 UTC 2024

Thanks for confirming Ioannis.

As to why Brian's test doesn't show any improvement, the jury is out. My 
general sense is that we don't know _exactly_ why there's a regression 
there. We know there are some inlining failures, but sometimes 
correlation is not causation. So it is also possible that we're looking 
in the wrong place :-)

(but, keep looking we must :-) )

Cheers
Maurizio

On 20/11/2024 20:50, Ioannis Tsakpinis wrote:
> Hey Maurizio,
>
> FWIW I have verified that the fixes for both issues are working as
> intended in 24-ea+24. I'm not sure why Brian's use case does not see
> any benefits. For JDK-8343394 specifically, in a simple benchmark that
> emulates single-shot access, the improvement is very obvious compared
> to build 23 [1]. (the instruction offsets have been normalized to zero
> for easier comparison)
>
> [1] https://urldefense.com/v3/__https://gist.github.com/Spasi/59c23452510bd7139b7bcdcbdac7dab9__;!!ACWV5N9M2RV99hQ!MbXykI-K088kVPzxuhU7LnYwJzq546KRZ23tnikqegNGFdLly_y5E4qWev9oJSca9yJV8ZRVPQoVarZy5FJzKOU$
>
> On Wed, 20 Nov 2024 at 14:54, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>>
>>> One thing that strikes me as odd is the use of a MemorySegment in a
>>> context in which I really want an "unbounded" memory segment, which
>>> cannot be represented by the MemorySegment interface.
>> I think this is a fair observation. Memory segments are designed to be
>> essentially a safe API. They have all sorts of checks to make sure
>> memory is accessed within bounds and while it's still alive. Even
>> restricted methods like "reinterpret" were mostly designed around
>> interaction with native functions, to attach bounds to pointers of
>> (statically) unknown size/lifetime -- e.g. to make FFI interaction also
>> safer.
>>
>> The underlying expectation with memory segments is that, when used in
>> idiomatic ways (e.g. counted loops etc.) C2 will try very hard (and
>> often succeed) to amortize the costs of all those checks. On the other
>> hand, with Unsafe you access memory at a given address (no questions
>> asked) using a fast JVM intrinsics. So, even when everything works, FFM
>> and Unsafe get at the desired results in very different ways. And this
>> difference cannot be eliminated (at least not in all cases) --
>> fundamentally, Unsafe is a much lower level API than FFM.
>>
>> So, while we keep looking for opportunities to make memory segment
>> access faster, I don't think it's particularly fruitful to chase dubious
>> API changes which turn the memory segment API into what it's not, e.g.
>> by adding first-class support for unbounded memory segment, or cramming
>> additional access primitives into memory segments so that one can get
>> more direct access to Unsafe. While these API changes will give power
>> users what they want, they will inevitably create confusion for
>> everybody else, and the temptation to use unchecked access will be too
>> great, even in cases where safe access is enough (the majority).
>>
>> The evidence *so far* does not seem to support the addition of a
>> lower-level off-heap access primitive -- possibly through a completely
>> different API. After all, in most cases FFM can be used in idiomatic
>> way, without issues. And, in less fortunate cases (e.g. random access),
>> the cost of bound checks can still be managed (even if not completely
>> eliminated) with the workarounds described in the past. I think it's
>> fair to say that, while we don't want to completely close the door on
>> such an API, we also don't want to jump into it prematurely. It would be
>> better to wait a bit more and see (a) how much memory segment access can
>> be improved and (b) whether more use cases will emerge where such an API
>> would bring considerable performance advantage compared to FFM.
>>
>> Maurizio
>>