[code-reflection] RFR: ONNX FFM Runtime initial work [v3]

Maurizio Cimadamore mcimadamore at openjdk.org
Mon Feb 10 13:21:24 UTC 2025


On Sun, 9 Feb 2025 14:31:14 GMT, Adam Sotona <asotona at openjdk.org> wrote:

>> This is initial work on ONNX FFM runtime with very raw connection with OnnxInterpreter and Tensor.
>> 
>> It is a rebase of https://github.com/PaulSandoz/babylon/pull/1
>
> Adam Sotona has updated the pull request incrementally with one additional commit since the last revision:
> 
>   minor rename

cr-examples/onnx/src/main/java/oracle/code/onnx/OnnxRuntime.java line 214:

> 212:         }
> 213: 
> 214:         public final class Session implements AutoCloseable {

There's another lifetime here - that of the session. I notice few things here:
* you expect access to be single-threaded (otherwise, reusing the `ret` output segment will backfire)
* you expect the session to be no longer used after the close method on session is closed

Both aspects could be modeled by having `Session` carry a new confined arena, and reinterpret the session segment to that arena (so that access is thread-confined, and only possible while the session is alive). When the session is closed, you  should call the confined arena close. You can also register (while reinterpreting) a manual cleanup action, to be invoked on the session segment when the confined arena is closed (e.g. to call `releaseSession`).

This will guarantee that no method on session can be called _after_ the session has been closed (note any such call will likely result in a JVM crash because of use-after-free, so that seems valuable?)

cr-examples/onnx/src/main/java/oracle/code/onnx/OnnxRuntime.java line 224:

> 222:             public int getNumberOfInputs() {
> 223:                 try {
> 224:                     return retInt(sessionGetInputCount.invokeExact(sessionAddress, ret));

I note the use of this shared `ret` output segment. We plan to add more capabilities to FFM to address the case of recyclable allocation -- this will help code like this to avoid the shared segment, at the same time while avoiding the cost of a malloc per call.

cr-examples/onnx/src/main/java/oracle/code/onnx/OnnxRuntime.java line 524:

> 522: 
> 523:     private void checkStatus(Object res) {
> 524:         if (!res.equals(MemorySegment.NULL) && res instanceof MemorySegment status) {

This is another example of "hardwired logic". There's a struct definition for this:


struct OrtStatus {
  OrtErrorCode code;
  char msg[1];  // a null-terminated string
};


And the code is optimistically assuming that this layout will hold (across different versions of the API and across different platforms). Ok for a prototype of course, but that's another thing jextract might help you with.

-------------

PR Review Comment: https://git.openjdk.org/babylon/pull/311#discussion_r1949032243
PR Review Comment: https://git.openjdk.org/babylon/pull/311#discussion_r1949038116
PR Review Comment: https://git.openjdk.org/babylon/pull/311#discussion_r1949035987


More information about the babylon-dev mailing list