[code-reflection] RFR: Model lifetimes of onnx session-related objects more explicitly

Maurizio Cimadamore mcimadamore at openjdk.org
Mon Mar 3 12:21:39 UTC 2025


On Fri, 28 Feb 2025 12:42:24 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

> The class representing an onnx session is auto closeable. But, in the current code, a session is closed immediately after its `run` method is called. This is problematic because a session returns some ORTValues (tensors) which also need to be freed, but that cannot be freed immediately after calling `run` (as they need to be used by clients).
> 
> To address this problem, I tweaked the session code to accept an external arena. All the allocation of session-related data structures now happens using that external arena. This means that the client can now be in charge of managing the lifetime of a session (see changes to MNIST demo).
> 
> To test, I tweaked the MNIST code to do 10K iterations on each button pressed. Predictably, a single button pressed resulted in over 3g of memory being leaked. With these changes the memory arrives at ~400K (there is still some minor leak, but not sure worth pushing more).
> 
> If the changes to the demo are not deemed good, I can withdraw this PR -- I mostly wanted to capture the result of my exploration somewhere.

This is an example of a possible higher-level API (built on top of the changes in this PR)

https://github.com/mcimadamore/babylon/compare/onnx_session_lifetime...mcimadamore:babylon:high-level-API?expand=1

For now -- it is added as an inner class of OnnxRuntime, since there's so much reuse (e.g. of `runtimeAddress` and other related helper functions). But the idea is that users will do:


try (OnnxRuntime.Environment env = OnnxRuntime.newEnv()) {
    env.execute(...)
    ...
} // close resources

-------------

PR Comment: https://git.openjdk.org/babylon/pull/332#issuecomment-2694179389


More information about the babylon-dev mailing list