[code-reflection] RFR: Model lifetimes of onnx session-related objects more explicitly [v2]
Paul Sandoz
psandoz at openjdk.org
Mon Mar 3 20:52:02 UTC 2025
On Mon, 3 Mar 2025 12:21:38 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> The class representing an onnx session is auto closeable. But, in the current code, a session is closed immediately after its `run` method is called. This is problematic because a session returns some ORTValues (tensors) which also need to be freed, but that cannot be freed immediately after calling `run` (as they need to be used by clients).
>>
>> To address this problem, I tweaked the session code to accept an external arena. All the allocation of session-related data structures now happens using that external arena. This means that the client can now be in charge of managing the lifetime of a session (see changes to MNIST demo).
>>
>> To test, I tweaked the MNIST code to do 10K iterations on each button pressed. Predictably, a single button pressed resulted in over 3g of memory being leaked. With these changes the memory arrives at ~400K (there is still some minor leak, but not sure worth pushing more).
>>
>> If the changes to the demo are not deemed good, I can withdraw this PR -- I mostly wanted to capture the result of my exploration somewhere.
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
>
> Rename local arena variables
I think it will take us more time to work out the API layering and names that don't conflict with ONNX concepts. Focusing on the memory management for now, with an Arena, is i think like the right thing to do so it behaves correctly and does not leak and can deallocate, if chosen to, in a timely manner.
Later we can then improve on the API layering. There is 1) the binding to the C API, creating an Java API that is idiomatically like C, 2) something that wraps the binding in idiomatic Java, and 3) something higher that encapsulates the scripting whose implementation is composed from 2. Meaning we should eventually be able to support the deployment of existing ONNX models and/or those generated from Java code.
-------------
PR Comment: https://git.openjdk.org/babylon/pull/332#issuecomment-2695503565
More information about the babylon-dev
mailing list