[code-reflection] RFR: Model lifetimes of onnx session-related objects more explicitly [v2]

Adam Sotona asotona at openjdk.org
Mon Mar 3 15:24:14 UTC 2025


On Mon, 3 Mar 2025 12:21:38 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> The class representing an onnx session is auto closeable. But, in the current code, a session is closed immediately after its `run` method is called. This is problematic because a session returns some ORTValues (tensors) which also need to be freed, but that cannot be freed immediately after calling `run` (as they need to be used by clients).
>> 
>> To address this problem, I tweaked the session code to accept an external arena. All the allocation of session-related data structures now happens using that external arena. This means that the client can now be in charge of managing the lifetime of a session (see changes to MNIST demo).
>> 
>> To test, I tweaked the MNIST code to do 10K iterations on each button pressed. Predictably, a single button pressed resulted in over 3g of memory being leaked. With these changes the memory arrives at ~400K (there is still some minor leak, but not sure worth pushing more).
>> 
>> If the changes to the demo are not deemed good, I can withdraw this PR -- I mostly wanted to capture the result of my exploration somewhere.
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Rename local arena variables

I would suggest to merge this PR first as a fix.

Current OnnxRuntime is a simple prototype focused on a single use case and needs to be refactored (maybe multiple times) to physically support all potential usecases (which we didn't even collected yet). A nice API box might be designed around it later.

>From all the discussion here I see the instantiation of the environment should be explicit (even doing it twice may cause a crash) and it might theoretically serve as a wrapper for default arena/allocator. However an option to shorten lifetime of the execution results should remain.  

I propose an alternative solution - create a new Arena.ofAuto() for each model/session execution and associate the results with it, so they are GCed and released when no more needed.

-------------

PR Comment: https://git.openjdk.org/babylon/pull/332#issuecomment-2694749938


More information about the babylon-dev mailing list