[code-reflection] RFR: Model lifetimes of onnx session-related objects more explicitly [v2]

Mon Mar 3 14:47:06 UTC 2025

On Mon, 3 Mar 2025 14:28:55 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

> > Releasing the ONNX environment will cause it to crash the next time one is created when certain kinds of sessions are created. It needs to be created at most once per JVM instantiation.
> 
> OK - this is not documented anywhere though. Is it a bug in the C impl? I've tested my toy API with an autocloseable environment and I have no problem creating a new environment and disposing it after each new model execution in the MNIST demo (I repeated 10K times). Is there some high-level documentation which explains how these various abstractions are meant to be used?
> 
> If environment really has to be global -- that leaves us again with the problem of needed to have a lifetime per session (because a session can create a lot of resources that you don't want to just keep accumulating -- which is fine, but then the API would look more similar to the Python one (e.g. if there's only one environment, then having it implicit seems like a good 80/20 compromise to me).

I documented it in the [Java API](https://github.com/microsoft/onnxruntime/blob/main/java/src/main/java/ai/onnxruntime/OrtEnvironment.java#L20). It's a consequence of how they load in CUDA & other execution providers, I don't think it breaks if it's CPU only (but it is allowed to), and it's not considered a bug in the C API (though the lack of documentation is a problem). The description of how I worked around it is [here](https://github.com/microsoft/onnxruntime/pull/10670). They added support for custom environment creation to [C# in 2023](https://github.com/microsoft/onnxruntime/pull/14723), and there are requests for it in Python. I'm not sure why it's not documented on the C API itself, but the C API has a bunch of undocumented interactions that you only find when they bite you (e.g., the last one I hit was that the CUDA session options update method overwrites all the options rather than adding options, so I added a note for that to the C API docs).

If you want to replicate the functionality of the existing Java API then environment creation control needs to be done by the user so they can specify a global thread pool (which is then shared across all sessions created from that environment) as there are people who use that functionality. But it depends what the scope is here, it can be implicit if you're not trying to provide a production ready API where people need that level of resource control.

-------------

PR Comment: https://git.openjdk.org/babylon/pull/332#issuecomment-2694638609