Python driver for torch model #3093

chentong319 · 2025-03-11T02:01:57Z

This PR provide a python package to run a torch model with onnx-mlir. The previous package onnxmlir is used to provide the basic functionality of compilation and execution of an ONNX model.
This package focused on providing a hook to torch model, exporting torch model to onnx and caching the history to reduce the redundancy in exporting, compiling and .so loading.
This PR supports two ways of using torch model:

Use a torch model directly wrapped with this package. E.g.
```
 mod = ONNXMLIRTorch(mod)
 result = mod(inputs)
```
When a torch model is called by torch.compile, this package can be used as a customized backend. E.g.
```
mod = torch.compile(mod, backend=onnxmlirtorch.onnxmlir_backend)
result = mod(inputs)
```

I also added a support for debugging: intercept the forward call either w/o torch.compile to just print out the parameters and use the original forward() function as inference engine. I will expand this functionality with torch.onnx.export or torch.dynamo.export to generate onnx model for model used in complicated env (such as hugging face framework) with another PR.

Details can be found in the beginning comment of onnxmlirtorch.py

The test cases in onnxmlirtorch/tests passed the test on a x86 machine with locally built compiler.

Known issues:

Cache is not used for torch.compile. The model may be optimized/changed according to the inputs by torch. But it is not implemented yet to the check whether the model is the same. With the first approach of direct use the model, the model is assumed to remain the same.
The inputs are assumed to be tensor. It could be other types.
The cache is for the constant shape of each inference. No effort to detect the dynamic dimension yet.

I know TRL has done lots of work in this area. Comments and suggestions are welcome!

Signed-off-by: Chen Tong <[email protected]>

tungld

LGTM!

Really appreciate the work here to make onnx-mlir work seamlessly with pytorch! Thank you very much!

chentong319 · 2025-03-24T02:37:29Z

@tungld I will add some change to support call compiler container from torch container. I just found that I need to add tag to shared library even though the session is not reused.

Signed-off-by: Chen Tong <[email protected]>

Sunny-Anand

Some minor comments, overall looks good. Good work to make the pytorch to onnx model conversion simpler. Thanks for the pr.

Sunny-Anand · 2025-03-24T18:19:43Z

src/Runtime/python/onnxmlirtorch/pyproject.toml

+]
+description = "Python driver to compile/run torch model with onnx-mlir"
+readme = "README.md"
+requires-python = ">=3.8"


With python3.8 reaching end of life https://devguide.python.org/versions/ should we make this 3.9 and above as requirement to begin with?

Sunny-Anand · 2025-03-24T18:21:47Z

src/Runtime/python/onnxmlirtorch/README.md

+At top of onnx-mlir: `pip3 install -e src/Runtime/python/onnxmlirtorch`
+
+### Install from repo
+After the package is uploaded to pip server, you can install with 'pip3 install onnxmlirtorch`


Are we planning to provide this as part of the pyruntime pip package support?

Yes, we will do that later when the package becomes stable. Currently, I test with local development install: pip install -e mypath/onnxmlirtorch

Sunny-Anand · 2025-03-24T18:27:58Z