You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
In Onnx Runtime Backend of Triton Server, when trying to load an ONNX model with OpenVINO as the CPU execution accelerator, the server throws an error indicating that the device with "NPU" name is not registered in the OpenVINO Runtime. This occurs despite explicitly configuring the parameter device_type as "CPU" in the model configuration as documented here.
Triton Information
Triton Server Version: 2.55.0 (Docker image: nvcr.io/nvidia/tritonserver:25.02-py3)
Backend: ONNX Runtime with OpenVINO execution accelerator
To Reproduce
Set up Triton Server using the Docker image nvcr.io/nvidia/tritonserver:25.02-py3.
Use an onnx model and add the following OpenVINO CPU optimization settings in the config.pbtxt file:
Launch the Triton Server, the model fails to load, and the server logs the following error:
I0319 23:51:53.570075 144 model_lifecycle.cc:473] "loading: onnx_model:1"
I0319 23:51:53.572942 144 onnxruntime.cc:2900] "TRITONBACKEND_Initialize: onnxruntime"
I0319 23:51:53.572971 144 onnxruntime.cc:2910] "Triton TRITONBACKEND API version: 1.19"
I0319 23:51:53.572984 144 onnxruntime.cc:2916] "'onnxruntime' TRITONBACKEND API version: 1.19"
I0319 23:51:53.572997 144 onnxruntime.cc:2946] "backend configuration:\n{\"cmdline\":{\"auto-complete-config\":\"false\",\"backend-directory\":\"/opt/tritonserver/backends\",\"device_type\":\"CPU\",\"min-compute-capability\":\"6.000000\",\"default-max-batch-size\":\"4\"}}"
I0319 23:51:53.594733 144 onnxruntime.cc:3011] "TRITONBACKEND_ModelInitialize: onnx_model (version 1)"
I0319 23:51:53.595868 144 onnxruntime.cc:3076] "TRITONBACKEND_ModelInstanceInitialize: onnx_model_0_0 (CPU device 0)"
2025-03-19 23:51:53.610138949 [W:onnxruntime:log, openvino_provider_factory.cc:209 operator()] Empty OV Config Map passed. Skipping load_config option parsing.
2025-03-19 23:51:53.746170745 [E:onnxruntime:, inference_session.cc:2117 operator()] Exception during initialization: Exception from src/inference/src/cpp/core.cpp:266:
Exception from src/inference/src/dev/core_impl.cpp:609:
Device with "NPU" name is not registered in the OpenVINO Runtime
I0319 23:51:53.756724 144 onnxruntime.cc:3128] "TRITONBACKEND_ModelInstanceFinalize: delete instance state"
E0319 23:51:53.756980 144 backend_model.cc:692] "ERROR: Failed to create instance: onnx runtime error 6: Exception during initialization: Exception from src/inference/src/cpp/core.cpp:266:\nException from src/inference/src/dev/core_impl.cpp:609:\nDevice with \"NPU\" name is not registered in the OpenVINO Runtime\n\n"
I0319 23:51:53.757025 144 onnxruntime.cc:3052] "TRITONBACKEND_ModelFinalize: delete model state"
E0319 23:51:53.757088 144 model_lifecycle.cc:654] "failed to load 'onnx_model' version 1: Internal: onnx runtime error 6: Exception during initialization: Exception from src/inference/src/cpp/core.cpp:266:\nException from src/inference/src/dev/core_impl.cpp:609:\nDevice with \"NPU\" name is not registered in the OpenVINO Runtime\n\n"
I0319 23:51:53.757130 144 model_lifecycle.cc:789] "failed to load 'onnx_model'"
I0319 23:51:53.757248 144 server.cc:604]
Despite the machine having no NPU, and explicitly specifying "CPU" as the device type in the configuration, the runtime still references an "NPU" device causing this error.
Expected behavior
The ONNX model should load successfully with OpenVINO optimizing the inference on the CPU. No errors related to an "NPU" device should occur
The text was updated successfully, but these errors were encountered:
Description
In Onnx Runtime Backend of Triton Server, when trying to load an ONNX model with OpenVINO as the CPU execution accelerator, the server throws an error indicating that the device with "NPU" name is not registered in the OpenVINO Runtime. This occurs despite explicitly configuring the parameter
device_type
as "CPU" in the model configuration as documented here.Triton Information
Triton Server Version:
2.55.0
(Docker image:nvcr.io/nvidia/tritonserver:25.02-py3
)Backend: ONNX Runtime with OpenVINO execution accelerator
To Reproduce
nvcr.io/nvidia/tritonserver:25.02-py3
.config.pbtxt
file:Despite the machine having no NPU, and explicitly specifying "CPU" as the device type in the configuration, the runtime still references an "NPU" device causing this error.
Expected behavior
The ONNX model should load successfully with OpenVINO optimizing the inference on the CPU. No errors related to an "NPU" device should occur
The text was updated successfully, but these errors were encountered: