Description
Hi, I’m deploying a policy trained in Isaac Lab using RSL-RL, and I have a question regarding the ONNX policy output.
In my task configuration, I used an action_scale (e.g., action_scale = 2.0) when defining the action space.
After training, I exported the policy to ONNX format and tested it with observation inputs. What I found was that the output values from the ONNX model exceed the [-1, 1] range, sometimes reaching values like 1.5 or -2.0.
So I’d like to confirm:
"Does the ONNX policy output already apply the action_scale used in the Isaac Lab task config?
Or does the ONNX output remain in the normalized [-1, 1] range, and I need to multiply it by action_scale manually in the Sim2Real deployment code?"
This clarification would help me understand whether I need to post-process the output before sending commands to the real robot.
Thanks for your help!