Open
Description
System Info
lerobot version: latest main
OS version: Ubuntu 22.04
Torch version: 2.6.0
Information
- One of the scripts in the examples/ folder of LeRobot
- My own task or dataset (give details below)
Reproduction
policy = Pi0Policy.from_pretrained("lerobot/pi0")
action = policy.select_action(batch)
Expected behavior
Thanks for implementing Pi0! I gave it a shot and observed good results using my robot arm.
Upon further inspection of your code, I found out that the default normalization mode for input images is identity:
And the policy wrapper makes an assumption that given an identity transform, the input images are already in the range [0, 1], see:
This looks like an important detail which was overlooked. I believe that the default visual transform should be MIN_MAX
to avoid incorrect scaling of images to SigLIP or maybe another way is to add an assertion before we prepare images for the backbone.