Pi0 visual normalization mode assumes images to be in the range [0,1]

### System Info

```Shell
lerobot version: latest main
OS version: Ubuntu 22.04
Torch version: 2.6.0
```

### Information

- [x] One of the scripts in the examples/ folder of LeRobot
- [ ] My own task or dataset (give details below)

### Reproduction

```
policy = Pi0Policy.from_pretrained("lerobot/pi0")
action = policy.select_action(batch)
```

### Expected behavior

Thanks for implementing Pi0! I gave it a shot and observed good results using my robot arm.

Upon further inspection of your code, I found out that the default normalization mode for input images is identity: https://github.com/huggingface/lerobot/blob/ee5525fea1926a848e0f590a293722b230c15337/lerobot/common/policies/pi0/configuration_pi0.py#L35

And the policy wrapper makes an assumption that given an identity transform, the input images are already in the range [0, 1], see: https://github.com/huggingface/lerobot/blob/ee5525fea1926a848e0f590a293722b230c15337/lerobot/common/policies/pi0/modeling_pi0.py#L360

This looks like an important detail which was overlooked. I believe that the default visual transform should be `MIN_MAX` to avoid incorrect scaling of images to SigLIP or maybe another way is to add an assertion before we prepare images for the backbone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pi0 visual normalization mode assumes images to be in the range [0,1] #1065

System Info

Information

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pi0 visual normalization mode assumes images to be in the range [0,1] #1065

Description

System Info

Information

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions