Skip to content

Pi0 visual normalization mode assumes images to be in the range [0,1] #1065

Open
@atharva-18

Description

@atharva-18

System Info

lerobot version: latest main
OS version: Ubuntu 22.04
Torch version: 2.6.0

Information

  • One of the scripts in the examples/ folder of LeRobot
  • My own task or dataset (give details below)

Reproduction

policy = Pi0Policy.from_pretrained("lerobot/pi0")
action = policy.select_action(batch)

Expected behavior

Thanks for implementing Pi0! I gave it a shot and observed good results using my robot arm.

Upon further inspection of your code, I found out that the default normalization mode for input images is identity:

"VISUAL": NormalizationMode.IDENTITY,

And the policy wrapper makes an assumption that given an identity transform, the input images are already in the range [0, 1], see:

# Normalize from range [0,1] to [-1,1] as expacted by siglip

This looks like an important detail which was overlooked. I believe that the default visual transform should be MIN_MAX to avoid incorrect scaling of images to SigLIP or maybe another way is to add an assertion before we prepare images for the backbone.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn’t working correctlypoliciesItems related to robot policies

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions