pi0-fintune-performance #427

yanghb1 · 2025-04-09T01:23:36Z

I have been fine-tuning the provided pi0-base model on my dataset using LeRobot. After training for 100,000 steps, I found that the model performs well on tasks that appeared in my dataset, but its performance on unseen tasks is very poor. It seems to lack the generalization ability of a VLA model. Is this phenomenon normal? Are there any strategies to improve this situation?

uzhilinsky · 2025-04-10T00:09:31Z

We've certainly been able to successfully fine-tune pi0-base on novel tasks. Hard to say what's going on in your case without having more context.

Just in case, are you using the right norm stats?

yanghb1 · 2025-04-10T02:23:10Z

Thank you for your patient response. May the sunshine bathe you. I am using pi0-base for training and fine-tuning on the lerobot project. The base model is from https://huggingface.co/lerobot/pi0. My training dataset contains 700 collected trajectories with only one task: opening a cabinet door. After training for 100,000 steps, the loss has converged to 0.03. During inference, I found that the model can perform the task of opening the cabinet door as in the dataset, but it performs poorly on other instructions, such as grabbing a ballpoint pen or moving the yellow tape to the left of the blue tape, and still tends to open the cabinet door. Is this phenomenon normal? Are there any other ways to improve this? I read in the paper that I thought fine-tuning the expert model and state projector with local data was sufficient for executing various instructions on a local robotic arm.

uzhilinsky · 2025-04-10T04:06:37Z

100,000 steps is a lot for such a small dataset. It's very likely that the model has overfit to your data.

Note that we are using 20K or 30K in our fine-tuning examples with much larger datasets. Have you tried training with fewer steps?

yanghb1 · 2025-04-10T08:17:57Z

thank you ! We tested models with 20k-30k steps, and the loss finally converged to 0.03. The performance was relatively poor, and the model could only execute tasks from the fine-tuning dataset. Under normal circumstances, when fine-tuning π₀ using a small single-task dataset, can the model, after training, not only perform tasks from the local dataset but also execute tasks from your pre-training stage? Does the model have such generalization capability?

oxFFFF-Q · 2025-04-21T12:39:01Z

Same issue here. Did you solve it?

YanJiaHuan · 2025-05-06T08:29:58Z

anyone tried they said?
With larger dataset and smaller training steps.
I tried 200 demos (10-15s each,10hz) with over 300k steps of training, the model performs terrible, simple pick&place task.

uzhilinsky self-assigned this Apr 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pi0-fintune-performance #427

pi0-fintune-performance #427

yanghb1 commented Apr 9, 2025

uzhilinsky commented Apr 10, 2025

Uh oh!

yanghb1 commented Apr 10, 2025

Uh oh!

uzhilinsky commented Apr 10, 2025

Uh oh!

yanghb1 commented Apr 10, 2025

Uh oh!

oxFFFF-Q commented Apr 21, 2025

Uh oh!

YanJiaHuan commented May 6, 2025

Uh oh!

pi0-fintune-performance #427

pi0-fintune-performance #427

Comments

yanghb1 commented Apr 9, 2025

uzhilinsky commented Apr 10, 2025

Uh oh!

yanghb1 commented Apr 10, 2025

Uh oh!

uzhilinsky commented Apr 10, 2025

Uh oh!

yanghb1 commented Apr 10, 2025

Uh oh!

oxFFFF-Q commented Apr 21, 2025

Uh oh!

YanJiaHuan commented May 6, 2025

Uh oh!