Welcome to PhysVLM

📖PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

This is the official repository for PhysVLM. The goal of PhysVLM is to enable Vision-Language Models (VLMs) to understand robot reachability, with the future aim of making the action decisions generated by the model more reliable.

Release

2025.03.18 Release the Phys100k-physqa dataset and the Model at 🤗HuggingFace.
2025.03.12 🔥Paper release 📕Arxiv.
2025.03.12 🔥Release the Benchmark: EQA-phys-val-sim.
2025.02.27 🔥PhysVLM has been accepted to CVPR 2025.
🔥Release the code of Phys-VLM.

What can PhysVLM do now?

PhysVLM demonstrates advanced performance across reachability understanding、EmbodiedQA、VQA.

Get Started

1.Clone & Install

git clone [email protected]:unira-zwj/PhysVLM.git
cd PhysVLM/physvlm-main
pip install -e .
pip install -e ".[train]"
pip install flash-attn --no-build-isolation

2.Download the PhysVLM models to the checkpoints folder.

Model	Links
PhysVLM-3B	`🤗HuggingFace`

3.Inference

python start_physvlm_server.py

Then you can request the server with (app, host="0.0.0.0", port=8001),

example: run python inference.py for easy inference,

or cd eval & python eval_phys_bench_sim.py for EQA-phys benchmark evaluation.

Acknowledgement

LLaVA provides the base codes.
qwen provides the basic llm model.

Citation

If you find PhysVLM useful for your research and applications, please cite using this BibTeX:

@misc{zhou2025physvlmenablingvisuallanguage,
      title={PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability}, 
      author={Weijie Zhou and Manli Tao and Chaoyang Zhao and Haiyun Guo and Honghui Dong and Ming Tang and Jinqiao Wang},
      year={2025},
      eprint={2503.08481},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2503.08481}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
EQA-phys-simulator		EQA-phys-simulator
assert		assert
physvlm-main		physvlm-main
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Welcome to PhysVLM

Release

What can PhysVLM do now?

Get Started

1.Clone & Install

2.Download the PhysVLM models to the checkpoints folder.

3.Inference

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

unira-zwj/PhysVLM

Folders and files

Latest commit

History

Repository files navigation

Welcome to PhysVLM

Release

What can PhysVLM do now?

Get Started

1.Clone & Install

2.Download the PhysVLM models to the checkpoints folder.

3.Inference

Acknowledgement

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages