Parallel running #107

Boltzmachine · 2025-06-09T18:15:00Z

Hi, I have tried running it on a machine parallelly (on one or more GPUs). But I found there might be some issues doing so. Do you or anyone else have any experience running multiple instances on the same machine/same GPU?

(I haven't dived into debugging it. Just want to collect some general information rather than asking help for a specific bug)

xuanlinli17 · 2025-06-09T18:20:38Z

@StoneT2000

StoneT2000 · 2025-06-09T18:22:26Z

Try this: https://maniskill.readthedocs.io/en/latest/user_guide/getting_started/quickstart.html#additional-gpu-simulation-rendering-customization

You need to set which devices are visible for the simulation process.

Boltzmachine · 2025-06-09T18:27:17Z

Thanks so is it supposed to work for maniskill2? I encoutered some bugs that occur randomly when not running on GPU:0?

xuanlinli17 · 2025-06-09T18:40:25Z

This repo (non-parallelized) is in ManiSkill2. Try DISPLAY="" CUDA_VISIBLE_DEVICES=xx python {cmd}

Boltzmachine · 2025-06-09T20:31:29Z

When I try

DISPLAY="" CUDA_VISIBLE_DEVICES=N  python simpler_env/main_inference.py --policy-model spatialvla --ckpt-path IPEC-COMMUNITY/spatialvla-4b-224-pt --action-ensemble-temp -0.8 --logging-dir results/spatialvla-4b-224-pt-0.8 --robot google_robot_static --control-freq 3 --sim-freq 513 --max-episode-steps 113 --env-name OpenTopDrawerCustomInScene-v0 --scene-name frl_apartment_stage_simple --robot-init-x 0.65 0.85 3 --robot-init-y -0.2 0.2 3 --robot-init-rot-quat-center 0 0 0 1 --robot-init-rot-rpy-range 0 0 1 0 0 1 0.0 0.0 1 --obj-init-x-range 0 0 1 --obj-init-y-range 0 0 1 --enable-raytracing

I always got some GPU memories occupied on GPU 0, N-1, N.
For example, if I set CUDA_VISIBLE_DEVICES=7, I will get this

And sorry I am actually using SimplerEnv-OpenVLA

Boltzmachine · 2025-06-09T20:47:10Z

I found a more precise cause

CUDA_VISIBLE_DEVICES=7 ipython
import sapien
sapien.core.SapienRenderer()

You will get the GPU memoris on 0 and 6

I use sapien==2.2.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallel running #107

Parallel running #107

Boltzmachine commented Jun 9, 2025

xuanlinli17 commented Jun 9, 2025

Uh oh!

StoneT2000 commented Jun 9, 2025

Uh oh!

Boltzmachine commented Jun 9, 2025

Uh oh!

xuanlinli17 commented Jun 9, 2025 •

edited

Loading

Uh oh!

Boltzmachine commented Jun 9, 2025 •

edited

Loading

Uh oh!

Boltzmachine commented Jun 9, 2025 •

edited

Loading

Uh oh!

Parallel running #107

Parallel running #107

Comments

Boltzmachine commented Jun 9, 2025

xuanlinli17 commented Jun 9, 2025

Uh oh!

StoneT2000 commented Jun 9, 2025

Uh oh!

Boltzmachine commented Jun 9, 2025

Uh oh!

xuanlinli17 commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Boltzmachine commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Boltzmachine commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuanlinli17 commented Jun 9, 2025 •

edited

Loading

Boltzmachine commented Jun 9, 2025 •

edited

Loading

Boltzmachine commented Jun 9, 2025 •

edited

Loading