Skip to content

optimization when lack of CPU #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LukeLIN-web opened this issue May 19, 2025 · 1 comment
Open

optimization when lack of CPU #97

LukeLIN-web opened this issue May 19, 2025 · 1 comment

Comments

@LukeLIN-web
Copy link

top - 15:41:43 up 2 days,  6:18,  8 users,  load average: 148.76, 159.89, 162.12
Tasks: 1130 total,   2 running, 1128 sleeping,   0 stopped,   0 zombie
%Cpu(s): 98.3 us,  1.6 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 257602.6 total,  95593.6 free,  27249.9 used, 134759.2 buff/cache
MiB Swap:   4096.0 total,   4096.0 free,      0.0 used. 228307.0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                          
3439911      20   0   39.7g   3.9g   1.0g R  1302   1.6  23:52.89 pt_main_thread                                                                                                                   
3412216     20   0   40.0g   4.0g   1.0g S  1072   1.6 131:49.12 pt_main_thread                                                                                                                   
3372667     20   0   40.6g   4.7g   1.0g S 162.7   1.9 290:17.24 pt_main_thread                                                                                                                   
3444674       20   0 9892.2m 134088  37632 S 121.9   0.1   0:03.73 ffmpeg             

I launched 4 Google robot evaluations using run_openvla.sh to evaluate 4 checkpoints simultaneously.
However, my 96-core machine is running out of CPU resources.
Do you have any suggestions for improving performance? Thank you!

@xuanlinli17
Copy link
Collaborator

xuanlinli17 commented May 20, 2025

Looks like your model is running on cpu instead of gpu. If it's using cpu, it will consume many cpu resources (and the env itself is slow). Pay attention to the tensorflow / jax info printed out in command line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants