Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025
#15735 opened Mar 29, 2025 by simon-mo
Open 1
[V1] Feedback Thread
#12568 opened Jan 30, 2025 by simon-mo
Open 83
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[Bug]: invalid responses when generating yaml format bug Something isn't working
#16269 opened Apr 8, 2025 by Glebbot
1 task done
[RFC]: TPU V1 Sampler planning RFC
#16268 opened Apr 8, 2025 by NickLucche
3 of 11 tasks
[Bug]: Not supporting CUDA12.8 bug Something isn't working
#16267 opened Apr 8, 2025 by liurui416
1 task done
[Performance]: H100 Optimisation Configuration For Offline Inferencing performance Performance-related issues
#16265 opened Apr 8, 2025 by mohanajuhi166
2 tasks done
[Feature]: ray logs too large feature request New feature or request
#16262 opened Apr 8, 2025 by ErykCh
1 task done
[Bug]: vLLM still runs after Ray workers crash bug Something isn't working
#16259 opened Apr 8, 2025 by ccdumitrascu
1 task done
[Usage]: The performance of ngram speculative decoding usage How to use vllm
#16258 opened Apr 8, 2025 by dtransposed
1 task done
[Bug]: Problem Load llama3.2-11B-Vision-Instruct-INT4-GPTQ bug Something isn't working
#16254 opened Apr 8, 2025 by fahadh4ilyas
1 task done
[Usage]: How to use xPyD disaggregated prefilling usage How to use vllm
#16253 opened Apr 8, 2025 by leoyuppieqnew
1 task done
[Usage]: Async generate with offline LLM interface usage How to use vllm
#16251 opened Apr 8, 2025 by SparkJiao
1 task done
[Performance]: qwen2.5vl preprocess videos very slow after several batches performance Performance-related issues
#16249 opened Apr 8, 2025 by Zooy138
1 task done
[New Model]: efficient-speech/lite-whisper-large-v3
#16244 opened Apr 8, 2025 by JakubCerven
1 task done
[Usage]: Failed to get global TPU topology. usage How to use vllm
#16243 opened Apr 8, 2025 by adityarajsahu
1 task done
[Usage]: ERROR:root:Compiled DAG task exited with exception usage How to use vllm
#16242 opened Apr 8, 2025 by vrascal
1 task done
[Bug]: LLM.beam_search Doesn't Pass Multimodal Data bug Something isn't working
#16240 opened Apr 8, 2025 by alex-jw-brooks
1 task done
[Bug]: how to use tests/distributed/test_custom_all_reduce.py bug Something isn't working
#16238 opened Apr 8, 2025 by zhink
1 task done
[Usage]: Multiple Models on Same Port usage How to use vllm
#16232 opened Apr 8, 2025 by dipta007
1 task done
[Bug]: failed to load deepseek-r1 AWQ quantization on CPU bug Something isn't working
#16230 opened Apr 8, 2025 by spaceater
1 task done
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.