vllm-project / vllm Public

Sponsor vllm-project/vllm
Notifications
Fork 6.7k
Star 43.8k

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025

#15735 opened Mar 29, 2025 by simon-mo

Open 1

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 83

Labels 45 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,618 Open 6,140 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: vLLM should prevent setting max_model_len < local attention size for Llama-4 models bug

Something isn't working

#16274 opened Apr 8, 2025 by eldarkurtic

1 task done

[Bug]: invalid responses when generating yaml format bug

Something isn't working

#16269 opened Apr 8, 2025 by Glebbot

1 task done

[RFC]: TPU V1 Sampler planning RFC

#16268 opened Apr 8, 2025 by NickLucche

3 of 11 tasks

[Bug]: Not supporting CUDA12.8 bug

Something isn't working

#16267 opened Apr 8, 2025 by liurui416

1 task done

[Performance]: H100 Optimisation Configuration For Offline Inferencing performance

Performance-related issues

#16265 opened Apr 8, 2025 by mohanajuhi166

2 tasks done

[Feature]: ray logs too large feature request

New feature or request

#16262 opened Apr 8, 2025 by ErykCh

1 task done

[Performance]: FP8 does not demonstrate an inference speed superior to that of FP16 performance

Performance-related issues

#16261 opened Apr 8, 2025 by Shuai-Xie

1 task done

[Feature]: Will you add padding for intermediate_size just like lmdeploy? feature request

New feature or request

#16260 opened Apr 8, 2025 by Einsturing

1 task done

[Bug]: vLLM still runs after Ray workers crash bug

Something isn't working

#16259 opened Apr 8, 2025 by ccdumitrascu

1 task done

[Usage]: The performance of ngram speculative decoding usage

How to use vllm

#16258 opened Apr 8, 2025 by dtransposed

1 task done

[Bug]: Problem Load llama3.2-11B-Vision-Instruct-INT4-GPTQ bug

Something isn't working

#16254 opened Apr 8, 2025 by fahadh4ilyas

1 task done

[Usage]: How to use xPyD disaggregated prefilling usage

How to use vllm

#16253 opened Apr 8, 2025 by leoyuppieqnew

1 task done

[Usage]: Async generate with offline LLM interface usage

How to use vllm

#16251 opened Apr 8, 2025 by SparkJiao

1 task done

[Usage]: how to set vLLM message queue communication handle's connect_ip to 127.0.0.1 usage

How to use vllm

#16250 opened Apr 8, 2025 by FanYaning

[Performance]: qwen2.5vl preprocess videos very slow after several batches performance

Performance-related issues

#16249 opened Apr 8, 2025 by Zooy138

1 task done

[Bug]: OPEA/Mistral-Small-3.1-24B-Instruct-2503-int4-AutoRound-awq-sym, VLLM Chat error :- can only concatenate str (not "list") to str bug

Something isn't working

#16245 opened Apr 8, 2025 by Karan-i3

1 task done

[New Model]: efficient-speech/lite-whisper-large-v3

#16244 opened Apr 8, 2025 by JakubCerven

1 task done

[Usage]: Failed to get global TPU topology. usage

How to use vllm

#16243 opened Apr 8, 2025 by adityarajsahu

1 task done

[Usage]: ERROR:root:Compiled DAG task exited with exception usage

How to use vllm

#16242 opened Apr 8, 2025 by vrascal

1 task done

[Bug]: LLM.beam_search Doesn't Pass Multimodal Data bug

Something isn't working

#16240 opened Apr 8, 2025 by alex-jw-brooks

1 task done

[Bug]: how to use tests/distributed/test_custom_all_reduce.py bug

Something isn't working

#16238 opened Apr 8, 2025 by zhink

1 task done

[Bug]: Calling /wake_up after /sleep and then sending a request leads to improper LLM response bug

Something isn't working

#16234 opened Apr 8, 2025 by akshayqylis

1 task done

[Usage]: Multiple Models on Same Port usage

How to use vllm

#16232 opened Apr 8, 2025 by dipta007

1 task done

[Feature]: Support Pipeline Parallelism on Llama-4-Maverick-17B-128E feature request

New feature or request

#16231 opened Apr 8, 2025 by Edwinhr716

1 task done

[Bug]: failed to load deepseek-r1 AWQ quantization on CPU bug

Something isn't working

#16230 opened Apr 8, 2025 by spaceater

1 task done

Previous 1 2 3 4 5 … 64 65 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly