[Roadmap] vLLM Roadmap Q1 2025 #11862

simon-mo · 2025-01-08T18:12:42Z

Zachary-ai-engineer · 2025-01-22T02:26:24Z

Will vLLM consider optimizing communication operations such as all gather/reduce through 4bit or 8bit quantization?

JaheimLee · 2025-01-23T09:24:41Z

When will V1 support fp8 kv cache?

youyc22 · 2025-01-25T12:04:42Z

Will vllm consider supporting sparse attention like streamingLLM and h2o?

new-TonyWang · 2025-01-31T11:48:43Z

Does vllm have plans to optimize host operations? Currently, both scheduling and sampling are handled on the host, which reduces GPU utilization. Is it possible to pipeline the scheduling, model execution, and post-processing operations to improve efficiency?

Three-Stage Pipeline Timeline

Time Step	Scheduling	Model Execution	Post-processing
Step 1	Batch N	—	—
Step 2	Batch N+1	Batch N	—
Step 3	Batch N+2	Batch N+1	Batch N
Step 4	Batch N+3	Batch N+2	Batch N+1
Step 5	Batch N+4	Batch N+3	Batch N+2

simon-mo added misc and removed misc labels Jan 8, 2025

simon-mo pinned this issue Jan 8, 2025

FurtherAI mentioned this issue Jan 9, 2025

[Feature]: Support Multiple Tasks Per Model #11905

Open

1 task

WoutDeRijck mentioned this issue Jan 15, 2025

[Feature]: Add support for attention score output #11365

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Roadmap] vLLM Roadmap Q1 2025 #11862

[Roadmap] vLLM Roadmap Q1 2025 #11862

simon-mo commented Jan 8, 2025 •

edited

Loading

Zachary-ai-engineer commented Jan 22, 2025

JaheimLee commented Jan 23, 2025 •

edited

Loading

youyc22 commented Jan 25, 2025

new-TonyWang commented Jan 31, 2025

[Roadmap] vLLM Roadmap Q1 2025 #11862

[Roadmap] vLLM Roadmap Q1 2025 #11862

Comments

simon-mo commented Jan 8, 2025 • edited Loading

vLLM Core

Model Support

Hardware Support

Optimizations

CI and Developer Productivity

Ecosystem Projects

Zachary-ai-engineer commented Jan 22, 2025

JaheimLee commented Jan 23, 2025 • edited Loading

youyc22 commented Jan 25, 2025

new-TonyWang commented Jan 31, 2025

Three-Stage Pipeline Timeline

simon-mo commented Jan 8, 2025 •

edited

Loading

JaheimLee commented Jan 23, 2025 •

edited

Loading