March 08, 2026

vLLM 0.17 Ships FlashAttention 4 and Live MoE Scaling

6 minutes

vLLM v0.17.0 adds FlashAttention 4, elastic expert parallelism for live MoE rescaling, full Qwen3.5 support, and a performance-mode flag, all in 699 commits from 272 contributors.

...more

View all episodes

By Awesome Agents

March 08, 2026

vLLM 0.17 Ships FlashAttention 4 and Live MoE Scaling

6 minutes

vLLM v0.17.0 adds FlashAttention 4, elastic expert parallelism for live MoE rescaling, full Qwen3.5 support, and a performance-mode flag, all in 699 commits from 272 contributors.

...more

Share vLLM 0.17 Ships FlashAttention 4 and Live MoE Scaling

Sign up to save your podcasts

vLLM 0.17 Ships FlashAttention 4 and Live MoE Scaling

vLLM 0.17 Ships FlashAttention 4 and Live MoE Scaling