Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add test for deep gemm matmul
#13932 opened Feb 26, 2025 by bnellnm Draft
[V1] EP + DP Attention WIP
#13931 opened Feb 26, 2025 by tlrmchlsmth Draft
[misc] Rename Ray ADAG to Compiled Graph ready ONLY add when PR is ready to merge/full CI is needed
#13928 opened Feb 26, 2025 by ruisearch42 Loading…
Add RELEASE.md documentation Improvements or additions to documentation
#13926 opened Feb 26, 2025 by atalman Loading…
[V1] AsyncLLM data parallel WIP v1
#13923 opened Feb 26, 2025 by njhill Draft
[ROCm][V1] Update reshape_and_cache to properly work with CUDA graph padding ready ONLY add when PR is ready to merge/full CI is needed
#13922 opened Feb 26, 2025 by SageMoore Loading…
Fix test_block_fp8.py test for MoE
#13915 opened Feb 26, 2025 by mgoin Loading…
Upgrade transformers to v4.49.0 ci/build documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#13905 opened Feb 26, 2025 by hmellor Loading…
Fix TPU CI ci/build
#13898 opened Feb 26, 2025 by mgoin Loading…
Fix mla prefill context performance
#13897 opened Feb 26, 2025 by ZhongYingMatrix Loading…
XGRAMMAR now support aarch64 structured-output
#13894 opened Feb 26, 2025 by johnnynunez Loading…
Use smaller embedding model when not testing model specifically ready ONLY add when PR is ready to merge/full CI is needed
#13891 opened Feb 26, 2025 by hmellor Loading…
[PP] Correct cache size check
#13873 opened Feb 26, 2025 by zhengy001 Loading…
ProTip! Follow long discussions with comments:>50.