[V1][Pixtral-HF] Add custom `slice_encoder_output` for Pixtral #13080

lk-chen · 2025-02-11T09:01:15Z

Prepare for #11409

This PR allows model to override _gather_encoder_outputs logic. Models like Pixtral need to take special tokens (break/end token) into consideration while gathering soft tokens.

See this comment for more context.

Test

Tested by running

VLLM_USE_V1=1 python examples/offline_inference/vision_language.py --model pixtral_hf --num-prompts=2

with #11409 patched, results are the same as V0.

cc @WoosukKwon @comaniac @ywang96

* confirm that `offline_inference_vision_language.py` and `benchmark_throughput.py` runs * FIXME: the placeholders in output, however, is empty - will fix in next commit Signed-off-by: Linkun Chen <[email protected]>

* add test for pixtral * fix a typo Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

…ect#10383) Signed-off-by: youkaichao <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

…odels (vllm-project#10374) Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Chendi Xue <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

…ject#10394) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: youkaichao <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

…ject#10403) Signed-off-by: imkero <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

vllm-project#10392) Signed-off-by: wchen61 <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Cyrus Leung <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Linkun Chen <[email protected]>

…m-project#10327) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

…vllm-project#10375) Signed-off-by: Hollow Man <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

… optional argument also require it to be passed as kwargs, to avoid breaking existing code. Signed-off-by: Linkun Chen <[email protected]>

mypy is not smart enough to validate kwargs Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Linkun Chen <[email protected]>

github-actions · 2025-02-11T09:01:28Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Prepare for vllm-project#11409 For pixtral model, we need to insert placeholders in the middle of encoder output, to fit into whole soft embedding. This case makes slicing operation tricky. This PR raises assertion if something's off. Signed-off-by: Linkun Chen <[email protected]>

Signed-off-by: Linkun Chen <[email protected]>

DarkLight1337 · 2025-02-12T09:02:11Z

For your reference, I have added a mapping from encoder outputs to embeddings in the outputs of Molmo multi-modal processor (#12966, see feat_is_patch and embed_is_patch) so that there is no need to define a custom hook inside the model. Do you think it's feasible to do the same for this model?

Linkun Chen and others added 30 commits November 18, 2024 05:52

Patch multi_modal_placeholders to RequestOutput

678c291

* confirm that `offline_inference_vision_language.py` and `benchmark_throughput.py` runs * FIXME: the placeholders in output, however, is empty - will fix in next commit Signed-off-by: Linkun Chen <[email protected]>

pipe multi_modal_placeholders from intput to final output

a1cdcb3

* add test for pixtral * fix a typo Signed-off-by: Linkun Chen <[email protected]>

[V1] Add code owners for V1 (vllm-project#10397)

f60964a

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[2/N][torch.compile] make compilation cfg part of vllm cfg (vllm-proj…

578e482

…ect#10383) Signed-off-by: youkaichao <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[V1] Refactor model executable interface for all text-only language m…

7fa97cf

…odels (vllm-project#10374) Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[CI/Build] Fix IDC hpu [Device not found] issue (vllm-project#10384)

629f512

Signed-off-by: Chendi Xue <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[Bugfix][CPU] Fix CPU embedding runner with tensor parallel (vllm-pro…

7539ab8

…ject#10394) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[platforms] refactor cpu code (vllm-project#10402)

bce660d

Signed-off-by: youkaichao <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[Hardware] [HPU]add mark_step for hpu (vllm-project#10239)

305708b

Signed-off-by: Kunshang Ji <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[Bugfix] Fix mrope_position_delta in non-last prefill chunk (vllm-pro…

871a773

…ject#10403) Signed-off-by: imkero <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[Misc] Enhance offline_inference to support user-configurable paramet… (

242bb53

vllm-project#10392) Signed-off-by: wchen61 <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Fix initialization

f5312d3

Signed-off-by: Cyrus Leung <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

Run isort

439e324

Signed-off-by: Linkun Chen <[email protected]>

isort

60815f2

Signed-off-by: Linkun Chen <[email protected]>

isort

ec46755

Signed-off-by: Linkun Chen <[email protected]>

[Misc] Add uninitialized params tracking for AutoWeightsLoader (vll…

ce3ae6f

…m-project#10327) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

[Bugfix] Ignore ray reinit error when current platform is ROCm or XPU (…

466b2cf

…vllm-project#10375) Signed-off-by: Hollow Man <[email protected]> Signed-off-by: Linkun Chen <[email protected]>

update RequestOutput.__init__() to take multi_modal_placeholders as…

3f092ce

… optional argument also require it to be passed as kwargs, to avoid breaking existing code. Signed-off-by: Linkun Chen <[email protected]>

update RequestOutput.__init__() to take multi_modal_placeholders as…

dd8427e

… optional argument also require it to be passed as kwargs, to avoid breaking existing code. Signed-off-by: Linkun Chen <[email protected]>

update RequestOutput.__init__() to take multi_modal_placeholders as…

76ac8b0

… optional argument also require it to be passed as kwargs, to avoid breaking existing code. Signed-off-by: Linkun Chen <[email protected]>

Merge branch 'vllm-project:main' into main

904e925

disable mypy type check

c963a25

mypy is not smart enough to validate kwargs Signed-off-by: Linkun Chen <[email protected]>

disable mypy type check

470fbd3

mypy is not smart enough to validate kwargs Signed-off-by: Linkun Chen <[email protected]>

remove unnecessary debug code

550be23

Signed-off-by: Linkun Chen <[email protected]>

Merge branch 'vllm-project:main' into main

1eb4d96

Merge branch 'vllm-project:main' into main

9b002b0

Merge branch 'vllm-project:main' into main

436beb2

Merge branch 'vllm-project:main' into main

bbc6420

Merge branch 'vllm-project:main' into main

5254415

Merge branch 'vllm-project:main' into main

6077919

lk-chen requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners February 11, 2025 09:01

lk-chen mentioned this pull request Feb 11, 2025

[V1] Support Pixtral-HF on V1 #11409

Draft

mergify bot added the v1 label Feb 11, 2025

lk-chen added 2 commits February 11, 2025 09:03

[V1][Pixtral-HF] Add custom slice_encoder_output for Pixtral

89f243b

Signed-off-by: Linkun Chen <[email protected]>

lk-chen force-pushed the pixtral_hf branch from 8a64858 to 89f243b Compare February 11, 2025 09:04

lk-chen added 2 commits February 11, 2025 01:04

Merge branch 'vllm-project:main' into main

1d37090

Merge branch 'main' into pixtral_hf

098b444

Signed-off-by: Linkun Chen <[email protected]>

lk-chen force-pushed the pixtral_hf branch from b9f0b87 to 098b444 Compare February 11, 2025 17:12

ywang96 self-assigned this Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1][Pixtral-HF] Add custom `slice_encoder_output` for Pixtral #13080

[V1][Pixtral-HF] Add custom `slice_encoder_output` for Pixtral #13080

lk-chen commented Feb 11, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 11, 2025

DarkLight1337 commented Feb 12, 2025 •

edited

Loading

[V1][Pixtral-HF] Add custom slice_encoder_output for Pixtral #13080

Are you sure you want to change the base?

[V1][Pixtral-HF] Add custom slice_encoder_output for Pixtral #13080

Conversation

lk-chen commented Feb 11, 2025 • edited by github-actions bot Loading

Test

github-actions bot commented Feb 11, 2025

DarkLight1337 commented Feb 12, 2025 • edited Loading

[V1][Pixtral-HF] Add custom `slice_encoder_output` for Pixtral #13080

[V1][Pixtral-HF] Add custom `slice_encoder_output` for Pixtral #13080

lk-chen commented Feb 11, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Feb 12, 2025 •

edited

Loading