-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporary fix for chat template issue with multimodal inference w/ in-process vLLM engine #1486
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked out your branch on a VM with an A100 and ran this command:
oumi infer -c oumi://configs/recipes/vision/llava_7b/inference/vllm_infer.yaml -i --image="https://oumi.ai/_next/image?url=%2F_next%2Fstatic%2Fmedia%2Fmatthew.3389a3a6.png&w=640&q=75"
I still get an error:
Enter your input prompt: What is this photo of?
INFO 02-26 20:52:24 chat_utils.py:332] Detected the chat template content format to be 'openai'. You can set `--chat-template-content-format` to override this.
[rank0]: Traceback (most recent call last):
[rank0]: File "/opt/conda/envs/oumi/bin/oumi", line 8, in <module>
[rank0]: sys.exit(run())
[rank0]: ^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/cli/main.py", line 123, in run
[rank0]: return app()
[rank0]: ^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/typer/main.py", line 340, in __call__
[rank0]: raise e
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/typer/main.py", line 323, in __call__
[rank0]: return get_command(self)(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/click/core.py", line 1161, in __call__
[rank0]: return self.main(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/typer/core.py", line 743, in main
[rank0]: return _main(
[rank0]: ^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/typer/core.py", line 198, in _main
[rank0]: rv = self.invoke(ctx)
[rank0]: ^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
[rank0]: return _process_result(sub_ctx.command.invoke(sub_ctx))
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
[rank0]: return ctx.invoke(self.callback, **ctx.params)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/click/core.py", line 788, in invoke
[rank0]: return __callback(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/typer/main.py", line 698, in wrapper
[rank0]: return callback(**use_params)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/cli/infer.py", line 143, in infer
[rank0]: return oumi_infer_interactive(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/__init__.py", line 135, in infer_interactive
[rank0]: return oumi.infer.infer_interactive(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/infer.py", line 58, in infer_interactive
[rank0]: model_response = infer(
[rank0]: ^^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/infer.py", line 138, in infer
[rank0]: generations = inference_engine.infer(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/core/inference/base_inference_engine.py", line 89, in infer
[rank0]: return self.infer_online(input, inference_config)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/inference/vllm_inference_engine.py", line 366, in infer_online
[rank0]: return self._infer(input, inference_config)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/matthew/.local/lib/python3.11/site-packages/oumi/inference/vllm_inference_engine.py", line 314, in _infer
[rank0]: chat_responses = self._llm.chat(
[rank0]: ^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/vllm/entrypoints/llm.py", line 725, in chat
[rank0]: prompt_data = apply_hf_chat_template(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/vllm/entrypoints/chat_utils.py", line 978, in apply_hf_chat_template
[rank0]: return tokenizer.apply_chat_template(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1687, in apply_chat_template
[rank0]: rendered_chat = compiled_template.render(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/jinja2/environment.py", line 1295, in render
[rank0]: self.environment.handle_exception()
[rank0]: File "/opt/conda/envs/oumi/lib/python3.11/site-packages/jinja2/environment.py", line 942, in handle_exception
[rank0]: raise rewrite_traceback_stack(source=source)
[rank0]: File "<template>", line 1, in top-level template code
[rank0]: jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'content'
LLAVA has no chat template published on HF => The change doesn't fix the problem for LLAVA. It should work from Qwen-s and Llava. This is non-perfect but better than completely broken as now. |
Got it. Should we remove |
it's referenced in misc docs. It'd be a lot of busy work to delete it, then re-add. Let me do a proper fix later this week (hopefully) once I'm done with GRPO |
Description
-- Use HF-supplied chat templates for VLM-s in vLLM in-process inference engine. The change will be reverted once we have a more general solution (requires more effort)
-- Solves the error for Llava and Qwen. Not effective for LLAVA (no chat template) and Phi3 (text-only template)
-- Create new helper function
get_hf_chat_template()
Related issues
Towards OPE-1090
Before submitting
Reviewers
At least one review from a member of
oumi-ai/oumi-staff
is required.