Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Throw a runtime error for quantized models & inference=VLLM #1212

Merged
merged 6 commits into from
Jan 23, 2025

Conversation

kaisopos
Copy link
Contributor

Description

Throw a runtime error when instantiating a {judge, evaluation, inference} config that includes a "BitsAndBytes" quantized model and a VLLM inference engine. I am aware this does NOT cover all possible use cases (users may use an inference engine directly without a config) but my hope is that it covers most cases.

Related issues

Fixes # (issue)

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

@kaisopos kaisopos merged commit 7e36dd5 into main Jan 23, 2025
2 checks passed
@kaisopos kaisopos deleted the kostas/runtime_error_vllm_quantized_model branch January 23, 2025 04:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants