Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add QwQ eval/infer configs #1515

Merged
merged 7 commits into from
Mar 5, 2025
Merged

Add QwQ eval/infer configs #1515

merged 7 commits into from
Mar 5, 2025

Conversation

wizeng23
Copy link
Contributor

@wizeng23 wizeng23 commented Mar 5, 2025

Description

Copies the config from configs/recipes/deepseek_r1/sft/distill_qwen_32b, which is a reasoning model based on the same architecture (Qwen 2.5 32B).

Tested on GCP.

Related issues

Towards #1408

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

@wizeng23 wizeng23 merged commit b6925e1 into main Mar 5, 2025
1 check passed
@wizeng23 wizeng23 deleted the wizeng/qwq branch March 5, 2025 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants