Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune few training params for LLAVA and blip2 models #642

Merged
merged 4 commits into from
Oct 10, 2024

Conversation

nikg4
Copy link
Collaborator

@nikg4 nikg4 commented Oct 10, 2024

-- Tune batch size, steps, and few other params
-- For LLAVA using bs=8,gas=8 to mimic this example: https://github.com/huggingface/trl/blob/b3f93f0bad85c808ea76ceb2c07706794a31e74f/examples/scripts/sft_vlm.py#L23

Towards OPE-467, OPE-551

Copy link

linear bot commented Oct 10, 2024

OPE-467

OPE-551

@nikg4 nikg4 marked this pull request as ready for review October 10, 2024 19:45
@nikg4 nikg4 requested review from optas, oelachqar and wizeng23 October 10, 2024 19:45
@nikg4 nikg4 merged commit d2f8138 into main Oct 10, 2024
1 check passed
@nikg4 nikg4 deleted the xrdaukar/llava-num-proc branch October 10, 2024 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants