Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the Launch command suite in the new CLI #612

Merged
merged 8 commits into from
Oct 8, 2024
Merged

Conversation

taenin
Copy link
Collaborator

@taenin taenin commented Oct 8, 2024

Example commands:

oumi launch status
oumi launch which
oumi launch up --config foo
oumi launch run --config foo
oumi launch down --cluster a --cloud b
oumi launch stop --cluster a --cloud b --id c

Towards OPE-500 and OPE-473

Copy link

linear bot commented Oct 8, 2024

OPE-500

OPE-473

@taenin taenin marked this pull request as ready for review October 8, 2024 22:30
@taenin taenin requested review from oelachqar, wizeng23 and nikg4 October 8, 2024 22:32
@wizeng23
Copy link
Contributor

wizeng23 commented Oct 8, 2024

Besides changing oumi-launch to oumi launch, is there any other differences for how to specify the command? Ex. for oumi train in the other PR, the omegaconf CLI overrides had to be specified differenty.

I assume there'll be documentation checked in at some point?

Copy link
Contributor

@wizeng23 wizeng23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of review, is there any significant difference in logic between this PR and oumi/src/oumi/launch.py?

@taenin
Copy link
Collaborator Author

taenin commented Oct 8, 2024

Besides changing oumi-launch to oumi launch, is there any other differences for how to specify the command? Ex. for oumi train in the other PR, the omegaconf CLI overrides had to be specified differenty.

I assume there'll be documentation checked in at some point?

You're correct, the main change is how additional params are passed. At some point (soon) we'll update this documentation. For now, here's a concrete example using oumi train:

What we have now:

oumi-train \
    "data.train.dataset.0.dataset_name=yahma/alpaca-cleaned" \
    "data.train.dataset.0.preprocessing_function_name=alpaca" \
    "data.train.dataset.target_col=prompt" \
    "model.model_name=microsoft/Phi-3-mini-4k-instruct" \
    "model.trust_remote_code=true" \
    "training.trainer_type=TRL_SFT/" \
    "training.output_dir=train/"

After my changes:

oumi train \
    --data.train.dataset.0.dataset_name yahma/alpaca-cleaned \
    --data.train.dataset.0.preprocessing_function_name alpaca \
    --data.train.dataset.target_col prompt \
    --model.model_name microsoft/Phi-3-mini-4k-instruct \
    --model.trust_remote_code true \
    --training.trainer_type TRL_SFT/ \
    --training.output_dir train/

@taenin
Copy link
Collaborator Author

taenin commented Oct 8, 2024

For ease of review, is there any significant difference in logic between this PR and oumi/src/oumi/launch.py?

Nope, it's almost entirely the same. The only minor shift is from using a threadpool -> a process pool for polling jobs.

@taenin taenin merged commit 7d81911 into main Oct 8, 2024
1 check passed
@taenin taenin deleted the taenin/launch branch October 8, 2024 23:44
@taenin taenin linked an issue Oct 10, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate Typer for better CLI performance
3 participants