Skip to content

Releases: oumi-ai/oumi

v0.1.8

10 Mar 18:25
3720d47
Compare
Choose a tag to compare

What's Changed

  • GRPO trainer: Minimal initial integration by @nikg4 in #1482
  • Update oumi infer to fall back to interactive mode if no input path is specified. by @taenin in #1483
  • Add sample DDP/GCP config for GRPO trainer by @nikg4 in #1485
  • Temporary fix for chat template issue with multimodal inference w/ in-process vLLM engine by @nikg4 in #1486
  • [tiny] Update async_eval.yaml comments to reference correct class by @wizeng23 in #1488
  • Fix a bug where overriding remote_params fails via the CLI (oumi infer) by @taenin in #1487
  • Define GrpoParams under configs by @nikg4 in #1490
  • Support more GRPO params by @nikg4 in #1491
  • Minor updates to oumi env by @nikg4 in #1492
  • Warn instead of error when device not found for MFU calculation by @wizeng23 in #1489
  • Updated all CLI endpoints to support oumi:// prefix by @Spaarsh in #1468
  • Fix chat template issue for nested content parts used for VLMs by @nikg4 in #1493
  • Ctseng777/judge by @ctseng777 in #1474
  • [Evaluation] Modularization & enabling custom evaluations by @kaisopos in #1484
  • Update documentation formatting for BaseModel by @taenin in #1494
  • Fix log_samples not propagating from eval_kwargs by @jgreer013 in #1496
  • [Evaluation] Adding support for logging model samples for all backends by @kaisopos in #1499
  • Support for deprecated input param ( evaluation_platform) by @kaisopos in #1500
  • Limiting the AlpacaEval number of samples for quickstart by @kaisopos in #1501
  • Add recurring tests to keep our test badges updated. by @taenin in #1498
  • Add a schedule for our GPU, CPU, and doc tests by @taenin in #1503
  • Update the GPU Tests badge to use results from main by @taenin in #1504
  • vLLM version increment by @nikg4 in #1502
  • Minor logging improvements by @nikg4 in #1505
  • [Evaluation] Save Utils: Moving, fixes, and unit tests by @kaisopos in #1506
  • Update sample GRPO script to validate num_generations by @nikg4 in #1509
  • Resolve warning about --dispatch batches deprecated param by @nikg4 in #1510
  • [Evaluation] Re-enabling evaluations with Math Hard (leaderboard_math_hard) by @kaisopos in #1511
  • Update docker image and build script by @oelachqar in #1508
  • Add Qwen QwQ Lora config by @wizeng23 in #1514
  • Add QwQ eval/infer configs by @wizeng23 in #1515
  • [Evaluation] Instantiating an inference engine (if needed) when running custom evaluations by @kaisopos in #1513
  • Switch eval yaml configs to use evaluation_platform by @wizeng23 in #1516
  • Mark BaseMapDataset as typing.Sized by @nikg4 in #1517
  • VLM collator refactor by @nikg4 in #1512

Full Changelog: v0.1.7...v0.1.8

v0.1.7

25 Feb 23:48
eb902e3
Compare
Choose a tag to compare

What's Changed

  • Update the RemoteInferenceEngine to appropriately handle openai format batch prediction endpoints. by @taenin in #1472
  • Fix local models to not break the registry. by @taenin in #1476
  • Create an inference config for Claude Sonnet 3.7 by @taenin in #1479
  • Add notebook for fine-tuning MiniMath-R1-1.5B by @jgreer013 in #1480
  • [Evaluation] Migrate LM Harness integration point from simple_evaluate to evaluate by @kaisopos in #1455
  • [tiny]Update trl to 0.14 by @wizeng23 in #1478

Full Changelog: v0.1.6...v0.1.7

v0.1.6

22 Feb 02:25
cc3510d
Compare
Choose a tag to compare

What's Changed

  • Update RemoteParams to no longer require an API URL. by @taenin in #1452
  • [Tiny] Update default training params for Qwen2-VL-2B-Instruct by @optas in #1454
  • [Tiny] Add more warnings for "special" requirements of Qwen2.5-VL by @optas in #1453
  • Minor cleanup of oumi fetch by @taenin in #1463
  • Support for multi-image VLM training by @nikg4 in #1448
  • Remove a temp workaround in pad_sequences on the left side by @nikg4 in #1464
  • [tiny] Add warning that Oumi doesn't support Intel Macs by @wizeng23 in #1467
  • VLM-related logging improvements by @nikg4 in #1469
  • Fix Oumi launcher to be able to run on RunPod and Lambda by @wizeng23 in #1470
  • Enable pre-release install for uv in pyproject.toml by @wizeng23 in #1466

Full Changelog: v0.1.5...v0.1.6

v0.1.5

20 Feb 00:05
f494e34
Compare
Choose a tag to compare

What's Changed

  • Fix the remainder of our configs by @wizeng23 in #1356
  • Adopt new Llama 3.1 HF names by @wizeng23 in #1357
  • Define OUMI_USE_SPOT_VM env var and start using it to override use_spot param by @xrdaukar in #1359
  • Support HuggingFaceM4/Docmatix dataset by @vishwamartur in #1342
  • [nit] update default issue names by @oelachqar in #1367
  • Update sft_datasets.md by @penfever in #1349
  • Have GitHub Trending image hyperlink to GitHub Trending page by @wizeng23 in #1370
  • Update the link for the trending banner. by @taenin in #1371
  • Move code to disable caching in model.config to a helper function by @xrdaukar in #1378
  • Update transformers version to 4.48 by @wizeng23 in #1372
  • Update notebooks to improve their Colab experience by @wizeng23 in #1380
  • Add proper labels and types to new Bugs and Feature Requests. by @taenin in #1383
  • Upgrade omegaconf to 2.4.0dev3 by @wizeng23 in #1384
  • Support HuggingFaceM4/the_cauldron dataset by @vishwamartur in #1366
  • Update our FAQ for tips about installing oumi on Windows by @taenin in #1385
  • Cleanup HuggingFaceM4/Docmatix and HuggingFaceM4/the_cauldron multimodal datasets by @xrdaukar in #1387
  • Remove uneeded env vars from job configs by @wizeng23 in #1390
  • Remove transformer version override for HuggingFaceTB/SmolVLM-Instruct in launcher script by @xrdaukar in #1388
  • [Small Refactor] Moving the inference engine def outside the inference config by @kaisopos in #1395
  • Evaluation - LM Harness: Adding vLLM support by @kaisopos in #1379
  • Remove Docmatix dataset references from docstrings VLM config examples by @xrdaukar in #1397
  • Fixed broken link in Oumi - A Tour.ipynb notebook by @ciaralema in #1398
  • Fix broken links in notebooks. by @taenin in #1402
  • Create a client for communicating with a Slurm node via SSH. by @taenin in #1389
  • [tiny] Remove references to missing job configs in README by @wizeng23 in #1404
  • Train+Inference with Qwen 2.5 VL (3B) by @optas in #1396
  • Add a Slurm cluster and cloud to the oumi launcher. by @taenin in #1406
  • Move pretokenize script from scripts/pretokenize/ to scripts/datasets/pretokenize/ by @xrdaukar in #1412
  • Create a script to save Conversation-s from SFT datasets into .jsonl file by @xrdaukar in #1413
  • [Evaluation] LM Harness refactor by @kaisopos in #1410
  • Update save_conversations tool by @xrdaukar in #1421
  • [SambaNova] Integrate SambaNova Systems to oumi inference by @ctseng777 in #1415
  • [Μinor] Equating Qwen's 2.5 chat-template to version's 2.0 by @optas in #1419
  • Add requirements header to configs and clean them up by @wizeng23 in #1411
  • Updated oumi infer to support CLI argument for system prompt by @Spaarsh in #1422
  • [Evaluation] LM Harness remote server support by @kaisopos in #1414
  • [Feature] Add Tulu3 SFT Mixture Dataset Support by @bwalshe in #1381
  • Support Multimodal inference with multiple images and PDF-s in NATIVE engine by @xrdaukar in #1424
  • Update notebooks to run on Colab by @wizeng23 in #1423
  • Add calm recipe. by @taenin in #1425
  • Update VLM sample oumi infer -i commands by @xrdaukar in #1428
  • Provide example show to start SGLang server using Docker by @xrdaukar in #1429
  • Multi-image support in SGLang inference engine by @xrdaukar in #1426
  • Calm readme by @emrecanacikgoz in #1432
  • WildChat-50M Reproduction by @penfever in #1433
  • Add WildChat support by @penfever in #1348
  • Create pad_to_max_dim_and_stack() function in torch_utils by @xrdaukar in #1435
  • use deterministic by @penfever in #1434
  • Additional HF trainer parameters for config by @penfever in #1436
  • Set a better default for vllm inference GPU usage. by @taenin in #1437
  • Added fetch command and modified infer command to resolve oumi:// by @Spaarsh in #1439
  • Require an inference config for oumi infer. by @taenin in #1443
  • Make the tulu3 unit tests hermetic. by @taenin in #1446
  • Add 2 more sample PDF-s with 1 and 2 pages under testdata/pdfs by @xrdaukar in #1427
  • Enable ability to override list values in config via CLI by @wizeng23 in #1430
  • Renamed CALM to CoALM by @jgreer013 in #1450
  • Add support for Docmatix dataset to multimodal training script by @xrdaukar in #1449
  • Update oumi launch status to show clusters with no running jobs. by @taenin in #1451

New Contributors

Full Changelog: v0.1.4...v0.1.5

v0.1.4

03 Feb 21:06
fc3d45e
Compare
Choose a tag to compare

What's Changed

  • Add memory cleanup calls in e2e integration tests by @xrdaukar in #1277
  • Set up versioning for our documentation by @taenin in #1275
  • Make qwen2-VL evaluation job pass by @xrdaukar in #1278
  • Add multi-modal (vlm) notebook with Llama 11B by @optas in #1258
  • Documentation: Inference -> List supported models by @kaisopos in #1279
  • [tiny] update website link by @oelachqar in #1280
  • Update all documentation links to the new doc URL by @taenin in #1281
  • Update Oumi - A Tour.ipynb by @brragorn in #1282
  • Documentation: Judge (minor edits) by @kaisopos in #1283
  • Fix citation by @oelachqar in #1285
  • Add Deepseek R1 1.5B/32B configs by @wizeng23 in #1276
  • Misc eval configs cleanup by @xrdaukar in #1286
  • [docs] Describe parallel evaluation by @xrdaukar in #1284
  • Update microsoft/Phi-3-vision-128k-instruct training config by @xrdaukar in #1287
  • Add Together Deepseek R1 inference config by @wizeng23 in #1289
  • [minor] vlm notebook minor updates (doc referencing, freeze visual backbone) by @optas in #1288
  • Add missing -m oumi evaluate argument in eval config by @xrdaukar in #1291
  • [docs] Add more references to VL-SFT and SFT notebooks by @xrdaukar in #1293
  • Eval config change for deepseek-ai/DeepSeek-R1-Distill-Llama-70B by @xrdaukar in #1292
  • [notebooks] Update intro & installation instruction by @oelachqar in #1294
  • Update notebook intros by @oelachqar in #1296
  • [notebooks] Update installation instructions for colab by @oelachqar in #1297
  • Add Apache license header to src/oumi/**/*.py by @wizeng23 in #1290
  • Minor updates to VLM Multimodal notebook by @xrdaukar in #1299
  • [docs] Add latest notebooks and update references by @oelachqar in #1300
  • [tiny] Add docs auto-generated .rst files to gitignore by @wizeng23 in #1298
  • [tiny] use GitHub link for header by @oelachqar in #1301
  • [docs][tiny] update inference engines reference by @oelachqar in #1302
  • Update README/docs to add new DeepSeek models by @wizeng23 in #1304
  • [docs] Use pip install oumi over pip install . by @wizeng23 in #1305
  • Tune VLM SFT configs by @xrdaukar in #1306
  • Tune VLM configs for SmolVLM and Qwen2-VL by @xrdaukar in #1307
  • Update config/notebook pip installs to use PyPI by @wizeng23 in #1308
  • [tiny] upgrade torch version by @oelachqar in #1295
  • Update logging and unit tests related to chat templates by @xrdaukar in #1311
  • fix(docs): "interested by joining" to "interested in joining" by @CharlesCNorton in #1312
  • Add HF_TOKEN instructions to Oumi Multimodal notebook by @xrdaukar in #1313
  • Update configuration.md by @penfever in #1314
  • remove duplicate keys in config example by @lucyknada in #1315
  • [Notebooks] Update VLM notebook by @xrdaukar in #1317
  • Update parasail_inference_engine.py by @jgreer013 in #1320
  • Fix typo and update warning message for OUMI trainer by @xrdaukar in #1319
  • [Notebooks] Add a note that a notebook kernel restart may be needed after pip install oumi by @xrdaukar in #1318
  • Update Phi3 to support multiple images by @xrdaukar in #1321
  • Add more detailed comment headers to YAML configs by @wizeng23 in #1310
  • [Notebooks] Add a note to Tour notebook to restart kernel after the first pip install by @xrdaukar in #1327
  • Tweak --mem-fraction-static param in sample SGLang configs by @xrdaukar in #1328
  • Disallow using DatasetParams field names as keys in DatasetParams.dataset_kwargs by @xrdaukar in #1324
  • Support dataset_name_override dataset_kwarg by @xrdaukar in #1188
  • Add an util and a test marker for HF token by @xrdaukar in #1329
  • Update llama3-instruct chat template to align with the original models template by @xrdaukar in #1326
  • chore: update launcher.sh by @eltociear in #1333
  • [Notebooks] Minor improvements in VLM and CNN notebooks by @xrdaukar in #1335
  • Update VLM cluster names in sample commands by @xrdaukar in #1336
  • Update our README and docs with the github trending badge. by @taenin in #1340
  • Update README.md - Add DeepSeek to supported models by @mkoukoumidis in #1343
  • Update index.md - Add DeepSeek to supported models by @mkoukoumidis in #1344
  • Update "GPU Tests" status badge in README page by @xrdaukar in #1345

New Contributors

Full Changelog: v0.1.3...v0.1.4

v0.1.3

28 Jan 00:44
86124a9
Compare
Choose a tag to compare

What's Changed

  • Documentation: Judge | Custom Model page by @kaisopos in #1195
  • [WIP] Add a notebook for using CNN with custom dataset by @xrdaukar in #1196
  • [Cherrypick for launch] Evaluate: return dict of results by @kaisopos in #1197
  • Configs Train/Infer/Eval and Llama 3.3v (70b) by @optas in #1200
  • Adding an integration test for evaluation fn's output (see PR-1197) by @kaisopos in #1199
  • [docs] Add more details and cross-references related to customization by @xrdaukar in #1198
  • Define single_gpu test marker by @xrdaukar in #1201
  • Native inference: Don't set min_p, temperature in GenerationConfig if sampling is disabled by @xrdaukar in #1202
  • Update tests to make them runnable on GCP by @xrdaukar in #1203
  • Add newline before pformat(train_config) by @xrdaukar in #1204
  • GCP tests launcher script changes by @xrdaukar in #1205
  • [Evaluation] Bug: serialization by @kaisopos in #1207
  • [docs] Add inference snippet for together.ai and DeepSeek APIs by @oelachqar in #1208
  • Exclude multi_gpu tests from GitHub GPU tests by @xrdaukar in #1210
  • Update e2e tests to support multi-GPU machines by @xrdaukar in #1206
  • Add wrappers for remote inference engines by @oelachqar in #1209
  • Vision-Lang & Inference (including LoRA) by @optas in #1174
  • [BugFix] Throw a runtime error for quantized models & inference=VLLM by @kaisopos in #1212
  • Fix most job configs by @wizeng23 in #1213
  • e2e tests update by @xrdaukar in #1216
  • [Notebook] Evaluation with Oumi by @kaisopos in #1218
  • gpt2: move include_performance_metrics param from script to yaml by @xrdaukar in #1217
  • Simplify inference engine API by @oelachqar in #1214
  • Move configs to experimental by @wizeng23 in #1215
  • [docs] Update index page by @oelachqar in #1220
  • Update ConsoleLogger to write to STDOUT by @xrdaukar in #1221
  • Set use_spot to False in our JobConfigs by @wizeng23 in #1222
  • Delete oumi[optional] install target by @wizeng23 in #1224
  • Scaffolding and the first testcase for e2e evaluation tests by @xrdaukar in #1225
  • [docs] Update inference engines doc page by @oelachqar in #1227
  • Clean-up inference engine builder by @oelachqar in #1226
  • [VLLM Engine] Enabling BitsAndBytes quantization by @kaisopos in #1223
  • Add example distillation notebook by @jgreer013 in #1228
  • Add a script to pre-download models for gpu_tests by @xrdaukar in #1231
  • Fix multi-GPU inference integration test by @xrdaukar in #1229
  • [tiny][docs] Update PEFT/LoRA content by @optas in #1233
  • [BugFix] GGUF does not work with VLLM by @kaisopos in #1232
  • Re-enable parallel evaluation for VLM-s by @xrdaukar in #1235
  • Add multimodal exemplar dataset in our provided mini-datasets by @optas in #1234
  • [Tiny] renaming a field name (init_lora_weights) by @optas in #1236
  • Add more e2e evaluation tests by @xrdaukar in #1237
  • Fix pyright breakage when vllm and llama_cpp are not installed by @taenin in #1240
  • Update our oumi launch documentation. by @taenin in #1239
  • Update index.md title for "Join the Community!" by @mkoukoumidis in #1242
  • Update quickstart.md - nit for Oumi support request by @mkoukoumidis in #1241
  • [VLLM Engine] Improve support for GGUF models (incl. auto-download) by @kaisopos in #1238
  • Update README.md title to "Join the Community!" by @mkoukoumidis in #1243
  • Update quickstart.md by @brragorn in #1251
  • Update quickstart.md by @brragorn in #1253
  • Update quickstart.md by @brragorn in #1252
  • Update quickstart.md by @brragorn in #1250
  • [Minor refactor] Moving model caching to oumi.utils by @kaisopos in #1246
  • Add more details to troubleshooting FAQ by @wizeng23 in #1249
  • Update training_methods.md - Change compute requirement suggestions by @mkoukoumidis in #1245
  • Update train.md - nit description change by @mkoukoumidis in #1244
  • [docs] misc docs feedback by @oelachqar in #1248
  • [tiny] Qwen2-VL activate experimental datapipes by @optas in #1247
  • Update Oumi - A Tour.ipynb by @brragorn in #1254
  • [docs] more docs feedback by @oelachqar in #1255
  • Update supported_models.md by @penfever in #1256
  • Rename experimental_use_torch_datapipes data param by @xrdaukar in #1257
  • Add pypi release workflow using testpypi by @oelachqar in #1259
  • Update workflow names by @oelachqar in #1262
  • Update default idle_minutes_to_autostop to 1 hour. by @taenin in #1264
  • update pypi release workflow to use trusted env by @oelachqar in #1265
  • Add padding_side param to internal model config by @xrdaukar in #1260
  • Documentation: Updates on Evaluation/Judge (based on Manos' feedback) by @kaisopos in #1261
  • [tiny] less strict requirements by @oelachqar in #1266
  • Add Deepseek R1 Distill Llama 8B/70B configs by @wizeng23 in #1263
  • Update index.md to highlight beta stage by @mkoukoumidis in #1268
  • Update README.md to highlight beta stage by @mkoukoumidis in #1267
  • Disable pre-release packages by @oelachqar in #1270
  • Update common_workflows.md - Clarify OpenAI is just an example by @mkoukoumidis in #1271
  • Documentation: Evaluation page (update to highlight multi-modal) by @kaisopos in #1269
  • Update launch.md by @taenin in #1272
  • Add pypi release workflow by @oelachqar in #1273
  • Documentation: Judge | minor edit (bold) by @kaisopos in #1274

Full Changelog: v0.1.2...v0.1.3

v0.1.2.3

27 Jan 22:30
5a9564a
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.1.2.2...v0.1.2.3

v0.1.2.0-alpha

17 Jan 17:31
0016ff8
Compare
Choose a tag to compare
v0.1.2.0-alpha Pre-release
Pre-release

What's Changed

  • Update README.md - Better highlight features & nits by @mkoukoumidis in #995
  • [tiny] update docstring and cleanup by @oelachqar in #1006
  • Qwen2-VL: minor updates by @xrdaukar in #1000
  • Update README.md - Describe Oumi's most common capabilities by @mkoukoumidis in #996
  • Fix readme. by @taenin in #1009
  • Updated our ascii logo by @taenin in #1008
  • [docs] Update readme by @oelachqar in #1010
  • Cleanup scripts by @oelachqar in #1011
  • Cleanup experimental folder by @oelachqar in #1012
  • Update lists of supported VLM-s in README and docs by @xrdaukar in #1014
  • Freeze Python package versions by @xrdaukar in #1007
  • Update blip2's chat template to use the "default" one by @xrdaukar in #1015
  • Add docstrings how to start vLLM and SGLang servers for Llama-3.2-11B-Vision-Instruct by @xrdaukar in #1016
  • Evaluation: bugfixing, corner case, unit tests by @kaisopos in #1003
  • Configure asyncio_default_fixture_loop_scope to reduce pytest warnings by @xrdaukar in #1013
  • Update the registry to load registered core values upon use. by @taenin in #1017
  • Update default installation instructions to pypi by @taenin in #1018
  • [tiny] Update debug datasets by @oelachqar in #1020
  • [docs] Address misc docs feedback by @oelachqar in #1019
  • [tiny] update pre-defined judges and docs by @oelachqar in #1021
  • Parameterize e2e training test, and add config for Qwen2-VL by @xrdaukar in #1023
  • Remove our docs password from the readme. by @taenin in #1024
  • VLM docs update by @xrdaukar in #1025
  • Fix loading registered pretrain datasets by @wizeng23 in #1005
  • Update @requires_gpus test decorator to optionally specify min GPU memory requirement by @xrdaukar in #1029
  • [tiny] Update GitHub workflows by @oelachqar in #1034
  • Update BaseConfig.from_yaml to also support Path by @xrdaukar in #1026
  • [tiny] Cleanup judge engine builder & fix circular dep by @oelachqar in #1035
  • Create GPU GitHub Actions workflow by @oelachqar in #1004
  • Add structured outputs support to gemini/vertex engines by @oelachqar in #1022
  • [docs] Fix feedback on training and inference user guides by @oelachqar in #1037
  • [docs][tiny] fix examples in inference guide by @oelachqar in #1038
  • Add a sanity test for circular imports. by @taenin in #1030
  • Resolve circular dependencies in Oumi by @taenin in #1039
  • Move our circular dependency test to e2e to speed up GPU CI tests. by @taenin in #1040
  • Add custom inference engine for gemini API by @oelachqar in #1036
  • Define CLI in our quickstart. by @taenin in #1042
  • Skip running GPU tests on low-risk code paths by @oelachqar in #1043
  • Define more terms in our training docs. by @taenin in #1044
  • Fix the broken python text snippet on the train page. by @taenin in #1045
  • Fix the second python snippet in the train page. by @taenin in #1046
  • [docs] Add Gemini to the list of supported inference API-s, and sort them by @xrdaukar in #1048
  • Fix issues in most notebooks by @wizeng23 in #1047
  • [docs][tiny] remove termynal from sphinx conf by @oelachqar in #1041
  • Fix a typo in the VS Code environment page. by @taenin in #1049
  • Define WSL in our vscode docs. by @taenin in #1052
  • [tiny] disable unit tests on safe paths by @oelachqar in #1051
  • [docs] Fix contributing and open issue links by @oelachqar in #1050
  • [evaluations/generative_benchmark] Broken link by @kaisopos in #1054
  • Remove dangling reference to jupyter in Makefile help by @xrdaukar in #1053
  • [evaluations/generative_benchmark] Removing notebook link by @kaisopos in #1055
  • Support constrained decoding in SGLang inference engine by @xrdaukar in #1032
  • [tiny] Update tutorials page by @wizeng23 in #1056
  • Minor updates to Launch.md by @taenin in #1059
  • [docs] Update docs/user_guides/infer/infer.md by @xrdaukar in #1058
  • Nits for common_workflows.md by @mkoukoumidis in #1061
  • Nit fixes for acknowledgements.md by @mkoukoumidis in #1057
  • Add sample trouble shooting for remote jobs. by @taenin in #1062
  • Add a Github Issues selector for questions and have it redirect to Discord. by @taenin in #1064
  • Package checking: Adding functionality for checking package versioning and fast failing by @kaisopos in #1031
  • Fix various typos in contributing.md by @taenin in #1066
  • SGLang inference documentation by @xrdaukar in #1065
  • Replace assert in NativeInferenceEngine with RuntimeError by @xrdaukar in #1068
  • Update dev set up instructions to use a Fork. by @taenin in #1067
  • Define inference configs for more models by @xrdaukar in #1069
  • [Evaluation] HF Leaderboards yaml files by @kaisopos in #1071
  • Specify engine: NATIVE is inference configs by @xrdaukar in #1075
  • Improve handling of image path and URLs by @xrdaukar in #1074
  • [Doc > Quickstart] Should we add links to guides for better discoverability? by @kaisopos in #1076
  • Add e2e tests for running tutorial notebooks by @oelachqar in #1079
  • Ignore all experimental files when running our circular dependency test. by @taenin in #1081
  • [Super Nit Doc Update] environments.md by @kaisopos in #1082
  • Add an env var for loading user registered values (dataset, models, clouds) when initializing the Oumi Registry by @taenin in #1077
  • Update internal model configs to support default tokenizer_pad_token and chat_template by model type by @xrdaukar in #1078
  • [Minor] Notebook typo by @kaisopos in #1085
  • Upgrade transformers to 4.47 by @wizeng23 in #1033
  • [tiny][docs] Update recipes page by @wizeng23 in #1072
  • Configure e2e integration test for Llama 3.2 Vision 11B by @xrdaukar in #1086
  • Nits for cli_reference.md by @mkoukoumidis in #1063
  • [Documentation] Evaluate | Leaderboards Page by @kaisopos in #1084
  • [Documentation] Evaluate | Main Page (revision) by @kaisopos in #1089
  • [tiny] Fix precommit by @oelachqar in #1092
  • Add timeout for unit & integration tests by @oelachqar in #1091
  • Add GitHub Actions workflow for doctests by @oelachqar in #1093
  • [docs] remove unused page, fix links by @oelachqar in #1094
  • [Documentation] Evaluate | Main Page (small refactor) by @kaisopos in #1095
  • Rewrite of the main Oumi Launch page. by @taenin in #1087
  • Remove pytest.mark.skip() for basic e2e tests by @xrdaukar in #1088
  • [tiny] Upgrade minimum numpy version to unblock python3.12 installation by @oelachqar in #1099
  • Update our Readme with a new header image. by @taenin in #1098
  • [docs] Minor refresh to dataset resource pages by @oelachqar in #1097
  • [docs] Add docs guide page by @oelachqar in #1096
  • Add a quick unit test to ensure new dependencies are not added to the top-level CLI by @taenin in https://github.com/o...
Read more

v0.1.1.0-alpha.1

08 Jan 04:27
9ff5132
Compare
Choose a tag to compare
v0.1.1.0-alpha.1 Pre-release
Pre-release

What's Changed

  • Minimal SkyPilot config for blip2 and llava models for GCP with TRL_SFT by @xrdaukar in #573
  • Inference Engine async writes by @taenin in #574
  • Misc cleanups in JsonlinesDataset by @xrdaukar in #576
  • Split out cloud dependencies by @taenin in #575
  • Disable sdpa for blip2 by @xrdaukar in #579
  • Set accelerate version to fix FSDP model saving by @wizeng23 in #580
  • Remove AWS as a default dep by @taenin in #582
  • Update ProfilerParams docstrings to follow the new style by @xrdaukar in #583
  • Minor updates in scripts/benchmarks/minimal_multimodal_training.py by @xrdaukar in #585
  • Add @override annotations to methods of few Dataset subclasses by @xrdaukar in #584
  • Add dataset class for dolly dataset by @oelachqar in #586
  • Refactor debugging/device utils, and add new GPU stats measurement functions by @xrdaukar in #587
  • Add text jsonlines dataset class by @oelachqar in #589
  • Define DataCollationParams by @xrdaukar in #581
  • Misc updates to Polaris launcher scripts by @xrdaukar in #591
  • Set up a new version of the Oumi CLI using Typer by @taenin in #588
  • Update handling of GPU fan speed info by @xrdaukar in #595
  • Add support for magpie dataset variants by @oelachqar in #594
  • Rename GenerationConfig to GenerationParams by @wizeng23 in #592
  • Fix cli infer test by @wizeng23 in #598
  • Judge Notebook 1: default judge by @kaisopos in #593
  • [Tiny] update missing dataset import by @oelachqar in #599
  • Update training script to support data collators by @xrdaukar in #590
  • Update accelerate version to 1.0.0 by @wizeng23 in #601
  • Remove deprecated dataset code paths by @oelachqar in #596
  • Refactor Aya & Ultrachat to use oumi dataset sft classes by @oelachqar in #597
  • Add Llama train/eval/infer E2E integration test by @wizeng23 in #602
  • Set docstring for NVidiaGpuRuntimeInfo struct by @xrdaukar in #603
  • Add generation params to inference engines by @oelachqar in #600
  • [bug] Fix issue loading jsonl datasets from file by @oelachqar in #604
  • Add Llama 3B configs by @wizeng23 in #605
  • Align pyright checks with latest Pylance version by @oelachqar in #611
  • Fix apply_chat_template issue in VisionLanguageSftDataset by @xrdaukar in #609
  • More robust make setup by @oelachqar in #610
  • Fix a bug where the new CLI was improperly importing functions from top-level modules. by @taenin in #613
  • Add support for the Launch command suite in the new CLI by @taenin in #612
  • Support HuggingFaceH4/llava-instruct-mix-vsft dataset by @xrdaukar in #608
  • [tiny] Fix .gitignore by @wizeng23 in #616
  • [tiny] add gpt2 chat template, and update tests to use it by @oelachqar in #617
  • Turn off pretty-printing exceptions in our CLI by @taenin in #618
  • Cleanup dependencies by @oelachqar in #615
  • Upgrade oumi dependencies by @oelachqar in #606
  • Update makefile to use uv, add Jupyter target by @oelachqar in #614
  • Add miniconda installation target, cleanup unused make commands by @oelachqar in #620
  • Update several notebooks with the new EvaluationConfig format. by @taenin in #621
  • Make sure conda env is registered by @oelachqar in #622
  • Add Llama 3b sft/lora/qlora configs for Polaris by @wizeng23 in #626
  • Add check if installation is successful by @oelachqar in #625
  • Initial Cambrian integration by @xrdaukar in #557
  • [tiny] alpaca - minor reproducibility boost by @optas in #619
  • explicitly specify the model's dtype in LMH by @optas in #607
  • [tiny] Add flops for T4 GPU by @wizeng23 in #628
  • Use a timestamp for job directories on Polaris by @taenin in #627
  • [tiny] Fix bug with Polaris job num by @wizeng23 in #629
  • Update two VLLM configs. by @xrdaukar in #624
  • Add pip install -U uv; to make setup for existing envs by @xrdaukar in #630
  • Disable MFU logging for non-packed datasets by @wizeng23 in #632
  • Add config example for long context fine-tuning by @oelachqar in #631
  • Add distribution mode flag to llama_tune by @wizeng23 in #635
  • Judge Notebook 2: Custom Judge by @kaisopos in #623
  • Bugfixes for LLAVA by @xrdaukar in #634
  • Update sphinx config and docs to fix misc errors and warnings by @oelachqar in #639
  • Factor out OUMI_TOTAL_NUM_GPUS env var by @wizeng23 in #636
  • Remove bitsandbytes from train dependencies by @oelachqar in #643
  • Enable intershinx to allow linking to external documentation pages by @oelachqar in #640
  • Tune few training params for LLAVA and blip2 models by @xrdaukar in #642
  • Added support for specifying the inference engine via the InferenceConfig by @taenin in #638
  • Add popular pre-training dataset classes by @oelachqar in #641
  • Remove openai dependency by @oelachqar in #644
  • Update our documentation to point to the new CLI. by @taenin in #645
  • Enable dataloaders for VLLM-s (llava and blip2) by @xrdaukar in #646
  • Allow gradient clipping to be optional by @optas in #649
  • Add support for add_generation_prompt in LLAVA chat template by @xrdaukar in #648
  • Add a description to the Launch CLI by @taenin in #651
  • Add all Llama FSDP GCP configs by @wizeng23 in #637
  • Coerce model params to correct dtype for QLoRA FSDP by @wizeng23 in #652
  • Use uv for pip install commands by @wizeng23 in #653
  • Update sphinx docs by @oelachqar in #654
  • [Docs] Refactor docs pipeline by @oelachqar in #655
  • [docs] swap and configure sphinx theme by @oelachqar in #656
  • [Docs] Add documentation placeholders by @oelachqar in #658
  • [Docs] Add sphinx-bibtex by @oelachqar in #659
  • [Docs] fix rendering issues by @oelachqar in #660
  • [docs] fix broken links by @oelachqar in #661
  • Fix broken link in readme (dev_setup) by @kaisopos in #662
  • [docs][tiny] fix minor doc typos by @oelachqar in #666
  • [docs] add autodoc2 template by @oelachqar in #665
  • [docs] Add content links and references by @oelachqar in #668
  • [docs] switch to myst-nb for rendering notebooks by @oelachqar in #669
  • [docs] Add script to generate module summaries by @oelachqar in #670
  • [docs] Include cli reference by @oelachqar in #671
  • Add dataset submodules by @oelachqar in #667
  • Update notebooks to include a descriptive title by @oelachqar in #664
  • Update tests/utils/test_device_utils.py by @xrdaukar in #672
  • [Inference] Bug in generation config stop tokens by @kaisopos in #663
  • Support rewriting special label values to -100 (ignore_index) to exclude from loss by @xrdaukar in #657
  • Rename emails and website url to Oumi by @wizeng23 in #675
  • Update scri...
Read more

Initial release

02 Oct 22:03
d14a4db
Compare
Choose a tag to compare
Initial release Pre-release
Pre-release

What's Changed

  • Add python project configs by @oelachqar in #1
  • Add repo skeleton by @oelachqar in #2
  • Export lema entrypoint scripts by @oelachqar in #3
  • Update static type checking config by @oelachqar in #5
  • Add example jupyter / colab notebook by @oelachqar in #4
  • Refactor config parsing to use omegaconf by @oelachqar in #6
  • Updating documentation (Dev Environment Setup) by @kaisopos in #7
  • Add tests and vscode config by @oelachqar in #8
  • Added DPOTrainer example to repo, as well as cuda device cleanup to training loop by @jgreer013 in #9
  • Adding torch as top-level module dependency by @optas in #10
  • Add configs for specific hardware requirements by @jgreer013 in #11
  • Sort pre-commit hooks lexicographically by @xrdaukar in #12
  • Add logging config by @oelachqar in #13
  • Lema inference by @xrdaukar in #14
  • Panos dev by @optas in #16
  • Add job launcher by @oelachqar in #15
  • Making split of data a flexible variable by @optas in #17
  • Configure max file size in precommit hooks by @xrdaukar in #18
  • Minor bugfix and documentation update by @oelachqar in #19
  • adding pynvml to train env by @kaisopos in #20
  • Panos dev by @optas in #22
  • Augmenting Types for training hyperparams by @optas in #23
  • Train refactoring (config file visibility) + a few minor changes by @kaisopos in #21
  • Minimal test for train function by @xrdaukar in #25
  • Fix leftover '_torch_dtype' in 'ModelParams' by @xrdaukar in #26
  • Update GPU types list in the default SkyPilot config by @xrdaukar in #27
  • Add a missing lema-infer command under [project.scripts] by @xrdaukar in #28
  • add basic pytests for evaluate and infer by @xrdaukar in #29
  • Update README and pyproject.toml by @wizeng23 in #30
  • A helper function to print info about available CUDA devices by @xrdaukar in #31
  • Update SkyPilot cconfig to start using torchrun by @xrdaukar in #32
  • Support basic single-node, multi-gpu training by @xrdaukar in #33
  • Run all precommit hooks on the repo by @xrdaukar in #35
  • Add experimental code for llama cpp inference by @jgreer013 in #37
  • Create skeleton of STYLE_GUIDE.md by @xrdaukar in #36
  • Adding support for training custom models (for now just a dummy model). by @kaisopos in #38
  • Fix custom model name in test_train.py by @xrdaukar in #39
  • Configure pyright (static type checker) and resolve existing type errors to make it pass by @xrdaukar in #41
  • fix trailing whitespace warning in STYLE_GUIDE.md by @xrdaukar in #43
  • Configure initial GitHub Actions workflow to run pre-commits and tests by @xrdaukar in #44
  • A variety of proposed extensions to finetune a chat-based model (starting with Zephyr) by @optas in #34
  • Fix syntax error in ultrachat by @xrdaukar in #48
  • Create initial version of CONTRIBUTING.md by @xrdaukar in #46
  • Reduce the number of training steps from 5 to 3 to make test_train.py faster by @xrdaukar in #49
  • Adding registry for custom models. by @kaisopos in #42
  • Add config and streaming args to DataParams by @wizeng23 in #47
  • Update Pre-review Tests to only run on pull_request by @xrdaukar in #50
  • Add training flags to computes tokens-based stats by @xrdaukar in #51
  • reduce test training steps in another test which I missed before by @xrdaukar in #53
  • Rename var names of *Params classes by @wizeng23 in #52
  • Make some NVIDIA-specific dependencies optional by @xrdaukar in #54
  • fix trl version as 0.8.6 by @xrdaukar in #56
  • Remove reference to torch.cuda.clock_rate by @xrdaukar in #57
  • Update inference to support non-interactive batch mode. by @kaisopos in #58
  • Update README.md to include Linux/WSL specific instructions by @xrdaukar in #59
  • Minor formatting improvements in README.md by @xrdaukar in #60
  • Minor: Updating Lora Params by @optas in #55
  • Support dataset packing by @wizeng23 in #63
  • Disallow relative imports in LeMa by @xrdaukar in #65
  • Add text_col param that's required for SFTTrainer by @wizeng23 in #66
  • Refactor common config parsing logic (YAML, arg_list) into a common util by @xrdaukar in #68
  • Standardize test naming convention by @wizeng23 in #69
  • Adding support for a hardcoded evaluation with MMLU. by @kaisopos in #67
  • Minor changes to the default configs/skypilot/sky.yaml config by @xrdaukar in #71
  • Prototype to pass config.model.model_max_length to Trainers by @xrdaukar in #70
  • [Inference] Remove the prepended prompts from model responses. by @kaisopos in #73
  • Add a util to print versioning info by @xrdaukar in #74
  • Switch to tempfile.TemporaryDirectory() in test_train.py by @xrdaukar in #75
  • Update docstring verbs to descriptive form by @wizeng23 in #76
  • Add sample accelerate and fsdp configs by @xrdaukar in #77
  • Refactor code to get device rank and world size into a helper function by @xrdaukar in #79
  • Add a simple util to print model summary e.g., layer names, architecture summary by @xrdaukar in #80
  • Freeze numpy to pre 2.0 version by @xrdaukar in #81
  • Adding inference support for next logit probability. by @kaisopos in #78
  • Create FSDP configs for Phi3 by @xrdaukar in #82
  • Auto-format pyproject.toml with "Even Better TOML" by @xrdaukar in #83
  • Minor cleanup updates to SkyPilot configs by @xrdaukar in #84
  • Mixed Precision Training, Flash-Attention-2, Print-trainable-params by @optas in #85
  • Update README.md to include basic instructions for multi-GPU training (DDP, FSDP) by @xrdaukar in #86
  • Start using $SKYPILOT_NUM_GPUS_PER_NODE in SkyPilot config by @xrdaukar in #90
  • Add configs for FineWeb Llama2 pretraining by @wizeng23 in #89
  • Quantization by @optas in #87
  • Update the default SkyPilot config to print more debug/context info by @xrdaukar in #92
  • Add license by @oelachqar in #93
  • Initial version of SkyPilot config for multi-node training (num_nodes: N) by @xrdaukar in #94
  • MMLU eval refactor. by @kaisopos in #88
  • Remove comparison between LOCAL_RANK and RANK by @xrdaukar in #96
  • Handling the loading of peft adapters and other minor issues (e.g., adding more logging parameters) by @optas in #91
  • Update configs/skypilot/sky_llama2b.yaml to start using sky_init.sh by @xrdaukar in #97
  • Add bool param to resume training from the last known checkpoint (if exists) by @xrdaukar in #99
  • Inference: save/restore probabilities to/from file. by @kaisopos in #98
  • Add support for dataset mixtures during training by @taenin in #95
  • Add train, test, and validation splits to the LeMa config. by @taenin in #101
  • nanoGPT (GPT2) pretraining recipe by @wizeng23 in #103
  • Minor: Updates on Zephyr-Config by @optas in https://githu...
Read more