What's Changed
- GRPO trainer: Minimal initial integration by @nikg4 in #1482
- Update oumi infer to fall back to interactive mode if no input path is specified. by @taenin in #1483
- Add sample DDP/GCP config for GRPO trainer by @nikg4 in #1485
- Temporary fix for chat template issue with multimodal inference w/ in-process vLLM engine by @nikg4 in #1486
- [tiny] Update async_eval.yaml comments to reference correct class by @wizeng23 in #1488
- Fix a bug where overriding remote_params fails via the CLI (oumi infer) by @taenin in #1487
- Define
GrpoParams
under configs by @nikg4 in #1490 - Support more GRPO params by @nikg4 in #1491
- Minor updates to
oumi env
by @nikg4 in #1492 - Warn instead of error when device not found for MFU calculation by @wizeng23 in #1489
- Updated all CLI endpoints to support oumi:// prefix by @Spaarsh in #1468
- Fix chat template issue for nested content parts used for VLMs by @nikg4 in #1493
- Ctseng777/judge by @ctseng777 in #1474
- [Evaluation] Modularization & enabling custom evaluations by @kaisopos in #1484
- Update documentation formatting for BaseModel by @taenin in #1494
- Fix
log_samples
not propagating fromeval_kwargs
by @jgreer013 in #1496 - [Evaluation] Adding support for logging model samples for all backends by @kaisopos in #1499
- Support for deprecated input param (
evaluation_platform
) by @kaisopos in #1500 - Limiting the AlpacaEval number of samples for quickstart by @kaisopos in #1501
- Add recurring tests to keep our test badges updated. by @taenin in #1498
- Add a schedule for our GPU, CPU, and doc tests by @taenin in #1503
- Update the GPU Tests badge to use results from main by @taenin in #1504
- vLLM version increment by @nikg4 in #1502
- Minor logging improvements by @nikg4 in #1505
- [Evaluation] Save Utils: Moving, fixes, and unit tests by @kaisopos in #1506
- Update sample GRPO script to validate num_generations by @nikg4 in #1509
- Resolve warning about
--dispatch batches
deprecated param by @nikg4 in #1510 - [Evaluation] Re-enabling evaluations with Math Hard (
leaderboard_math_hard
) by @kaisopos in #1511 - Update docker image and build script by @oelachqar in #1508
- Add Qwen QwQ Lora config by @wizeng23 in #1514
- Add QwQ eval/infer configs by @wizeng23 in #1515
- [Evaluation] Instantiating an inference engine (if needed) when running custom evaluations by @kaisopos in #1513
- Switch eval yaml configs to use evaluation_platform by @wizeng23 in #1516
- Mark
BaseMapDataset
astyping.Sized
by @nikg4 in #1517 - VLM collator refactor by @nikg4 in #1512
Full Changelog: v0.1.7...v0.1.8