Add generation params to inference engines #600

oelachqar · 2024-10-07T20:31:33Z

Add Generation Parameters to Inference Engines

This PR introduces a set of missing generation parameters, and updates all the current inference engines to support them, or if not supported by the engine, logs a warning that the parameter will be ignored.

The following parameters are added: temperature, top_p, frequency_penalty, presence_penalty, stop sequences, logit_bias, and min_p.

Towards OPE-328

Changes

Updated GenerationParams class with new parameters
Implemented parameter support in AnthropicInferenceEngine, LlamaCppInferenceEngine, NativeTextInferenceEngine, RemoteInferenceEngine, and VLLMInferenceEngine

Usage

Setting Generation Parameters

from oumi.core.configs.params.generation_params import GenerationParams

params = GenerationParams(
    max_new_tokens=100,
    temperature=0.7,
    top_p=0.9,
    frequency_penalty=0.1,
    presence_penalty=0.1,
    stop=["END"],
    logit_bias={50256: -100},  # Decrease likelihood of EOS token
    min_p=0.05
)

Using Parameters with an Inference Engine

from oumi.inference.llama_cpp_inference_engine import LlamaCppInferenceEngine

engine = LlamaCppInferenceEngine(model_params, generation_params=params)
response = engine.generate(conversation)

linear · 2024-10-07T20:31:35Z

OPE-328

src/oumi/inference/remote_inference_engine.py

src/oumi/core/configs/params/generation_params.py

taenin

If you want to simplify the warning logic, we could implement a get_supported_params method in the baseInferenceEngine that by default returns an empty set. This method is called by infer() and will print warnings where relevant. Each derivative class could simply implement that method and get alerting as needed. Food for thought though, not completely necessary as we only have a few engines.

Additionally, we should update our unit tests to verify that the new parameters are passed appropriately

src/oumi/inference/llama_cpp_inference_engine.py

oelachqar · 2024-10-08T04:22:06Z

If you want to simplify the warning logic, we could implement a get_supported_params method in the baseInferenceEngine that by default returns an empty set. This method is called by infer() and will print warnings where relevant. Each derivative class could simply implement that method and get alerting as needed. Food for thought though, not completely necessary as we only have a few engines.

Additionally, we should update our unit tests to verify that the new parameters are passed appropriately

That's a great suggestion -- logged OPE-546 to address as a follow-up. Update this PR to include unit tests for the generation params

oelachqar added 9 commits October 7, 2024 12:21

move params

f10ef57

update

ef31666

update

884b22a

update

cc9bda1

update

93ca627

add min_p

5831ce6

cleanup

4921413

update type

fa8f772

Merge branch 'main' into oelachqar/add_generation_params

5fb05cd

oelachqar requested review from taenin, wizeng23, jgreer013 and nikg4 October 7, 2024 20:31

fix omegaconf type

7870f0e

nikg4 reviewed Oct 7, 2024

View reviewed changes

src/oumi/inference/remote_inference_engine.py Show resolved Hide resolved

nikg4 reviewed Oct 7, 2024

View reviewed changes

src/oumi/core/configs/params/generation_params.py Show resolved Hide resolved

nikg4 reviewed Oct 7, 2024

View reviewed changes

src/oumi/core/configs/params/generation_params.py Outdated Show resolved Hide resolved

taenin reviewed Oct 7, 2024

View reviewed changes

pr feedback

e2e0036

nikg4 approved these changes Oct 7, 2024

View reviewed changes

nikg4 reviewed Oct 7, 2024

View reviewed changes

src/oumi/inference/llama_cpp_inference_engine.py Outdated Show resolved Hide resolved

oelachqar added 6 commits October 7, 2024 17:56

fix tests

0af3668

fix test

20bd103

add tests

02e9cab

pr feedback

2186fb5

Merge branch 'main' into oelachqar/add_generation_params

eb220d5

pr feedback

aa269de

taenin approved these changes Oct 8, 2024

View reviewed changes

oelachqar merged commit 5f3780e into main Oct 8, 2024
1 check passed

oelachqar deleted the oelachqar/add_generation_params branch October 8, 2024 16:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generation params to inference engines #600

Add generation params to inference engines #600

oelachqar commented Oct 7, 2024

linear bot commented Oct 7, 2024

taenin left a comment •

edited

Loading

oelachqar commented Oct 8, 2024

Add generation params to inference engines #600

Add generation params to inference engines #600

Conversation

oelachqar commented Oct 7, 2024

Add Generation Parameters to Inference Engines

Changes

Usage

Setting Generation Parameters

Using Parameters with an Inference Engine

linear bot commented Oct 7, 2024

taenin left a comment • edited Loading

Choose a reason for hiding this comment

oelachqar commented Oct 8, 2024

taenin left a comment •

edited

Loading