Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] KL Transform for RLHF #1196

Merged
merged 31 commits into from
May 30, 2023
Merged

[Feature] KL Transform for RLHF #1196

merged 31 commits into from
May 30, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented May 26, 2023

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 26, 2023
@vmoens vmoens added the enhancement New feature or request label May 26, 2023
Copy link
Contributor

@tcbegley tcbegley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

actor, keep_params=False, funs_to_decorate=["forward", "get_dist"]
)
self.functional_actor = deepcopy(actor)
repopulate_module(actor, params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we repopulate here when below we seem to only call with params=self.frozen_params? Is this so that the caller can continue to use actor having supplied it as an argument to the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want users to get an actor without parameters, that's all
there wouldn't be much to train :p

vmoens added 2 commits May 30, 2023 15:47
# Conflicts:
#	torchrl/csrc/utils.h
#	torchrl/modules/distributions/continuous.py
@vmoens vmoens merged commit e8f5efe into main May 30, 2023
@vmoens vmoens deleted the kl_transform branch May 30, 2023 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants