Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] Accelerate TD lambda return estimate #1158

Merged
merged 10 commits into from
May 18, 2023

Conversation

Blonck
Copy link
Contributor

@Blonck Blonck commented May 15, 2023

Description

Implement a version of for vec_td_lambda_return_estimate optimized for the case where gamma is a scalar.

Motivation and Context

In the case gamma is a scalar one can do the same optimization as done in #1141. Instead of constructing a big gamma tensor [B, T, T], one can split and pad the consecutive traces:

reward = [r00, r01, r02, r03, r10, r11]
done = [False, False, False, True, False, False]

into

r_transformed = [
    [r00, r01, r02, r03],
    [r10, r11, 0, 0]
]

and then apply a gamma filter to this transformed input.

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • [x ] Bug fix (non-breaking change which fixes an issue)

Closes #1148

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • I have read the CONTRIBUTION guide (required)
  • I have updated the tests accordingly (required for a bug fix or a new feature).

…bda are scalars

Patch is not ready for merge.
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 15, 2023
@Blonck
Copy link
Contributor Author

Blonck commented May 16, 2023

Performance comparison for shape (16, T, 1) on GPU. Shown is the the ratio runtime(algorithm)/runtime(td_lambda_return_estimate). 1 means the algorithm has identical runtime as the unvectorized version of TD lambda.

newplot (1)

@Blonck Blonck requested a review from vmoens May 16, 2023 06:16
@Blonck Blonck self-assigned this May 16, 2023
@Blonck Blonck added the performance Performance issue or suggestion for improvement label May 16, 2023
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for this!

@vmoens vmoens merged commit e8a43b9 into pytorch:main May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. performance Performance issue or suggestion for improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Performance] Accelerate TD lambda return estimate
3 participants