[Feature] Single call to value network in advantages #1256

vmoens · 2023-06-12T08:02:29Z

This PR allows advantages to call once and only once the value model.
If adv.shifted is set to True and if the params at t and t+1 match, the value net is called only once.
In all other cases, vmap is used to batch the calls to the value net.

cc @tcbegley @apbard

skandermoalla · 2023-06-13T10:23:24Z

torchrl/objectives/value/advantages.py

+        # kwargs = {}
+        # if self.is_stateless and params is None:
+        #     raise RuntimeError(
+        #         "Expected params to be passed to advantage module but got none."
+        #     )
+        # if params is not None:
+        #     kwargs["params"] = params
+        #
+        # if self.value_network is not None:
+        #     with hold_out_net(self.value_network):
+        #         # we may still need to pass gradient, but we don't want to assign grads to
+        #         # value net params
+        #         self.value_network(tensordict, **kwargs)
+        #
+        # value = tensordict.get(self.tensor_keys.value)
+        #
+        # step_td = step_mdp(tensordict)
+        # if target_params is not None:
+        #     # we assume that target parameters are not differentiable
+        #     kwargs["params"] = target_params
+        # elif "params" in kwargs:
+        #     kwargs["params"] = kwargs["params"].detach()
+        # if self.value_network is not None:
+        #     with hold_out_net(self.value_network):
+        #         # we may still need to pass gradient, but we don't want to assign grads to
+        #         # value net params
+        #         self.value_network(step_td, **kwargs)
+        # next_value = step_td.get(self.tensor_keys.value)


Is the commented-out code left on purpose?

This reverts commit fccad08.

init

b70a00f

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 12, 2023

vmoens added 2 commits June 12, 2023 12:59

amend

ed55ee0

amend

44dffb5

vmoens added the enhancement New feature or request label Jun 12, 2023

vmoens merged commit fccad08 into main Jun 13, 2023

vmoens deleted the single_call_adv branch June 13, 2023 09:40

skandermoalla reviewed Jun 13, 2023

View reviewed changes

vmoens restored the single_call_adv branch June 13, 2023 10:27

vmoens added a commit that referenced this pull request Jun 13, 2023

Revert "[Feature] Single call to value network in advantages (#1256)"

af72c56

This reverts commit fccad08.

vmoens mentioned this pull request Jun 13, 2023

Revert "[Feature] Single call to value network in advantages" #1262

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Single call to value network in advantages #1256

[Feature] Single call to value network in advantages #1256

vmoens commented Jun 12, 2023 •

edited

Loading

skandermoalla Jun 13, 2023

[Feature] Single call to value network in advantages #1256

[Feature] Single call to value network in advantages #1256

Conversation

vmoens commented Jun 12, 2023 • edited Loading

skandermoalla Jun 13, 2023

Choose a reason for hiding this comment

vmoens commented Jun 12, 2023 •

edited

Loading