Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add Choice spec #2713

Merged
merged 4 commits into from
Feb 3, 2025
Merged

Conversation

kurtamohler
Copy link
Collaborator

@kurtamohler kurtamohler commented Jan 22, 2025

Stack from ghstack (oldest at bottom):

Close #2712

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jan 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2713

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kurtamohler added a commit that referenced this pull request Jan 22, 2025
ghstack-source-id: e0092dfb35c160986610b6acefc6d925f44da7f4
Pull Request resolved: #2713
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2025
@kurtamohler kurtamohler requested a review from vmoens January 23, 2025 00:00
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I left a few comments, feel free to disagree if you think it won't work!

Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I left a few comments, feel free to disagree if you think it won't work!

@vmoens vmoens added the enhancement New feature or request label Jan 23, 2025
kurtamohler added a commit to kurtamohler/torchrl that referenced this pull request Jan 24, 2025
ghstack-source-id: e0092dfb35c160986610b6acefc6d925f44da7f4
Pull Request resolved: pytorch#2713
kurtamohler added a commit to kurtamohler/torchrl that referenced this pull request Jan 25, 2025
ghstack-source-id: e0092dfb35c160986610b6acefc6d925f44da7f4
Pull Request resolved: pytorch#2713
[ghstack-poisoned]
kurtamohler added a commit that referenced this pull request Jan 25, 2025
ghstack-source-id: 7b0c0f4f3b548c010cf0701251486c74fc4dff6e
Pull Request resolved: #2713
@kurtamohler
Copy link
Collaborator Author

I've switched over to using a list instead of a stack, and I enabled expanding and stacking

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Feb 3, 2025
ghstack-source-id: 6776395454489a9996fb8c0488df4fbb6dd6d5f8
Pull Request resolved: #2713
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Feb 3, 2025
ghstack-source-id: afa315a311845ab39ade3e75046f32757f9d94f1
Pull Request resolved: #2713
@vmoens vmoens merged commit 98654dc into gh/kurtamohler/1/base Feb 3, 2025
33 of 45 checks passed
@vmoens vmoens deleted the gh/kurtamohler/1/head branch February 3, 2025 18:21
Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}40$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5788s 0.4677s 2.1381 Ops/s 2.1440 Ops/s $\color{#d91a1a}-0.28\%$
test_transformed 1.0534s 0.9528s 1.0495 Ops/s 1.0575 Ops/s $\color{#d91a1a}-0.75\%$
test_serial 1.4957s 1.4007s 0.7139 Ops/s 0.7091 Ops/s $\color{#35bf28}+0.68\%$
test_parallel 1.3698s 1.2654s 0.7903 Ops/s 0.8160 Ops/s $\color{#d91a1a}-3.15\%$
test_step_mdp_speed[True-True-True-True-True] 0.6553ms 29.6251μs 33.7552 KOps/s 33.1363 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[True-True-True-True-False] 62.2970μs 17.6794μs 56.5629 KOps/s 55.9648 KOps/s $\color{#35bf28}+1.07\%$
test_step_mdp_speed[True-True-True-False-True] 70.9030μs 17.0180μs 58.7612 KOps/s 59.2775 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[True-True-True-False-False] 51.1160μs 10.1712μs 98.3166 KOps/s 99.9599 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[True-True-False-True-True] 0.1078ms 31.8066μs 31.4400 KOps/s 31.1753 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[True-True-False-True-False] 53.9320μs 19.7001μs 50.7611 KOps/s 50.8843 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[True-True-False-False-True] 83.1360μs 18.7605μs 53.3036 KOps/s 52.8223 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[True-True-False-False-False] 69.8710μs 11.8734μs 84.2218 KOps/s 84.1447 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[True-False-True-True-True] 76.3630μs 33.6477μs 29.7197 KOps/s 29.6859 KOps/s $\color{#35bf28}+0.11\%$
test_step_mdp_speed[True-False-True-True-False] 64.9410μs 21.6005μs 46.2953 KOps/s 46.1489 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[True-False-True-False-True] 0.1007ms 18.8501μs 53.0501 KOps/s 53.0982 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-False-True-False-False] 60.2130μs 11.8214μs 84.5924 KOps/s 83.8256 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[True-False-False-True-True] 84.3180μs 35.3186μs 28.3137 KOps/s 28.1703 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-False-False-True-False] 73.9590μs 23.3420μs 42.8412 KOps/s 42.8448 KOps/s $-0.01\%$
test_step_mdp_speed[True-False-False-False-True] 69.4390μs 20.6768μs 48.3634 KOps/s 48.6344 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[True-False-False-False-False] 59.8220μs 13.6328μs 73.3523 KOps/s 73.5073 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[False-True-True-True-True] 93.3440μs 34.1600μs 29.2740 KOps/s 29.4119 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[False-True-True-True-False] 53.5600μs 21.7280μs 46.0236 KOps/s 46.3119 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-True-True-False-True] 0.6216ms 21.6305μs 46.2311 KOps/s 46.4620 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-True-True-False-False] 46.8680μs 13.3799μs 74.7392 KOps/s 75.2801 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[False-True-False-True-True] 0.1028ms 35.7493μs 27.9726 KOps/s 27.9497 KOps/s $\color{#35bf28}+0.08\%$
test_step_mdp_speed[False-True-False-True-False] 81.3150μs 23.3431μs 42.8392 KOps/s 42.8784 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-False-False-True] 2.7424ms 23.0732μs 43.3403 KOps/s 43.1414 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[False-True-False-False-False] 74.8500μs 14.9664μs 66.8165 KOps/s 66.9115 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[False-False-True-True-True] 89.0370μs 37.2439μs 26.8500 KOps/s 26.7749 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-False-True-True-False] 77.7660μs 25.1457μs 39.7682 KOps/s 39.6755 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-False-True-False-True] 62.6070μs 23.1106μs 43.2702 KOps/s 42.9412 KOps/s $\color{#35bf28}+0.77\%$
test_step_mdp_speed[False-False-True-False-False] 65.8230μs 15.0609μs 66.3971 KOps/s 66.1603 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-False-False-True-True] 0.1108ms 38.9363μs 25.6830 KOps/s 25.6066 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[False-False-False-True-False] 63.9290μs 26.5972μs 37.5980 KOps/s 37.0262 KOps/s $\color{#35bf28}+1.54\%$
test_step_mdp_speed[False-False-False-False-True] 77.9350μs 24.4958μs 40.8233 KOps/s 40.4858 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[False-False-False-False-False] 56.8860μs 16.7412μs 59.7329 KOps/s 59.9934 KOps/s $\color{#d91a1a}-0.43\%$
test_values[generalized_advantage_estimate-True-True] 10.1550ms 9.7474ms 102.5910 Ops/s 99.4523 Ops/s $\color{#35bf28}+3.16\%$
test_values[vec_generalized_advantage_estimate-True-True] 29.7258ms 26.8878ms 37.1915 Ops/s 38.2191 Ops/s $\color{#d91a1a}-2.69\%$
test_values[td0_return_estimate-False-False] 0.2845ms 0.1899ms 5.2671 KOps/s 4.9310 KOps/s $\textbf{\color{#35bf28}+6.82\%}$
test_values[td1_return_estimate-False-False] 27.5958ms 24.9473ms 40.0845 Ops/s 39.7896 Ops/s $\color{#35bf28}+0.74\%$
test_values[vec_td1_return_estimate-False-False] 29.5230ms 27.1565ms 36.8236 Ops/s 38.0039 Ops/s $\color{#d91a1a}-3.11\%$
test_values[td_lambda_return_estimate-True-False] 38.6047ms 35.6970ms 28.0135 Ops/s 27.1852 Ops/s $\color{#35bf28}+3.05\%$
test_values[vec_td_lambda_return_estimate-True-False] 28.5890ms 27.0670ms 36.9454 Ops/s 35.8617 Ops/s $\color{#35bf28}+3.02\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.8339ms 8.5020ms 117.6196 Ops/s 115.4424 Ops/s $\color{#35bf28}+1.89\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.9740ms 2.0034ms 499.1591 Ops/s 496.7009 Ops/s $\color{#35bf28}+0.49\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5064ms 0.3679ms 2.7182 KOps/s 2.6768 KOps/s $\color{#35bf28}+1.55\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 40.2711ms 38.4139ms 26.0322 Ops/s 22.4002 Ops/s $\textbf{\color{#35bf28}+16.21\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.9284ms 3.4846ms 286.9775 Ops/s 285.4730 Ops/s $\color{#35bf28}+0.53\%$
test_dqn_speed[False-None] 6.4259ms 1.4308ms 698.9021 Ops/s 688.5614 Ops/s $\color{#35bf28}+1.50\%$
test_dqn_speed[False-backward] 2.4799ms 1.9653ms 508.8178 Ops/s 516.0100 Ops/s $\color{#d91a1a}-1.39\%$
test_dqn_speed[True-None] 0.7481ms 0.4830ms 2.0703 KOps/s 2.0192 KOps/s $\color{#35bf28}+2.53\%$
test_dqn_speed[True-backward] 1.1288ms 0.9364ms 1.0680 KOps/s 1.0826 KOps/s $\color{#d91a1a}-1.35\%$
test_dqn_speed[reduce-overhead-None] 0.8102ms 0.4821ms 2.0742 KOps/s 1.9994 KOps/s $\color{#35bf28}+3.74\%$
test_dqn_speed[reduce-overhead-backward] 0.9430ms 0.8954ms 1.1168 KOps/s 1.0515 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_ddpg_speed[False-None] 4.2037ms 2.9496ms 339.0247 Ops/s 339.8312 Ops/s $\color{#d91a1a}-0.24\%$
test_ddpg_speed[False-backward] 5.2178ms 4.1457ms 241.2136 Ops/s 246.0519 Ops/s $\color{#d91a1a}-1.97\%$
test_ddpg_speed[True-None] 2.4701ms 1.2392ms 806.9784 Ops/s 789.7528 Ops/s $\color{#35bf28}+2.18\%$
test_ddpg_speed[True-backward] 2.9083ms 2.1724ms 460.3117 Ops/s 460.5103 Ops/s $\color{#d91a1a}-0.04\%$
test_ddpg_speed[reduce-overhead-None] 2.0187ms 1.2600ms 793.6808 Ops/s 798.7932 Ops/s $\color{#d91a1a}-0.64\%$
test_ddpg_speed[reduce-overhead-backward] 2.1654ms 2.1124ms 473.3989 Ops/s 430.7945 Ops/s $\textbf{\color{#35bf28}+9.89\%}$
test_sac_speed[False-None] 11.1387ms 8.3805ms 119.3253 Ops/s 122.2343 Ops/s $\color{#d91a1a}-2.38\%$
test_sac_speed[False-backward] 13.9144ms 11.4442ms 87.3806 Ops/s 90.0316 Ops/s $\color{#d91a1a}-2.94\%$
test_sac_speed[True-None] 3.8121ms 2.1646ms 461.9748 Ops/s 466.1114 Ops/s $\color{#d91a1a}-0.89\%$
test_sac_speed[True-backward] 5.2540ms 4.0109ms 249.3178 Ops/s 257.2707 Ops/s $\color{#d91a1a}-3.09\%$
test_sac_speed[reduce-overhead-None] 2.7240ms 2.0827ms 480.1358 Ops/s 470.6067 Ops/s $\color{#35bf28}+2.02\%$
test_sac_speed[reduce-overhead-backward] 3.9130ms 3.7590ms 266.0274 Ops/s 255.0079 Ops/s $\color{#35bf28}+4.32\%$
test_redq_speed[False-None] 13.8477ms 13.1272ms 76.1778 Ops/s 70.4339 Ops/s $\textbf{\color{#35bf28}+8.16\%}$
test_redq_speed[False-backward] 31.8380ms 22.9958ms 43.4861 Ops/s 42.0095 Ops/s $\color{#35bf28}+3.52\%$
test_redq_speed[True-None] 6.3749ms 5.3015ms 188.6272 Ops/s 176.0601 Ops/s $\textbf{\color{#35bf28}+7.14\%}$
test_redq_speed[True-backward] 13.4856ms 12.6710ms 78.9205 Ops/s 74.1290 Ops/s $\textbf{\color{#35bf28}+6.46\%}$
test_redq_speed[reduce-overhead-None] 6.6949ms 5.2360ms 190.9850 Ops/s 173.8590 Ops/s $\textbf{\color{#35bf28}+9.85\%}$
test_redq_speed[reduce-overhead-backward] 14.0807ms 12.9396ms 77.2820 Ops/s 74.5902 Ops/s $\color{#35bf28}+3.61\%$
test_redq_deprec_speed[False-None] 14.4928ms 13.3702ms 74.7932 Ops/s 73.0438 Ops/s $\color{#35bf28}+2.40\%$
test_redq_deprec_speed[False-backward] 25.0557ms 20.0901ms 49.7757 Ops/s 50.0238 Ops/s $\color{#d91a1a}-0.50\%$
test_redq_deprec_speed[True-None] 4.3429ms 3.8635ms 258.8358 Ops/s 226.9180 Ops/s $\textbf{\color{#35bf28}+14.07\%}$
test_redq_deprec_speed[True-backward] 9.5420ms 8.8090ms 113.5207 Ops/s 107.5513 Ops/s $\textbf{\color{#35bf28}+5.55\%}$
test_redq_deprec_speed[reduce-overhead-None] 5.5935ms 4.2757ms 233.8812 Ops/s 222.2866 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_redq_deprec_speed[reduce-overhead-backward] 10.2807ms 8.7250ms 114.6137 Ops/s 105.6645 Ops/s $\textbf{\color{#35bf28}+8.47\%}$
test_td3_speed[False-None] 8.8249ms 8.1417ms 122.8252 Ops/s 115.2949 Ops/s $\textbf{\color{#35bf28}+6.53\%}$
test_td3_speed[False-backward] 12.5273ms 10.6594ms 93.8141 Ops/s 88.0043 Ops/s $\textbf{\color{#35bf28}+6.60\%}$
test_td3_speed[True-None] 1.9061ms 1.7793ms 562.0268 Ops/s 514.7225 Ops/s $\textbf{\color{#35bf28}+9.19\%}$
test_td3_speed[True-backward] 3.8831ms 3.3929ms 294.7337 Ops/s 277.9080 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_td3_speed[reduce-overhead-None] 1.8852ms 1.7669ms 565.9691 Ops/s 510.1035 Ops/s $\textbf{\color{#35bf28}+10.95\%}$
test_td3_speed[reduce-overhead-backward] 3.4750ms 3.3531ms 298.2334 Ops/s 270.0951 Ops/s $\textbf{\color{#35bf28}+10.42\%}$
test_cql_speed[False-None] 38.8759ms 36.6041ms 27.3193 Ops/s 25.9421 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_cql_speed[False-backward] 48.7043ms 46.8957ms 21.3239 Ops/s 20.4794 Ops/s $\color{#35bf28}+4.12\%$
test_cql_speed[True-None] 17.7653ms 16.2647ms 61.4828 Ops/s 58.7766 Ops/s $\color{#35bf28}+4.60\%$
test_cql_speed[True-backward] 30.2116ms 23.4017ms 42.7319 Ops/s 41.7455 Ops/s $\color{#35bf28}+2.36\%$
test_cql_speed[reduce-overhead-None] 16.8025ms 16.2669ms 61.4747 Ops/s 59.4496 Ops/s $\color{#35bf28}+3.41\%$
test_cql_speed[reduce-overhead-backward] 24.4430ms 23.0514ms 43.3813 Ops/s 40.7612 Ops/s $\textbf{\color{#35bf28}+6.43\%}$
test_a2c_speed[False-None] 8.2967ms 7.2893ms 137.1883 Ops/s 127.6993 Ops/s $\textbf{\color{#35bf28}+7.43\%}$
test_a2c_speed[False-backward] 15.4072ms 14.5807ms 68.5836 Ops/s 64.2354 Ops/s $\textbf{\color{#35bf28}+6.77\%}$
test_a2c_speed[True-None] 3.9840ms 3.6927ms 270.8012 Ops/s 248.3873 Ops/s $\textbf{\color{#35bf28}+9.02\%}$
test_a2c_speed[True-backward] 11.4001ms 10.7847ms 92.7239 Ops/s 90.7620 Ops/s $\color{#35bf28}+2.16\%$
test_a2c_speed[reduce-overhead-None] 4.6485ms 3.7045ms 269.9383 Ops/s 253.9782 Ops/s $\textbf{\color{#35bf28}+6.28\%}$
test_a2c_speed[reduce-overhead-backward] 10.9328ms 10.6618ms 93.7928 Ops/s 90.1404 Ops/s $\color{#35bf28}+4.05\%$
test_ppo_speed[False-None] 11.1815ms 7.9354ms 126.0178 Ops/s 122.2671 Ops/s $\color{#35bf28}+3.07\%$
test_ppo_speed[False-backward] 16.8967ms 15.5363ms 64.3654 Ops/s 62.2259 Ops/s $\color{#35bf28}+3.44\%$
test_ppo_speed[True-None] 4.7325ms 4.0735ms 245.4904 Ops/s 233.6592 Ops/s $\textbf{\color{#35bf28}+5.06\%}$
test_ppo_speed[True-backward] 11.2141ms 10.5044ms 95.1981 Ops/s 91.4118 Ops/s $\color{#35bf28}+4.14\%$
test_ppo_speed[reduce-overhead-None] 4.5009ms 4.1106ms 243.2727 Ops/s 235.8577 Ops/s $\color{#35bf28}+3.14\%$
test_ppo_speed[reduce-overhead-backward] 10.9954ms 10.6313ms 94.0617 Ops/s 90.6558 Ops/s $\color{#35bf28}+3.76\%$
test_reinforce_speed[False-None] 8.0385ms 6.7674ms 147.7671 Ops/s 146.0864 Ops/s $\color{#35bf28}+1.15\%$
test_reinforce_speed[False-backward] 10.7828ms 10.2257ms 97.7925 Ops/s 95.9783 Ops/s $\color{#35bf28}+1.89\%$
test_reinforce_speed[True-None] 3.4901ms 3.1617ms 316.2872 Ops/s 301.7596 Ops/s $\color{#35bf28}+4.81\%$
test_reinforce_speed[True-backward] 10.0184ms 9.2899ms 107.6441 Ops/s 101.4495 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_reinforce_speed[reduce-overhead-None] 3.7353ms 3.0883ms 323.8065 Ops/s 306.2325 Ops/s $\textbf{\color{#35bf28}+5.74\%}$
test_reinforce_speed[reduce-overhead-backward] 9.8037ms 9.2540ms 108.0609 Ops/s 102.1532 Ops/s $\textbf{\color{#35bf28}+5.78\%}$
test_iql_speed[False-None] 0.3074s 41.6167ms 24.0288 Ops/s 30.0340 Ops/s $\textbf{\color{#d91a1a}-19.99\%}$
test_iql_speed[False-backward] 47.5053ms 46.1045ms 21.6899 Ops/s 21.3762 Ops/s $\color{#35bf28}+1.47\%$
test_iql_speed[True-None] 12.6722ms 11.4767ms 87.1330 Ops/s 82.7481 Ops/s $\textbf{\color{#35bf28}+5.30\%}$
test_iql_speed[True-backward] 23.7867ms 22.6901ms 44.0721 Ops/s 41.8955 Ops/s $\textbf{\color{#35bf28}+5.20\%}$
test_iql_speed[reduce-overhead-None] 12.1326ms 11.6121ms 86.1171 Ops/s 83.4031 Ops/s $\color{#35bf28}+3.25\%$
test_iql_speed[reduce-overhead-backward] 23.6112ms 22.6801ms 44.0916 Ops/s 42.4969 Ops/s $\color{#35bf28}+3.75\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3810ms 5.0006ms 199.9771 Ops/s 186.7314 Ops/s $\textbf{\color{#35bf28}+7.09\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8510ms 0.5271ms 1.8972 KOps/s 1.8660 KOps/s $\color{#35bf28}+1.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8638ms 0.4993ms 2.0029 KOps/s 1.9534 KOps/s $\color{#35bf28}+2.54\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.2548ms 4.8098ms 207.9089 Ops/s 200.2159 Ops/s $\color{#35bf28}+3.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1349ms 0.5090ms 1.9648 KOps/s 1.9214 KOps/s $\color{#35bf28}+2.26\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8961ms 0.4942ms 2.0233 KOps/s 1.9829 KOps/s $\color{#35bf28}+2.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.7370ms 1.6713ms 598.3194 Ops/s 587.1956 Ops/s $\color{#35bf28}+1.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3033ms 1.5735ms 635.5417 Ops/s 622.9106 Ops/s $\color{#35bf28}+2.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.1199ms 4.9522ms 201.9305 Ops/s 191.8969 Ops/s $\textbf{\color{#35bf28}+5.23\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1733ms 0.6589ms 1.5176 KOps/s 1.4900 KOps/s $\color{#35bf28}+1.85\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8884ms 0.6302ms 1.5868 KOps/s 1.5371 KOps/s $\color{#35bf28}+3.23\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.4202ms 4.8714ms 205.2796 Ops/s 195.9854 Ops/s $\color{#35bf28}+4.74\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8945ms 0.5256ms 1.9027 KOps/s 1.8458 KOps/s $\color{#35bf28}+3.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7810ms 0.5065ms 1.9743 KOps/s 1.9607 KOps/s $\color{#35bf28}+0.69\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6383ms 4.8684ms 205.4081 Ops/s 201.0601 Ops/s $\color{#35bf28}+2.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8670ms 0.5140ms 1.9457 KOps/s 1.9318 KOps/s $\color{#35bf28}+0.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7694ms 0.4945ms 2.0222 KOps/s 1.9598 KOps/s $\color{#35bf28}+3.19\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1793ms 4.8931ms 204.3689 Ops/s 195.2723 Ops/s $\color{#35bf28}+4.66\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0487ms 0.6603ms 1.5146 KOps/s 1.4902 KOps/s $\color{#35bf28}+1.63\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0985ms 0.6365ms 1.5711 KOps/s 1.5571 KOps/s $\color{#35bf28}+0.90\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5597ms 4.2125ms 237.3865 Ops/s 232.1847 Ops/s $\color{#35bf28}+2.24\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.7707ms 2.4670ms 405.3536 Ops/s 401.7495 Ops/s $\color{#35bf28}+0.90\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 3.9774ms 1.3006ms 768.8859 Ops/s 728.2068 Ops/s $\textbf{\color{#35bf28}+5.59\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.7648ms 4.2440ms 235.6245 Ops/s 31.5338 Ops/s $\textbf{\color{#35bf28}+647.21\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.0029ms 2.3011ms 434.5722 Ops/s 365.8723 Ops/s $\textbf{\color{#35bf28}+18.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.2605ms 1.3493ms 741.1226 Ops/s 780.6881 Ops/s $\textbf{\color{#d91a1a}-5.07\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4508s 13.3944ms 74.6583 Ops/s 215.3515 Ops/s $\textbf{\color{#d91a1a}-65.33\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.8679ms 2.4579ms 406.8547 Ops/s 389.8400 Ops/s $\color{#35bf28}+4.36\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8796ms 1.3383ms 747.1943 Ops/s 651.4306 Ops/s $\textbf{\color{#35bf28}+14.70\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 15.1525ms 11.5049ms 86.9195 Ops/s 77.8493 Ops/s $\textbf{\color{#35bf28}+11.65\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.2373ms 14.2131ms 70.3576 Ops/s 67.7563 Ops/s $\color{#35bf28}+3.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.6251ms 20.2796ms 49.3106 Ops/s 46.1524 Ops/s $\textbf{\color{#35bf28}+6.84\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.2048ms 14.4589ms 69.1617 Ops/s 66.6854 Ops/s $\color{#35bf28}+3.71\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.8930ms 20.3660ms 49.1015 Ops/s 46.3903 Ops/s $\textbf{\color{#35bf28}+5.84\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.3653ms 15.6540ms 63.8815 Ops/s 60.0083 Ops/s $\textbf{\color{#35bf28}+6.45\%}$

Copy link

github-actions bot commented Feb 3, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}27$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8354s 0.7533s 1.3275 Ops/s 1.3670 Ops/s $\color{#d91a1a}-2.89\%$
test_transformed 1.4236s 1.3376s 0.7476 Ops/s 0.7312 Ops/s $\color{#35bf28}+2.24\%$
test_serial 2.1703s 2.1677s 0.4613 Ops/s 0.4523 Ops/s $\color{#35bf28}+2.00\%$
test_parallel 1.8210s 1.8100s 0.5525 Ops/s 0.5385 Ops/s $\color{#35bf28}+2.59\%$
test_step_mdp_speed[True-True-True-True-True] 0.1657ms 41.0266μs 24.3744 KOps/s 23.9349 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[True-True-True-True-False] 0.1299ms 23.4247μs 42.6900 KOps/s 42.0200 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[True-True-True-False-True] 58.7010μs 22.0987μs 45.2516 KOps/s 43.1245 KOps/s $\color{#35bf28}+4.93\%$
test_step_mdp_speed[True-True-True-False-False] 51.3010μs 12.8683μs 77.7102 KOps/s 75.2876 KOps/s $\color{#35bf28}+3.22\%$
test_step_mdp_speed[True-True-False-True-True] 0.1049ms 43.4543μs 23.0127 KOps/s 22.9206 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[True-True-False-True-False] 58.2010μs 26.0372μs 38.4066 KOps/s 38.0460 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-True-False-False-True] 0.1698ms 24.7401μs 40.4203 KOps/s 38.9921 KOps/s $\color{#35bf28}+3.66\%$
test_step_mdp_speed[True-True-False-False-False] 0.1546ms 15.4191μs 64.8545 KOps/s 63.7777 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[True-False-True-True-True] 83.9710μs 45.5655μs 21.9464 KOps/s 21.6739 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[True-False-True-True-False] 56.5110μs 28.0525μs 35.6475 KOps/s 35.5087 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[True-False-True-False-True] 58.2010μs 24.7908μs 40.3375 KOps/s 39.5278 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[True-False-True-False-False] 0.1884ms 15.4811μs 64.5950 KOps/s 63.8862 KOps/s $\color{#35bf28}+1.11\%$
test_step_mdp_speed[True-False-False-True-True] 0.2214ms 47.0376μs 21.2596 KOps/s 20.5510 KOps/s $\color{#35bf28}+3.45\%$
test_step_mdp_speed[True-False-False-True-False] 62.3610μs 30.4468μs 32.8442 KOps/s 32.7686 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[True-False-False-False-True] 0.2196ms 26.4453μs 37.8139 KOps/s 35.6166 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_step_mdp_speed[True-False-False-False-False] 44.2410μs 17.7024μs 56.4894 KOps/s 54.8167 KOps/s $\color{#35bf28}+3.05\%$
test_step_mdp_speed[False-True-True-True-True] 80.0610μs 45.1607μs 22.1431 KOps/s 21.4409 KOps/s $\color{#35bf28}+3.28\%$
test_step_mdp_speed[False-True-True-True-False] 63.5710μs 27.7843μs 35.9916 KOps/s 35.1050 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[False-True-True-False-True] 2.6098ms 28.9831μs 34.5029 KOps/s 33.9443 KOps/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[False-True-True-False-False] 43.5210μs 17.1879μs 58.1803 KOps/s 57.2317 KOps/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[False-True-False-True-True] 80.4910μs 47.7833μs 20.9278 KOps/s 20.3652 KOps/s $\color{#35bf28}+2.76\%$
test_step_mdp_speed[False-True-False-True-False] 92.5110μs 31.2213μs 32.0294 KOps/s 32.9238 KOps/s $\color{#d91a1a}-2.72\%$
test_step_mdp_speed[False-True-False-False-True] 89.2510μs 30.4175μs 32.8758 KOps/s 30.7569 KOps/s $\textbf{\color{#35bf28}+6.89\%}$
test_step_mdp_speed[False-True-False-False-False] 48.3310μs 19.3489μs 51.6825 KOps/s 49.7063 KOps/s $\color{#35bf28}+3.98\%$
test_step_mdp_speed[False-False-True-True-True] 73.0610μs 49.5186μs 20.1944 KOps/s 19.9613 KOps/s $\color{#35bf28}+1.17\%$
test_step_mdp_speed[False-False-True-True-False] 69.8910μs 32.4111μs 30.8536 KOps/s 30.6044 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-False-True-False-True] 92.4320μs 30.5057μs 32.7808 KOps/s 31.7139 KOps/s $\color{#35bf28}+3.36\%$
test_step_mdp_speed[False-False-True-False-False] 48.8700μs 19.0633μs 52.4568 KOps/s 52.7615 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[False-False-False-True-True] 87.1210μs 51.2371μs 19.5171 KOps/s 19.2110 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[False-False-False-True-False] 0.1281ms 34.6086μs 28.8946 KOps/s 28.0079 KOps/s $\color{#35bf28}+3.17\%$
test_step_mdp_speed[False-False-False-False-True] 0.1012ms 31.9435μs 31.3053 KOps/s 29.9377 KOps/s $\color{#35bf28}+4.57\%$
test_step_mdp_speed[False-False-False-False-False] 45.2000μs 21.4128μs 46.7010 KOps/s 46.1821 KOps/s $\color{#35bf28}+1.12\%$
test_values[generalized_advantage_estimate-True-True] 25.2997ms 24.7861ms 40.3451 Ops/s 39.6461 Ops/s $\color{#35bf28}+1.76\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1049s 2.9964ms 333.7393 Ops/s 342.2462 Ops/s $\color{#d91a1a}-2.49\%$
test_values[td0_return_estimate-False-False] 0.1102ms 79.8345μs 12.5259 KOps/s 12.0971 KOps/s $\color{#35bf28}+3.55\%$
test_values[td1_return_estimate-False-False] 55.7703ms 55.4124ms 18.0465 Ops/s 17.5316 Ops/s $\color{#35bf28}+2.94\%$
test_values[vec_td1_return_estimate-False-False] 1.3445ms 1.0870ms 919.9264 Ops/s 923.0268 Ops/s $\color{#d91a1a}-0.34\%$
test_values[td_lambda_return_estimate-True-False] 87.9393ms 87.3941ms 11.4424 Ops/s 10.9402 Ops/s $\color{#35bf28}+4.59\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3560ms 1.0846ms 921.9565 Ops/s 925.2920 Ops/s $\color{#d91a1a}-0.36\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.5822ms 24.9105ms 40.1437 Ops/s 39.5198 Ops/s $\color{#35bf28}+1.58\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.1384ms 0.7626ms 1.3113 KOps/s 1.3269 KOps/s $\color{#d91a1a}-1.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8719ms 0.6753ms 1.4808 KOps/s 1.4885 KOps/s $\color{#d91a1a}-0.51\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.7290ms 1.4892ms 671.5173 Ops/s 671.0768 Ops/s $\color{#35bf28}+0.07\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9042ms 0.7013ms 1.4259 KOps/s 1.4624 KOps/s $\color{#d91a1a}-2.50\%$
test_dqn_speed[False-None] 6.8428ms 1.5296ms 653.7867 Ops/s 639.3329 Ops/s $\color{#35bf28}+2.26\%$
test_dqn_speed[False-backward] 2.4912ms 2.1408ms 467.1061 Ops/s 460.6077 Ops/s $\color{#35bf28}+1.41\%$
test_dqn_speed[True-None] 0.7265ms 0.5586ms 1.7903 KOps/s 1.7183 KOps/s $\color{#35bf28}+4.19\%$
test_dqn_speed[True-backward] 1.3131ms 1.1388ms 878.0933 Ops/s 801.1503 Ops/s $\textbf{\color{#35bf28}+9.60\%}$
test_dqn_speed[reduce-overhead-None] 0.9620ms 0.5763ms 1.7351 KOps/s 1.7199 KOps/s $\color{#35bf28}+0.89\%$
test_dqn_speed[reduce-overhead-backward] 0.9998ms 0.9631ms 1.0383 KOps/s 914.4634 Ops/s $\textbf{\color{#35bf28}+13.54\%}$
test_ddpg_speed[False-None] 3.2746ms 2.8958ms 345.3223 Ops/s 339.4729 Ops/s $\color{#35bf28}+1.72\%$
test_ddpg_speed[False-backward] 4.6329ms 4.1631ms 240.2053 Ops/s 230.7126 Ops/s $\color{#35bf28}+4.11\%$
test_ddpg_speed[True-None] 1.7496ms 1.3492ms 741.1704 Ops/s 736.9506 Ops/s $\color{#35bf28}+0.57\%$
test_ddpg_speed[True-backward] 2.6145ms 2.4390ms 409.9993 Ops/s 380.2232 Ops/s $\textbf{\color{#35bf28}+7.83\%}$
test_ddpg_speed[reduce-overhead-None] 1.7848ms 1.3635ms 733.4255 Ops/s 725.8819 Ops/s $\color{#35bf28}+1.04\%$
test_ddpg_speed[reduce-overhead-backward] 2.0743ms 1.8895ms 529.2493 Ops/s 483.4932 Ops/s $\textbf{\color{#35bf28}+9.46\%}$
test_sac_speed[False-None] 8.4549ms 8.0655ms 123.9848 Ops/s 119.3480 Ops/s $\color{#35bf28}+3.89\%$
test_sac_speed[False-backward] 11.5143ms 10.9839ms 91.0421 Ops/s 87.4740 Ops/s $\color{#35bf28}+4.08\%$
test_sac_speed[True-None] 2.2352ms 1.8508ms 540.3032 Ops/s 526.1189 Ops/s $\color{#35bf28}+2.70\%$
test_sac_speed[True-backward] 4.0701ms 3.5720ms 279.9561 Ops/s 263.2318 Ops/s $\textbf{\color{#35bf28}+6.35\%}$
test_sac_speed[reduce-overhead-None] 22.4408ms 12.0993ms 82.6497 Ops/s 80.2723 Ops/s $\color{#35bf28}+2.96\%$
test_sac_speed[reduce-overhead-backward] 1.7496ms 1.6118ms 620.4090 Ops/s 545.7832 Ops/s $\textbf{\color{#35bf28}+13.67\%}$
test_redq_speed[False-None] 8.1966ms 7.5269ms 132.8563 Ops/s 129.5800 Ops/s $\color{#35bf28}+2.53\%$
test_redq_speed[False-backward] 12.0987ms 11.3421ms 88.1675 Ops/s 84.1866 Ops/s $\color{#35bf28}+4.73\%$
test_redq_speed[True-None] 2.6054ms 2.3016ms 434.4884 Ops/s 426.0295 Ops/s $\color{#35bf28}+1.99\%$
test_redq_speed[True-backward] 4.1528ms 3.9861ms 250.8701 Ops/s 245.6392 Ops/s $\color{#35bf28}+2.13\%$
test_redq_speed[reduce-overhead-None] 3.4010ms 2.3197ms 431.0874 Ops/s 419.2837 Ops/s $\color{#35bf28}+2.82\%$
test_redq_speed[reduce-overhead-backward] 4.8999ms 4.1845ms 238.9759 Ops/s 232.8699 Ops/s $\color{#35bf28}+2.62\%$
test_redq_deprec_speed[False-None] 11.4684ms 9.1328ms 109.4960 Ops/s 106.8611 Ops/s $\color{#35bf28}+2.47\%$
test_redq_deprec_speed[False-backward] 12.4570ms 11.9837ms 83.4464 Ops/s 79.5085 Ops/s $\color{#35bf28}+4.95\%$
test_redq_deprec_speed[True-None] 3.0919ms 2.6527ms 376.9713 Ops/s 360.0893 Ops/s $\color{#35bf28}+4.69\%$
test_redq_deprec_speed[True-backward] 4.9916ms 4.5045ms 222.0026 Ops/s 216.7521 Ops/s $\color{#35bf28}+2.42\%$
test_redq_deprec_speed[reduce-overhead-None] 2.9530ms 2.6392ms 378.9083 Ops/s 371.5935 Ops/s $\color{#35bf28}+1.97\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6576ms 4.4721ms 223.6099 Ops/s 219.0384 Ops/s $\color{#35bf28}+2.09\%$
test_td3_speed[False-None] 8.0902ms 7.9884ms 125.1820 Ops/s 121.9109 Ops/s $\color{#35bf28}+2.68\%$
test_td3_speed[False-backward] 11.3448ms 10.5523ms 94.7657 Ops/s 93.0105 Ops/s $\color{#35bf28}+1.89\%$
test_td3_speed[True-None] 1.7377ms 1.6540ms 604.6105 Ops/s 556.4624 Ops/s $\textbf{\color{#35bf28}+8.65\%}$
test_td3_speed[True-backward] 3.6313ms 3.3578ms 297.8107 Ops/s 307.5765 Ops/s $\color{#d91a1a}-3.18\%$
test_td3_speed[reduce-overhead-None] 55.0701ms 26.6284ms 37.5539 Ops/s 35.3894 Ops/s $\textbf{\color{#35bf28}+6.12\%}$
test_td3_speed[reduce-overhead-backward] 1.6612ms 1.4998ms 666.7659 Ops/s 713.4820 Ops/s $\textbf{\color{#d91a1a}-6.55\%}$
test_cql_speed[False-None] 17.1992ms 16.7936ms 59.5464 Ops/s 58.3851 Ops/s $\color{#35bf28}+1.99\%$
test_cql_speed[False-backward] 22.9511ms 22.3313ms 44.7802 Ops/s 44.9537 Ops/s $\color{#d91a1a}-0.39\%$
test_cql_speed[True-None] 3.5465ms 3.2761ms 305.2437 Ops/s 300.4343 Ops/s $\color{#35bf28}+1.60\%$
test_cql_speed[True-backward] 5.9351ms 5.5177ms 181.2335 Ops/s 178.2107 Ops/s $\color{#35bf28}+1.70\%$
test_cql_speed[reduce-overhead-None] 21.4647ms 13.2702ms 75.3567 Ops/s 56.3680 Ops/s $\textbf{\color{#35bf28}+33.69\%}$
test_cql_speed[reduce-overhead-backward] 1.9614ms 1.8234ms 548.4112 Ops/s 528.7542 Ops/s $\color{#35bf28}+3.72\%$
test_a2c_speed[False-None] 3.5006ms 3.1881ms 313.6654 Ops/s 305.7737 Ops/s $\color{#35bf28}+2.58\%$
test_a2c_speed[False-backward] 6.9279ms 6.0700ms 164.7456 Ops/s 160.0357 Ops/s $\color{#35bf28}+2.94\%$
test_a2c_speed[True-None] 1.6350ms 1.3574ms 736.6876 Ops/s 730.8886 Ops/s $\color{#35bf28}+0.79\%$
test_a2c_speed[True-backward] 3.0440ms 2.8868ms 346.4089 Ops/s 334.3572 Ops/s $\color{#35bf28}+3.60\%$
test_a2c_speed[reduce-overhead-None] 16.4766ms 9.2727ms 107.8439 Ops/s 108.2116 Ops/s $\color{#d91a1a}-0.34\%$
test_a2c_speed[reduce-overhead-backward] 1.7343ms 1.4704ms 680.0977 Ops/s 670.6581 Ops/s $\color{#35bf28}+1.41\%$
test_ppo_speed[False-None] 4.1047ms 3.6935ms 270.7493 Ops/s 263.2834 Ops/s $\color{#35bf28}+2.84\%$
test_ppo_speed[False-backward] 7.2373ms 6.8197ms 146.6338 Ops/s 145.0831 Ops/s $\color{#35bf28}+1.07\%$
test_ppo_speed[True-None] 1.9464ms 1.4213ms 703.5698 Ops/s 694.5792 Ops/s $\color{#35bf28}+1.29\%$
test_ppo_speed[True-backward] 3.3899ms 3.0581ms 326.9980 Ops/s 303.1367 Ops/s $\textbf{\color{#35bf28}+7.87\%}$
test_ppo_speed[reduce-overhead-None] 1.1725ms 0.9791ms 1.0214 KOps/s 1.0145 KOps/s $\color{#35bf28}+0.68\%$
test_ppo_speed[reduce-overhead-backward] 1.5006ms 1.4004ms 714.0898 Ops/s 615.3107 Ops/s $\textbf{\color{#35bf28}+16.05\%}$
test_reinforce_speed[False-None] 2.6774ms 2.2895ms 436.7755 Ops/s 422.3416 Ops/s $\color{#35bf28}+3.42\%$
test_reinforce_speed[False-backward] 3.7385ms 3.3125ms 301.8854 Ops/s 285.4956 Ops/s $\textbf{\color{#35bf28}+5.74\%}$
test_reinforce_speed[True-None] 1.5382ms 1.3088ms 764.0845 Ops/s 749.5477 Ops/s $\color{#35bf28}+1.94\%$
test_reinforce_speed[True-backward] 3.1038ms 2.9548ms 338.4336 Ops/s 323.7790 Ops/s $\color{#35bf28}+4.53\%$
test_reinforce_speed[reduce-overhead-None] 18.3222ms 10.1513ms 98.5094 Ops/s 102.0627 Ops/s $\color{#d91a1a}-3.48\%$
test_reinforce_speed[reduce-overhead-backward] 1.5976ms 1.4764ms 677.3236 Ops/s 594.6651 Ops/s $\textbf{\color{#35bf28}+13.90\%}$
test_iql_speed[False-None] 9.6684ms 9.2062ms 108.6228 Ops/s 104.6048 Ops/s $\color{#35bf28}+3.84\%$
test_iql_speed[False-backward] 13.3214ms 12.7883ms 78.1965 Ops/s 73.8296 Ops/s $\textbf{\color{#35bf28}+5.91\%}$
test_iql_speed[True-None] 2.5587ms 2.2621ms 442.0764 Ops/s 431.7090 Ops/s $\color{#35bf28}+2.40\%$
test_iql_speed[True-backward] 5.1870ms 4.9192ms 203.2836 Ops/s 198.2051 Ops/s $\color{#35bf28}+2.56\%$
test_iql_speed[reduce-overhead-None] 19.1796ms 11.2612ms 88.8008 Ops/s 87.2557 Ops/s $\color{#35bf28}+1.77\%$
test_iql_speed[reduce-overhead-backward] 2.1779ms 2.0500ms 487.8133 Ops/s 459.1247 Ops/s $\textbf{\color{#35bf28}+6.25\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0516ms 6.5083ms 153.6491 Ops/s 152.5835 Ops/s $\color{#35bf28}+0.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5376ms 0.2809ms 3.5601 KOps/s 3.0512 KOps/s $\textbf{\color{#35bf28}+16.68\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5254ms 0.2513ms 3.9796 KOps/s 3.3557 KOps/s $\textbf{\color{#35bf28}+18.59\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6422ms 6.2014ms 161.2545 Ops/s 161.6871 Ops/s $\color{#d91a1a}-0.27\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9955ms 0.2602ms 3.8426 KOps/s 3.8250 KOps/s $\color{#35bf28}+0.46\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6571ms 0.2780ms 3.5977 KOps/s 4.1424 KOps/s $\textbf{\color{#d91a1a}-13.15\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6968ms 1.4040ms 712.2485 Ops/s 784.5965 Ops/s $\textbf{\color{#d91a1a}-9.22\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7373ms 1.3080ms 764.5526 Ops/s 744.7499 Ops/s $\color{#35bf28}+2.66\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6096ms 6.3386ms 157.7633 Ops/s 155.6492 Ops/s $\color{#35bf28}+1.36\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0732ms 0.4115ms 2.4298 KOps/s 2.2894 KOps/s $\textbf{\color{#35bf28}+6.13\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7685ms 0.3868ms 2.5854 KOps/s 2.5444 KOps/s $\color{#35bf28}+1.61\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.6082ms 6.2788ms 159.2667 Ops/s 160.2057 Ops/s $\color{#d91a1a}-0.59\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9137ms 0.3378ms 2.9607 KOps/s 2.8803 KOps/s $\color{#35bf28}+2.79\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6535ms 0.3686ms 2.7126 KOps/s 3.3181 KOps/s $\textbf{\color{#d91a1a}-18.25\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 9.4776ms 6.1817ms 161.7680 Ops/s 160.5622 Ops/s $\color{#35bf28}+0.75\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.6830ms 0.3173ms 3.1513 KOps/s 3.2590 KOps/s $\color{#d91a1a}-3.30\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5069ms 0.3113ms 3.2122 KOps/s 3.5242 KOps/s $\textbf{\color{#d91a1a}-8.85\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7698ms 6.4252ms 155.6373 Ops/s 154.6451 Ops/s $\color{#35bf28}+0.64\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7447ms 0.4069ms 2.4575 KOps/s 2.0743 KOps/s $\textbf{\color{#35bf28}+18.48\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6251ms 0.3856ms 2.5931 KOps/s 2.4281 KOps/s $\textbf{\color{#35bf28}+6.80\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1813ms 5.5283ms 180.8867 Ops/s 176.5052 Ops/s $\color{#35bf28}+2.48\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.8823ms 1.9168ms 521.7038 Ops/s 428.3314 Ops/s $\textbf{\color{#35bf28}+21.80\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.6767ms 1.2208ms 819.1367 Ops/s 846.6376 Ops/s $\color{#d91a1a}-3.25\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.8108ms 5.6786ms 176.0995 Ops/s 180.6224 Ops/s $\color{#d91a1a}-2.50\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.0170ms 2.0293ms 492.7908 Ops/s 456.0613 Ops/s $\textbf{\color{#35bf28}+8.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.4820ms 1.2515ms 799.0310 Ops/s 743.3681 Ops/s $\textbf{\color{#35bf28}+7.49\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5078s 15.8602ms 63.0509 Ops/s 30.4844 Ops/s $\textbf{\color{#35bf28}+106.83\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.0293ms 2.2469ms 445.0559 Ops/s 463.6759 Ops/s $\color{#d91a1a}-4.02\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.0824ms 1.3783ms 725.5085 Ops/s 844.0839 Ops/s $\textbf{\color{#d91a1a}-14.05\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6261ms 12.9855ms 77.0089 Ops/s 71.6801 Ops/s $\textbf{\color{#35bf28}+7.43\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.6525ms 16.8016ms 59.5180 Ops/s 58.5444 Ops/s $\color{#35bf28}+1.66\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2631ms 17.7191ms 56.4363 Ops/s 54.1905 Ops/s $\color{#35bf28}+4.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.6285ms 16.9933ms 58.8469 Ops/s 57.5216 Ops/s $\color{#35bf28}+2.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.0792ms 17.6256ms 56.7358 Ops/s 54.4123 Ops/s $\color{#35bf28}+4.27\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.4155ms 17.9816ms 55.6125 Ops/s 53.4908 Ops/s $\color{#35bf28}+3.97\%$

@kurtamohler kurtamohler mentioned this pull request Feb 5, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants