Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] MultiAction transform #2779

Merged
merged 15 commits into from
Feb 13, 2025
Merged

[Feature] MultiAction transform #2779

merged 15 commits into from
Feb 13, 2025

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2779

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Pending, 1 Unrelated Failure

As of commit 2db0291 with merge base f1c42e0 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 98854f662662b8f6e919ad8888610003d2f727f4
Pull Request resolved: #2779
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 10, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 75c0d63669dceda1e5324565d636d7ae4f98ac67
Pull Request resolved: #2779
Copy link

github-actions bot commented Feb 10, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}36$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6383s 0.5316s 1.8812 Ops/s 1.8289 Ops/s $\color{#35bf28}+2.86\%$
test_transformed 1.1310s 1.0281s 0.9727 Ops/s 0.9643 Ops/s $\color{#35bf28}+0.87\%$
test_serial 1.6681s 1.5480s 0.6460 Ops/s 0.6254 Ops/s $\color{#35bf28}+3.29\%$
test_parallel 1.4155s 1.3078s 0.7647 Ops/s 0.7649 Ops/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[True-True-True-True-True] 0.6443ms 31.0005μs 32.2575 KOps/s 33.1222 KOps/s $\color{#d91a1a}-2.61\%$
test_step_mdp_speed[True-True-True-True-False] 47.8290μs 18.2032μs 54.9355 KOps/s 55.6163 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[True-True-True-False-True] 76.7190μs 17.6191μs 56.7566 KOps/s 56.0672 KOps/s $\color{#35bf28}+1.23\%$
test_step_mdp_speed[True-True-True-False-False] 42.9900μs 10.2922μs 97.1611 KOps/s 98.8223 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[True-True-False-True-True] 77.9040μs 33.6287μs 29.7365 KOps/s 30.8784 KOps/s $\color{#d91a1a}-3.70\%$
test_step_mdp_speed[True-True-False-True-False] 46.0660μs 20.0568μs 49.8584 KOps/s 50.4727 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[True-True-False-False-True] 86.9540μs 19.4809μs 51.3324 KOps/s 52.1735 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[True-True-False-False-False] 39.9740μs 12.2122μs 81.8851 KOps/s 84.0759 KOps/s $\color{#d91a1a}-2.61\%$
test_step_mdp_speed[True-False-True-True-True] 77.3040μs 35.3575μs 28.2826 KOps/s 29.0753 KOps/s $\color{#d91a1a}-2.73\%$
test_step_mdp_speed[True-False-True-True-False] 57.0150μs 22.0205μs 45.4122 KOps/s 45.8511 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[True-False-True-False-True] 86.5970μs 19.5527μs 51.1439 KOps/s 52.4954 KOps/s $\color{#d91a1a}-2.57\%$
test_step_mdp_speed[True-False-True-False-False] 43.6810μs 12.0714μs 82.8404 KOps/s 83.6009 KOps/s $\color{#d91a1a}-0.91\%$
test_step_mdp_speed[True-False-False-True-True] 81.8620μs 36.6596μs 27.2780 KOps/s 27.8139 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[True-False-False-True-False] 56.5850μs 24.2585μs 41.2226 KOps/s 42.2281 KOps/s $\color{#d91a1a}-2.38\%$
test_step_mdp_speed[True-False-False-False-True] 58.5390μs 21.2127μs 47.1415 KOps/s 48.1001 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[True-False-False-False-False] 46.1060μs 13.8853μs 72.0188 KOps/s 72.7119 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-True-True-True-True] 0.1204ms 35.1964μs 28.4120 KOps/s 28.7548 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[False-True-True-True-False] 64.1590μs 21.9994μs 45.4557 KOps/s 45.6715 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[False-True-True-False-True] 0.6120ms 22.5362μs 44.3731 KOps/s 45.2651 KOps/s $\color{#d91a1a}-1.97\%$
test_step_mdp_speed[False-True-True-False-False] 45.6650μs 13.6652μs 73.1784 KOps/s 74.3210 KOps/s $\color{#d91a1a}-1.54\%$
test_step_mdp_speed[False-True-False-True-True] 89.5370μs 37.1376μs 26.9269 KOps/s 27.6904 KOps/s $\color{#d91a1a}-2.76\%$
test_step_mdp_speed[False-True-False-True-False] 85.1020μs 23.8871μs 41.8636 KOps/s 42.3631 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-True-False-False-True] 2.4753ms 24.3116μs 41.1326 KOps/s 41.2493 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-True-False-False-False] 44.1320μs 15.4563μs 64.6985 KOps/s 65.0680 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[False-False-True-True-True] 77.0330μs 38.9095μs 25.7007 KOps/s 26.0689 KOps/s $\color{#d91a1a}-1.41\%$
test_step_mdp_speed[False-False-True-True-False] 56.4450μs 25.9481μs 38.5384 KOps/s 39.4670 KOps/s $\color{#d91a1a}-2.35\%$
test_step_mdp_speed[False-False-True-False-True] 54.0300μs 24.1869μs 41.3448 KOps/s 41.7898 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[False-False-True-False-False] 68.8080μs 15.4654μs 64.6606 KOps/s 65.4376 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[False-False-False-True-True] 84.6370μs 40.4938μs 24.6952 KOps/s 25.0035 KOps/s $\color{#d91a1a}-1.23\%$
test_step_mdp_speed[False-False-False-True-False] 73.1760μs 27.7180μs 36.0777 KOps/s 36.4746 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[False-False-False-False-True] 61.3640μs 25.8949μs 38.6177 KOps/s 38.4468 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-False-False-False-False] 49.6620μs 17.1193μs 58.4137 KOps/s 59.2115 KOps/s $\color{#d91a1a}-1.35\%$
test_values[generalized_advantage_estimate-True-True] 12.1484ms 10.2379ms 97.6761 Ops/s 96.5186 Ops/s $\color{#35bf28}+1.20\%$
test_values[vec_generalized_advantage_estimate-True-True] 31.3129ms 25.0191ms 39.9695 Ops/s 38.1207 Ops/s $\color{#35bf28}+4.85\%$
test_values[td0_return_estimate-False-False] 0.2312ms 0.2065ms 4.8421 KOps/s 5.1340 KOps/s $\textbf{\color{#d91a1a}-5.69\%}$
test_values[td1_return_estimate-False-False] 28.9124ms 25.3751ms 39.4087 Ops/s 40.0366 Ops/s $\color{#d91a1a}-1.57\%$
test_values[vec_td1_return_estimate-False-False] 26.0434ms 24.6101ms 40.6337 Ops/s 37.5549 Ops/s $\textbf{\color{#35bf28}+8.20\%}$
test_values[td_lambda_return_estimate-True-False] 39.1703ms 35.9450ms 27.8202 Ops/s 27.7836 Ops/s $\color{#35bf28}+0.13\%$
test_values[vec_td_lambda_return_estimate-True-False] 31.0659ms 25.2380ms 39.6228 Ops/s 37.6993 Ops/s $\textbf{\color{#35bf28}+5.10\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.3778ms 8.7632ms 114.1134 Ops/s 116.7816 Ops/s $\color{#d91a1a}-2.28\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4490ms 1.9654ms 508.7897 Ops/s 535.0940 Ops/s $\color{#d91a1a}-4.92\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4713ms 0.3724ms 2.6856 KOps/s 2.6444 KOps/s $\color{#35bf28}+1.56\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.9754ms 41.6355ms 24.0180 Ops/s 21.5954 Ops/s $\textbf{\color{#35bf28}+11.22\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 6.5044ms 3.5516ms 281.5635 Ops/s 284.4993 Ops/s $\color{#d91a1a}-1.03\%$
test_dqn_speed[False-None] 5.9821ms 1.4298ms 699.4171 Ops/s 686.7380 Ops/s $\color{#35bf28}+1.85\%$
test_dqn_speed[False-backward] 2.0690ms 1.9124ms 522.8986 Ops/s 502.1997 Ops/s $\color{#35bf28}+4.12\%$
test_dqn_speed[True-None] 0.7602ms 0.5010ms 1.9958 KOps/s 1.9799 KOps/s $\color{#35bf28}+0.80\%$
test_dqn_speed[True-backward] 1.0175ms 0.9472ms 1.0557 KOps/s 1.0588 KOps/s $\color{#d91a1a}-0.29\%$
test_dqn_speed[reduce-overhead-None] 0.8115ms 0.5073ms 1.9712 KOps/s 1.9664 KOps/s $\color{#35bf28}+0.24\%$
test_dqn_speed[reduce-overhead-backward] 1.7261ms 1.0236ms 976.9308 Ops/s 1.0627 KOps/s $\textbf{\color{#d91a1a}-8.07\%}$
test_ddpg_speed[False-None] 5.0344ms 3.0465ms 328.2500 Ops/s 332.7846 Ops/s $\color{#d91a1a}-1.36\%$
test_ddpg_speed[False-backward] 4.7614ms 4.2232ms 236.7875 Ops/s 241.2655 Ops/s $\color{#d91a1a}-1.86\%$
test_ddpg_speed[True-None] 2.0834ms 1.2859ms 777.6370 Ops/s 759.2989 Ops/s $\color{#35bf28}+2.42\%$
test_ddpg_speed[True-backward] 2.2958ms 2.2259ms 449.2648 Ops/s 445.4376 Ops/s $\color{#35bf28}+0.86\%$
test_ddpg_speed[reduce-overhead-None] 1.5817ms 1.3221ms 756.3735 Ops/s 713.3517 Ops/s $\textbf{\color{#35bf28}+6.03\%}$
test_ddpg_speed[reduce-overhead-backward] 2.6304ms 2.1665ms 461.5687 Ops/s 347.5960 Ops/s $\textbf{\color{#35bf28}+32.79\%}$
test_sac_speed[False-None] 8.6090ms 8.0426ms 124.3383 Ops/s 103.0094 Ops/s $\textbf{\color{#35bf28}+20.71\%}$
test_sac_speed[False-backward] 11.0514ms 10.7922ms 92.6597 Ops/s 72.9366 Ops/s $\textbf{\color{#35bf28}+27.04\%}$
test_sac_speed[True-None] 3.3627ms 2.2968ms 435.3878 Ops/s 368.7173 Ops/s $\textbf{\color{#35bf28}+18.08\%}$
test_sac_speed[True-backward] 3.8498ms 3.7882ms 263.9765 Ops/s 256.1985 Ops/s $\color{#35bf28}+3.04\%$
test_sac_speed[reduce-overhead-None] 2.8436ms 2.2809ms 438.4260 Ops/s 403.4728 Ops/s $\textbf{\color{#35bf28}+8.66\%}$
test_sac_speed[reduce-overhead-backward] 4.0815ms 3.9179ms 255.2398 Ops/s 255.0181 Ops/s $\color{#35bf28}+0.09\%$
test_redq_speed[False-None] 19.6335ms 13.4914ms 74.1212 Ops/s 48.0928 Ops/s $\textbf{\color{#35bf28}+54.12\%}$
test_redq_speed[False-backward] 34.1786ms 23.2413ms 43.0269 Ops/s 41.3888 Ops/s $\color{#35bf28}+3.96\%$
test_redq_speed[True-None] 8.8667ms 5.2754ms 189.5592 Ops/s 155.6748 Ops/s $\textbf{\color{#35bf28}+21.77\%}$
test_redq_speed[True-backward] 14.9102ms 12.6789ms 78.8711 Ops/s 69.4272 Ops/s $\textbf{\color{#35bf28}+13.60\%}$
test_redq_speed[reduce-overhead-None] 6.1429ms 5.3215ms 187.9165 Ops/s 149.7641 Ops/s $\textbf{\color{#35bf28}+25.47\%}$
test_redq_speed[reduce-overhead-backward] 15.1434ms 12.9413ms 77.2723 Ops/s 68.4369 Ops/s $\textbf{\color{#35bf28}+12.91\%}$
test_redq_deprec_speed[False-None] 14.4140ms 13.1792ms 75.8772 Ops/s 67.7378 Ops/s $\textbf{\color{#35bf28}+12.02\%}$
test_redq_deprec_speed[False-backward] 21.9731ms 19.4486ms 51.4176 Ops/s 49.4150 Ops/s $\color{#35bf28}+4.05\%$
test_redq_deprec_speed[True-None] 5.8006ms 4.8763ms 205.0752 Ops/s 217.7692 Ops/s $\textbf{\color{#d91a1a}-5.83\%}$
test_redq_deprec_speed[True-backward] 9.8200ms 9.4301ms 106.0439 Ops/s 96.6079 Ops/s $\textbf{\color{#35bf28}+9.77\%}$
test_redq_deprec_speed[reduce-overhead-None] 5.5698ms 4.3906ms 227.7599 Ops/s 231.7327 Ops/s $\color{#d91a1a}-1.71\%$
test_redq_deprec_speed[reduce-overhead-backward] 10.1545ms 9.6030ms 104.1337 Ops/s 104.6314 Ops/s $\color{#d91a1a}-0.48\%$
test_td3_speed[False-None] 8.8554ms 8.3694ms 119.4832 Ops/s 115.0610 Ops/s $\color{#35bf28}+3.84\%$
test_td3_speed[False-backward] 13.5754ms 11.2539ms 88.8584 Ops/s 90.2785 Ops/s $\color{#d91a1a}-1.57\%$
test_td3_speed[True-None] 1.9739ms 1.8492ms 540.7886 Ops/s 441.9997 Ops/s $\textbf{\color{#35bf28}+22.35\%}$
test_td3_speed[True-backward] 4.1851ms 3.6744ms 272.1545 Ops/s 243.0063 Ops/s $\textbf{\color{#35bf28}+11.99\%}$
test_td3_speed[reduce-overhead-None] 2.1369ms 1.8758ms 533.1000 Ops/s 513.7728 Ops/s $\color{#35bf28}+3.76\%$
test_td3_speed[reduce-overhead-backward] 4.0358ms 3.7304ms 268.0709 Ops/s 264.6712 Ops/s $\color{#35bf28}+1.28\%$
test_cql_speed[False-None] 39.1075ms 37.3884ms 26.7463 Ops/s 25.8802 Ops/s $\color{#35bf28}+3.35\%$
test_cql_speed[False-backward] 51.5260ms 48.0639ms 20.8056 Ops/s 19.9956 Ops/s $\color{#35bf28}+4.05\%$
test_cql_speed[True-None] 17.5372ms 16.3025ms 61.3402 Ops/s 58.9199 Ops/s $\color{#35bf28}+4.11\%$
test_cql_speed[True-backward] 24.9470ms 24.3364ms 41.0908 Ops/s 40.6789 Ops/s $\color{#35bf28}+1.01\%$
test_cql_speed[reduce-overhead-None] 17.4463ms 16.3089ms 61.3162 Ops/s 59.3281 Ops/s $\color{#35bf28}+3.35\%$
test_cql_speed[reduce-overhead-backward] 27.4377ms 24.0307ms 41.6134 Ops/s 40.9401 Ops/s $\color{#35bf28}+1.64\%$
test_a2c_speed[False-None] 8.9105ms 7.5535ms 132.3892 Ops/s 125.9130 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_a2c_speed[False-backward] 15.6297ms 14.6089ms 68.4513 Ops/s 64.1092 Ops/s $\textbf{\color{#35bf28}+6.77\%}$
test_a2c_speed[True-None] 4.4087ms 4.0956ms 244.1637 Ops/s 264.1070 Ops/s $\textbf{\color{#d91a1a}-7.55\%}$
test_a2c_speed[True-backward] 12.3544ms 11.1916ms 89.3523 Ops/s 85.7989 Ops/s $\color{#35bf28}+4.14\%$
test_a2c_speed[reduce-overhead-None] 4.3882ms 3.8174ms 261.9582 Ops/s 265.6485 Ops/s $\color{#d91a1a}-1.39\%$
test_a2c_speed[reduce-overhead-backward] 11.5141ms 10.6215ms 94.1484 Ops/s 86.9357 Ops/s $\textbf{\color{#35bf28}+8.30\%}$
test_ppo_speed[False-None] 8.5727ms 7.7009ms 129.8545 Ops/s 121.4774 Ops/s $\textbf{\color{#35bf28}+6.90\%}$
test_ppo_speed[False-backward] 15.8227ms 14.8618ms 67.2866 Ops/s 62.2556 Ops/s $\textbf{\color{#35bf28}+8.08\%}$
test_ppo_speed[True-None] 4.7067ms 4.3736ms 228.6466 Ops/s 219.5565 Ops/s $\color{#35bf28}+4.14\%$
test_ppo_speed[True-backward] 10.9964ms 10.7590ms 92.9453 Ops/s 94.6799 Ops/s $\color{#d91a1a}-1.83\%$
test_ppo_speed[reduce-overhead-None] 4.9768ms 4.0814ms 245.0142 Ops/s 211.4417 Ops/s $\textbf{\color{#35bf28}+15.88\%}$
test_ppo_speed[reduce-overhead-backward] 11.1072ms 10.5231ms 95.0291 Ops/s 95.5861 Ops/s $\color{#d91a1a}-0.58\%$
test_reinforce_speed[False-None] 8.0887ms 6.8367ms 146.2701 Ops/s 150.1905 Ops/s $\color{#d91a1a}-2.61\%$
test_reinforce_speed[False-backward] 11.0913ms 10.6230ms 94.1354 Ops/s 101.2385 Ops/s $\textbf{\color{#d91a1a}-7.02\%}$
test_reinforce_speed[True-None] 3.9172ms 3.5044ms 285.3565 Ops/s 282.9276 Ops/s $\color{#35bf28}+0.86\%$
test_reinforce_speed[True-backward] 10.5596ms 9.5317ms 104.9133 Ops/s 105.8363 Ops/s $\color{#d91a1a}-0.87\%$
test_reinforce_speed[reduce-overhead-None] 3.4934ms 3.1107ms 321.4732 Ops/s 298.8440 Ops/s $\textbf{\color{#35bf28}+7.57\%}$
test_reinforce_speed[reduce-overhead-backward] 10.6607ms 9.9245ms 100.7610 Ops/s 110.4124 Ops/s $\textbf{\color{#d91a1a}-8.74\%}$
test_iql_speed[False-None] 34.5123ms 33.6434ms 29.7235 Ops/s 30.1736 Ops/s $\color{#d91a1a}-1.49\%$
test_iql_speed[False-backward] 48.3294ms 47.0896ms 21.2361 Ops/s 20.5622 Ops/s $\color{#35bf28}+3.28\%$
test_iql_speed[True-None] 12.9343ms 11.8768ms 84.1977 Ops/s 82.0379 Ops/s $\color{#35bf28}+2.63\%$
test_iql_speed[True-backward] 23.9061ms 22.7494ms 43.9572 Ops/s 43.4058 Ops/s $\color{#35bf28}+1.27\%$
test_iql_speed[reduce-overhead-None] 14.0439ms 12.1197ms 82.5105 Ops/s 81.0311 Ops/s $\color{#35bf28}+1.83\%$
test_iql_speed[reduce-overhead-backward] 24.4296ms 23.3539ms 42.8193 Ops/s 40.5009 Ops/s $\textbf{\color{#35bf28}+5.72\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.5509ms 4.7945ms 208.5711 Ops/s 181.9923 Ops/s $\textbf{\color{#35bf28}+14.60\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7185ms 0.5083ms 1.9674 KOps/s 1.8331 KOps/s $\textbf{\color{#35bf28}+7.33\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7039ms 0.4886ms 2.0466 KOps/s 1.8657 KOps/s $\textbf{\color{#35bf28}+9.69\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3557ms 4.6749ms 213.9070 Ops/s 191.9135 Ops/s $\textbf{\color{#35bf28}+11.46\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.4304ms 0.5208ms 1.9200 KOps/s 1.8585 KOps/s $\color{#35bf28}+3.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7372ms 0.5018ms 1.9929 KOps/s 1.9330 KOps/s $\color{#35bf28}+3.10\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9220ms 1.6683ms 599.4210 Ops/s 590.9535 Ops/s $\color{#35bf28}+1.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0972ms 1.5800ms 632.9159 Ops/s 625.9996 Ops/s $\color{#35bf28}+1.10\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.6578ms 5.1709ms 193.3907 Ops/s 196.2115 Ops/s $\color{#d91a1a}-1.44\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.4288ms 0.6771ms 1.4769 KOps/s 1.4722 KOps/s $\color{#35bf28}+0.32\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9102ms 0.6547ms 1.5273 KOps/s 1.5417 KOps/s $\color{#d91a1a}-0.93\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.3140ms 5.1394ms 194.5755 Ops/s 209.4823 Ops/s $\textbf{\color{#d91a1a}-7.12\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8329ms 0.5335ms 1.8745 KOps/s 1.8887 KOps/s $\color{#d91a1a}-0.75\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7597ms 0.5152ms 1.9410 KOps/s 1.9688 KOps/s $\color{#d91a1a}-1.41\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.5216ms 5.0397ms 198.4233 Ops/s 215.7679 Ops/s $\textbf{\color{#d91a1a}-8.04\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8428ms 0.5226ms 1.9133 KOps/s 1.9314 KOps/s $\color{#d91a1a}-0.93\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7648ms 0.5118ms 1.9540 KOps/s 2.0003 KOps/s $\color{#d91a1a}-2.31\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5904ms 5.1710ms 193.3851 Ops/s 188.1254 Ops/s $\color{#35bf28}+2.80\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.5290ms 0.6678ms 1.4975 KOps/s 1.4576 KOps/s $\color{#35bf28}+2.74\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9032ms 0.6448ms 1.5508 KOps/s 1.5006 KOps/s $\color{#35bf28}+3.35\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.8666ms 4.3241ms 231.2625 Ops/s 226.7618 Ops/s $\color{#35bf28}+1.98\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 5.4761ms 2.3528ms 425.0242 Ops/s 419.2975 Ops/s $\color{#35bf28}+1.37\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.3051ms 1.5166ms 659.3740 Ops/s 662.8686 Ops/s $\color{#d91a1a}-0.53\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.0649ms 4.4087ms 226.8220 Ops/s 226.9115 Ops/s $\color{#d91a1a}-0.04\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.5373s 13.0801ms 76.4518 Ops/s 412.7716 Ops/s $\textbf{\color{#d91a1a}-81.48\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.6329ms 1.4785ms 676.3725 Ops/s 673.6461 Ops/s $\color{#35bf28}+0.40\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.6856ms 4.4246ms 226.0070 Ops/s 29.2388 Ops/s $\textbf{\color{#35bf28}+672.97\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.9775ms 2.5877ms 386.4508 Ops/s 376.3785 Ops/s $\color{#35bf28}+2.68\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.8129ms 1.6061ms 622.6266 Ops/s 616.2244 Ops/s $\color{#35bf28}+1.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6812ms 11.6099ms 86.1331 Ops/s 76.6476 Ops/s $\textbf{\color{#35bf28}+12.38\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.9066ms 14.5770ms 68.6011 Ops/s 65.8780 Ops/s $\color{#35bf28}+4.13\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.7422ms 20.3862ms 49.0527 Ops/s 46.1365 Ops/s $\textbf{\color{#35bf28}+6.32\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.6883ms 14.4597ms 69.1577 Ops/s 64.6963 Ops/s $\textbf{\color{#35bf28}+6.90\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.5526ms 20.5386ms 48.6888 Ops/s 46.2854 Ops/s $\textbf{\color{#35bf28}+5.19\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.4235ms 15.8127ms 63.2401 Ops/s 59.8068 Ops/s $\textbf{\color{#35bf28}+5.74\%}$

Copy link

github-actions bot commented Feb 10, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8899s 0.8060s 1.2407 Ops/s 1.2325 Ops/s $\color{#35bf28}+0.67\%$
test_transformed 1.4673s 1.3838s 0.7226 Ops/s 0.7074 Ops/s $\color{#35bf28}+2.15\%$
test_serial 2.3720s 2.2902s 0.4366 Ops/s 0.4308 Ops/s $\color{#35bf28}+1.35\%$
test_parallel 1.9659s 1.8465s 0.5416 Ops/s 0.5385 Ops/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-True-True-True-True] 0.1905ms 40.9253μs 24.4348 KOps/s 25.4674 KOps/s $\color{#d91a1a}-4.05\%$
test_step_mdp_speed[True-True-True-True-False] 56.7310μs 23.8094μs 42.0002 KOps/s 42.7179 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[True-True-True-False-True] 55.3010μs 22.6538μs 44.1427 KOps/s 43.9135 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[True-True-True-False-False] 41.2510μs 13.1484μs 76.0551 KOps/s 77.2170 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[True-True-False-True-True] 80.7320μs 43.4082μs 23.0371 KOps/s 23.4725 KOps/s $\color{#d91a1a}-1.85\%$
test_step_mdp_speed[True-True-False-True-False] 56.6520μs 26.0277μs 38.4206 KOps/s 39.4504 KOps/s $\color{#d91a1a}-2.61\%$
test_step_mdp_speed[True-True-False-False-True] 61.4310μs 25.0025μs 39.9961 KOps/s 40.7665 KOps/s $\color{#d91a1a}-1.89\%$
test_step_mdp_speed[True-True-False-False-False] 42.7010μs 15.5390μs 64.3540 KOps/s 66.7976 KOps/s $\color{#d91a1a}-3.66\%$
test_step_mdp_speed[True-False-True-True-True] 72.0620μs 45.6518μs 21.9050 KOps/s 22.0616 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-False-True-True-False] 77.5920μs 28.0004μs 35.7137 KOps/s 35.5879 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[True-False-True-False-True] 53.9010μs 24.7971μs 40.3273 KOps/s 40.2167 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-False-True-False-False] 51.3810μs 15.5052μs 64.4946 KOps/s 66.8845 KOps/s $\color{#d91a1a}-3.57\%$
test_step_mdp_speed[True-False-False-True-True] 81.7420μs 47.4870μs 21.0584 KOps/s 21.3134 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-False-False-True-False] 57.5710μs 30.9492μs 32.3111 KOps/s 33.2813 KOps/s $\color{#d91a1a}-2.92\%$
test_step_mdp_speed[True-False-False-False-True] 55.8210μs 26.9119μs 37.1583 KOps/s 37.8398 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[True-False-False-False-False] 43.2310μs 17.7824μs 56.2353 KOps/s 57.3776 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-True-True-True-True] 79.0220μs 45.2437μs 22.1025 KOps/s 22.5515 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-True-True-True-False] 62.3710μs 28.3205μs 35.3101 KOps/s 35.6277 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[False-True-True-False-True] 64.8320μs 28.5819μs 34.9872 KOps/s 35.0951 KOps/s $\color{#d91a1a}-0.31\%$
test_step_mdp_speed[False-True-True-False-False] 47.9310μs 17.2594μs 57.9396 KOps/s 58.7703 KOps/s $\color{#d91a1a}-1.41\%$
test_step_mdp_speed[False-True-False-True-True] 80.0120μs 47.5717μs 21.0209 KOps/s 21.2316 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[False-True-False-True-False] 63.2910μs 30.4851μs 32.8029 KOps/s 33.1162 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-True-False-False-True] 3.1612ms 31.4942μs 31.7519 KOps/s 32.3881 KOps/s $\color{#d91a1a}-1.96\%$
test_step_mdp_speed[False-True-False-False-False] 52.0410μs 19.5646μs 51.1126 KOps/s 51.4155 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-True-True-True] 0.1019ms 50.0876μs 19.9650 KOps/s 20.1375 KOps/s $\color{#d91a1a}-0.86\%$
test_step_mdp_speed[False-False-True-True-False] 64.9110μs 32.8439μs 30.4471 KOps/s 30.3677 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[False-False-True-False-True] 79.6420μs 31.2627μs 31.9870 KOps/s 32.9218 KOps/s $\color{#d91a1a}-2.84\%$
test_step_mdp_speed[False-False-True-False-False] 50.8510μs 19.4354μs 51.4525 KOps/s 52.1917 KOps/s $\color{#d91a1a}-1.42\%$
test_step_mdp_speed[False-False-False-True-True] 0.1617ms 51.7878μs 19.3096 KOps/s 19.5597 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[False-False-False-True-False] 67.1210μs 35.1815μs 28.4240 KOps/s 28.8030 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[False-False-False-False-True] 68.7920μs 32.5732μs 30.7001 KOps/s 31.1602 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[False-False-False-False-False] 51.3010μs 21.5776μs 46.3444 KOps/s 46.6354 KOps/s $\color{#d91a1a}-0.62\%$
test_values[generalized_advantage_estimate-True-True] 26.1695ms 25.8569ms 38.6744 Ops/s 38.8237 Ops/s $\color{#d91a1a}-0.38\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1054s 3.0124ms 331.9616 Ops/s 322.4460 Ops/s $\color{#35bf28}+2.95\%$
test_values[td0_return_estimate-False-False] 0.1055ms 80.3761μs 12.4415 KOps/s 11.5649 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_values[td1_return_estimate-False-False] 57.6934ms 57.3424ms 17.4391 Ops/s 17.4195 Ops/s $\color{#35bf28}+0.11\%$
test_values[vec_td1_return_estimate-False-False] 1.2720ms 1.0961ms 912.3481 Ops/s 915.5297 Ops/s $\color{#d91a1a}-0.35\%$
test_values[td_lambda_return_estimate-True-False] 91.2850ms 90.7615ms 11.0179 Ops/s 11.0288 Ops/s $\color{#d91a1a}-0.10\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2325ms 1.0906ms 916.9492 Ops/s 917.2036 Ops/s $\color{#d91a1a}-0.03\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.5583ms 25.3968ms 39.3750 Ops/s 38.6522 Ops/s $\color{#35bf28}+1.87\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0308ms 0.7615ms 1.3131 KOps/s 1.3156 KOps/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8311ms 0.6806ms 1.4693 KOps/s 1.4675 KOps/s $\color{#35bf28}+0.13\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5417ms 1.4960ms 668.4628 Ops/s 670.2349 Ops/s $\color{#d91a1a}-0.26\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7701ms 0.6954ms 1.4380 KOps/s 1.4393 KOps/s $\color{#d91a1a}-0.09\%$
test_dqn_speed[False-None] 6.9303ms 1.5403ms 649.2040 Ops/s 645.5117 Ops/s $\color{#35bf28}+0.57\%$
test_dqn_speed[False-backward] 2.2364ms 2.1609ms 462.7635 Ops/s 462.6910 Ops/s $\color{#35bf28}+0.02\%$
test_dqn_speed[True-None] 0.7006ms 0.5795ms 1.7257 KOps/s 1.6997 KOps/s $\color{#35bf28}+1.53\%$
test_dqn_speed[True-backward] 1.2989ms 1.2612ms 792.8892 Ops/s 849.6483 Ops/s $\textbf{\color{#d91a1a}-6.68\%}$
test_dqn_speed[reduce-overhead-None] 0.6658ms 0.5973ms 1.6743 KOps/s 1.6642 KOps/s $\color{#35bf28}+0.61\%$
test_dqn_speed[reduce-overhead-backward] 1.1966ms 1.0973ms 911.3215 Ops/s 1.0090 KOps/s $\textbf{\color{#d91a1a}-9.68\%}$
test_ddpg_speed[False-None] 3.1680ms 2.8625ms 349.3396 Ops/s 343.4449 Ops/s $\color{#35bf28}+1.72\%$
test_ddpg_speed[False-backward] 4.6990ms 4.2742ms 233.9626 Ops/s 236.9239 Ops/s $\color{#d91a1a}-1.25\%$
test_ddpg_speed[True-None] 1.5084ms 1.3931ms 717.8192 Ops/s 721.1305 Ops/s $\color{#d91a1a}-0.46\%$
test_ddpg_speed[True-backward] 2.7235ms 2.6374ms 379.1647 Ops/s 397.2010 Ops/s $\color{#d91a1a}-4.54\%$
test_ddpg_speed[reduce-overhead-None] 1.5128ms 1.4068ms 710.8581 Ops/s 714.2662 Ops/s $\color{#d91a1a}-0.48\%$
test_ddpg_speed[reduce-overhead-backward] 2.1316ms 2.0912ms 478.1897 Ops/s 505.4944 Ops/s $\textbf{\color{#d91a1a}-5.40\%}$
test_sac_speed[False-None] 8.4347ms 8.0600ms 124.0691 Ops/s 122.2696 Ops/s $\color{#35bf28}+1.47\%$
test_sac_speed[False-backward] 11.9549ms 11.3257ms 88.2948 Ops/s 89.8401 Ops/s $\color{#d91a1a}-1.72\%$
test_sac_speed[True-None] 2.0141ms 1.9091ms 523.8062 Ops/s 520.8420 Ops/s $\color{#35bf28}+0.57\%$
test_sac_speed[True-backward] 3.9107ms 3.8428ms 260.2293 Ops/s 258.2377 Ops/s $\color{#35bf28}+0.77\%$
test_sac_speed[reduce-overhead-None] 17.4611ms 10.8193ms 92.4271 Ops/s 89.6967 Ops/s $\color{#35bf28}+3.04\%$
test_sac_speed[reduce-overhead-backward] 1.8794ms 1.8344ms 545.1461 Ops/s 533.0955 Ops/s $\color{#35bf28}+2.26\%$
test_redq_speed[False-None] 7.9104ms 7.4655ms 133.9490 Ops/s 129.9803 Ops/s $\color{#35bf28}+3.05\%$
test_redq_speed[False-backward] 12.2127ms 11.7128ms 85.3768 Ops/s 83.7789 Ops/s $\color{#35bf28}+1.91\%$
test_redq_speed[True-None] 2.5159ms 2.3956ms 417.4287 Ops/s 414.6844 Ops/s $\color{#35bf28}+0.66\%$
test_redq_speed[True-backward] 4.2818ms 4.1560ms 240.6138 Ops/s 226.7566 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_redq_speed[reduce-overhead-None] 2.5187ms 2.4240ms 412.5334 Ops/s 410.4317 Ops/s $\color{#35bf28}+0.51\%$
test_redq_speed[reduce-overhead-backward] 4.2613ms 4.1656ms 240.0639 Ops/s 226.5850 Ops/s $\textbf{\color{#35bf28}+5.95\%}$
test_redq_deprec_speed[False-None] 9.4357ms 9.0963ms 109.9345 Ops/s 108.2140 Ops/s $\color{#35bf28}+1.59\%$
test_redq_deprec_speed[False-backward] 12.6380ms 12.1173ms 82.5268 Ops/s 80.3834 Ops/s $\color{#35bf28}+2.67\%$
test_redq_deprec_speed[True-None] 3.0508ms 2.7405ms 364.8916 Ops/s 365.0491 Ops/s $\color{#d91a1a}-0.04\%$
test_redq_deprec_speed[True-backward] 4.5293ms 4.4430ms 225.0733 Ops/s 213.7113 Ops/s $\textbf{\color{#35bf28}+5.32\%}$
test_redq_deprec_speed[reduce-overhead-None] 2.9896ms 2.7216ms 367.4294 Ops/s 363.5394 Ops/s $\color{#35bf28}+1.07\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.9282ms 4.4630ms 224.0635 Ops/s 213.7870 Ops/s $\color{#35bf28}+4.81\%$
test_td3_speed[False-None] 8.0658ms 8.0073ms 124.8863 Ops/s 123.9684 Ops/s $\color{#35bf28}+0.74\%$
test_td3_speed[False-backward] 10.9320ms 10.3532ms 96.5885 Ops/s 94.2216 Ops/s $\color{#35bf28}+2.51\%$
test_td3_speed[True-None] 1.8129ms 1.7443ms 573.2861 Ops/s 572.7987 Ops/s $\color{#35bf28}+0.09\%$
test_td3_speed[True-backward] 3.5384ms 3.3105ms 302.0672 Ops/s 285.3549 Ops/s $\textbf{\color{#35bf28}+5.86\%}$
test_td3_speed[reduce-overhead-None] 70.6143ms 27.4683ms 36.4055 Ops/s 35.7384 Ops/s $\color{#35bf28}+1.87\%$
test_td3_speed[reduce-overhead-backward] 1.4901ms 1.4120ms 708.2266 Ops/s 637.5945 Ops/s $\textbf{\color{#35bf28}+11.08\%}$
test_cql_speed[False-None] 17.3724ms 16.8290ms 59.4212 Ops/s 58.6389 Ops/s $\color{#35bf28}+1.33\%$
test_cql_speed[False-backward] 22.5627ms 22.0323ms 45.3880 Ops/s 44.3120 Ops/s $\color{#35bf28}+2.43\%$
test_cql_speed[True-None] 3.5195ms 3.3757ms 296.2385 Ops/s 294.1553 Ops/s $\color{#35bf28}+0.71\%$
test_cql_speed[True-backward] 6.5321ms 5.8509ms 170.9128 Ops/s 170.6526 Ops/s $\color{#35bf28}+0.15\%$
test_cql_speed[reduce-overhead-None] 19.0256ms 13.1065ms 76.2980 Ops/s 74.2721 Ops/s $\color{#35bf28}+2.73\%$
test_cql_speed[reduce-overhead-backward] 2.2468ms 2.0469ms 488.5406 Ops/s 482.4370 Ops/s $\color{#35bf28}+1.27\%$
test_a2c_speed[False-None] 3.3661ms 3.1963ms 312.8574 Ops/s 304.5407 Ops/s $\color{#35bf28}+2.73\%$
test_a2c_speed[False-backward] 7.0820ms 6.4858ms 154.1839 Ops/s 154.5215 Ops/s $\color{#d91a1a}-0.22\%$
test_a2c_speed[True-None] 1.4478ms 1.3840ms 722.5569 Ops/s 717.7817 Ops/s $\color{#35bf28}+0.67\%$
test_a2c_speed[True-backward] 3.4761ms 3.1477ms 317.6910 Ops/s 316.1316 Ops/s $\color{#35bf28}+0.49\%$
test_a2c_speed[reduce-overhead-None] 14.4764ms 8.5320ms 117.2061 Ops/s 116.7982 Ops/s $\color{#35bf28}+0.35\%$
test_a2c_speed[reduce-overhead-backward] 1.7593ms 1.6287ms 614.0048 Ops/s 610.0167 Ops/s $\color{#35bf28}+0.65\%$
test_ppo_speed[False-None] 3.8037ms 3.7119ms 269.4039 Ops/s 263.5677 Ops/s $\color{#35bf28}+2.21\%$
test_ppo_speed[False-backward] 7.5454ms 7.1123ms 140.6011 Ops/s 138.4036 Ops/s $\color{#35bf28}+1.59\%$
test_ppo_speed[True-None] 1.5272ms 1.4452ms 691.9519 Ops/s 683.7388 Ops/s $\color{#35bf28}+1.20\%$
test_ppo_speed[True-backward] 3.2335ms 3.1162ms 320.8995 Ops/s 309.6472 Ops/s $\color{#35bf28}+3.63\%$
test_ppo_speed[reduce-overhead-None] 1.0634ms 0.9870ms 1.0132 KOps/s 1.0179 KOps/s $\color{#d91a1a}-0.46\%$
test_ppo_speed[reduce-overhead-backward] 1.5782ms 1.4388ms 695.0060 Ops/s 671.1139 Ops/s $\color{#35bf28}+3.56\%$
test_reinforce_speed[False-None] 2.4471ms 2.2857ms 437.5090 Ops/s 427.1795 Ops/s $\color{#35bf28}+2.42\%$
test_reinforce_speed[False-backward] 3.7599ms 3.3125ms 301.8893 Ops/s 296.6538 Ops/s $\color{#35bf28}+1.76\%$
test_reinforce_speed[True-None] 1.4585ms 1.3338ms 749.7609 Ops/s 739.6989 Ops/s $\color{#35bf28}+1.36\%$
test_reinforce_speed[True-backward] 3.1409ms 3.0199ms 331.1330 Ops/s 316.5577 Ops/s $\color{#35bf28}+4.60\%$
test_reinforce_speed[reduce-overhead-None] 16.5885ms 9.3617ms 106.8180 Ops/s 106.2464 Ops/s $\color{#35bf28}+0.54\%$
test_reinforce_speed[reduce-overhead-backward] 1.6101ms 1.5255ms 655.5370 Ops/s 588.9980 Ops/s $\textbf{\color{#35bf28}+11.30\%}$
test_iql_speed[False-None] 9.6967ms 9.2499ms 108.1091 Ops/s 105.6541 Ops/s $\color{#35bf28}+2.32\%$
test_iql_speed[False-backward] 13.4592ms 12.9489ms 77.2267 Ops/s 73.9443 Ops/s $\color{#35bf28}+4.44\%$
test_iql_speed[True-None] 2.4816ms 2.3073ms 433.3997 Ops/s 422.0237 Ops/s $\color{#35bf28}+2.70\%$
test_iql_speed[True-backward] 5.3896ms 4.9726ms 201.1024 Ops/s 191.7963 Ops/s $\color{#35bf28}+4.85\%$
test_iql_speed[reduce-overhead-None] 0.4772s 12.7628ms 78.3529 Ops/s 93.7798 Ops/s $\textbf{\color{#d91a1a}-16.45\%}$
test_iql_speed[reduce-overhead-backward] 2.0357ms 1.9686ms 507.9775 Ops/s 458.1026 Ops/s $\textbf{\color{#35bf28}+10.89\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9722ms 6.3281ms 158.0242 Ops/s 153.5382 Ops/s $\color{#35bf28}+2.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5628ms 0.3453ms 2.8964 KOps/s 3.1612 KOps/s $\textbf{\color{#d91a1a}-8.38\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5318ms 0.3286ms 3.0429 KOps/s 3.7072 KOps/s $\textbf{\color{#d91a1a}-17.92\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3570ms 6.0696ms 164.7565 Ops/s 161.7176 Ops/s $\color{#35bf28}+1.88\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6427ms 0.2562ms 3.9037 KOps/s 3.8164 KOps/s $\color{#35bf28}+2.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4488ms 0.2368ms 4.2228 KOps/s 3.5545 KOps/s $\textbf{\color{#35bf28}+18.80\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6015ms 1.3546ms 738.2122 Ops/s 781.9973 Ops/s $\textbf{\color{#d91a1a}-5.60\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4642ms 1.2356ms 809.3388 Ops/s 847.1380 Ops/s $\color{#d91a1a}-4.46\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3477ms 6.2245ms 160.6559 Ops/s 156.8291 Ops/s $\color{#35bf28}+2.44\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8549ms 0.4344ms 2.3019 KOps/s 2.2371 KOps/s $\color{#35bf28}+2.90\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6119ms 0.3992ms 2.5051 KOps/s 2.3804 KOps/s $\textbf{\color{#35bf28}+5.24\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2114ms 6.1020ms 163.8805 Ops/s 160.9690 Ops/s $\color{#35bf28}+1.81\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0321ms 0.3119ms 3.2063 KOps/s 2.8976 KOps/s $\textbf{\color{#35bf28}+10.65\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5807ms 0.3172ms 3.1526 KOps/s 2.9610 KOps/s $\textbf{\color{#35bf28}+6.47\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3019ms 6.0167ms 166.2050 Ops/s 160.5502 Ops/s $\color{#35bf28}+3.52\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9338ms 0.2594ms 3.8553 KOps/s 3.1074 KOps/s $\textbf{\color{#35bf28}+24.07\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5034ms 0.3018ms 3.3136 KOps/s 3.8872 KOps/s $\textbf{\color{#d91a1a}-14.76\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5086ms 6.2420ms 160.2038 Ops/s 156.6797 Ops/s $\color{#35bf28}+2.25\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2540ms 0.4847ms 2.0631 KOps/s 2.3300 KOps/s $\textbf{\color{#d91a1a}-11.46\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7011ms 0.4472ms 2.2363 KOps/s 2.5240 KOps/s $\textbf{\color{#d91a1a}-11.40\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0177ms 5.4256ms 184.3127 Ops/s 179.5706 Ops/s $\color{#35bf28}+2.64\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.4764ms 2.0508ms 487.6201 Ops/s 420.1944 Ops/s $\textbf{\color{#35bf28}+16.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.6164ms 1.2071ms 828.4407 Ops/s 811.3189 Ops/s $\color{#35bf28}+2.11\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4601s 14.6113ms 68.4404 Ops/s 180.4647 Ops/s $\textbf{\color{#d91a1a}-62.08\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.9332ms 2.0013ms 499.6753 Ops/s 417.3252 Ops/s $\textbf{\color{#35bf28}+19.73\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 10.1278ms 1.2651ms 790.4530 Ops/s 847.2325 Ops/s $\textbf{\color{#d91a1a}-6.70\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.0401ms 5.7079ms 175.1968 Ops/s 31.3995 Ops/s $\textbf{\color{#35bf28}+457.96\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.4679ms 2.1763ms 459.4921 Ops/s 426.5339 Ops/s $\textbf{\color{#35bf28}+7.73\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.9615ms 1.4537ms 687.8828 Ops/s 778.9825 Ops/s $\textbf{\color{#d91a1a}-11.69\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2308ms 13.0474ms 76.6437 Ops/s 72.4492 Ops/s $\textbf{\color{#35bf28}+5.79\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.1945ms 16.9731ms 58.9167 Ops/s 59.1023 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.4478ms 17.9055ms 55.8486 Ops/s 53.8918 Ops/s $\color{#35bf28}+3.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.7609ms 17.1765ms 58.2192 Ops/s 58.7031 Ops/s $\color{#d91a1a}-0.82\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.3313ms 17.7685ms 56.2795 Ops/s 53.9583 Ops/s $\color{#35bf28}+4.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.3819ms 18.8509ms 53.0478 Ops/s 53.7865 Ops/s $\color{#d91a1a}-1.37\%$

@vmoens vmoens added the enhancement New feature or request label Feb 10, 2025
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 4a9eed2b52be6616c1f27f843b04e31ddc0a7947
Pull Request resolved: #2779
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 6785a40b3b9d8f90ba2aa81c23071dacbb9452ed
Pull Request resolved: #2779
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 11, 2025
ghstack-source-id: 436be56fa8e527e2b04406e6b8382103f3fb3b67
Pull Request resolved: #2779
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 11, 2025
ghstack-source-id: 48533a99706bc6d29e335d8e9efac576c47359c9
Pull Request resolved: #2779
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 12, 2025
ghstack-source-id: 5a3a2015ef90a0f9a13ec6d10274e2410b3e9a83
Pull Request resolved: #2779
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 12, 2025
ghstack-source-id: c67e625ccb1dc755c3391f7daa363ccc0fb005bc
Pull Request resolved: #2779
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 12, 2025
ghstack-source-id: feb4f05bea1e2489664d1cab8c1213d8a3944458
Pull Request resolved: #2779
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 2db0291 into gh/vmoens/88/base Feb 13, 2025
71 of 74 checks passed
vmoens added a commit that referenced this pull request Feb 13, 2025
ghstack-source-id: 0a6f7f916ee6f9c6d450c511385bdfdb1d911da0
Pull Request resolved: #2779
@vmoens vmoens deleted the gh/vmoens/88/head branch February 13, 2025 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants