Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Extend RB with lazy stack #2453

Merged
merged 1 commit into from
Sep 25, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 25, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Sep 25, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2453

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 19 Unrelated Failures

As of commit 2df6393 with merge base e294c68 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Sep 25, 2024
ghstack-source-id: a0be9a2840ab6f090605a3e1d2f47a4f00ac5183
Pull Request resolved: #2453
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 25, 2024
@vmoens vmoens merged commit 2df6393 into gh/vmoens/30/base Sep 25, 2024
18 of 31 checks passed
vmoens added a commit that referenced this pull request Sep 25, 2024
ghstack-source-id: a0be9a2840ab6f090605a3e1d2f47a4f00ac5183
Pull Request resolved: #2453
@vmoens vmoens deleted the gh/vmoens/30/head branch September 25, 2024 05:42
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 146. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 61.5088ms 60.4461ms 16.5437 Ops/s 16.0026 Ops/s $\color{#35bf28}+3.38\%$
test_sync 36.8657ms 33.0874ms 30.2230 Ops/s 28.6134 Ops/s $\textbf{\color{#35bf28}+5.63\%}$
test_async 62.7537ms 32.3991ms 30.8651 Ops/s 30.0197 Ops/s $\color{#35bf28}+2.82\%$
test_simple 0.5239s 0.4388s 2.2789 Ops/s 2.3221 Ops/s $\color{#d91a1a}-1.86\%$
test_transformed 0.5873s 0.5818s 1.7189 Ops/s 1.6160 Ops/s $\textbf{\color{#35bf28}+6.36\%}$
test_serial 1.2798s 1.2720s 0.7862 Ops/s 0.7471 Ops/s $\textbf{\color{#35bf28}+5.23\%}$
test_parallel 1.2387s 1.1464s 0.8723 Ops/s 0.8620 Ops/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[True-True-True-True-True] 0.1851ms 27.2386μs 36.7125 KOps/s 36.5099 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-True-True-True-False] 80.7710μs 15.8785μs 62.9781 KOps/s 61.8424 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[True-True-True-False-True] 55.2240μs 15.7437μs 63.5176 KOps/s 63.0999 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[True-True-True-False-False] 45.8560μs 9.1235μs 109.6068 KOps/s 108.9058 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[True-True-False-True-True] 65.3620μs 29.3174μs 34.1095 KOps/s 34.2529 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-True-False-True-False] 70.5230μs 17.8611μs 55.9877 KOps/s 56.3932 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[True-True-False-False-True] 45.1450μs 17.3837μs 57.5251 KOps/s 57.4883 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-True-False-False-False] 39.9840μs 10.9276μs 91.5117 KOps/s 92.7220 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[True-False-True-True-True] 64.4210μs 31.1240μs 32.1295 KOps/s 32.3534 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-False-True-True-False] 60.0830μs 19.4965μs 51.2914 KOps/s 51.4379 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-True-False-True] 65.6700μs 17.0496μs 58.6526 KOps/s 57.7846 KOps/s $\color{#35bf28}+1.50\%$
test_step_mdp_speed[True-False-True-False-False] 89.4820μs 10.5888μs 94.4391 KOps/s 92.5334 KOps/s $\color{#35bf28}+2.06\%$
test_step_mdp_speed[True-False-False-True-True] 0.1155ms 31.7954μs 31.4510 KOps/s 30.6228 KOps/s $\color{#35bf28}+2.70\%$
test_step_mdp_speed[True-False-False-True-False] 59.6420μs 21.0462μs 47.5146 KOps/s 46.9445 KOps/s $\color{#35bf28}+1.21\%$
test_step_mdp_speed[True-False-False-False-True] 74.7000μs 18.5173μs 54.0036 KOps/s 52.9793 KOps/s $\color{#35bf28}+1.93\%$
test_step_mdp_speed[True-False-False-False-False] 43.9020μs 12.3422μs 81.0229 KOps/s 80.3797 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[False-True-True-True-True] 72.5060μs 30.4657μs 32.8237 KOps/s 32.3866 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[False-True-True-True-False] 66.5150μs 19.2958μs 51.8247 KOps/s 50.5634 KOps/s $\color{#35bf28}+2.49\%$
test_step_mdp_speed[False-True-True-False-True] 44.9140μs 19.5746μs 51.0866 KOps/s 49.9277 KOps/s $\color{#35bf28}+2.32\%$
test_step_mdp_speed[False-True-True-False-False] 56.8970μs 11.9853μs 83.4354 KOps/s 82.8367 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[False-True-False-True-True] 94.2830μs 32.1256μs 31.1278 KOps/s 31.0424 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-True-False-True-False] 68.2380μs 20.5405μs 48.6844 KOps/s 47.6369 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[False-True-False-False-True] 2.9663ms 21.2183μs 47.1290 KOps/s 47.1220 KOps/s $\color{#35bf28}+0.01\%$
test_step_mdp_speed[False-True-False-False-False] 37.2900μs 13.4609μs 74.2894 KOps/s 73.9978 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-False-True-True-True] 0.1089ms 33.4921μs 29.8578 KOps/s 29.1282 KOps/s $\color{#35bf28}+2.50\%$
test_step_mdp_speed[False-False-True-True-False] 51.7670μs 22.2224μs 44.9997 KOps/s 44.1199 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[False-False-True-False-True] 70.5220μs 21.1792μs 47.2162 KOps/s 47.1813 KOps/s $\color{#35bf28}+0.07\%$
test_step_mdp_speed[False-False-True-False-False] 55.6730μs 13.7357μs 72.8032 KOps/s 73.2321 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-False-True-True] 73.4270μs 35.0648μs 28.5186 KOps/s 28.3526 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-False-False-True-False] 61.7050μs 23.9112μs 41.8214 KOps/s 41.3929 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-False-False-False-True] 71.7630μs 22.6421μs 44.1655 KOps/s 44.2676 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[False-False-False-False-False] 56.1550μs 14.9460μs 66.9073 KOps/s 66.4315 KOps/s $\color{#35bf28}+0.72\%$
test_values[generalized_advantage_estimate-True-True] 9.8607ms 9.6096ms 104.0623 Ops/s 105.7369 Ops/s $\color{#d91a1a}-1.58\%$
test_values[vec_generalized_advantage_estimate-True-True] 40.8833ms 33.7372ms 29.6409 Ops/s 29.7264 Ops/s $\color{#d91a1a}-0.29\%$
test_values[td0_return_estimate-False-False] 0.2974ms 0.1966ms 5.0869 KOps/s 5.4021 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_values[td1_return_estimate-False-False] 28.2774ms 23.8430ms 41.9411 Ops/s 42.2612 Ops/s $\color{#d91a1a}-0.76\%$
test_values[vec_td1_return_estimate-False-False] 36.0662ms 33.6919ms 29.6807 Ops/s 29.6962 Ops/s $\color{#d91a1a}-0.05\%$
test_values[td_lambda_return_estimate-True-False] 38.5779ms 34.2574ms 29.1908 Ops/s 29.1735 Ops/s $\color{#35bf28}+0.06\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.2733ms 33.6457ms 29.7214 Ops/s 29.7576 Ops/s $\color{#d91a1a}-0.12\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.2534ms 8.2122ms 121.7707 Ops/s 123.9265 Ops/s $\color{#d91a1a}-1.74\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.9119ms 1.9984ms 500.3962 Ops/s 493.3868 Ops/s $\color{#35bf28}+1.42\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6056ms 0.3528ms 2.8345 KOps/s 2.7475 KOps/s $\color{#35bf28}+3.17\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.9880ms 47.2338ms 21.1713 Ops/s 21.9530 Ops/s $\color{#d91a1a}-3.56\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.1643ms 3.0466ms 328.2368 Ops/s 323.6527 Ops/s $\color{#35bf28}+1.42\%$
test_dqn_speed[False-None] 6.6414ms 1.3337ms 749.8172 Ops/s 737.5501 Ops/s $\color{#35bf28}+1.66\%$
test_dqn_speed[False-backward] 2.0865ms 1.8261ms 547.6239 Ops/s 544.1133 Ops/s $\color{#35bf28}+0.65\%$
test_dqn_speed[True-None] 0.7301ms 0.4655ms 2.1482 KOps/s 2.1102 KOps/s $\color{#35bf28}+1.80\%$
test_dqn_speed[True-backward] 0.9275ms 0.8795ms 1.1370 KOps/s 1.0667 KOps/s $\textbf{\color{#35bf28}+6.59\%}$
test_dqn_speed[reduce-overhead-None] 0.7965ms 0.4724ms 2.1168 KOps/s 2.1263 KOps/s $\color{#d91a1a}-0.45\%$
test_dqn_speed[reduce-overhead-backward] 0.9906ms 0.8942ms 1.1183 KOps/s 1.1013 KOps/s $\color{#35bf28}+1.55\%$
test_ddpg_speed[False-None] 3.8792ms 2.7893ms 358.5189 Ops/s 355.2518 Ops/s $\color{#35bf28}+0.92\%$
test_ddpg_speed[False-backward] 4.2918ms 3.9373ms 253.9781 Ops/s 249.7734 Ops/s $\color{#35bf28}+1.68\%$
test_ddpg_speed[True-None] 1.4169ms 1.0183ms 982.0388 Ops/s 964.6577 Ops/s $\color{#35bf28}+1.80\%$
test_ddpg_speed[True-backward] 2.1015ms 1.9272ms 518.8794 Ops/s 506.6928 Ops/s $\color{#35bf28}+2.41\%$
test_ddpg_speed[reduce-overhead-None] 1.3767ms 1.0128ms 987.3852 Ops/s 960.8961 Ops/s $\color{#35bf28}+2.76\%$
test_ddpg_speed[reduce-overhead-backward] 2.6972ms 2.0175ms 495.6652 Ops/s 522.5030 Ops/s $\textbf{\color{#d91a1a}-5.14\%}$
test_sac_speed[False-None] 9.2296ms 7.9866ms 125.2096 Ops/s 126.8872 Ops/s $\color{#d91a1a}-1.32\%$
test_sac_speed[False-backward] 11.4705ms 10.8247ms 92.3817 Ops/s 60.9145 Ops/s $\textbf{\color{#35bf28}+51.66\%}$
test_sac_speed[True-None] 2.5026ms 1.8581ms 538.1777 Ops/s 530.8996 Ops/s $\color{#35bf28}+1.37\%$
test_sac_speed[True-backward] 3.6386ms 3.5450ms 282.0846 Ops/s 260.5192 Ops/s $\textbf{\color{#35bf28}+8.28\%}$
test_sac_speed[reduce-overhead-None] 2.1387ms 1.8619ms 537.0973 Ops/s 496.2213 Ops/s $\textbf{\color{#35bf28}+8.24\%}$
test_sac_speed[reduce-overhead-backward] 3.8586ms 3.6413ms 274.6269 Ops/s 267.4162 Ops/s $\color{#35bf28}+2.70\%$
test_redq_speed[False-None] 14.6753ms 12.9433ms 77.2600 Ops/s 74.3978 Ops/s $\color{#35bf28}+3.85\%$
test_redq_speed[False-backward] 23.0396ms 22.2698ms 44.9039 Ops/s 43.1314 Ops/s $\color{#35bf28}+4.11\%$
test_redq_speed[True-None] 5.7403ms 5.2246ms 191.4033 Ops/s 175.4704 Ops/s $\textbf{\color{#35bf28}+9.08\%}$
test_redq_speed[True-backward] 13.1201ms 12.5243ms 79.8447 Ops/s 75.9470 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_redq_speed[reduce-overhead-None] 5.9871ms 5.2797ms 189.4059 Ops/s 190.0225 Ops/s $\color{#d91a1a}-0.32\%$
test_redq_speed[reduce-overhead-backward] 14.2640ms 12.8968ms 77.5386 Ops/s 77.2719 Ops/s $\color{#35bf28}+0.35\%$
test_redq_deprec_speed[False-None] 14.3173ms 13.2788ms 75.3080 Ops/s 45.5774 Ops/s $\textbf{\color{#35bf28}+65.23\%}$
test_redq_deprec_speed[False-backward] 22.0303ms 19.3323ms 51.7268 Ops/s 50.8949 Ops/s $\color{#35bf28}+1.63\%$
test_redq_deprec_speed[True-None] 5.4187ms 3.9845ms 250.9743 Ops/s 240.7709 Ops/s $\color{#35bf28}+4.24\%$
test_redq_deprec_speed[True-backward] 9.8494ms 8.9676ms 111.5120 Ops/s 120.6749 Ops/s $\textbf{\color{#d91a1a}-7.59\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.7553ms 3.8716ms 258.2939 Ops/s 255.0873 Ops/s $\color{#35bf28}+1.26\%$
test_redq_deprec_speed[reduce-overhead-backward] 10.0386ms 8.6996ms 114.9475 Ops/s 111.2667 Ops/s $\color{#35bf28}+3.31\%$
test_td3_speed[False-None] 8.6027ms 7.9179ms 126.2956 Ops/s 121.3602 Ops/s $\color{#35bf28}+4.07\%$
test_td3_speed[False-backward] 11.0774ms 10.5071ms 95.1734 Ops/s 95.1130 Ops/s $\color{#35bf28}+0.06\%$
test_td3_speed[True-None] 2.0864ms 1.9503ms 512.7535 Ops/s 497.8973 Ops/s $\color{#35bf28}+2.98\%$
test_td3_speed[True-backward] 3.8818ms 3.7453ms 266.9997 Ops/s 246.1958 Ops/s $\textbf{\color{#35bf28}+8.45\%}$
test_td3_speed[reduce-overhead-None] 2.2374ms 1.9680ms 508.1257 Ops/s 468.3960 Ops/s $\textbf{\color{#35bf28}+8.48\%}$
test_td3_speed[reduce-overhead-backward] 3.8592ms 3.7279ms 268.2491 Ops/s 262.5343 Ops/s $\color{#35bf28}+2.18\%$
test_cql_speed[False-None] 37.5283ms 35.6363ms 28.0613 Ops/s 26.8793 Ops/s $\color{#35bf28}+4.40\%$
test_cql_speed[False-backward] 52.0678ms 46.4329ms 21.5365 Ops/s 21.0817 Ops/s $\color{#35bf28}+2.16\%$
test_cql_speed[True-None] 17.7983ms 16.0697ms 62.2289 Ops/s 60.3446 Ops/s $\color{#35bf28}+3.12\%$
test_cql_speed[True-backward] 24.4053ms 23.0678ms 43.3505 Ops/s 41.6735 Ops/s $\color{#35bf28}+4.02\%$
test_cql_speed[reduce-overhead-None] 17.7140ms 16.1128ms 62.0623 Ops/s 62.0395 Ops/s $\color{#35bf28}+0.04\%$
test_cql_speed[reduce-overhead-backward] 24.0092ms 23.1428ms 43.2100 Ops/s 42.8968 Ops/s $\color{#35bf28}+0.73\%$
test_a2c_speed[False-None] 7.8593ms 7.4754ms 133.7717 Ops/s 139.5581 Ops/s $\color{#d91a1a}-4.15\%$
test_a2c_speed[False-backward] 17.4056ms 15.3298ms 65.2322 Ops/s 69.7285 Ops/s $\textbf{\color{#d91a1a}-6.45\%}$
test_a2c_speed[True-None] 3.8113ms 3.4055ms 293.6464 Ops/s 290.6806 Ops/s $\color{#35bf28}+1.02\%$
test_a2c_speed[True-backward] 10.5353ms 10.0541ms 99.4623 Ops/s 94.5424 Ops/s $\textbf{\color{#35bf28}+5.20\%}$
test_a2c_speed[reduce-overhead-None] 3.7175ms 3.2980ms 303.2102 Ops/s 292.0218 Ops/s $\color{#35bf28}+3.83\%$
test_a2c_speed[reduce-overhead-backward] 11.1062ms 10.5919ms 94.4119 Ops/s 100.1214 Ops/s $\textbf{\color{#d91a1a}-5.70\%}$
test_ppo_speed[False-None] 8.4333ms 7.7350ms 129.2828 Ops/s 130.6955 Ops/s $\color{#d91a1a}-1.08\%$
test_ppo_speed[False-backward] 17.6791ms 15.5176ms 64.4431 Ops/s 66.6170 Ops/s $\color{#d91a1a}-3.26\%$
test_ppo_speed[True-None] 4.5069ms 3.8091ms 262.5312 Ops/s 252.1929 Ops/s $\color{#35bf28}+4.10\%$
test_ppo_speed[True-backward] 11.0610ms 10.3180ms 96.9182 Ops/s 94.3545 Ops/s $\color{#35bf28}+2.72\%$
test_ppo_speed[reduce-overhead-None] 4.2127ms 3.7531ms 266.4489 Ops/s 263.3438 Ops/s $\color{#35bf28}+1.18\%$
test_ppo_speed[reduce-overhead-backward] 10.5408ms 10.0017ms 99.9828 Ops/s 101.1717 Ops/s $\color{#d91a1a}-1.18\%$
test_reinforce_speed[False-None] 7.9061ms 6.4809ms 154.3004 Ops/s 151.8357 Ops/s $\color{#35bf28}+1.62\%$
test_reinforce_speed[False-backward] 11.5482ms 9.8377ms 101.6501 Ops/s 97.0406 Ops/s $\color{#35bf28}+4.75\%$
test_reinforce_speed[True-None] 3.3848ms 2.7439ms 364.4501 Ops/s 372.8133 Ops/s $\color{#d91a1a}-2.24\%$
test_reinforce_speed[True-backward] 10.3145ms 9.1783ms 108.9526 Ops/s 109.6341 Ops/s $\color{#d91a1a}-0.62\%$
test_reinforce_speed[reduce-overhead-None] 2.9133ms 2.7075ms 369.3463 Ops/s 353.5810 Ops/s $\color{#35bf28}+4.46\%$
test_reinforce_speed[reduce-overhead-backward] 10.3584ms 9.2158ms 108.5098 Ops/s 108.7958 Ops/s $\color{#d91a1a}-0.26\%$
test_iql_speed[False-None] 33.5876ms 32.6098ms 30.6656 Ops/s 30.3469 Ops/s $\color{#35bf28}+1.05\%$
test_iql_speed[False-backward] 47.9152ms 45.4281ms 22.0128 Ops/s 21.8255 Ops/s $\color{#35bf28}+0.86\%$
test_iql_speed[True-None] 15.0918ms 13.7061ms 72.9600 Ops/s 72.4474 Ops/s $\color{#35bf28}+0.71\%$
test_iql_speed[True-backward] 26.2527ms 25.0797ms 39.8729 Ops/s 38.8687 Ops/s $\color{#35bf28}+2.58\%$
test_iql_speed[reduce-overhead-None] 14.5707ms 13.7318ms 72.8239 Ops/s 72.7618 Ops/s $\color{#35bf28}+0.09\%$
test_iql_speed[reduce-overhead-backward] 26.5737ms 25.3092ms 39.5114 Ops/s 39.0459 Ops/s $\color{#35bf28}+1.19\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.6188ms 5.2804ms 189.3809 Ops/s 183.2150 Ops/s $\color{#35bf28}+3.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2594ms 0.4888ms 2.0457 KOps/s 1.9917 KOps/s $\color{#35bf28}+2.71\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7180ms 0.4656ms 2.1477 KOps/s 2.0948 KOps/s $\color{#35bf28}+2.53\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.3695ms 5.2538ms 190.3401 Ops/s 185.2046 Ops/s $\color{#35bf28}+2.77\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6768ms 0.4894ms 2.0434 KOps/s 2.0433 KOps/s $+0.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7006ms 0.4642ms 2.1541 KOps/s 2.1047 KOps/s $\color{#35bf28}+2.35\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2913ms 1.6130ms 619.9446 Ops/s 607.1086 Ops/s $\color{#35bf28}+2.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7418ms 1.5071ms 663.5443 Ops/s 647.5774 Ops/s $\color{#35bf28}+2.47\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.6485ms 5.3993ms 185.2075 Ops/s 181.1917 Ops/s $\color{#35bf28}+2.22\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2836ms 0.6253ms 1.5993 KOps/s 1.5800 KOps/s $\color{#35bf28}+1.22\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0520ms 0.6045ms 1.6543 KOps/s 1.6510 KOps/s $\color{#35bf28}+0.20\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2248ms 5.3996ms 185.1995 Ops/s 187.2273 Ops/s $\color{#d91a1a}-1.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1467ms 0.4931ms 2.0281 KOps/s 508.1899 Ops/s $\textbf{\color{#35bf28}+299.08\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6995ms 0.4722ms 2.1177 KOps/s 2.1324 KOps/s $\color{#d91a1a}-0.69\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6742ms 5.2765ms 189.5210 Ops/s 181.3533 Ops/s $\color{#35bf28}+4.50\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6779ms 0.4906ms 2.0382 KOps/s 2.0434 KOps/s $\color{#d91a1a}-0.26\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6692ms 0.4592ms 2.1775 KOps/s 2.1414 KOps/s $\color{#35bf28}+1.69\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2743ms 5.4341ms 184.0222 Ops/s 174.4546 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8259ms 0.6199ms 1.6131 KOps/s 1.5786 KOps/s $\color{#35bf28}+2.19\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 8.1879ms 0.6041ms 1.6554 KOps/s 1.6165 KOps/s $\color{#35bf28}+2.41\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.6940ms 4.2338ms 236.1954 Ops/s 222.2298 Ops/s $\textbf{\color{#35bf28}+6.28\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.7830ms 2.2307ms 448.2893 Ops/s 73.2442 Ops/s $\textbf{\color{#35bf28}+512.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0843ms 1.4486ms 690.3336 Ops/s 746.4565 Ops/s $\textbf{\color{#d91a1a}-7.52\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4192s 12.5593ms 79.6222 Ops/s 218.0941 Ops/s $\textbf{\color{#d91a1a}-63.49\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.6599ms 1.9970ms 500.7516 Ops/s 72.4558 Ops/s $\textbf{\color{#35bf28}+591.11\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.6342ms 1.4303ms 699.1760 Ops/s 691.8840 Ops/s $\color{#35bf28}+1.05\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.0054ms 4.4316ms 225.6509 Ops/s 214.7158 Ops/s $\textbf{\color{#35bf28}+5.09\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.6810ms 2.5706ms 389.0147 Ops/s 73.0467 Ops/s $\textbf{\color{#35bf28}+432.56\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2236ms 1.5080ms 663.1317 Ops/s 651.1291 Ops/s $\color{#35bf28}+1.84\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1058s 0.1057s 9.4643 Ops/s 9.4589 Ops/s $\color{#35bf28}+0.06\%$
test_sync 94.7205ms 93.3160ms 10.7163 Ops/s 10.9585 Ops/s $\color{#d91a1a}-2.21\%$
test_async 0.1768s 88.7124ms 11.2724 Ops/s 11.2597 Ops/s $\color{#35bf28}+0.11\%$
test_single_pixels 0.1120s 0.1118s 8.9425 Ops/s 8.9288 Ops/s $\color{#35bf28}+0.15\%$
test_sync_pixels 73.4647ms 72.3289ms 13.8257 Ops/s 13.8134 Ops/s $\color{#35bf28}+0.09\%$
test_async_pixels 0.1398s 68.6603ms 14.5645 Ops/s 14.7354 Ops/s $\color{#d91a1a}-1.16\%$
test_simple 0.7792s 0.7713s 1.2965 Ops/s 1.2938 Ops/s $\color{#35bf28}+0.21\%$
test_transformed 1.0200s 1.0070s 0.9931 Ops/s 1.0184 Ops/s $\color{#d91a1a}-2.49\%$
test_serial 2.2260s 2.2057s 0.4534 Ops/s 0.4662 Ops/s $\color{#d91a1a}-2.74\%$
test_parallel 1.9417s 1.9040s 0.5252 Ops/s 0.5292 Ops/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[True-True-True-True-True] 0.2349ms 38.5588μs 25.9344 KOps/s 26.4081 KOps/s $\color{#d91a1a}-1.79\%$
test_step_mdp_speed[True-True-True-True-False] 53.1610μs 22.0044μs 45.4455 KOps/s 45.4815 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[True-True-True-False-True] 51.4810μs 21.9097μs 45.6420 KOps/s 45.9843 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[True-True-True-False-False] 39.1410μs 12.6451μs 79.0819 KOps/s 79.7514 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[True-True-False-True-True] 74.4610μs 41.1747μs 24.2868 KOps/s 24.2377 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[True-True-False-True-False] 52.0710μs 24.2325μs 41.2669 KOps/s 41.0840 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[True-True-False-False-True] 58.4910μs 23.7750μs 42.0610 KOps/s 41.5240 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[True-True-False-False-False] 46.7710μs 14.8049μs 67.5452 KOps/s 68.0427 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[True-False-True-True-True] 0.1093ms 42.1491μs 23.7253 KOps/s 23.6347 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[True-False-True-True-False] 57.1710μs 26.2225μs 38.1352 KOps/s 37.8557 KOps/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[True-False-True-False-True] 52.0310μs 23.6995μs 42.1949 KOps/s 40.1140 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_step_mdp_speed[True-False-True-False-False] 46.3310μs 14.7024μs 68.0161 KOps/s 67.3845 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[True-False-False-True-True] 79.9620μs 45.2043μs 22.1218 KOps/s 22.1382 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[True-False-False-True-False] 59.4010μs 28.4823μs 35.1096 KOps/s 35.1614 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-False-False-False-True] 73.7210μs 25.6937μs 38.9201 KOps/s 38.1872 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[True-False-False-False-False] 63.1910μs 16.8645μs 59.2962 KOps/s 59.8320 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-True-True-True-True] 76.5310μs 43.3487μs 23.0687 KOps/s 23.0283 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[False-True-True-True-False] 61.9510μs 26.6054μs 37.5864 KOps/s 38.0140 KOps/s $\color{#d91a1a}-1.12\%$
test_step_mdp_speed[False-True-True-False-True] 53.7410μs 27.6203μs 36.2052 KOps/s 37.0137 KOps/s $\color{#d91a1a}-2.18\%$
test_step_mdp_speed[False-True-True-False-False] 49.1410μs 16.3495μs 61.1641 KOps/s 61.7200 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-True-False-True-True] 79.3010μs 45.1251μs 22.1606 KOps/s 22.2509 KOps/s $\color{#d91a1a}-0.41\%$
test_step_mdp_speed[False-True-False-True-False] 58.5910μs 28.8766μs 34.6302 KOps/s 35.2202 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[False-True-False-False-True] 3.2989ms 29.7023μs 33.6674 KOps/s 33.7925 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-True-False-False-False] 54.1710μs 18.5379μs 53.9435 KOps/s 54.4435 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[False-False-True-True-True] 81.6110μs 47.5984μs 21.0091 KOps/s 21.3861 KOps/s $\color{#d91a1a}-1.76\%$
test_step_mdp_speed[False-False-True-True-False] 60.6410μs 30.6471μs 32.6295 KOps/s 32.2521 KOps/s $\color{#35bf28}+1.17\%$
test_step_mdp_speed[False-False-True-False-True] 62.0710μs 29.7171μs 33.6506 KOps/s 33.7224 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[False-False-True-False-False] 53.8810μs 18.4740μs 54.1301 KOps/s 54.3580 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[False-False-False-True-True] 87.2120μs 48.4482μs 20.6406 KOps/s 20.4544 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-False-False-True-False] 65.7510μs 32.7504μs 30.5340 KOps/s 31.8348 KOps/s $\color{#d91a1a}-4.09\%$
test_step_mdp_speed[False-False-False-False-True] 60.6010μs 30.6142μs 32.6646 KOps/s 32.5917 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[False-False-False-False-False] 60.4410μs 20.3379μs 49.1694 KOps/s 49.3954 KOps/s $\color{#d91a1a}-0.46\%$
test_values[generalized_advantage_estimate-True-True] 26.3038ms 25.4501ms 39.2926 Ops/s 40.3322 Ops/s $\color{#d91a1a}-2.58\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1046s 2.9840ms 335.1216 Ops/s 337.7771 Ops/s $\color{#d91a1a}-0.79\%$
test_values[td0_return_estimate-False-False] 95.7810μs 67.3388μs 14.8503 KOps/s 14.9872 KOps/s $\color{#d91a1a}-0.91\%$
test_values[td1_return_estimate-False-False] 58.6351ms 57.2963ms 17.4531 Ops/s 17.8871 Ops/s $\color{#d91a1a}-2.43\%$
test_values[vec_td1_return_estimate-False-False] 1.4168ms 1.0863ms 920.5808 Ops/s 926.0453 Ops/s $\color{#d91a1a}-0.59\%$
test_values[td_lambda_return_estimate-True-False] 93.9047ms 91.8914ms 10.8824 Ops/s 11.3744 Ops/s $\color{#d91a1a}-4.32\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3990ms 1.0770ms 928.4691 Ops/s 928.3836 Ops/s $+0.01\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.5390ms 25.9842ms 38.4849 Ops/s 38.0773 Ops/s $\color{#35bf28}+1.07\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9888ms 0.7214ms 1.3862 KOps/s 1.3900 KOps/s $\color{#d91a1a}-0.27\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7424ms 0.6729ms 1.4861 KOps/s 1.5107 KOps/s $\color{#d91a1a}-1.63\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5168ms 1.4684ms 681.0077 Ops/s 681.0648 Ops/s $-0.01\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7634ms 0.7044ms 1.4196 KOps/s 1.4748 KOps/s $\color{#d91a1a}-3.75\%$
test_dqn_speed[False-None] 6.7898ms 1.3604ms 735.0605 Ops/s 736.6318 Ops/s $\color{#d91a1a}-0.21\%$
test_dqn_speed[False-backward] 1.9483ms 1.9000ms 526.3134 Ops/s 540.8930 Ops/s $\color{#d91a1a}-2.70\%$
test_dqn_speed[True-None] 0.9134ms 0.5467ms 1.8293 KOps/s 1.7516 KOps/s $\color{#35bf28}+4.43\%$
test_dqn_speed[True-backward] 1.0400ms 0.9920ms 1.0081 KOps/s 883.1417 Ops/s $\textbf{\color{#35bf28}+14.15\%}$
test_dqn_speed[reduce-overhead-None] 0.9340ms 0.5534ms 1.8071 KOps/s 1.7002 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_dqn_speed[reduce-overhead-backward] 1.0380ms 1.0007ms 999.3137 Ops/s 952.2256 Ops/s $\color{#35bf28}+4.95\%$
test_ddpg_speed[False-None] 3.3939ms 2.7688ms 361.1722 Ops/s 359.8548 Ops/s $\color{#35bf28}+0.37\%$
test_ddpg_speed[False-backward] 4.2301ms 3.9801ms 251.2496 Ops/s 249.1090 Ops/s $\color{#35bf28}+0.86\%$
test_ddpg_speed[True-None] 1.3727ms 1.2339ms 810.4092 Ops/s 775.5148 Ops/s $\color{#35bf28}+4.50\%$
test_ddpg_speed[True-backward] 2.2506ms 2.2019ms 454.1475 Ops/s 373.9830 Ops/s $\textbf{\color{#35bf28}+21.44\%}$
test_ddpg_speed[reduce-overhead-None] 1.6172ms 1.2348ms 809.8593 Ops/s 764.4344 Ops/s $\textbf{\color{#35bf28}+5.94\%}$
test_ddpg_speed[reduce-overhead-backward] 2.3008ms 2.2378ms 446.8757 Ops/s 437.6552 Ops/s $\color{#35bf28}+2.11\%$
test_sac_speed[False-None] 8.6939ms 7.7062ms 129.7662 Ops/s 130.7771 Ops/s $\color{#d91a1a}-0.77\%$
test_sac_speed[False-backward] 11.3707ms 10.9227ms 91.5523 Ops/s 92.7594 Ops/s $\color{#d91a1a}-1.30\%$
test_sac_speed[True-None] 2.4097ms 2.0064ms 498.4011 Ops/s 484.1535 Ops/s $\color{#35bf28}+2.94\%$
test_sac_speed[True-backward] 4.0456ms 3.9468ms 253.3719 Ops/s 246.0425 Ops/s $\color{#35bf28}+2.98\%$
test_sac_speed[reduce-overhead-None] 2.1262ms 2.0156ms 496.1255 Ops/s 482.3424 Ops/s $\color{#35bf28}+2.86\%$
test_sac_speed[reduce-overhead-backward] 4.1103ms 3.9694ms 251.9288 Ops/s 245.9852 Ops/s $\color{#35bf28}+2.42\%$
test_redq_speed[False-None] 11.7566ms 10.2121ms 97.9232 Ops/s 97.4080 Ops/s $\color{#35bf28}+0.53\%$
test_redq_speed[False-backward] 18.2784ms 17.3532ms 57.6263 Ops/s 56.5113 Ops/s $\color{#35bf28}+1.97\%$
test_redq_speed[True-None] 3.7659ms 3.3986ms 294.2393 Ops/s 297.5266 Ops/s $\color{#d91a1a}-1.10\%$
test_redq_speed[True-backward] 8.5819ms 8.3045ms 120.4172 Ops/s 124.1915 Ops/s $\color{#d91a1a}-3.04\%$
test_redq_speed[reduce-overhead-None] 3.5808ms 3.3941ms 294.6289 Ops/s 295.5496 Ops/s $\color{#d91a1a}-0.31\%$
test_redq_speed[reduce-overhead-backward] 8.5484ms 8.1975ms 121.9885 Ops/s 123.0827 Ops/s $\color{#d91a1a}-0.89\%$
test_redq_deprec_speed[False-None] 11.0013ms 10.5061ms 95.1829 Ops/s 96.5726 Ops/s $\color{#d91a1a}-1.44\%$
test_redq_deprec_speed[False-backward] 15.8084ms 15.1826ms 65.8649 Ops/s 66.7453 Ops/s $\color{#d91a1a}-1.32\%$
test_redq_deprec_speed[True-None] 3.4524ms 3.1205ms 320.4564 Ops/s 318.3941 Ops/s $\color{#35bf28}+0.65\%$
test_redq_deprec_speed[True-backward] 6.9278ms 6.7051ms 149.1397 Ops/s 150.5809 Ops/s $\color{#d91a1a}-0.96\%$
test_redq_deprec_speed[reduce-overhead-None] 3.5762ms 3.1260ms 319.9021 Ops/s 313.8118 Ops/s $\color{#35bf28}+1.94\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.2176ms 6.8488ms 146.0108 Ops/s 151.9092 Ops/s $\color{#d91a1a}-3.88\%$
test_td3_speed[False-None] 32.9821ms 7.8647ms 127.1497 Ops/s 130.4560 Ops/s $\color{#d91a1a}-2.53\%$
test_td3_speed[False-backward] 10.8160ms 10.4730ms 95.4832 Ops/s 95.8739 Ops/s $\color{#d91a1a}-0.41\%$
test_td3_speed[True-None] 2.0672ms 2.0117ms 497.1003 Ops/s 465.9698 Ops/s $\textbf{\color{#35bf28}+6.68\%}$
test_td3_speed[True-backward] 4.0018ms 3.8938ms 256.8190 Ops/s 252.2077 Ops/s $\color{#35bf28}+1.83\%$
test_td3_speed[reduce-overhead-None] 2.3564ms 2.0628ms 484.7802 Ops/s 483.1277 Ops/s $\color{#35bf28}+0.34\%$
test_td3_speed[reduce-overhead-backward] 3.8984ms 3.8161ms 262.0470 Ops/s 254.2534 Ops/s $\color{#35bf28}+3.07\%$
test_cql_speed[False-None] 31.4290ms 25.0487ms 39.9223 Ops/s 40.8370 Ops/s $\color{#d91a1a}-2.24\%$
test_cql_speed[False-backward] 37.6981ms 34.3053ms 29.1500 Ops/s 29.6627 Ops/s $\color{#d91a1a}-1.73\%$
test_cql_speed[True-None] 11.3668ms 10.6845ms 93.5934 Ops/s 93.9304 Ops/s $\color{#d91a1a}-0.36\%$
test_cql_speed[True-backward] 16.7445ms 16.2876ms 61.3965 Ops/s 61.0747 Ops/s $\color{#35bf28}+0.53\%$
test_cql_speed[reduce-overhead-None] 11.1712ms 10.8089ms 92.5161 Ops/s 94.8164 Ops/s $\color{#d91a1a}-2.43\%$
test_cql_speed[reduce-overhead-backward] 16.6293ms 16.3480ms 61.1696 Ops/s 61.2213 Ops/s $\color{#d91a1a}-0.08\%$
test_a2c_speed[False-None] 5.4838ms 5.1486ms 194.2292 Ops/s 186.9068 Ops/s $\color{#35bf28}+3.92\%$
test_a2c_speed[False-backward] 11.8212ms 11.4168ms 87.5905 Ops/s 86.8246 Ops/s $\color{#35bf28}+0.88\%$
test_a2c_speed[True-None] 3.5290ms 3.0874ms 323.8975 Ops/s 325.5773 Ops/s $\color{#d91a1a}-0.52\%$
test_a2c_speed[True-backward] 8.5918ms 8.4262ms 118.6775 Ops/s 118.2984 Ops/s $\color{#35bf28}+0.32\%$
test_a2c_speed[reduce-overhead-None] 3.2534ms 3.0146ms 331.7163 Ops/s 328.3840 Ops/s $\color{#35bf28}+1.01\%$
test_a2c_speed[reduce-overhead-backward] 8.5499ms 8.3793ms 119.3415 Ops/s 118.4857 Ops/s $\color{#35bf28}+0.72\%$
test_ppo_speed[False-None] 5.9445ms 5.5520ms 180.1138 Ops/s 174.2839 Ops/s $\color{#35bf28}+3.35\%$
test_ppo_speed[False-backward] 15.0181ms 12.3482ms 80.9835 Ops/s 83.1308 Ops/s $\color{#d91a1a}-2.58\%$
test_ppo_speed[True-None] 3.6673ms 3.4462ms 290.1734 Ops/s 285.9335 Ops/s $\color{#35bf28}+1.48\%$
test_ppo_speed[True-backward] 8.2584ms 8.0819ms 123.7340 Ops/s 117.9080 Ops/s $\color{#35bf28}+4.94\%$
test_ppo_speed[reduce-overhead-None] 3.6061ms 3.4430ms 290.4433 Ops/s 293.4028 Ops/s $\color{#d91a1a}-1.01\%$
test_ppo_speed[reduce-overhead-backward] 8.4266ms 8.1638ms 122.4926 Ops/s 122.3499 Ops/s $\color{#35bf28}+0.12\%$
test_reinforce_speed[False-None] 4.7369ms 4.4527ms 224.5807 Ops/s 223.3770 Ops/s $\color{#35bf28}+0.54\%$
test_reinforce_speed[False-backward] 7.3273ms 7.1309ms 140.2352 Ops/s 140.4395 Ops/s $\color{#d91a1a}-0.15\%$
test_reinforce_speed[True-None] 2.5992ms 2.2185ms 450.7522 Ops/s 444.0628 Ops/s $\color{#35bf28}+1.51\%$
test_reinforce_speed[True-backward] 7.7026ms 7.0598ms 141.6470 Ops/s 130.0507 Ops/s $\textbf{\color{#35bf28}+8.92\%}$
test_reinforce_speed[reduce-overhead-None] 2.5985ms 2.2339ms 447.6450 Ops/s 446.2447 Ops/s $\color{#35bf28}+0.31\%$
test_reinforce_speed[reduce-overhead-backward] 7.4771ms 6.9520ms 143.8445 Ops/s 143.0351 Ops/s $\color{#35bf28}+0.57\%$
test_iql_speed[False-None] 21.0070ms 18.8405ms 53.0770 Ops/s 50.9260 Ops/s $\color{#35bf28}+4.22\%$
test_iql_speed[False-backward] 30.7228ms 28.8049ms 34.7163 Ops/s 33.8924 Ops/s $\color{#35bf28}+2.43\%$
test_iql_speed[True-None] 8.0483ms 7.7844ms 128.4624 Ops/s 127.0183 Ops/s $\color{#35bf28}+1.14\%$
test_iql_speed[True-backward] 16.8498ms 16.2124ms 61.6811 Ops/s 60.9704 Ops/s $\color{#35bf28}+1.17\%$
test_iql_speed[reduce-overhead-None] 8.1341ms 7.8139ms 127.9774 Ops/s 128.5177 Ops/s $\color{#d91a1a}-0.42\%$
test_iql_speed[reduce-overhead-backward] 16.6274ms 16.2864ms 61.4010 Ops/s 61.6671 Ops/s $\color{#d91a1a}-0.43\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.2444ms 6.9658ms 143.5595 Ops/s 144.1507 Ops/s $\color{#d91a1a}-0.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0521ms 0.2419ms 4.1334 KOps/s 2.8827 KOps/s $\textbf{\color{#35bf28}+43.39\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6262ms 0.2191ms 4.5634 KOps/s 3.0793 KOps/s $\textbf{\color{#35bf28}+48.19\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.1305ms 6.8469ms 146.0506 Ops/s 147.0603 Ops/s $\color{#d91a1a}-0.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0266ms 0.2370ms 4.2198 KOps/s 4.2799 KOps/s $\color{#d91a1a}-1.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4749ms 0.2136ms 4.6823 KOps/s 4.7616 KOps/s $\color{#d91a1a}-1.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6081ms 1.2496ms 800.2534 Ops/s 811.5581 Ops/s $\color{#d91a1a}-1.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3539ms 1.1527ms 867.5523 Ops/s 880.0374 Ops/s $\color{#d91a1a}-1.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.1265ms 7.0024ms 142.8087 Ops/s 142.8874 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2794ms 0.4655ms 2.1484 KOps/s 2.0532 KOps/s $\color{#35bf28}+4.64\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6036ms 0.4452ms 2.2463 KOps/s 2.1862 KOps/s $\color{#35bf28}+2.75\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1544ms 6.9086ms 144.7466 Ops/s 145.9234 Ops/s $\color{#d91a1a}-0.81\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4471s 0.7389ms 1.3534 KOps/s 3.2217 KOps/s $\textbf{\color{#d91a1a}-57.99\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4103ms 0.2166ms 4.6170 KOps/s 3.3485 KOps/s $\textbf{\color{#35bf28}+37.88\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.1877ms 6.9090ms 144.7393 Ops/s 148.1181 Ops/s $\color{#d91a1a}-2.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6678ms 0.2345ms 4.2649 KOps/s 4.0724 KOps/s $\color{#35bf28}+4.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4245ms 0.2117ms 4.7238 KOps/s 3.9948 KOps/s $\textbf{\color{#35bf28}+18.25\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2575ms 7.1115ms 140.6170 Ops/s 140.1217 Ops/s $\color{#35bf28}+0.35\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3364ms 0.3935ms 2.5416 KOps/s 2.6371 KOps/s $\color{#d91a1a}-3.62\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6313ms 0.3674ms 2.7217 KOps/s 2.8246 KOps/s $\color{#d91a1a}-3.64\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0172ms 5.4956ms 181.9650 Ops/s 180.0292 Ops/s $\color{#35bf28}+1.08\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.4508ms 1.6577ms 603.2388 Ops/s 59.7952 Ops/s $\textbf{\color{#35bf28}+908.84\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 12.4340ms 1.3493ms 741.1191 Ops/s 833.2600 Ops/s $\textbf{\color{#d91a1a}-11.06\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.9602ms 5.4786ms 182.5281 Ops/s 178.2295 Ops/s $\color{#35bf28}+2.41\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.8853ms 1.7729ms 564.0360 Ops/s 59.7147 Ops/s $\textbf{\color{#35bf28}+844.55\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.8401ms 1.2311ms 812.3107 Ops/s 825.6310 Ops/s $\color{#d91a1a}-1.61\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3527s 12.6519ms 79.0396 Ops/s 174.3988 Ops/s $\textbf{\color{#d91a1a}-54.68\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 1.6920ms 1.5253ms 655.5990 Ops/s 58.8905 Ops/s $\textbf{\color{#35bf28}+1013.25\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.4062ms 1.3244ms 755.0446 Ops/s 776.7933 Ops/s $\color{#d91a1a}-2.80\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants