Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Quality,BE] Better doc for step_mdp #2639

Merged
merged 4 commits into from
Dec 12, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 6, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 6, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2639

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 19 Unrelated Failures

As of commit 5258b8f with merge base 4bc40a8 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

github-actions bot commented Dec 6, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4355s 0.4317s 2.3165 Ops/s 2.2219 Ops/s $\color{#35bf28}+4.26\%$
test_transformed 0.7052s 0.6310s 1.5849 Ops/s 1.5823 Ops/s $\color{#35bf28}+0.16\%$
test_serial 1.3568s 1.3519s 0.7397 Ops/s 0.7267 Ops/s $\color{#35bf28}+1.79\%$
test_parallel 1.2904s 1.2850s 0.7782 Ops/s 0.7579 Ops/s $\color{#35bf28}+2.68\%$
test_step_mdp_speed[True-True-True-True-True] 0.2142ms 29.5410μs 33.8512 KOps/s 33.4186 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[True-True-True-True-False] 48.6500μs 17.5296μs 57.0462 KOps/s 57.0325 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[True-True-True-False-True] 46.8580μs 16.9540μs 58.9830 KOps/s 58.7146 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[True-True-True-False-False] 40.3450μs 9.8976μs 101.0345 KOps/s 98.2797 KOps/s $\color{#35bf28}+2.80\%$
test_step_mdp_speed[True-True-False-True-True] 69.4090μs 31.9916μs 31.2582 KOps/s 31.0232 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[True-True-False-True-False] 54.4520μs 19.2587μs 51.9245 KOps/s 51.6408 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-True-False-False-True] 46.8570μs 18.9301μs 52.8260 KOps/s 53.4090 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-False-False-False] 59.4810μs 11.7542μs 85.0760 KOps/s 84.1943 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-False-True-True-True] 74.9690μs 33.6819μs 29.6895 KOps/s 29.4010 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[True-False-True-True-False] 94.9770μs 21.0738μs 47.4522 KOps/s 46.6486 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-False-True-False-True] 53.4400μs 18.6624μs 53.5838 KOps/s 52.4942 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[True-False-True-False-False] 40.8260μs 11.6728μs 85.6690 KOps/s 84.3020 KOps/s $\color{#35bf28}+1.62\%$
test_step_mdp_speed[True-False-False-True-True] 83.1950μs 35.3513μs 28.2875 KOps/s 28.1299 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-False-False-True-False] 53.0590μs 22.7891μs 43.8806 KOps/s 43.5508 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[True-False-False-False-True] 50.8150μs 20.4333μs 48.9397 KOps/s 48.4294 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-False-False-False-False] 40.6950μs 13.4150μs 74.5436 KOps/s 73.9701 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[False-True-True-True-True] 67.6260μs 33.8883μs 29.5087 KOps/s 29.4120 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-True-True-True-False] 45.1440μs 21.1185μs 47.3519 KOps/s 46.4346 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[False-True-True-False-True] 53.1190μs 21.3187μs 46.9072 KOps/s 46.2037 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[False-True-True-False-False] 73.7580μs 13.1402μs 76.1024 KOps/s 75.7653 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-True-False-True-True] 76.2910μs 35.1940μs 28.4140 KOps/s 28.2322 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-True-False-True-False] 50.1230μs 22.7068μs 44.0397 KOps/s 43.4366 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-False-False-True] 98.1903ms 26.3484μs 37.9529 KOps/s 43.1403 KOps/s $\textbf{\color{#d91a1a}-12.02\%}$
test_step_mdp_speed[False-True-False-False-False] 37.4700μs 14.4821μs 69.0509 KOps/s 66.9689 KOps/s $\color{#35bf28}+3.11\%$
test_step_mdp_speed[False-False-True-True-True] 68.6980μs 36.9096μs 27.0933 KOps/s 26.8819 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-False-True-True-False] 59.5300μs 24.5006μs 40.8153 KOps/s 40.0306 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[False-False-True-False-True] 59.0200μs 22.2495μs 44.9449 KOps/s 43.3840 KOps/s $\color{#35bf28}+3.60\%$
test_step_mdp_speed[False-False-True-False-False] 42.0480μs 14.6707μs 68.1633 KOps/s 66.9091 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[False-False-False-True-True] 72.6350μs 37.8996μs 26.3855 KOps/s 25.9233 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-False-False-True-False] 79.8380μs 25.8654μs 38.6617 KOps/s 37.7614 KOps/s $\color{#35bf28}+2.38\%$
test_step_mdp_speed[False-False-False-False-True] 65.8620μs 23.9630μs 41.7310 KOps/s 40.4536 KOps/s $\color{#35bf28}+3.16\%$
test_step_mdp_speed[False-False-False-False-False] 39.6130μs 16.0741μs 62.2121 KOps/s 60.8515 KOps/s $\color{#35bf28}+2.24\%$
test_values[generalized_advantage_estimate-True-True] 9.6490ms 9.3547ms 106.8978 Ops/s 104.5171 Ops/s $\color{#35bf28}+2.28\%$
test_values[vec_generalized_advantage_estimate-True-True] 35.9263ms 33.5355ms 29.8191 Ops/s 27.9639 Ops/s $\textbf{\color{#35bf28}+6.63\%}$
test_values[td0_return_estimate-False-False] 0.2317ms 0.1777ms 5.6280 KOps/s 5.5966 KOps/s $\color{#35bf28}+0.56\%$
test_values[td1_return_estimate-False-False] 23.7351ms 23.3744ms 42.7818 Ops/s 41.9647 Ops/s $\color{#35bf28}+1.95\%$
test_values[vec_td1_return_estimate-False-False] 35.7755ms 33.6147ms 29.7489 Ops/s 27.9348 Ops/s $\textbf{\color{#35bf28}+6.49\%}$
test_values[td_lambda_return_estimate-True-False] 45.4087ms 34.2524ms 29.1950 Ops/s 29.1662 Ops/s $\color{#35bf28}+0.10\%$
test_values[vec_td_lambda_return_estimate-True-False] 34.9477ms 33.6023ms 29.7599 Ops/s 27.9209 Ops/s $\textbf{\color{#35bf28}+6.59\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.4460ms 8.1277ms 123.0360 Ops/s 121.2859 Ops/s $\color{#35bf28}+1.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4792ms 2.0498ms 487.8527 Ops/s 491.2474 Ops/s $\color{#d91a1a}-0.69\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1.0632ms 0.3606ms 2.7735 KOps/s 2.7507 KOps/s $\color{#35bf28}+0.83\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.5264ms 42.7323ms 23.4015 Ops/s 21.2855 Ops/s $\textbf{\color{#35bf28}+9.94\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7826ms 3.1215ms 320.3552 Ops/s 323.7429 Ops/s $\color{#d91a1a}-1.05\%$
test_dqn_speed[False-None] 2.8116ms 1.3887ms 720.0737 Ops/s 711.4210 Ops/s $\color{#35bf28}+1.22\%$
test_dqn_speed[False-backward] 1.9609ms 1.8581ms 538.1914 Ops/s 530.3023 Ops/s $\color{#35bf28}+1.49\%$
test_dqn_speed[True-None] 0.6095ms 0.4730ms 2.1144 KOps/s 2.0725 KOps/s $\color{#35bf28}+2.02\%$
test_dqn_speed[True-backward] 1.0864ms 0.9272ms 1.0785 KOps/s 1.0876 KOps/s $\color{#d91a1a}-0.84\%$
test_dqn_speed[reduce-overhead-None] 0.5520ms 0.4735ms 2.1120 KOps/s 2.1185 KOps/s $\color{#d91a1a}-0.31\%$
test_dqn_speed[reduce-overhead-backward] 0.9411ms 0.9011ms 1.1097 KOps/s 1.0808 KOps/s $\color{#35bf28}+2.68\%$
test_ddpg_speed[False-None] 5.3011ms 2.8839ms 346.7506 Ops/s 348.2955 Ops/s $\color{#d91a1a}-0.44\%$
test_ddpg_speed[False-backward] 4.1584ms 3.9880ms 250.7539 Ops/s 245.0312 Ops/s $\color{#35bf28}+2.34\%$
test_ddpg_speed[True-None] 1.3835ms 1.0046ms 995.4148 Ops/s 937.6289 Ops/s $\textbf{\color{#35bf28}+6.16\%}$
test_ddpg_speed[True-backward] 1.9702ms 1.9038ms 525.2599 Ops/s 444.4842 Ops/s $\textbf{\color{#35bf28}+18.17\%}$
test_ddpg_speed[reduce-overhead-None] 1.4789ms 1.0128ms 987.3305 Ops/s 983.8472 Ops/s $\color{#35bf28}+0.35\%$
test_ddpg_speed[reduce-overhead-backward] 1.9648ms 1.9041ms 525.1930 Ops/s 505.9977 Ops/s $\color{#35bf28}+3.79\%$
test_sac_speed[False-None] 9.4843ms 8.0506ms 124.2143 Ops/s 122.1687 Ops/s $\color{#35bf28}+1.67\%$
test_sac_speed[False-backward] 12.4742ms 10.9119ms 91.6433 Ops/s 87.7413 Ops/s $\color{#35bf28}+4.45\%$
test_sac_speed[True-None] 2.0508ms 1.8332ms 545.4848 Ops/s 533.5887 Ops/s $\color{#35bf28}+2.23\%$
test_sac_speed[True-backward] 3.5914ms 3.5266ms 283.5606 Ops/s 272.3032 Ops/s $\color{#35bf28}+4.13\%$
test_sac_speed[reduce-overhead-None] 2.9647ms 1.8435ms 542.4479 Ops/s 534.5979 Ops/s $\color{#35bf28}+1.47\%$
test_sac_speed[reduce-overhead-backward] 3.8474ms 3.5448ms 282.1019 Ops/s 275.6783 Ops/s $\color{#35bf28}+2.33\%$
test_redq_speed[False-None] 0.2360s 15.6926ms 63.7244 Ops/s 67.1220 Ops/s $\textbf{\color{#d91a1a}-5.06\%}$
test_redq_speed[False-backward] 24.5365ms 22.2516ms 44.9406 Ops/s 42.8808 Ops/s $\color{#35bf28}+4.80\%$
test_redq_speed[True-None] 5.5577ms 4.5722ms 218.7127 Ops/s 192.7133 Ops/s $\textbf{\color{#35bf28}+13.49\%}$
test_redq_speed[True-backward] 13.6350ms 12.1377ms 82.3882 Ops/s 76.8341 Ops/s $\textbf{\color{#35bf28}+7.23\%}$
test_redq_speed[reduce-overhead-None] 6.0056ms 4.7377ms 211.0713 Ops/s 190.8172 Ops/s $\textbf{\color{#35bf28}+10.61\%}$
test_redq_speed[reduce-overhead-backward] 15.9226ms 12.9370ms 77.2978 Ops/s 76.4293 Ops/s $\color{#35bf28}+1.14\%$
test_redq_deprec_speed[False-None] 14.5635ms 13.0338ms 76.7236 Ops/s 71.2361 Ops/s $\textbf{\color{#35bf28}+7.70\%}$
test_redq_deprec_speed[False-backward] 0.2755s 24.6279ms 40.6044 Ops/s 50.0975 Ops/s $\textbf{\color{#d91a1a}-18.95\%}$
test_redq_deprec_speed[True-None] 4.6184ms 3.6949ms 270.6466 Ops/s 268.6583 Ops/s $\color{#35bf28}+0.74\%$
test_redq_deprec_speed[True-backward] 8.9956ms 8.6469ms 115.6477 Ops/s 111.6417 Ops/s $\color{#35bf28}+3.59\%$
test_redq_deprec_speed[reduce-overhead-None] 4.3537ms 3.6988ms 270.3605 Ops/s 271.1847 Ops/s $\color{#d91a1a}-0.30\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.0572ms 8.7061ms 114.8621 Ops/s 114.6880 Ops/s $\color{#35bf28}+0.15\%$
test_td3_speed[False-None] 8.4482ms 8.0526ms 124.1829 Ops/s 114.9876 Ops/s $\textbf{\color{#35bf28}+8.00\%}$
test_td3_speed[False-backward] 12.0122ms 10.5498ms 94.7881 Ops/s 91.0488 Ops/s $\color{#35bf28}+4.11\%$
test_td3_speed[True-None] 1.9508ms 1.7128ms 583.8525 Ops/s 549.7834 Ops/s $\textbf{\color{#35bf28}+6.20\%}$
test_td3_speed[True-backward] 3.7943ms 3.5555ms 281.2505 Ops/s 273.0675 Ops/s $\color{#35bf28}+3.00\%$
test_td3_speed[reduce-overhead-None] 1.9047ms 1.7306ms 577.8381 Ops/s 536.5794 Ops/s $\textbf{\color{#35bf28}+7.69\%}$
test_td3_speed[reduce-overhead-backward] 3.8671ms 3.5638ms 280.6010 Ops/s 277.1018 Ops/s $\color{#35bf28}+1.26\%$
test_cql_speed[False-None] 38.3083ms 36.1100ms 27.6932 Ops/s 26.8145 Ops/s $\color{#35bf28}+3.28\%$
test_cql_speed[False-backward] 49.1035ms 46.8767ms 21.3326 Ops/s 20.5733 Ops/s $\color{#35bf28}+3.69\%$
test_cql_speed[True-None] 16.5032ms 15.8018ms 63.2839 Ops/s 60.8704 Ops/s $\color{#35bf28}+3.97\%$
test_cql_speed[True-backward] 24.7281ms 22.8804ms 43.7055 Ops/s 41.5163 Ops/s $\textbf{\color{#35bf28}+5.27\%}$
test_cql_speed[reduce-overhead-None] 17.2981ms 16.0148ms 62.4423 Ops/s 63.2529 Ops/s $\color{#d91a1a}-1.28\%$
test_cql_speed[reduce-overhead-backward] 24.2945ms 23.1099ms 43.2714 Ops/s 42.3411 Ops/s $\color{#35bf28}+2.20\%$
test_a2c_speed[False-None] 9.1640ms 7.6673ms 130.4236 Ops/s 130.4670 Ops/s $\color{#d91a1a}-0.03\%$
test_a2c_speed[False-backward] 15.7464ms 15.2651ms 65.5091 Ops/s 66.7420 Ops/s $\color{#d91a1a}-1.85\%$
test_a2c_speed[True-None] 5.8794ms 4.4389ms 225.2834 Ops/s 234.4531 Ops/s $\color{#d91a1a}-3.91\%$
test_a2c_speed[True-backward] 11.8761ms 11.4545ms 87.3016 Ops/s 92.4091 Ops/s $\textbf{\color{#d91a1a}-5.53\%}$
test_a2c_speed[reduce-overhead-None] 5.2247ms 4.3632ms 229.1874 Ops/s 235.3031 Ops/s $\color{#d91a1a}-2.60\%$
test_a2c_speed[reduce-overhead-backward] 12.0213ms 11.2598ms 88.8114 Ops/s 88.7073 Ops/s $\color{#35bf28}+0.12\%$
test_ppo_speed[False-None] 10.2960ms 7.7555ms 128.9409 Ops/s 132.0690 Ops/s $\color{#d91a1a}-2.37\%$
test_ppo_speed[False-backward] 16.1520ms 15.3512ms 65.1415 Ops/s 66.7526 Ops/s $\color{#d91a1a}-2.41\%$
test_ppo_speed[True-None] 5.0835ms 3.7715ms 265.1472 Ops/s 270.0424 Ops/s $\color{#d91a1a}-1.81\%$
test_ppo_speed[True-backward] 10.3379ms 10.0611ms 99.3928 Ops/s 102.2845 Ops/s $\color{#d91a1a}-2.83\%$
test_ppo_speed[reduce-overhead-None] 4.2814ms 3.8384ms 260.5218 Ops/s 266.8548 Ops/s $\color{#d91a1a}-2.37\%$
test_ppo_speed[reduce-overhead-backward] 10.9571ms 10.2029ms 98.0118 Ops/s 102.1696 Ops/s $\color{#d91a1a}-4.07\%$
test_reinforce_speed[False-None] 8.5343ms 6.7400ms 148.3686 Ops/s 151.3095 Ops/s $\color{#d91a1a}-1.94\%$
test_reinforce_speed[False-backward] 10.7036ms 10.4576ms 95.6240 Ops/s 100.3655 Ops/s $\color{#d91a1a}-4.72\%$
test_reinforce_speed[True-None] 3.2746ms 2.7559ms 362.8528 Ops/s 374.0633 Ops/s $\color{#d91a1a}-3.00\%$
test_reinforce_speed[True-backward] 9.9087ms 9.0583ms 110.3956 Ops/s 113.8842 Ops/s $\color{#d91a1a}-3.06\%$
test_reinforce_speed[reduce-overhead-None] 3.4268ms 2.6962ms 370.8885 Ops/s 376.4413 Ops/s $\color{#d91a1a}-1.48\%$
test_reinforce_speed[reduce-overhead-backward] 9.5897ms 9.1795ms 108.9384 Ops/s 116.2323 Ops/s $\textbf{\color{#d91a1a}-6.28\%}$
test_iql_speed[False-None] 34.6752ms 32.9669ms 30.3334 Ops/s 31.3853 Ops/s $\color{#d91a1a}-3.35\%$
test_iql_speed[False-backward] 48.2498ms 46.5616ms 21.4769 Ops/s 22.1975 Ops/s $\color{#d91a1a}-3.25\%$
test_iql_speed[True-None] 11.6482ms 10.9245ms 91.5370 Ops/s 90.3693 Ops/s $\color{#35bf28}+1.29\%$
test_iql_speed[True-backward] 23.1534ms 22.2356ms 44.9729 Ops/s 46.0584 Ops/s $\color{#d91a1a}-2.36\%$
test_iql_speed[reduce-overhead-None] 12.0657ms 10.9421ms 91.3902 Ops/s 93.6118 Ops/s $\color{#d91a1a}-2.37\%$
test_iql_speed[reduce-overhead-backward] 23.7026ms 22.3433ms 44.7561 Ops/s 45.5524 Ops/s $\color{#d91a1a}-1.75\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1389ms 5.0887ms 196.5147 Ops/s 195.8457 Ops/s $\color{#35bf28}+0.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7414ms 0.5163ms 1.9368 KOps/s 1.9143 KOps/s $\color{#35bf28}+1.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7400ms 0.4947ms 2.0214 KOps/s 1.9580 KOps/s $\color{#35bf28}+3.24\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2075ms 4.9318ms 202.7656 Ops/s 199.9371 Ops/s $\color{#35bf28}+1.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1597ms 0.5050ms 1.9801 KOps/s 1.9457 KOps/s $\color{#35bf28}+1.77\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7015ms 0.4827ms 2.0718 KOps/s 2.0385 KOps/s $\color{#35bf28}+1.63\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9395ms 1.6353ms 611.5009 Ops/s 605.0207 Ops/s $\color{#35bf28}+1.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0697ms 1.5805ms 632.6985 Ops/s 619.2515 Ops/s $\color{#35bf28}+2.17\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.3912ms 5.1456ms 194.3391 Ops/s 192.6583 Ops/s $\color{#35bf28}+0.87\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.4350ms 0.6590ms 1.5175 KOps/s 1.5115 KOps/s $\color{#35bf28}+0.40\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8719ms 0.6247ms 1.6008 KOps/s 1.5666 KOps/s $\color{#35bf28}+2.18\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.1687ms 4.9892ms 200.4330 Ops/s 199.1762 Ops/s $\color{#35bf28}+0.63\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.3672ms 0.5224ms 1.9142 KOps/s 1.8937 KOps/s $\color{#35bf28}+1.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7457ms 0.4942ms 2.0233 KOps/s 1.9667 KOps/s $\color{#35bf28}+2.88\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6307ms 4.9281ms 202.9166 Ops/s 194.4662 Ops/s $\color{#35bf28}+4.35\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6856ms 0.5048ms 1.9809 KOps/s 1.9917 KOps/s $\color{#d91a1a}-0.54\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 8.4428ms 0.4941ms 2.0239 KOps/s 2.0673 KOps/s $\color{#d91a1a}-2.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.9623ms 5.0959ms 196.2370 Ops/s 194.9117 Ops/s $\color{#35bf28}+0.68\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8831ms 0.6615ms 1.5117 KOps/s 1.5120 KOps/s $\color{#d91a1a}-0.02\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 8.3812ms 0.6464ms 1.5469 KOps/s 1.6028 KOps/s $\color{#d91a1a}-3.49\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4622s 13.4324ms 74.4471 Ops/s 210.3535 Ops/s $\textbf{\color{#d91a1a}-64.61\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.6544ms 2.3591ms 423.8817 Ops/s 447.3587 Ops/s $\textbf{\color{#d91a1a}-5.25\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.9520ms 1.2686ms 788.2754 Ops/s 753.3054 Ops/s $\color{#35bf28}+4.64\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4020s 12.1354ms 82.4035 Ops/s 243.3865 Ops/s $\textbf{\color{#d91a1a}-66.14\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.4688ms 2.2608ms 442.3169 Ops/s 425.5354 Ops/s $\color{#35bf28}+3.94\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.8042ms 1.3372ms 747.8312 Ops/s 762.9226 Ops/s $\color{#d91a1a}-1.98\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.7396ms 4.3761ms 228.5119 Ops/s 230.3201 Ops/s $\color{#d91a1a}-0.79\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.6553ms 2.5448ms 392.9633 Ops/s 413.9793 Ops/s $\textbf{\color{#d91a1a}-5.08\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.0158ms 1.4559ms 686.8818 Ops/s 678.7504 Ops/s $\color{#35bf28}+1.20\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.0090ms 11.2510ms 88.8811 Ops/s 83.2923 Ops/s $\textbf{\color{#35bf28}+6.71\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.0572ms 14.4817ms 69.0527 Ops/s 67.8602 Ops/s $\color{#35bf28}+1.76\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.3595ms 20.0596ms 49.8515 Ops/s 49.3926 Ops/s $\color{#35bf28}+0.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 14.9370ms 14.5445ms 68.7544 Ops/s 67.0774 Ops/s $\color{#35bf28}+2.50\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.6127ms 20.0269ms 49.9329 Ops/s 49.6116 Ops/s $\color{#35bf28}+0.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.4088ms 15.6640ms 63.8405 Ops/s 62.6642 Ops/s $\color{#35bf28}+1.88\%$

Copy link

github-actions bot commented Dec 6, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7472s 0.7464s 1.3397 Ops/s 1.3046 Ops/s $\color{#35bf28}+2.69\%$
test_transformed 1.1002s 1.0209s 0.9795 Ops/s 1.0041 Ops/s $\color{#d91a1a}-2.45\%$
test_serial 2.2459s 2.1660s 0.4617 Ops/s 0.4673 Ops/s $\color{#d91a1a}-1.20\%$
test_parallel 2.1055s 2.0063s 0.4984 Ops/s 0.5043 Ops/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[True-True-True-True-True] 0.1894ms 40.5188μs 24.6799 KOps/s 24.8104 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-True-True-True-False] 53.1510μs 23.1439μs 43.2079 KOps/s 43.5795 KOps/s $\color{#d91a1a}-0.85\%$
test_step_mdp_speed[True-True-True-False-True] 59.7110μs 22.2408μs 44.9623 KOps/s 44.7330 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-True-True-False-False] 50.7000μs 12.6584μs 78.9991 KOps/s 78.7913 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[True-True-False-True-True] 69.9300μs 41.2756μs 24.2274 KOps/s 23.2325 KOps/s $\color{#35bf28}+4.28\%$
test_step_mdp_speed[True-True-False-True-False] 56.9410μs 24.7000μs 40.4858 KOps/s 40.2505 KOps/s $\color{#35bf28}+0.58\%$
test_step_mdp_speed[True-True-False-False-True] 67.9510μs 24.0574μs 41.5672 KOps/s 41.4201 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-True-False-False-False] 39.1700μs 14.8249μs 67.4541 KOps/s 66.9772 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[True-False-True-True-True] 81.1310μs 43.9324μs 22.7623 KOps/s 22.8607 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-False-True-True-False] 57.7110μs 27.1097μs 36.8871 KOps/s 36.8753 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[True-False-True-False-True] 56.6400μs 24.6442μs 40.5776 KOps/s 41.5614 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[True-False-True-False-False] 44.8310μs 14.9068μs 67.0836 KOps/s 67.4882 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[True-False-False-True-True] 85.7410μs 46.3539μs 21.5732 KOps/s 21.5424 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[True-False-False-True-False] 59.0600μs 28.8199μs 34.6982 KOps/s 34.7837 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-False-False-False-True] 55.1410μs 26.1375μs 38.2592 KOps/s 38.2954 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-False-False-False-False] 49.6110μs 16.7293μs 59.7753 KOps/s 59.2176 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[False-True-True-True-True] 70.8310μs 43.0160μs 23.2471 KOps/s 22.8075 KOps/s $\color{#35bf28}+1.93\%$
test_step_mdp_speed[False-True-True-True-False] 59.9200μs 26.9809μs 37.0632 KOps/s 36.5834 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-True-True-False-True] 61.9900μs 27.4799μs 36.3903 KOps/s 36.0628 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-True-True-False-False] 50.9610μs 16.3566μs 61.1372 KOps/s 60.9966 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-True-False-True-True] 0.1065ms 44.8087μs 22.3171 KOps/s 21.8881 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[False-True-False-True-False] 67.2210μs 28.7284μs 34.8087 KOps/s 34.3127 KOps/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[False-True-False-False-True] 3.2582ms 29.8359μs 33.5167 KOps/s 33.0915 KOps/s $\color{#35bf28}+1.28\%$
test_step_mdp_speed[False-True-False-False-False] 53.3710μs 18.4528μs 54.1924 KOps/s 53.3628 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[False-False-True-True-True] 80.4610μs 48.6322μs 20.5625 KOps/s 20.4113 KOps/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[False-False-True-True-False] 63.5200μs 31.3876μs 31.8597 KOps/s 31.6205 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[False-False-True-False-True] 58.6510μs 29.2607μs 34.1756 KOps/s 33.0529 KOps/s $\color{#35bf28}+3.40\%$
test_step_mdp_speed[False-False-True-False-False] 44.5310μs 18.4473μs 54.2084 KOps/s 53.6038 KOps/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[False-False-False-True-True] 84.5010μs 49.7967μs 20.0816 KOps/s 20.2068 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-False-False-True-False] 1.7335ms 32.8149μs 30.4740 KOps/s 30.0999 KOps/s $\color{#35bf28}+1.24\%$
test_step_mdp_speed[False-False-False-False-True] 60.1800μs 31.1509μs 32.1018 KOps/s 32.2586 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[False-False-False-False-False] 50.2610μs 20.3295μs 49.1895 KOps/s 47.9538 KOps/s $\color{#35bf28}+2.58\%$
test_values[generalized_advantage_estimate-True-True] 26.0008ms 24.5852ms 40.6749 Ops/s 41.0239 Ops/s $\color{#d91a1a}-0.85\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1144s 3.1830ms 314.1731 Ops/s 326.5293 Ops/s $\color{#d91a1a}-3.78\%$
test_values[td0_return_estimate-False-False] 0.1029ms 79.0021μs 12.6579 KOps/s 12.7188 KOps/s $\color{#d91a1a}-0.48\%$
test_values[td1_return_estimate-False-False] 54.3337ms 53.8160ms 18.5818 Ops/s 17.9418 Ops/s $\color{#35bf28}+3.57\%$
test_values[vec_td1_return_estimate-False-False] 1.3397ms 1.0760ms 929.3940 Ops/s 928.0311 Ops/s $\color{#35bf28}+0.15\%$
test_values[td_lambda_return_estimate-True-False] 90.7954ms 88.8514ms 11.2547 Ops/s 11.5085 Ops/s $\color{#d91a1a}-2.20\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.1970ms 1.0829ms 923.4057 Ops/s 931.4310 Ops/s $\color{#d91a1a}-0.86\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.7230ms 25.4227ms 39.3350 Ops/s 39.9831 Ops/s $\color{#d91a1a}-1.62\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0201ms 0.7401ms 1.3512 KOps/s 1.3383 KOps/s $\color{#35bf28}+0.96\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7497ms 0.6634ms 1.5074 KOps/s 1.5046 KOps/s $\color{#35bf28}+0.19\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5191ms 1.4717ms 679.4864 Ops/s 679.6596 Ops/s $\color{#d91a1a}-0.03\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7411ms 0.6748ms 1.4819 KOps/s 1.4628 KOps/s $\color{#35bf28}+1.31\%$
test_dqn_speed[False-None] 6.8544ms 1.5051ms 664.4152 Ops/s 674.6165 Ops/s $\color{#d91a1a}-1.51\%$
test_dqn_speed[False-backward] 2.1331ms 2.0899ms 478.4855 Ops/s 479.3308 Ops/s $\color{#d91a1a}-0.18\%$
test_dqn_speed[True-None] 1.2093ms 0.5394ms 1.8539 KOps/s 1.8670 KOps/s $\color{#d91a1a}-0.70\%$
test_dqn_speed[True-backward] 1.1503ms 1.0858ms 921.0098 Ops/s 828.7505 Ops/s $\textbf{\color{#35bf28}+11.13\%}$
test_dqn_speed[reduce-overhead-None] 0.6691ms 0.5536ms 1.8064 KOps/s 1.8073 KOps/s $\color{#d91a1a}-0.05\%$
test_dqn_speed[reduce-overhead-backward] 1.0008ms 0.9581ms 1.0438 KOps/s 934.9817 Ops/s $\textbf{\color{#35bf28}+11.63\%}$
test_ddpg_speed[False-None] 3.1340ms 2.8566ms 350.0677 Ops/s 351.2055 Ops/s $\color{#d91a1a}-0.32\%$
test_ddpg_speed[False-backward] 4.5413ms 4.0857ms 244.7569 Ops/s 238.0796 Ops/s $\color{#35bf28}+2.80\%$
test_ddpg_speed[True-None] 1.1605ms 1.0735ms 931.5531 Ops/s 889.4098 Ops/s $\color{#35bf28}+4.74\%$
test_ddpg_speed[True-backward] 2.3058ms 2.1483ms 465.4788 Ops/s 431.4904 Ops/s $\textbf{\color{#35bf28}+7.88\%}$
test_ddpg_speed[reduce-overhead-None] 1.1837ms 1.0822ms 924.0682 Ops/s 918.7344 Ops/s $\color{#35bf28}+0.58\%$
test_ddpg_speed[reduce-overhead-backward] 1.7034ms 1.6316ms 612.8986 Ops/s 563.4039 Ops/s $\textbf{\color{#35bf28}+8.78\%}$
test_sac_speed[False-None] 8.7848ms 8.0007ms 124.9898 Ops/s 125.4798 Ops/s $\color{#d91a1a}-0.39\%$
test_sac_speed[False-backward] 11.5871ms 11.0899ms 90.1719 Ops/s 90.1198 Ops/s $\color{#35bf28}+0.06\%$
test_sac_speed[True-None] 1.7236ms 1.5945ms 627.1557 Ops/s 648.6126 Ops/s $\color{#d91a1a}-3.31\%$
test_sac_speed[True-backward] 4.4272ms 3.2597ms 306.7787 Ops/s 293.5034 Ops/s $\color{#35bf28}+4.52\%$
test_sac_speed[reduce-overhead-None] 22.9719ms 12.4862ms 80.0886 Ops/s 79.7379 Ops/s $\color{#35bf28}+0.44\%$
test_sac_speed[reduce-overhead-backward] 1.3997ms 1.3294ms 752.1996 Ops/s 670.1455 Ops/s $\textbf{\color{#35bf28}+12.24\%}$
test_redq_speed[False-None] 8.3566ms 7.4228ms 134.7209 Ops/s 133.1984 Ops/s $\color{#35bf28}+1.14\%$
test_redq_speed[False-backward] 11.8260ms 11.1135ms 89.9805 Ops/s 85.7752 Ops/s $\color{#35bf28}+4.90\%$
test_redq_speed[True-None] 2.1807ms 1.9791ms 505.2787 Ops/s 475.6637 Ops/s $\textbf{\color{#35bf28}+6.23\%}$
test_redq_speed[True-backward] 3.7825ms 3.6037ms 277.4938 Ops/s 273.0253 Ops/s $\color{#35bf28}+1.64\%$
test_redq_speed[reduce-overhead-None] 2.0829ms 1.9747ms 506.3998 Ops/s 498.0848 Ops/s $\color{#35bf28}+1.67\%$
test_redq_speed[reduce-overhead-backward] 3.6915ms 3.6074ms 277.2072 Ops/s 259.3792 Ops/s $\textbf{\color{#35bf28}+6.87\%}$
test_redq_deprec_speed[False-None] 9.4640ms 8.9303ms 111.9780 Ops/s 110.8914 Ops/s $\color{#35bf28}+0.98\%$
test_redq_deprec_speed[False-backward] 12.2367ms 11.8358ms 84.4891 Ops/s 81.5462 Ops/s $\color{#35bf28}+3.61\%$
test_redq_deprec_speed[True-None] 2.3770ms 2.2938ms 435.9622 Ops/s 415.9300 Ops/s $\color{#35bf28}+4.82\%$
test_redq_deprec_speed[True-backward] 4.0722ms 3.9337ms 254.2138 Ops/s 251.4716 Ops/s $\color{#35bf28}+1.09\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4529ms 2.2799ms 438.6230 Ops/s 431.9806 Ops/s $\color{#35bf28}+1.54\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.3144ms 3.9329ms 254.2679 Ops/s 250.3114 Ops/s $\color{#35bf28}+1.58\%$
test_td3_speed[False-None] 8.1529ms 7.8395ms 127.5584 Ops/s 127.3445 Ops/s $\color{#35bf28}+0.17\%$
test_td3_speed[False-backward] 10.8083ms 10.0441ms 99.5605 Ops/s 99.1569 Ops/s $\color{#35bf28}+0.41\%$
test_td3_speed[True-None] 1.5797ms 1.5624ms 640.0471 Ops/s 636.9203 Ops/s $\color{#35bf28}+0.49\%$
test_td3_speed[True-backward] 3.2965ms 3.2441ms 308.2560 Ops/s 321.8372 Ops/s $\color{#d91a1a}-4.22\%$
test_td3_speed[reduce-overhead-None] 50.1588ms 25.6389ms 39.0033 Ops/s 37.7622 Ops/s $\color{#35bf28}+3.29\%$
test_td3_speed[reduce-overhead-backward] 1.8211ms 1.4388ms 695.0180 Ops/s 780.9906 Ops/s $\textbf{\color{#d91a1a}-11.01\%}$
test_cql_speed[False-None] 16.7266ms 16.1821ms 61.7968 Ops/s 62.2136 Ops/s $\color{#d91a1a}-0.67\%$
test_cql_speed[False-backward] 22.1959ms 21.4915ms 46.5299 Ops/s 47.1892 Ops/s $\color{#d91a1a}-1.40\%$
test_cql_speed[True-None] 3.0185ms 2.9060ms 344.1176 Ops/s 337.2549 Ops/s $\color{#35bf28}+2.03\%$
test_cql_speed[True-backward] 5.3814ms 5.0017ms 199.9333 Ops/s 186.9360 Ops/s $\textbf{\color{#35bf28}+6.95\%}$
test_cql_speed[reduce-overhead-None] 21.5650ms 12.9761ms 77.0647 Ops/s 75.6914 Ops/s $\color{#35bf28}+1.81\%$
test_cql_speed[reduce-overhead-backward] 1.5573ms 1.4871ms 672.4356 Ops/s 669.3777 Ops/s $\color{#35bf28}+0.46\%$
test_a2c_speed[False-None] 3.2553ms 3.1445ms 318.0147 Ops/s 314.1968 Ops/s $\color{#35bf28}+1.22\%$
test_a2c_speed[False-backward] 6.5227ms 5.9490ms 168.0942 Ops/s 165.2894 Ops/s $\color{#35bf28}+1.70\%$
test_a2c_speed[True-None] 1.1428ms 0.9979ms 1.0021 KOps/s 946.7124 Ops/s $\textbf{\color{#35bf28}+5.85\%}$
test_a2c_speed[True-backward] 2.6893ms 2.6145ms 382.4842 Ops/s 383.1279 Ops/s $\color{#d91a1a}-0.17\%$
test_a2c_speed[reduce-overhead-None] 0.3960s 12.2695ms 81.5032 Ops/s 87.1660 Ops/s $\textbf{\color{#d91a1a}-6.50\%}$
test_a2c_speed[reduce-overhead-backward] 1.0264ms 0.9918ms 1.0083 KOps/s 1.0193 KOps/s $\color{#d91a1a}-1.08\%$
test_ppo_speed[False-None] 3.8778ms 3.6747ms 272.1326 Ops/s 265.8827 Ops/s $\color{#35bf28}+2.35\%$
test_ppo_speed[False-backward] 7.0901ms 6.6407ms 150.5856 Ops/s 149.4650 Ops/s $\color{#35bf28}+0.75\%$
test_ppo_speed[True-None] 1.0028ms 0.9379ms 1.0663 KOps/s 1.0438 KOps/s $\color{#35bf28}+2.16\%$
test_ppo_speed[True-backward] 2.5906ms 2.5487ms 392.3547 Ops/s 367.1145 Ops/s $\textbf{\color{#35bf28}+6.88\%}$
test_ppo_speed[reduce-overhead-None] 0.5788ms 0.4951ms 2.0199 KOps/s 1.8241 KOps/s $\textbf{\color{#35bf28}+10.73\%}$
test_ppo_speed[reduce-overhead-backward] 1.0200ms 0.9672ms 1.0339 KOps/s 888.5144 Ops/s $\textbf{\color{#35bf28}+16.36\%}$
test_reinforce_speed[False-None] 2.3124ms 2.2258ms 449.2821 Ops/s 441.4176 Ops/s $\color{#35bf28}+1.78\%$
test_reinforce_speed[False-backward] 3.2626ms 3.1905ms 313.4300 Ops/s 298.0691 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_reinforce_speed[True-None] 0.9138ms 0.8309ms 1.2035 KOps/s 1.1907 KOps/s $\color{#35bf28}+1.07\%$
test_reinforce_speed[True-backward] 2.4794ms 2.4095ms 415.0320 Ops/s 387.3159 Ops/s $\textbf{\color{#35bf28}+7.16\%}$
test_reinforce_speed[reduce-overhead-None] 22.6037ms 11.7708ms 84.9563 Ops/s 89.0753 Ops/s $\color{#d91a1a}-4.62\%$
test_reinforce_speed[reduce-overhead-backward] 1.1138ms 1.0557ms 947.2752 Ops/s 838.2455 Ops/s $\textbf{\color{#35bf28}+13.01\%}$
test_iql_speed[False-None] 9.6420ms 9.1104ms 109.7643 Ops/s 109.7155 Ops/s $\color{#35bf28}+0.04\%$
test_iql_speed[False-backward] 13.5131ms 12.7457ms 78.4576 Ops/s 77.0797 Ops/s $\color{#35bf28}+1.79\%$
test_iql_speed[True-None] 1.8457ms 1.7632ms 567.1397 Ops/s 543.8380 Ops/s $\color{#35bf28}+4.28\%$
test_iql_speed[True-backward] 4.2752ms 4.1972ms 238.2561 Ops/s 235.4462 Ops/s $\color{#35bf28}+1.19\%$
test_iql_speed[reduce-overhead-None] 20.0124ms 11.3862ms 87.8258 Ops/s 111.8286 Ops/s $\textbf{\color{#d91a1a}-21.46\%}$
test_iql_speed[reduce-overhead-backward] 1.4934ms 1.4234ms 702.5321 Ops/s 710.1307 Ops/s $\color{#d91a1a}-1.07\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9785ms 6.5001ms 153.8440 Ops/s 154.6835 Ops/s $\color{#d91a1a}-0.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5757ms 0.3060ms 3.2677 KOps/s 3.7434 KOps/s $\textbf{\color{#d91a1a}-12.71\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6437ms 0.2876ms 3.4768 KOps/s 3.3937 KOps/s $\color{#35bf28}+2.45\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5141ms 6.2188ms 160.8033 Ops/s 159.5947 Ops/s $\color{#35bf28}+0.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9389ms 0.3363ms 2.9734 KOps/s 3.0476 KOps/s $\color{#d91a1a}-2.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5737ms 0.2749ms 3.6380 KOps/s 3.6731 KOps/s $\color{#d91a1a}-0.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5821ms 1.3415ms 745.4547 Ops/s 762.8892 Ops/s $\color{#d91a1a}-2.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5634ms 1.2834ms 779.2024 Ops/s 853.0678 Ops/s $\textbf{\color{#d91a1a}-8.66\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6048ms 6.3994ms 156.2650 Ops/s 154.9430 Ops/s $\color{#35bf28}+0.85\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8032ms 0.4468ms 2.2383 KOps/s 2.3121 KOps/s $\color{#d91a1a}-3.19\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6990ms 0.4782ms 2.0911 KOps/s 2.3133 KOps/s $\textbf{\color{#d91a1a}-9.61\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4132ms 6.2565ms 159.8335 Ops/s 159.0521 Ops/s $\color{#35bf28}+0.49\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7915ms 0.3305ms 3.0256 KOps/s 2.8554 KOps/s $\textbf{\color{#35bf28}+5.96\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5201ms 0.3060ms 3.2683 KOps/s 3.0595 KOps/s $\textbf{\color{#35bf28}+6.82\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4844ms 6.2133ms 160.9442 Ops/s 160.8696 Ops/s $\color{#35bf28}+0.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5787ms 0.3349ms 2.9863 KOps/s 3.2950 KOps/s $\textbf{\color{#d91a1a}-9.37\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6487ms 0.3174ms 3.1506 KOps/s 3.1153 KOps/s $\color{#35bf28}+1.13\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5831ms 6.4369ms 155.3537 Ops/s 156.8856 Ops/s $\color{#d91a1a}-0.98\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2559ms 0.4876ms 2.0509 KOps/s 2.2330 KOps/s $\textbf{\color{#d91a1a}-8.15\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6286ms 0.3885ms 2.5738 KOps/s 2.5741 KOps/s $\color{#d91a1a}-0.01\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.8439ms 5.2712ms 189.7100 Ops/s 189.3704 Ops/s $\color{#35bf28}+0.18\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.2757ms 2.0527ms 487.1670 Ops/s 509.2815 Ops/s $\color{#d91a1a}-4.34\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.1366ms 1.2321ms 811.6030 Ops/s 823.0071 Ops/s $\color{#d91a1a}-1.39\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4885s 14.9698ms 66.8013 Ops/s 192.0648 Ops/s $\textbf{\color{#d91a1a}-65.22\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.9578ms 2.0297ms 492.6914 Ops/s 438.3493 Ops/s $\textbf{\color{#35bf28}+12.40\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.1402ms 1.1934ms 837.9558 Ops/s 834.1389 Ops/s $\color{#35bf28}+0.46\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.4945ms 5.5137ms 181.3661 Ops/s 33.4979 Ops/s $\textbf{\color{#35bf28}+441.43\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 5.9862ms 2.1724ms 460.3190 Ops/s 501.9472 Ops/s $\textbf{\color{#d91a1a}-8.29\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 10.3249ms 1.4544ms 687.5574 Ops/s 807.7927 Ops/s $\textbf{\color{#d91a1a}-14.88\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.1851ms 13.1297ms 76.1631 Ops/s 75.4537 Ops/s $\color{#35bf28}+0.94\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.2834ms 17.0243ms 58.7397 Ops/s 60.1716 Ops/s $\color{#d91a1a}-2.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.7079ms 17.7887ms 56.2154 Ops/s 54.9777 Ops/s $\color{#35bf28}+2.25\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.4569ms 16.8943ms 59.1915 Ops/s 60.2914 Ops/s $\color{#d91a1a}-1.82\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.6854ms 18.1755ms 55.0193 Ops/s 55.8611 Ops/s $\color{#d91a1a}-1.51\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.1835ms 18.4508ms 54.1982 Ops/s 56.5516 Ops/s $\color{#d91a1a}-4.16\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 5258b8f into gh/vmoens/53/base Dec 12, 2024
1 check passed
vmoens added a commit that referenced this pull request Dec 12, 2024
ghstack-source-id: 1f5aed6fb2e97ead9d379f9545ae742f7728c585
Pull Request resolved: #2639
@vmoens vmoens deleted the gh/vmoens/53/head branch December 12, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants