Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Better weight update in collectors #1723

Merged
merged 2 commits into from
Nov 30, 2023
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 30, 2023

No description provided.

Copy link

pytorch-bot bot commented Nov 30, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1723

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (6 Unrelated Failures)

As of commit 4e16634 with merge base 6c27bdb (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 30, 2023
Copy link

github-actions bot commented Nov 30, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 65.2732ms 64.2571ms 15.5625 Ops/s 14.6902 Ops/s $\textbf{\color{#35bf28}+5.94\%}$
test_sync 39.7438ms 34.6126ms 28.8912 Ops/s 24.3299 Ops/s $\textbf{\color{#35bf28}+18.75\%}$
test_async 61.7747ms 31.3168ms 31.9317 Ops/s 29.2498 Ops/s $\textbf{\color{#35bf28}+9.17\%}$
test_simple 0.5034s 0.4385s 2.2806 Ops/s 2.2418 Ops/s $\color{#35bf28}+1.73\%$
test_transformed 0.6433s 0.5960s 1.6778 Ops/s 1.6231 Ops/s $\color{#35bf28}+3.37\%$
test_serial 1.3745s 1.3223s 0.7562 Ops/s 0.7322 Ops/s $\color{#35bf28}+3.28\%$
test_parallel 1.3317s 1.2782s 0.7823 Ops/s 0.7773 Ops/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[True-True-True-True-True] 0.1714ms 22.4990μs 44.4464 KOps/s 43.4823 KOps/s $\color{#35bf28}+2.22\%$
test_step_mdp_speed[True-True-True-True-False] 66.9850μs 13.6637μs 73.1866 KOps/s 70.5907 KOps/s $\color{#35bf28}+3.68\%$
test_step_mdp_speed[True-True-True-False-True] 39.9750μs 13.8527μs 72.1881 KOps/s 71.9677 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-True-True-False-False] 50.7850μs 8.3922μs 119.1583 KOps/s 116.9103 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[True-True-False-True-True] 53.0290μs 24.1775μs 41.3607 KOps/s 41.0690 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[True-True-False-True-False] 66.1240μs 15.2004μs 65.7877 KOps/s 64.6042 KOps/s $\color{#35bf28}+1.83\%$
test_step_mdp_speed[True-True-False-False-True] 60.9940μs 15.3116μs 65.3099 KOps/s 65.2769 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-True-False-False-False] 62.9580μs 9.6532μs 103.5930 KOps/s 100.9461 KOps/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[True-False-True-True-True] 68.6480μs 26.1021μs 38.3112 KOps/s 38.5796 KOps/s $\color{#d91a1a}-0.70\%$
test_step_mdp_speed[True-False-True-True-False] 64.1200μs 16.5405μs 60.4578 KOps/s 58.6897 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-False-True-False-True] 41.2270μs 15.3135μs 65.3020 KOps/s 64.8583 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[True-False-True-False-False] 58.8690μs 9.7967μs 102.0750 KOps/s 98.9411 KOps/s $\color{#35bf28}+3.17\%$
test_step_mdp_speed[True-False-False-True-True] 58.8100μs 26.6274μs 37.5553 KOps/s 36.9769 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[True-False-False-True-False] 68.1880μs 17.6365μs 56.7006 KOps/s 55.0019 KOps/s $\color{#35bf28}+3.09\%$
test_step_mdp_speed[True-False-False-False-True] 53.7510μs 16.3617μs 61.1183 KOps/s 60.7904 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-False-False-False-False] 49.7330μs 10.7109μs 93.3625 KOps/s 92.7714 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-True-True-True-True] 64.5000μs 25.5810μs 39.0915 KOps/s 38.5851 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-True-True-True-False] 81.5920μs 16.3560μs 61.1397 KOps/s 60.0323 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[False-True-True-False-True] 71.1830μs 17.2054μs 58.1214 KOps/s 56.4215 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[False-True-True-False-False] 35.2260μs 10.7601μs 92.9359 KOps/s 90.4662 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[False-True-False-True-True] 81.2720μs 26.4869μs 37.7545 KOps/s 36.9504 KOps/s $\color{#35bf28}+2.18\%$
test_step_mdp_speed[False-True-False-True-False] 47.1580μs 17.4840μs 57.1951 KOps/s 55.4478 KOps/s $\color{#35bf28}+3.15\%$
test_step_mdp_speed[False-True-False-False-True] 40.2950μs 18.6239μs 53.6945 KOps/s 53.3970 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[False-True-False-False-False] 55.4640μs 12.0229μs 83.1746 KOps/s 81.2986 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[False-False-True-True-True] 75.3810μs 27.9238μs 35.8118 KOps/s 35.2123 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-False-True-True-False] 68.2670μs 18.9590μs 52.7453 KOps/s 51.1068 KOps/s $\color{#35bf28}+3.21\%$
test_step_mdp_speed[False-False-True-False-True] 68.8590μs 18.5909μs 53.7899 KOps/s 52.0685 KOps/s $\color{#35bf28}+3.31\%$
test_step_mdp_speed[False-False-True-False-False] 37.4800μs 12.0047μs 83.3005 KOps/s 80.4449 KOps/s $\color{#35bf28}+3.55\%$
test_step_mdp_speed[False-False-False-True-True] 82.9350μs 28.9356μs 34.5595 KOps/s 33.9398 KOps/s $\color{#35bf28}+1.83\%$
test_step_mdp_speed[False-False-False-True-False] 76.2620μs 19.8780μs 50.3070 KOps/s 48.2215 KOps/s $\color{#35bf28}+4.32\%$
test_step_mdp_speed[False-False-False-False-True] 60.1630μs 19.6894μs 50.7886 KOps/s 49.7849 KOps/s $\color{#35bf28}+2.02\%$
test_step_mdp_speed[False-False-False-False-False] 63.2190μs 13.0302μs 76.7449 KOps/s 73.9774 KOps/s $\color{#35bf28}+3.74\%$
test_values[generalized_advantage_estimate-True-True] 17.5495ms 12.1175ms 82.5255 Ops/s 82.0251 Ops/s $\color{#35bf28}+0.61\%$
test_values[vec_generalized_advantage_estimate-True-True] 42.3380ms 27.1385ms 36.8480 Ops/s 36.8454 Ops/s $+0.01\%$
test_values[td0_return_estimate-False-False] 0.2589ms 0.1773ms 5.6402 KOps/s 4.8627 KOps/s $\textbf{\color{#35bf28}+15.99\%}$
test_values[td1_return_estimate-False-False] 28.6600ms 25.5027ms 39.2116 Ops/s 38.4249 Ops/s $\color{#35bf28}+2.05\%$
test_values[vec_td1_return_estimate-False-False] 34.8004ms 26.8496ms 37.2444 Ops/s 36.6985 Ops/s $\color{#35bf28}+1.49\%$
test_values[td_lambda_return_estimate-True-False] 36.6868ms 35.2488ms 28.3698 Ops/s 26.9911 Ops/s $\textbf{\color{#35bf28}+5.11\%}$
test_values[vec_td_lambda_return_estimate-True-False] 34.2293ms 27.0133ms 37.0188 Ops/s 36.6422 Ops/s $\color{#35bf28}+1.03\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 7.9370ms 7.8252ms 127.7922 Ops/s 122.1511 Ops/s $\color{#35bf28}+4.62\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3251ms 1.8147ms 551.0441 Ops/s 503.7687 Ops/s $\textbf{\color{#35bf28}+9.38\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 8.6523ms 0.4363ms 2.2922 KOps/s 2.3150 KOps/s $\color{#d91a1a}-0.99\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 47.9546ms 39.2830ms 25.4563 Ops/s 23.9495 Ops/s $\textbf{\color{#35bf28}+6.29\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 14.4058ms 2.8884ms 346.2097 Ops/s 392.7746 Ops/s $\textbf{\color{#d91a1a}-11.86\%}$
test_dqn_speed 9.5724ms 1.6363ms 611.1419 Ops/s 578.5774 Ops/s $\textbf{\color{#35bf28}+5.63\%}$
test_ddpg_speed 11.4370ms 3.6645ms 272.8870 Ops/s 268.0043 Ops/s $\color{#35bf28}+1.82\%$
test_sac_speed 17.8651ms 10.2951ms 97.1335 Ops/s 94.6980 Ops/s $\color{#35bf28}+2.57\%$
test_redq_speed 26.6256ms 19.5310ms 51.2006 Ops/s 49.4985 Ops/s $\color{#35bf28}+3.44\%$
test_redq_deprec_speed 25.9691ms 15.5369ms 64.3630 Ops/s 61.8405 Ops/s $\color{#35bf28}+4.08\%$
test_td3_speed 18.1401ms 10.6414ms 93.9722 Ops/s 89.8654 Ops/s $\color{#35bf28}+4.57\%$
test_cql_speed 46.9946ms 39.1615ms 25.5353 Ops/s 24.7240 Ops/s $\color{#35bf28}+3.28\%$
test_a2c_speed 97.4704ms 9.2718ms 107.8540 Ops/s 116.7137 Ops/s $\textbf{\color{#d91a1a}-7.59\%}$
test_ppo_speed 17.6125ms 8.8804ms 112.6081 Ops/s 111.1204 Ops/s $\color{#35bf28}+1.34\%$
test_reinforce_speed 15.6752ms 7.5206ms 132.9676 Ops/s 133.0296 Ops/s $\color{#d91a1a}-0.05\%$
test_iql_speed 42.9843ms 34.9292ms 28.6293 Ops/s 27.8852 Ops/s $\color{#35bf28}+2.67\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.5447ms 1.9802ms 504.9941 Ops/s 510.2647 Ops/s $\color{#d91a1a}-1.03\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.1116ms 2.1019ms 475.7622 Ops/s 476.6251 Ops/s $\color{#d91a1a}-0.18\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.3832ms 2.0889ms 478.7256 Ops/s 470.8548 Ops/s $\color{#35bf28}+1.67\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.0568ms 1.9782ms 505.5012 Ops/s 487.7395 Ops/s $\color{#35bf28}+3.64\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.1399ms 2.0974ms 476.7715 Ops/s 464.6499 Ops/s $\color{#35bf28}+2.61\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.3744ms 2.1450ms 466.1997 Ops/s 457.5165 Ops/s $\color{#35bf28}+1.90\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.8012ms 1.9406ms 515.3124 Ops/s 508.2580 Ops/s $\color{#35bf28}+1.39\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.7141ms 2.1368ms 467.9940 Ops/s 454.5236 Ops/s $\color{#35bf28}+2.96\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.5777ms 2.1271ms 470.1327 Ops/s 452.2379 Ops/s $\color{#35bf28}+3.96\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.7407ms 1.9502ms 512.7556 Ops/s 488.5881 Ops/s $\color{#35bf28}+4.95\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.8920ms 2.2081ms 452.8877 Ops/s 454.7347 Ops/s $\color{#d91a1a}-0.41\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.5914ms 2.2974ms 435.2732 Ops/s 469.3231 Ops/s $\textbf{\color{#d91a1a}-7.26\%}$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.4604ms 2.0140ms 496.5283 Ops/s 498.2669 Ops/s $\color{#d91a1a}-0.35\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.8345ms 2.1833ms 458.0325 Ops/s 471.6804 Ops/s $\color{#d91a1a}-2.89\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 2.9927ms 2.0963ms 477.0298 Ops/s 479.0487 Ops/s $\color{#d91a1a}-0.42\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.4080ms 1.9929ms 501.7913 Ops/s 508.4000 Ops/s $\color{#d91a1a}-1.30\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 4.0537ms 2.1462ms 465.9480 Ops/s 459.7463 Ops/s $\color{#35bf28}+1.35\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.5296ms 2.1761ms 459.5316 Ops/s 449.6726 Ops/s $\color{#35bf28}+2.19\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1992s 18.7715ms 53.2722 Ops/s 51.4267 Ops/s $\color{#35bf28}+3.59\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1199s 17.1382ms 58.3491 Ops/s 56.2906 Ops/s $\color{#35bf28}+3.66\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1201s 17.2135ms 58.0939 Ops/s 57.0112 Ops/s $\color{#35bf28}+1.90\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1176s 17.1682ms 58.2473 Ops/s 56.9493 Ops/s $\color{#35bf28}+2.28\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1125s 16.7919ms 59.5524 Ops/s 55.2825 Ops/s $\textbf{\color{#35bf28}+7.72\%}$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1163s 14.8673ms 67.2618 Ops/s 48.0778 Ops/s $\textbf{\color{#35bf28}+39.90\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1138s 16.7898ms 59.5600 Ops/s 54.3931 Ops/s $\textbf{\color{#35bf28}+9.50\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1183s 16.9964ms 58.8360 Ops/s 63.0653 Ops/s $\textbf{\color{#d91a1a}-6.71\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1174s 16.8264ms 59.4304 Ops/s 55.9602 Ops/s $\textbf{\color{#35bf28}+6.20\%}$

@vmoens vmoens added the Refactoring Refactoring of an existing feature label Nov 30, 2023
@vmoens vmoens marked this pull request as ready for review November 30, 2023 14:47
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1219s 0.1216s 8.2229 Ops/s 8.3609 Ops/s $\color{#d91a1a}-1.65\%$
test_sync 0.1030s 0.1026s 9.7431 Ops/s 9.7287 Ops/s $\color{#35bf28}+0.15\%$
test_async 0.2740s 0.1003s 9.9750 Ops/s 9.9318 Ops/s $\color{#35bf28}+0.43\%$
test_single_pixels 0.1315s 0.1312s 7.6199 Ops/s 7.7195 Ops/s $\color{#d91a1a}-1.29\%$
test_sync_pixels 96.0831ms 94.6988ms 10.5598 Ops/s 10.5091 Ops/s $\color{#35bf28}+0.48\%$
test_async_pixels 0.2463s 91.5030ms 10.9286 Ops/s 11.0056 Ops/s $\color{#d91a1a}-0.70\%$
test_simple 0.9783s 0.9127s 1.0957 Ops/s 1.1517 Ops/s $\color{#d91a1a}-4.87\%$
test_transformed 1.2179s 1.1570s 0.8643 Ops/s 0.8978 Ops/s $\color{#d91a1a}-3.73\%$
test_serial 2.5168s 2.4539s 0.4075 Ops/s 0.4173 Ops/s $\color{#d91a1a}-2.34\%$
test_parallel 2.5496s 2.4739s 0.4042 Ops/s 0.4029 Ops/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[True-True-True-True-True] 0.1043ms 35.3497μs 28.2888 KOps/s 29.0591 KOps/s $\color{#d91a1a}-2.65\%$
test_step_mdp_speed[True-True-True-True-False] 45.9110μs 21.0867μs 47.4232 KOps/s 49.1811 KOps/s $\color{#d91a1a}-3.57\%$
test_step_mdp_speed[True-True-True-False-True] 48.2100μs 21.3308μs 46.8806 KOps/s 48.9567 KOps/s $\color{#d91a1a}-4.24\%$
test_step_mdp_speed[True-True-True-False-False] 38.4600μs 12.4397μs 80.3876 KOps/s 82.7043 KOps/s $\color{#d91a1a}-2.80\%$
test_step_mdp_speed[True-True-False-True-True] 81.7010μs 37.0439μs 26.9950 KOps/s 27.7159 KOps/s $\color{#d91a1a}-2.60\%$
test_step_mdp_speed[True-True-False-True-False] 56.3010μs 22.5519μs 44.3421 KOps/s 45.4726 KOps/s $\color{#d91a1a}-2.49\%$
test_step_mdp_speed[True-True-False-False-True] 53.3200μs 22.8767μs 43.7126 KOps/s 46.0193 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_step_mdp_speed[True-True-False-False-False] 44.1010μs 14.2100μs 70.3729 KOps/s 72.4551 KOps/s $\color{#d91a1a}-2.87\%$
test_step_mdp_speed[True-False-True-True-True] 76.8610μs 39.3662μs 25.4025 KOps/s 25.8157 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[True-False-True-True-False] 47.8510μs 24.9754μs 40.0393 KOps/s 40.8888 KOps/s $\color{#d91a1a}-2.08\%$
test_step_mdp_speed[True-False-True-False-True] 48.0510μs 23.0284μs 43.4246 KOps/s 44.8082 KOps/s $\color{#d91a1a}-3.09\%$
test_step_mdp_speed[True-False-True-False-False] 30.3810μs 14.2750μs 70.0523 KOps/s 71.7157 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[True-False-False-True-True] 0.1050ms 41.1603μs 24.2953 KOps/s 25.3286 KOps/s $\color{#d91a1a}-4.08\%$
test_step_mdp_speed[True-False-False-True-False] 65.9610μs 26.9788μs 37.0661 KOps/s 38.8760 KOps/s $\color{#d91a1a}-4.66\%$
test_step_mdp_speed[True-False-False-False-True] 48.2610μs 24.6871μs 40.5071 KOps/s 42.2599 KOps/s $\color{#d91a1a}-4.15\%$
test_step_mdp_speed[True-False-False-False-False] 52.6510μs 16.2770μs 61.4363 KOps/s 63.8188 KOps/s $\color{#d91a1a}-3.73\%$
test_step_mdp_speed[False-True-True-True-True] 64.8810μs 39.6625μs 25.2127 KOps/s 25.9102 KOps/s $\color{#d91a1a}-2.69\%$
test_step_mdp_speed[False-True-True-True-False] 48.4510μs 24.9367μs 40.1015 KOps/s 41.5924 KOps/s $\color{#d91a1a}-3.58\%$
test_step_mdp_speed[False-True-True-False-True] 52.7610μs 27.1315μs 36.8575 KOps/s 37.4692 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[False-True-True-False-False] 32.5610μs 16.1306μs 61.9940 KOps/s 63.2445 KOps/s $\color{#d91a1a}-1.98\%$
test_step_mdp_speed[False-True-False-True-True] 0.1086ms 41.8892μs 23.8725 KOps/s 24.7213 KOps/s $\color{#d91a1a}-3.43\%$
test_step_mdp_speed[False-True-False-True-False] 51.6300μs 26.9498μs 37.1061 KOps/s 38.1209 KOps/s $\color{#d91a1a}-2.66\%$
test_step_mdp_speed[False-True-False-False-True] 67.6710μs 29.0576μs 34.4144 KOps/s 35.3770 KOps/s $\color{#d91a1a}-2.72\%$
test_step_mdp_speed[False-True-False-False-False] 44.5910μs 18.4270μs 54.2682 KOps/s 57.3901 KOps/s $\textbf{\color{#d91a1a}-5.44\%}$
test_step_mdp_speed[False-False-True-True-True] 70.5710μs 43.3554μs 23.0652 KOps/s 24.0066 KOps/s $\color{#d91a1a}-3.92\%$
test_step_mdp_speed[False-False-True-True-False] 53.3890μs 28.9102μs 34.5899 KOps/s 35.7801 KOps/s $\color{#d91a1a}-3.33\%$
test_step_mdp_speed[False-False-True-False-True] 49.3000μs 29.0184μs 34.4609 KOps/s 35.4837 KOps/s $\color{#d91a1a}-2.88\%$
test_step_mdp_speed[False-False-True-False-False] 41.3310μs 18.3243μs 54.5722 KOps/s 56.9386 KOps/s $\color{#d91a1a}-4.16\%$
test_step_mdp_speed[False-False-False-True-True] 80.3920μs 44.6874μs 22.3777 KOps/s 23.1897 KOps/s $\color{#d91a1a}-3.50\%$
test_step_mdp_speed[False-False-False-True-False] 53.4410μs 30.7112μs 32.5614 KOps/s 33.8326 KOps/s $\color{#d91a1a}-3.76\%$
test_step_mdp_speed[False-False-False-False-True] 62.3510μs 30.3789μs 32.9176 KOps/s 33.9386 KOps/s $\color{#d91a1a}-3.01\%$
test_step_mdp_speed[False-False-False-False-False] 40.6500μs 19.9565μs 50.1091 KOps/s 51.4888 KOps/s $\color{#d91a1a}-2.68\%$
test_values[generalized_advantage_estimate-True-True] 26.4248ms 25.6126ms 39.0433 Ops/s 40.2701 Ops/s $\color{#d91a1a}-3.05\%$
test_values[vec_generalized_advantage_estimate-True-True] 86.2543ms 3.2870ms 304.2290 Ops/s 305.3325 Ops/s $\color{#d91a1a}-0.36\%$
test_values[td0_return_estimate-False-False] 97.0010μs 66.0315μs 15.1443 KOps/s 15.6219 KOps/s $\color{#d91a1a}-3.06\%$
test_values[td1_return_estimate-False-False] 55.6069ms 55.0812ms 18.1550 Ops/s 18.4806 Ops/s $\color{#d91a1a}-1.76\%$
test_values[vec_td1_return_estimate-False-False] 2.0339ms 1.7334ms 576.9150 Ops/s 581.5786 Ops/s $\color{#d91a1a}-0.80\%$
test_values[td_lambda_return_estimate-True-False] 89.9114ms 87.8175ms 11.3872 Ops/s 11.5711 Ops/s $\color{#d91a1a}-1.59\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.9892ms 1.7284ms 578.5675 Ops/s 581.0423 Ops/s $\color{#d91a1a}-0.43\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.5154ms 24.2480ms 41.2405 Ops/s 41.9386 Ops/s $\color{#d91a1a}-1.66\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.8830ms 0.7228ms 1.3836 KOps/s 1.3978 KOps/s $\color{#d91a1a}-1.02\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7508ms 0.6839ms 1.4622 KOps/s 1.4686 KOps/s $\color{#d91a1a}-0.43\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5089ms 1.4720ms 679.3511 Ops/s 681.2021 Ops/s $\color{#d91a1a}-0.27\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9556ms 0.7121ms 1.4043 KOps/s 1.4214 KOps/s $\color{#d91a1a}-1.20\%$
test_dqn_speed 7.8513ms 1.4590ms 685.4017 Ops/s 659.9321 Ops/s $\color{#35bf28}+3.86\%$
test_ddpg_speed 4.6449ms 3.2841ms 304.4977 Ops/s 281.6572 Ops/s $\textbf{\color{#35bf28}+8.11\%}$
test_sac_speed 95.2161ms 9.9386ms 100.6180 Ops/s 109.1345 Ops/s $\textbf{\color{#d91a1a}-7.80\%}$
test_redq_speed 17.2784ms 16.5259ms 60.5111 Ops/s 60.3300 Ops/s $\color{#35bf28}+0.30\%$
test_redq_deprec_speed 14.0905ms 12.7118ms 78.6671 Ops/s 78.3776 Ops/s $\color{#35bf28}+0.37\%$
test_td3_speed 19.2589ms 9.4354ms 105.9839 Ops/s 106.2658 Ops/s $\color{#d91a1a}-0.27\%$
test_cql_speed 31.8750ms 30.8967ms 32.3659 Ops/s 30.8408 Ops/s $\color{#35bf28}+4.95\%$
test_a2c_speed 8.3659ms 6.9925ms 143.0097 Ops/s 142.9396 Ops/s $\color{#35bf28}+0.05\%$
test_ppo_speed 8.7674ms 7.2751ms 137.4545 Ops/s 138.0205 Ops/s $\color{#d91a1a}-0.41\%$
test_reinforce_speed 7.2451ms 6.0105ms 166.3756 Ops/s 165.7644 Ops/s $\color{#35bf28}+0.37\%$
test_iql_speed 28.1981ms 26.7731ms 37.3509 Ops/s 37.4527 Ops/s $\color{#d91a1a}-0.27\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.1866ms 2.4891ms 401.7546 Ops/s 405.9560 Ops/s $\color{#d91a1a}-1.03\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.7073ms 2.6784ms 373.3506 Ops/s 337.9843 Ops/s $\textbf{\color{#35bf28}+10.46\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.8938ms 2.6870ms 372.1615 Ops/s 377.3090 Ops/s $\color{#d91a1a}-1.36\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.6966ms 2.4933ms 401.0746 Ops/s 406.3772 Ops/s $\color{#d91a1a}-1.30\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.7455ms 2.6770ms 373.5550 Ops/s 381.8653 Ops/s $\color{#d91a1a}-2.18\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.6580ms 2.6825ms 372.7900 Ops/s 381.2554 Ops/s $\color{#d91a1a}-2.22\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.7133ms 2.4917ms 401.3353 Ops/s 407.2810 Ops/s $\color{#d91a1a}-1.46\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 4.0498ms 2.6867ms 372.1995 Ops/s 378.6013 Ops/s $\color{#d91a1a}-1.69\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.2653ms 2.6692ms 374.6382 Ops/s 378.4661 Ops/s $\color{#d91a1a}-1.01\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.7463ms 2.4797ms 403.2765 Ops/s 404.9211 Ops/s $\color{#d91a1a}-0.41\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.3363ms 2.6884ms 371.9687 Ops/s 377.6322 Ops/s $\color{#d91a1a}-1.50\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.7868ms 2.6853ms 372.3926 Ops/s 375.5161 Ops/s $\color{#d91a1a}-0.83\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.0524ms 2.5110ms 398.2545 Ops/s 405.9775 Ops/s $\color{#d91a1a}-1.90\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 4.0958ms 2.6858ms 372.3236 Ops/s 376.3358 Ops/s $\color{#d91a1a}-1.07\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.1082ms 2.6913ms 371.5640 Ops/s 377.3009 Ops/s $\color{#d91a1a}-1.52\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1212ms 2.5082ms 398.6939 Ops/s 405.4963 Ops/s $\color{#d91a1a}-1.68\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.4763ms 2.6720ms 374.2580 Ops/s 377.2671 Ops/s $\color{#d91a1a}-0.80\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.3456ms 2.6866ms 372.2242 Ops/s 377.9100 Ops/s $\color{#d91a1a}-1.50\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.2118s 19.3070ms 51.7947 Ops/s 51.4316 Ops/s $\color{#35bf28}+0.71\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1266s 17.5779ms 56.8896 Ops/s 56.6519 Ops/s $\color{#35bf28}+0.42\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1268s 15.2207ms 65.7000 Ops/s 65.4279 Ops/s $\color{#35bf28}+0.42\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1279s 17.5211ms 57.0741 Ops/s 56.7561 Ops/s $\color{#35bf28}+0.56\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1263s 17.4445ms 57.3246 Ops/s 56.5880 Ops/s $\color{#35bf28}+1.30\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1270s 17.6142ms 56.7725 Ops/s 56.5782 Ops/s $\color{#35bf28}+0.34\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1282s 17.6187ms 56.7578 Ops/s 56.8660 Ops/s $\color{#d91a1a}-0.19\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1267s 17.4792ms 57.2107 Ops/s 65.1859 Ops/s $\textbf{\color{#d91a1a}-12.23\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1266s 15.2659ms 65.5056 Ops/s 56.5796 Ops/s $\textbf{\color{#35bf28}+15.78\%}$

@vmoens vmoens merged commit d545364 into main Nov 30, 2023
@vmoens vmoens deleted the fix-collector-weights branch November 30, 2023 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactoring Refactoring of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants