Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BE] Make better logits in cost tests #2775

Merged
merged 2 commits into from
Feb 10, 2025
Merged

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2775

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 10, 2025
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: 2a4257aa39d39dcc9fede4813824e9956138d24c
Pull Request resolved: #2775
@vmoens vmoens added Suitable for minor Suitable to be integrated in minor release (no new feature) BE Better errors, logs, docs or test utils labels Feb 10, 2025
Copy link

github-actions bot commented Feb 10, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5887s 0.4957s 2.0173 Ops/s 2.1123 Ops/s $\color{#d91a1a}-4.50\%$
test_transformed 1.0762s 0.9697s 1.0313 Ops/s 1.0283 Ops/s $\color{#35bf28}+0.29\%$
test_serial 1.4721s 1.4697s 0.6804 Ops/s 0.6727 Ops/s $\color{#35bf28}+1.15\%$
test_parallel 1.4202s 1.3012s 0.7685 Ops/s 0.7814 Ops/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[True-True-True-True-True] 0.2157ms 30.1045μs 33.2176 KOps/s 32.7552 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[True-True-True-True-False] 68.7690μs 17.9382μs 55.7469 KOps/s 55.5735 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-True-True-False-True] 68.3670μs 17.3839μs 57.5246 KOps/s 58.0316 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[True-True-True-False-False] 30.3270μs 10.1290μs 98.7267 KOps/s 99.9974 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[True-True-False-True-True] 87.0230μs 32.5355μs 30.7356 KOps/s 30.4391 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-True-False-True-False] 60.8940μs 19.8143μs 50.4685 KOps/s 50.7124 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[True-True-False-False-True] 73.7880μs 19.0472μs 52.5011 KOps/s 52.4387 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-True-False-False-False] 37.3700μs 11.9116μs 83.9517 KOps/s 83.9525 KOps/s $-0.00\%$
test_step_mdp_speed[True-False-True-True-True] 0.1083ms 33.9922μs 29.4185 KOps/s 28.9857 KOps/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[True-False-True-True-False] 83.2860μs 21.6902μs 46.1037 KOps/s 46.2127 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[True-False-True-False-True] 44.5130μs 19.0762μs 52.4212 KOps/s 52.8393 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[True-False-True-False-False] 61.8560μs 11.9220μs 83.8788 KOps/s 83.3118 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[True-False-False-True-True] 89.7970μs 36.0679μs 27.7255 KOps/s 27.8698 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[True-False-False-True-False] 83.8360μs 23.6078μs 42.3588 KOps/s 42.5846 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-False-False-False-True] 74.9400μs 21.2610μs 47.0344 KOps/s 48.5781 KOps/s $\color{#d91a1a}-3.18\%$
test_step_mdp_speed[True-False-False-False-False] 52.2670μs 13.8699μs 72.0985 KOps/s 72.5483 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-True-True-True-True] 98.2440μs 34.5158μs 28.9722 KOps/s 29.2160 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[False-True-True-True-False] 64.1700μs 21.8101μs 45.8504 KOps/s 45.7323 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[False-True-True-False-True] 50.3350μs 21.6721μs 46.1423 KOps/s 45.4996 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[False-True-True-False-False] 59.9120μs 13.4791μs 74.1890 KOps/s 74.1877 KOps/s $+0.00\%$
test_step_mdp_speed[False-True-False-True-True] 74.1390μs 35.8157μs 27.9207 KOps/s 27.8885 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[False-True-False-True-False] 84.6090μs 23.4296μs 42.6811 KOps/s 42.6872 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[False-True-False-False-True] 2.6131ms 23.6789μs 42.2317 KOps/s 42.7650 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[False-True-False-False-False] 46.8480μs 15.0922μs 66.2594 KOps/s 65.4303 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[False-False-True-True-True] 84.0470μs 37.6286μs 26.5755 KOps/s 26.1339 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[False-False-True-True-False] 69.6300μs 25.3933μs 39.3804 KOps/s 39.2378 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-False-True-False-True] 59.3310μs 23.7487μs 42.1076 KOps/s 42.9653 KOps/s $\color{#d91a1a}-2.00\%$
test_step_mdp_speed[False-False-True-False-False] 59.4910μs 15.3239μs 65.2576 KOps/s 65.0241 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-False-False-True-True] 79.5190μs 39.2405μs 25.4839 KOps/s 24.6018 KOps/s $\color{#35bf28}+3.59\%$
test_step_mdp_speed[False-False-False-True-False] 76.0730μs 26.8239μs 37.2802 KOps/s 37.0852 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-False-False-False-True] 78.4160μs 24.8709μs 40.2076 KOps/s 39.6108 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[False-False-False-False-False] 54.9130μs 16.7530μs 59.6908 KOps/s 59.2870 KOps/s $\color{#35bf28}+0.68\%$
test_values[generalized_advantage_estimate-True-True] 10.2617ms 9.9739ms 100.2612 Ops/s 101.4814 Ops/s $\color{#d91a1a}-1.20\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.7073ms 24.2384ms 41.2569 Ops/s 37.8601 Ops/s $\textbf{\color{#35bf28}+8.97\%}$
test_values[td0_return_estimate-False-False] 0.2632ms 0.2090ms 4.7842 KOps/s 5.2445 KOps/s $\textbf{\color{#d91a1a}-8.78\%}$
test_values[td1_return_estimate-False-False] 28.6953ms 24.8365ms 40.2633 Ops/s 41.2503 Ops/s $\color{#d91a1a}-2.39\%$
test_values[vec_td1_return_estimate-False-False] 26.6256ms 24.3667ms 41.0396 Ops/s 37.7314 Ops/s $\textbf{\color{#35bf28}+8.77\%}$
test_values[td_lambda_return_estimate-True-False] 39.0231ms 35.3425ms 28.2945 Ops/s 28.9644 Ops/s $\color{#d91a1a}-2.31\%$
test_values[vec_td_lambda_return_estimate-True-False] 25.3329ms 24.3194ms 41.1195 Ops/s 36.9131 Ops/s $\textbf{\color{#35bf28}+11.40\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.1170ms 8.6011ms 116.2636 Ops/s 119.3765 Ops/s $\color{#d91a1a}-2.61\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5152ms 1.9677ms 508.2053 Ops/s 511.2475 Ops/s $\color{#d91a1a}-0.60\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5404ms 0.3692ms 2.7086 KOps/s 2.7644 KOps/s $\color{#d91a1a}-2.02\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 46.6153ms 43.0364ms 23.2362 Ops/s 21.5751 Ops/s $\textbf{\color{#35bf28}+7.70\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.2706ms 3.4790ms 287.4359 Ops/s 290.2865 Ops/s $\color{#d91a1a}-0.98\%$
test_dqn_speed[False-None] 2.0809ms 1.4326ms 698.0170 Ops/s 710.9160 Ops/s $\color{#d91a1a}-1.81\%$
test_dqn_speed[False-backward] 2.0348ms 1.9464ms 513.7647 Ops/s 524.1638 Ops/s $\color{#d91a1a}-1.98\%$
test_dqn_speed[True-None] 0.1729s 0.5830ms 1.7152 KOps/s 2.0453 KOps/s $\textbf{\color{#d91a1a}-16.14\%}$
test_dqn_speed[True-backward] 1.0222ms 0.9493ms 1.0534 KOps/s 1.0978 KOps/s $\color{#d91a1a}-4.04\%$
test_dqn_speed[reduce-overhead-None] 0.8198ms 0.4893ms 2.0437 KOps/s 2.0517 KOps/s $\color{#d91a1a}-0.39\%$
test_dqn_speed[reduce-overhead-backward] 0.9749ms 0.9220ms 1.0846 KOps/s 1.0783 KOps/s $\color{#35bf28}+0.58\%$
test_ddpg_speed[False-None] 3.3097ms 2.9341ms 340.8180 Ops/s 345.5964 Ops/s $\color{#d91a1a}-1.38\%$
test_ddpg_speed[False-backward] 5.3081ms 4.1770ms 239.4083 Ops/s 248.4516 Ops/s $\color{#d91a1a}-3.64\%$
test_ddpg_speed[True-None] 1.6189ms 1.2426ms 804.7835 Ops/s 800.3981 Ops/s $\color{#35bf28}+0.55\%$
test_ddpg_speed[True-backward] 2.3162ms 2.2153ms 451.4046 Ops/s 469.8998 Ops/s $\color{#d91a1a}-3.94\%$
test_ddpg_speed[reduce-overhead-None] 1.8396ms 1.2545ms 797.1592 Ops/s 745.1748 Ops/s $\textbf{\color{#35bf28}+6.98\%}$
test_ddpg_speed[reduce-overhead-backward] 2.4314ms 2.2607ms 442.3372 Ops/s 458.6135 Ops/s $\color{#d91a1a}-3.55\%$
test_sac_speed[False-None] 10.5105ms 8.2885ms 120.6496 Ops/s 120.2981 Ops/s $\color{#35bf28}+0.29\%$
test_sac_speed[False-backward] 11.9236ms 11.2087ms 89.2165 Ops/s 90.8432 Ops/s $\color{#d91a1a}-1.79\%$
test_sac_speed[True-None] 2.6008ms 2.2302ms 448.3818 Ops/s 468.7559 Ops/s $\color{#d91a1a}-4.35\%$
test_sac_speed[True-backward] 4.3932ms 4.1625ms 240.2395 Ops/s 263.9815 Ops/s $\textbf{\color{#d91a1a}-8.99\%}$
test_sac_speed[reduce-overhead-None] 2.8196ms 2.2569ms 443.0858 Ops/s 459.0410 Ops/s $\color{#d91a1a}-3.48\%$
test_sac_speed[reduce-overhead-backward] 4.1297ms 3.9824ms 251.1042 Ops/s 251.3612 Ops/s $\color{#d91a1a}-0.10\%$
test_redq_speed[False-None] 14.1659ms 13.3294ms 75.0224 Ops/s 70.2220 Ops/s $\textbf{\color{#35bf28}+6.84\%}$
test_redq_speed[False-backward] 24.5541ms 22.8959ms 43.6759 Ops/s 41.7619 Ops/s $\color{#35bf28}+4.58\%$
test_redq_speed[True-None] 6.2997ms 5.6996ms 175.4500 Ops/s 184.4734 Ops/s $\color{#d91a1a}-4.89\%$
test_redq_speed[True-backward] 13.8735ms 13.2893ms 75.2483 Ops/s 75.3082 Ops/s $\color{#d91a1a}-0.08\%$
test_redq_speed[reduce-overhead-None] 6.8676ms 6.1111ms 163.6369 Ops/s 176.7420 Ops/s $\textbf{\color{#d91a1a}-7.41\%}$
test_redq_speed[reduce-overhead-backward] 14.4253ms 13.4851ms 74.1559 Ops/s 76.1026 Ops/s $\color{#d91a1a}-2.56\%$
test_redq_deprec_speed[False-None] 18.6478ms 14.0646ms 71.1006 Ops/s 73.2789 Ops/s $\color{#d91a1a}-2.97\%$
test_redq_deprec_speed[False-backward] 21.0512ms 19.8453ms 50.3897 Ops/s 51.0715 Ops/s $\color{#d91a1a}-1.34\%$
test_redq_deprec_speed[True-None] 5.6594ms 4.3662ms 229.0341 Ops/s 240.8155 Ops/s $\color{#d91a1a}-4.89\%$
test_redq_deprec_speed[True-backward] 10.9696ms 9.5977ms 104.1917 Ops/s 112.4713 Ops/s $\textbf{\color{#d91a1a}-7.36\%}$
test_redq_deprec_speed[reduce-overhead-None] 7.1245ms 4.3690ms 228.8854 Ops/s 240.7011 Ops/s $\color{#d91a1a}-4.91\%$
test_redq_deprec_speed[reduce-overhead-backward] 10.5114ms 9.4905ms 105.3687 Ops/s 110.2403 Ops/s $\color{#d91a1a}-4.42\%$
test_td3_speed[False-None] 8.8142ms 8.3702ms 119.4709 Ops/s 117.7296 Ops/s $\color{#35bf28}+1.48\%$
test_td3_speed[False-backward] 13.0272ms 11.1081ms 90.0246 Ops/s 93.1296 Ops/s $\color{#d91a1a}-3.33\%$
test_td3_speed[True-None] 2.2301ms 1.8822ms 531.2987 Ops/s 530.9638 Ops/s $\color{#35bf28}+0.06\%$
test_td3_speed[True-backward] 3.9769ms 3.6997ms 270.2903 Ops/s 285.6793 Ops/s $\textbf{\color{#d91a1a}-5.39\%}$
test_td3_speed[reduce-overhead-None] 2.1630ms 1.8571ms 538.4783 Ops/s 535.2336 Ops/s $\color{#35bf28}+0.61\%$
test_td3_speed[reduce-overhead-backward] 3.9743ms 3.6442ms 274.4089 Ops/s 280.8444 Ops/s $\color{#d91a1a}-2.29\%$
test_cql_speed[False-None] 39.7232ms 37.0299ms 27.0052 Ops/s 26.9295 Ops/s $\color{#35bf28}+0.28\%$
test_cql_speed[False-backward] 56.2510ms 47.0350ms 21.2608 Ops/s 20.7312 Ops/s $\color{#35bf28}+2.55\%$
test_cql_speed[True-None] 18.0646ms 16.6593ms 60.0264 Ops/s 59.8415 Ops/s $\color{#35bf28}+0.31\%$
test_cql_speed[True-backward] 25.8990ms 23.6931ms 42.2064 Ops/s 42.5812 Ops/s $\color{#d91a1a}-0.88\%$
test_cql_speed[reduce-overhead-None] 18.1204ms 16.4099ms 60.9389 Ops/s 60.8372 Ops/s $\color{#35bf28}+0.17\%$
test_cql_speed[reduce-overhead-backward] 25.4616ms 23.3784ms 42.7744 Ops/s 41.9521 Ops/s $\color{#35bf28}+1.96\%$
test_a2c_speed[False-None] 8.9270ms 7.5803ms 131.9217 Ops/s 132.6708 Ops/s $\color{#d91a1a}-0.56\%$
test_a2c_speed[False-backward] 16.1544ms 14.6526ms 68.2471 Ops/s 65.5302 Ops/s $\color{#35bf28}+4.15\%$
test_a2c_speed[True-None] 4.6983ms 3.7842ms 264.2533 Ops/s 238.7064 Ops/s $\textbf{\color{#35bf28}+10.70\%}$
test_a2c_speed[True-backward] 11.6795ms 10.3799ms 96.3404 Ops/s 96.6488 Ops/s $\color{#d91a1a}-0.32\%$
test_a2c_speed[reduce-overhead-None] 4.3092ms 3.7535ms 266.4205 Ops/s 266.5078 Ops/s $\color{#d91a1a}-0.03\%$
test_a2c_speed[reduce-overhead-backward] 11.5336ms 10.2616ms 97.4507 Ops/s 96.2381 Ops/s $\color{#35bf28}+1.26\%$
test_ppo_speed[False-None] 8.7122ms 7.6606ms 130.5388 Ops/s 131.7571 Ops/s $\color{#d91a1a}-0.92\%$
test_ppo_speed[False-backward] 17.8185ms 15.2975ms 65.3702 Ops/s 66.3094 Ops/s $\color{#d91a1a}-1.42\%$
test_ppo_speed[True-None] 5.5611ms 4.2028ms 237.9381 Ops/s 238.6727 Ops/s $\color{#d91a1a}-0.31\%$
test_ppo_speed[True-backward] 11.6253ms 10.2734ms 97.3391 Ops/s 98.9925 Ops/s $\color{#d91a1a}-1.67\%$
test_ppo_speed[reduce-overhead-None] 4.4779ms 4.1380ms 241.6651 Ops/s 240.7188 Ops/s $\color{#35bf28}+0.39\%$
test_ppo_speed[reduce-overhead-backward] 10.7869ms 10.0203ms 99.7971 Ops/s 96.0501 Ops/s $\color{#35bf28}+3.90\%$
test_reinforce_speed[False-None] 6.9200ms 6.6069ms 151.3565 Ops/s 149.2070 Ops/s $\color{#35bf28}+1.44\%$
test_reinforce_speed[False-backward] 11.4797ms 10.0025ms 99.9749 Ops/s 97.6784 Ops/s $\color{#35bf28}+2.35\%$
test_reinforce_speed[True-None] 3.6677ms 3.2212ms 310.4435 Ops/s 321.0802 Ops/s $\color{#d91a1a}-3.31\%$
test_reinforce_speed[True-backward] 10.2220ms 9.5005ms 105.2577 Ops/s 106.7825 Ops/s $\color{#d91a1a}-1.43\%$
test_reinforce_speed[reduce-overhead-None] 3.7158ms 3.1539ms 317.0686 Ops/s 317.8952 Ops/s $\color{#d91a1a}-0.26\%$
test_reinforce_speed[reduce-overhead-backward] 10.4902ms 9.4134ms 106.2320 Ops/s 108.2662 Ops/s $\color{#d91a1a}-1.88\%$
test_iql_speed[False-None] 40.4428ms 33.7097ms 29.6650 Ops/s 30.1194 Ops/s $\color{#d91a1a}-1.51\%$
test_iql_speed[False-backward] 53.1848ms 46.3474ms 21.5762 Ops/s 21.5398 Ops/s $\color{#35bf28}+0.17\%$
test_iql_speed[True-None] 12.5062ms 11.5322ms 86.7136 Ops/s 86.7830 Ops/s $\color{#d91a1a}-0.08\%$
test_iql_speed[True-backward] 24.2097ms 22.7694ms 43.9186 Ops/s 43.2489 Ops/s $\color{#35bf28}+1.55\%$
test_iql_speed[reduce-overhead-None] 12.7342ms 11.4994ms 86.9613 Ops/s 85.1827 Ops/s $\color{#35bf28}+2.09\%$
test_iql_speed[reduce-overhead-backward] 37.1061ms 23.3862ms 42.7602 Ops/s 41.0750 Ops/s $\color{#35bf28}+4.10\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8840ms 5.1170ms 195.4277 Ops/s 192.6624 Ops/s $\color{#35bf28}+1.44\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7792ms 0.5257ms 1.9023 KOps/s 1.8933 KOps/s $\color{#35bf28}+0.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8774ms 0.4991ms 2.0036 KOps/s 1.9737 KOps/s $\color{#35bf28}+1.51\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1580ms 4.7829ms 209.0780 Ops/s 198.9001 Ops/s $\textbf{\color{#35bf28}+5.12\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9229ms 0.5072ms 1.9715 KOps/s 1.9541 KOps/s $\color{#35bf28}+0.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8248ms 0.4840ms 2.0663 KOps/s 1.9827 KOps/s $\color{#35bf28}+4.22\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3884ms 1.6608ms 602.1327 Ops/s 599.4440 Ops/s $\color{#35bf28}+0.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.7424ms 1.6036ms 623.6040 Ops/s 624.5704 Ops/s $\color{#d91a1a}-0.15\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.4735ms 4.9495ms 202.0396 Ops/s 184.2884 Ops/s $\textbf{\color{#35bf28}+9.63\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4795ms 0.6679ms 1.4973 KOps/s 1.4805 KOps/s $\color{#35bf28}+1.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8404ms 0.6257ms 1.5982 KOps/s 1.5322 KOps/s $\color{#35bf28}+4.31\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.1998ms 4.7676ms 209.7507 Ops/s 204.9228 Ops/s $\color{#35bf28}+2.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.6320ms 0.5188ms 1.9277 KOps/s 1.8772 KOps/s $\color{#35bf28}+2.69\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8040ms 0.5005ms 1.9982 KOps/s 1.9717 KOps/s $\color{#35bf28}+1.34\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1986ms 4.8397ms 206.6239 Ops/s 201.4203 Ops/s $\color{#35bf28}+2.58\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8954ms 0.5056ms 1.9778 KOps/s 1.8695 KOps/s $\textbf{\color{#35bf28}+5.79\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8297ms 0.4950ms 2.0204 KOps/s 2.0114 KOps/s $\color{#35bf28}+0.44\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.6281ms 4.9769ms 200.9297 Ops/s 197.7569 Ops/s $\color{#35bf28}+1.60\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2622ms 0.6792ms 1.4722 KOps/s 1.5384 KOps/s $\color{#d91a1a}-4.30\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9222ms 0.6425ms 1.5564 KOps/s 1.5752 KOps/s $\color{#d91a1a}-1.20\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 10.9420ms 4.7564ms 210.2414 Ops/s 236.8571 Ops/s $\textbf{\color{#d91a1a}-11.24\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.8184ms 2.3994ms 416.7771 Ops/s 428.2550 Ops/s $\color{#d91a1a}-2.68\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.8601ms 1.2738ms 785.0251 Ops/s 786.4722 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5045s 14.7375ms 67.8542 Ops/s 228.4315 Ops/s $\textbf{\color{#d91a1a}-70.30\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.5456ms 2.2173ms 450.9938 Ops/s 429.9718 Ops/s $\color{#35bf28}+4.89\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.2740ms 1.4400ms 694.4463 Ops/s 740.6738 Ops/s $\textbf{\color{#d91a1a}-6.24\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.6437ms 4.6044ms 217.1857 Ops/s 226.9508 Ops/s $\color{#d91a1a}-4.30\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.2342ms 2.4598ms 406.5390 Ops/s 401.6822 Ops/s $\color{#35bf28}+1.21\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.3735ms 1.4997ms 666.8034 Ops/s 655.6100 Ops/s $\color{#35bf28}+1.71\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.5104ms 12.1849ms 82.0686 Ops/s 83.1215 Ops/s $\color{#d91a1a}-1.27\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.5585ms 14.5664ms 68.6509 Ops/s 70.0887 Ops/s $\color{#d91a1a}-2.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.2807ms 21.0054ms 47.6067 Ops/s 46.9596 Ops/s $\color{#35bf28}+1.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.5647ms 14.4510ms 69.1992 Ops/s 68.4857 Ops/s $\color{#35bf28}+1.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.8950ms 21.0453ms 47.5166 Ops/s 48.3434 Ops/s $\color{#d91a1a}-1.71\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.8032ms 15.8707ms 63.0090 Ops/s 63.0061 Ops/s $+0.00\%$

Copy link

github-actions bot commented Feb 10, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8858s 0.7985s 1.2524 Ops/s 1.2460 Ops/s $\color{#35bf28}+0.51\%$
test_transformed 1.4909s 1.3998s 0.7144 Ops/s 0.7149 Ops/s $\color{#d91a1a}-0.08\%$
test_serial 2.4020s 2.3135s 0.4322 Ops/s 0.4346 Ops/s $\color{#d91a1a}-0.55\%$
test_parallel 1.9804s 1.8419s 0.5429 Ops/s 0.5414 Ops/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-True-True-True-True] 0.2055ms 40.4270μs 24.7360 KOps/s 25.6716 KOps/s $\color{#d91a1a}-3.64\%$
test_step_mdp_speed[True-True-True-True-False] 49.0110μs 23.6429μs 42.2960 KOps/s 42.8212 KOps/s $\color{#d91a1a}-1.23\%$
test_step_mdp_speed[True-True-True-False-True] 53.7510μs 21.9963μs 45.4623 KOps/s 44.8935 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[True-True-True-False-False] 38.6410μs 12.9372μs 77.2967 KOps/s 76.8143 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[True-True-False-True-True] 75.4010μs 41.9049μs 23.8636 KOps/s 23.6138 KOps/s $\color{#35bf28}+1.06\%$
test_step_mdp_speed[True-True-False-True-False] 55.2100μs 25.5080μs 39.2034 KOps/s 39.0757 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-True-False-False-True] 60.0410μs 24.7429μs 40.4156 KOps/s 41.0013 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-True-False-False-False] 46.5910μs 15.2828μs 65.4328 KOps/s 64.8651 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[True-False-True-True-True] 86.3820μs 44.5134μs 22.4651 KOps/s 22.3469 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[True-False-True-True-False] 56.6910μs 27.6958μs 36.1066 KOps/s 35.3479 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-False-True-False-True] 52.8710μs 24.3873μs 41.0050 KOps/s 40.7160 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[True-False-True-False-False] 50.2710μs 15.0970μs 66.2383 KOps/s 64.7315 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[True-False-False-True-True] 77.0310μs 46.4874μs 21.5112 KOps/s 21.2006 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[True-False-False-True-False] 61.9910μs 29.9987μs 33.3348 KOps/s 33.3189 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-False-False-False-True] 65.9410μs 26.3513μs 37.9488 KOps/s 37.4959 KOps/s $\color{#35bf28}+1.21\%$
test_step_mdp_speed[True-False-False-False-False] 48.3510μs 17.6940μs 56.5165 KOps/s 57.7606 KOps/s $\color{#d91a1a}-2.15\%$
test_step_mdp_speed[False-True-True-True-True] 76.6510μs 43.9637μs 22.7460 KOps/s 22.6487 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[False-True-True-True-False] 60.8810μs 27.7710μs 36.0087 KOps/s 35.7001 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-True-True-False-True] 57.1010μs 28.1028μs 35.5836 KOps/s 35.2279 KOps/s $\color{#35bf28}+1.01\%$
test_step_mdp_speed[False-True-True-False-False] 53.5210μs 16.9413μs 59.0273 KOps/s 58.9402 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-False-True-True] 80.2610μs 46.1455μs 21.6706 KOps/s 21.3578 KOps/s $\color{#35bf28}+1.46\%$
test_step_mdp_speed[False-True-False-True-False] 57.4810μs 29.7775μs 33.5824 KOps/s 33.6206 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[False-True-False-False-True] 3.1964ms 31.1395μs 32.1135 KOps/s 32.1853 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[False-True-False-False-False] 57.8110μs 19.3196μs 51.7610 KOps/s 51.8445 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-False-True-True-True] 0.1246ms 49.0017μs 20.4074 KOps/s 20.2622 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[False-False-True-True-False] 60.2610μs 32.3729μs 30.8901 KOps/s 30.8417 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[False-False-True-False-True] 67.1110μs 30.1124μs 33.2089 KOps/s 32.6997 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[False-False-True-False-False] 44.9510μs 18.9960μs 52.6426 KOps/s 51.9613 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-False-False-True-True] 86.4710μs 50.3760μs 19.8507 KOps/s 19.5965 KOps/s $\color{#35bf28}+1.30\%$
test_step_mdp_speed[False-False-False-True-False] 68.2210μs 34.4158μs 29.0564 KOps/s 28.6851 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[False-False-False-False-True] 68.4910μs 32.1675μs 31.0873 KOps/s 31.2036 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-False-False-False-False] 50.5910μs 21.0785μs 47.4418 KOps/s 47.0044 KOps/s $\color{#35bf28}+0.93\%$
test_values[generalized_advantage_estimate-True-True] 25.4100ms 25.0755ms 39.8795 Ops/s 39.5407 Ops/s $\color{#35bf28}+0.86\%$
test_values[vec_generalized_advantage_estimate-True-True] 98.1657ms 2.8730ms 348.0729 Ops/s 315.5820 Ops/s $\textbf{\color{#35bf28}+10.30\%}$
test_values[td0_return_estimate-False-False] 0.1062ms 81.3084μs 12.2988 KOps/s 11.7593 KOps/s $\color{#35bf28}+4.59\%$
test_values[td1_return_estimate-False-False] 57.7089ms 55.6492ms 17.9697 Ops/s 17.7041 Ops/s $\color{#35bf28}+1.50\%$
test_values[vec_td1_return_estimate-False-False] 1.2931ms 1.0866ms 920.2755 Ops/s 914.7595 Ops/s $\color{#35bf28}+0.60\%$
test_values[td_lambda_return_estimate-True-False] 92.8850ms 88.2804ms 11.3275 Ops/s 11.1698 Ops/s $\color{#35bf28}+1.41\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2286ms 1.0800ms 925.9455 Ops/s 917.2054 Ops/s $\color{#35bf28}+0.95\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.9137ms 24.9797ms 40.0325 Ops/s 39.9588 Ops/s $\color{#35bf28}+0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0200ms 0.7524ms 1.3291 KOps/s 1.3191 KOps/s $\color{#35bf28}+0.75\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7606ms 0.6714ms 1.4894 KOps/s 1.4748 KOps/s $\color{#35bf28}+0.99\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5450ms 1.4853ms 673.2676 Ops/s 672.7956 Ops/s $\color{#35bf28}+0.07\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7229ms 0.6873ms 1.4550 KOps/s 1.4464 KOps/s $\color{#35bf28}+0.60\%$
test_dqn_speed[False-None] 7.0033ms 1.5378ms 650.2957 Ops/s 641.7215 Ops/s $\color{#35bf28}+1.34\%$
test_dqn_speed[False-backward] 2.1916ms 2.1395ms 467.3937 Ops/s 462.1061 Ops/s $\color{#35bf28}+1.14\%$
test_dqn_speed[True-None] 0.7220ms 0.5726ms 1.7464 KOps/s 1.7802 KOps/s $\color{#d91a1a}-1.90\%$
test_dqn_speed[True-backward] 1.1446ms 1.1021ms 907.3932 Ops/s 810.9935 Ops/s $\textbf{\color{#35bf28}+11.89\%}$
test_dqn_speed[reduce-overhead-None] 0.9711ms 0.5661ms 1.7665 KOps/s 1.7658 KOps/s $\color{#35bf28}+0.04\%$
test_dqn_speed[reduce-overhead-backward] 0.9782ms 0.9529ms 1.0495 KOps/s 927.3594 Ops/s $\textbf{\color{#35bf28}+13.17\%}$
test_ddpg_speed[False-None] 3.1752ms 2.8819ms 346.9967 Ops/s 345.1757 Ops/s $\color{#35bf28}+0.53\%$
test_ddpg_speed[False-backward] 4.5785ms 4.1612ms 240.3178 Ops/s 233.2173 Ops/s $\color{#35bf28}+3.04\%$
test_ddpg_speed[True-None] 1.4663ms 1.3362ms 748.4165 Ops/s 744.1175 Ops/s $\color{#35bf28}+0.58\%$
test_ddpg_speed[True-backward] 2.4601ms 2.4086ms 415.1771 Ops/s 385.1346 Ops/s $\textbf{\color{#35bf28}+7.80\%}$
test_ddpg_speed[reduce-overhead-None] 1.4800ms 1.3460ms 742.9311 Ops/s 737.7995 Ops/s $\color{#35bf28}+0.70\%$
test_ddpg_speed[reduce-overhead-backward] 1.9377ms 1.8779ms 532.5229 Ops/s 486.5549 Ops/s $\textbf{\color{#35bf28}+9.45\%}$
test_sac_speed[False-None] 8.5349ms 8.1229ms 123.1093 Ops/s 121.7323 Ops/s $\color{#35bf28}+1.13\%$
test_sac_speed[False-backward] 11.6556ms 11.0762ms 90.2838 Ops/s 87.2122 Ops/s $\color{#35bf28}+3.52\%$
test_sac_speed[True-None] 2.1142ms 1.8233ms 548.4547 Ops/s 537.6845 Ops/s $\color{#35bf28}+2.00\%$
test_sac_speed[True-backward] 4.0312ms 3.5481ms 281.8376 Ops/s 264.6340 Ops/s $\textbf{\color{#35bf28}+6.50\%}$
test_sac_speed[reduce-overhead-None] 21.5074ms 12.0558ms 82.9474 Ops/s 84.5176 Ops/s $\color{#d91a1a}-1.86\%$
test_sac_speed[reduce-overhead-backward] 1.7188ms 1.6311ms 613.0713 Ops/s 548.6275 Ops/s $\textbf{\color{#35bf28}+11.75\%}$
test_redq_speed[False-None] 8.0261ms 7.5435ms 132.5639 Ops/s 130.5600 Ops/s $\color{#35bf28}+1.53\%$
test_redq_speed[False-backward] 11.7981ms 11.4181ms 87.5801 Ops/s 83.3573 Ops/s $\textbf{\color{#35bf28}+5.07\%}$
test_redq_speed[True-None] 2.4188ms 2.3000ms 434.7806 Ops/s 426.1126 Ops/s $\color{#35bf28}+2.03\%$
test_redq_speed[True-backward] 4.4358ms 3.9911ms 250.5552 Ops/s 242.1950 Ops/s $\color{#35bf28}+3.45\%$
test_redq_speed[reduce-overhead-None] 2.4816ms 2.3302ms 429.1510 Ops/s 421.9054 Ops/s $\color{#35bf28}+1.72\%$
test_redq_speed[reduce-overhead-backward] 4.4861ms 4.0210ms 248.6960 Ops/s 239.9055 Ops/s $\color{#35bf28}+3.66\%$
test_redq_deprec_speed[False-None] 9.5613ms 9.1186ms 109.6659 Ops/s 108.4483 Ops/s $\color{#35bf28}+1.12\%$
test_redq_deprec_speed[False-backward] 12.4271ms 12.0890ms 82.7198 Ops/s 80.9533 Ops/s $\color{#35bf28}+2.18\%$
test_redq_deprec_speed[True-None] 2.7323ms 2.6221ms 381.3797 Ops/s 376.5205 Ops/s $\color{#35bf28}+1.29\%$
test_redq_deprec_speed[True-backward] 4.3590ms 4.2721ms 234.0785 Ops/s 225.5003 Ops/s $\color{#35bf28}+3.80\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7797ms 2.6219ms 381.4079 Ops/s 372.1782 Ops/s $\color{#35bf28}+2.48\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.7766ms 4.3063ms 232.2190 Ops/s 227.0812 Ops/s $\color{#35bf28}+2.26\%$
test_td3_speed[False-None] 8.2406ms 7.9887ms 125.1767 Ops/s 123.9402 Ops/s $\color{#35bf28}+1.00\%$
test_td3_speed[False-backward] 10.8480ms 10.3152ms 96.9445 Ops/s 95.4875 Ops/s $\color{#35bf28}+1.53\%$
test_td3_speed[True-None] 1.6539ms 1.6160ms 618.7998 Ops/s 598.2599 Ops/s $\color{#35bf28}+3.43\%$
test_td3_speed[True-backward] 3.2561ms 3.1457ms 317.8990 Ops/s 304.5210 Ops/s $\color{#35bf28}+4.39\%$
test_td3_speed[reduce-overhead-None] 51.5013ms 26.2746ms 38.0596 Ops/s 38.6581 Ops/s $\color{#d91a1a}-1.55\%$
test_td3_speed[reduce-overhead-backward] 1.4330ms 1.3519ms 739.7236 Ops/s 728.5256 Ops/s $\color{#35bf28}+1.54\%$
test_cql_speed[False-None] 17.5303ms 16.8708ms 59.2742 Ops/s 58.7575 Ops/s $\color{#35bf28}+0.88\%$
test_cql_speed[False-backward] 22.6050ms 22.0369ms 45.3784 Ops/s 44.8550 Ops/s $\color{#35bf28}+1.17\%$
test_cql_speed[True-None] 3.4310ms 3.2496ms 307.7290 Ops/s 294.3796 Ops/s $\color{#35bf28}+4.53\%$
test_cql_speed[True-backward] 6.0671ms 5.6500ms 176.9920 Ops/s 175.5180 Ops/s $\color{#35bf28}+0.84\%$
test_cql_speed[reduce-overhead-None] 21.7364ms 13.1060ms 76.3008 Ops/s 75.9918 Ops/s $\color{#35bf28}+0.41\%$
test_cql_speed[reduce-overhead-backward] 2.1324ms 1.9807ms 504.8606 Ops/s 527.9890 Ops/s $\color{#d91a1a}-4.38\%$
test_a2c_speed[False-None] 3.2845ms 3.1981ms 312.6900 Ops/s 300.0318 Ops/s $\color{#35bf28}+4.22\%$
test_a2c_speed[False-backward] 7.0964ms 6.4335ms 155.4355 Ops/s 157.6428 Ops/s $\color{#d91a1a}-1.40\%$
test_a2c_speed[True-None] 1.4277ms 1.3367ms 748.1200 Ops/s 732.3028 Ops/s $\color{#35bf28}+2.16\%$
test_a2c_speed[True-backward] 3.0584ms 3.0024ms 333.0645 Ops/s 334.9039 Ops/s $\color{#d91a1a}-0.55\%$
test_a2c_speed[reduce-overhead-None] 15.8083ms 8.9705ms 111.4768 Ops/s 112.6434 Ops/s $\color{#d91a1a}-1.04\%$
test_a2c_speed[reduce-overhead-backward] 1.7237ms 1.5998ms 625.0932 Ops/s 679.5081 Ops/s $\textbf{\color{#d91a1a}-8.01\%}$
test_ppo_speed[False-None] 3.8203ms 3.7222ms 268.6615 Ops/s 263.8622 Ops/s $\color{#35bf28}+1.82\%$
test_ppo_speed[False-backward] 7.6867ms 7.1756ms 139.3614 Ops/s 143.2442 Ops/s $\color{#d91a1a}-2.71\%$
test_ppo_speed[True-None] 1.5084ms 1.4022ms 713.1755 Ops/s 706.4975 Ops/s $\color{#35bf28}+0.95\%$
test_ppo_speed[True-backward] 3.5635ms 3.1923ms 313.2577 Ops/s 320.0816 Ops/s $\color{#d91a1a}-2.13\%$
test_ppo_speed[reduce-overhead-None] 1.0346ms 0.9540ms 1.0482 KOps/s 1.0352 KOps/s $\color{#35bf28}+1.26\%$
test_ppo_speed[reduce-overhead-backward] 1.7047ms 1.5580ms 641.8533 Ops/s 682.7168 Ops/s $\textbf{\color{#d91a1a}-5.99\%}$
test_reinforce_speed[False-None] 2.3881ms 2.2847ms 437.6986 Ops/s 431.4438 Ops/s $\color{#35bf28}+1.45\%$
test_reinforce_speed[False-backward] 3.8440ms 3.4380ms 290.8645 Ops/s 295.4741 Ops/s $\color{#d91a1a}-1.56\%$
test_reinforce_speed[True-None] 1.8124ms 1.2830ms 779.4430 Ops/s 756.2552 Ops/s $\color{#35bf28}+3.07\%$
test_reinforce_speed[True-backward] 3.1126ms 3.0381ms 329.1497 Ops/s 341.9800 Ops/s $\color{#d91a1a}-3.75\%$
test_reinforce_speed[reduce-overhead-None] 18.8463ms 10.0832ms 99.1750 Ops/s 99.8951 Ops/s $\color{#d91a1a}-0.72\%$
test_reinforce_speed[reduce-overhead-backward] 1.7231ms 1.6213ms 616.7888 Ops/s 652.4461 Ops/s $\textbf{\color{#d91a1a}-5.47\%}$
test_iql_speed[False-None] 9.6932ms 9.2380ms 108.2482 Ops/s 106.4771 Ops/s $\color{#35bf28}+1.66\%$
test_iql_speed[False-backward] 13.6895ms 13.1557ms 76.0128 Ops/s 76.4603 Ops/s $\color{#d91a1a}-0.59\%$
test_iql_speed[True-None] 2.2958ms 2.1876ms 457.1218 Ops/s 438.3494 Ops/s $\color{#35bf28}+4.28\%$
test_iql_speed[True-backward] 5.2169ms 4.8279ms 207.1290 Ops/s 202.1216 Ops/s $\color{#35bf28}+2.48\%$
test_iql_speed[reduce-overhead-None] 18.9153ms 11.1686ms 89.5371 Ops/s 90.8417 Ops/s $\color{#d91a1a}-1.44\%$
test_iql_speed[reduce-overhead-backward] 2.1060ms 2.0575ms 486.0213 Ops/s 501.1323 Ops/s $\color{#d91a1a}-3.02\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7601ms 6.4149ms 155.8859 Ops/s 154.5876 Ops/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6579ms 0.2805ms 3.5648 KOps/s 3.2735 KOps/s $\textbf{\color{#35bf28}+8.90\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5185ms 0.2587ms 3.8653 KOps/s 3.4410 KOps/s $\textbf{\color{#35bf28}+12.33\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4156ms 6.0655ms 164.8674 Ops/s 164.1156 Ops/s $\color{#35bf28}+0.46\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0940ms 0.3311ms 3.0200 KOps/s 3.6993 KOps/s $\textbf{\color{#d91a1a}-18.36\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5543ms 0.2953ms 3.3868 KOps/s 3.9785 KOps/s $\textbf{\color{#d91a1a}-14.87\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5003ms 1.2647ms 790.7046 Ops/s 780.9608 Ops/s $\color{#35bf28}+1.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4592ms 1.2524ms 798.4437 Ops/s 839.1486 Ops/s $\color{#d91a1a}-4.85\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4718ms 6.2808ms 159.2164 Ops/s 158.8548 Ops/s $\color{#35bf28}+0.23\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0562ms 0.4200ms 2.3808 KOps/s 1.9840 KOps/s $\textbf{\color{#35bf28}+20.00\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6752ms 0.4256ms 2.3498 KOps/s 2.0815 KOps/s $\textbf{\color{#35bf28}+12.89\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2334ms 6.1277ms 163.1932 Ops/s 161.2413 Ops/s $\color{#35bf28}+1.21\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1429ms 0.3078ms 3.2484 KOps/s 3.5938 KOps/s $\textbf{\color{#d91a1a}-9.61\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6834ms 0.3059ms 3.2693 KOps/s 3.7808 KOps/s $\textbf{\color{#d91a1a}-13.53\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5196ms 6.0572ms 165.0917 Ops/s 163.5854 Ops/s $\color{#35bf28}+0.92\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8036ms 0.2950ms 3.3894 KOps/s 3.7391 KOps/s $\textbf{\color{#d91a1a}-9.35\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5575ms 0.2999ms 3.3343 KOps/s 4.1400 KOps/s $\textbf{\color{#d91a1a}-19.46\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.1429ms 6.2658ms 159.5958 Ops/s 158.9523 Ops/s $\color{#35bf28}+0.40\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7995ms 0.4129ms 2.4218 KOps/s 2.2879 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5931ms 0.3914ms 2.5549 KOps/s 2.3532 KOps/s $\textbf{\color{#35bf28}+8.57\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0629ms 5.4283ms 184.2189 Ops/s 179.8330 Ops/s $\color{#35bf28}+2.44\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.3047ms 2.1597ms 463.0362 Ops/s 434.4476 Ops/s $\textbf{\color{#35bf28}+6.58\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9980ms 1.0494ms 952.9207 Ops/s 846.4366 Ops/s $\textbf{\color{#35bf28}+12.58\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.1009ms 5.5998ms 178.5787 Ops/s 182.1518 Ops/s $\color{#d91a1a}-1.96\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.3613ms 2.0035ms 499.1220 Ops/s 425.6562 Ops/s $\textbf{\color{#35bf28}+17.26\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.2558ms 1.2388ms 807.2204 Ops/s 863.3533 Ops/s $\textbf{\color{#d91a1a}-6.50\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5043s 15.7092ms 63.6571 Ops/s 31.2143 Ops/s $\textbf{\color{#35bf28}+103.94\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.7988ms 2.2167ms 451.1245 Ops/s 467.1710 Ops/s $\color{#d91a1a}-3.43\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.4339ms 1.3566ms 737.1141 Ops/s 749.2113 Ops/s $\color{#d91a1a}-1.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4452ms 12.9761ms 77.0647 Ops/s 73.7872 Ops/s $\color{#35bf28}+4.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.2815ms 16.3486ms 61.1671 Ops/s 58.2762 Ops/s $\color{#35bf28}+4.96\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.9085ms 17.6264ms 56.7329 Ops/s 55.4474 Ops/s $\color{#35bf28}+2.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.3374ms 16.7167ms 59.8203 Ops/s 58.1170 Ops/s $\color{#35bf28}+2.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.8941ms 17.5622ms 56.9406 Ops/s 54.0740 Ops/s $\textbf{\color{#35bf28}+5.30\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.1312ms 18.7495ms 53.3349 Ops/s 53.5635 Ops/s $\color{#d91a1a}-0.43\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: be9ea92b3f3d2592e426eaeaff7b81e50472cf16
Pull Request resolved: #2775
@vmoens vmoens merged commit 75d52fc into gh/vmoens/90/base Feb 10, 2025
8 of 15 checks passed
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: be9ea92b3f3d2592e426eaeaff7b81e50472cf16
Pull Request resolved: #2775
@vmoens vmoens deleted the gh/vmoens/90/head branch February 10, 2025 12:27
vmoens added a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: be9ea92b3f3d2592e426eaeaff7b81e50472cf16
Pull Request resolved: #2775

(cherry picked from commit 42ed42c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BE Better errors, logs, docs or test utils CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Suitable for minor Suitable to be integrated in minor release (no new feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants