Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep iterations below 2**30 #143

Merged
merged 1 commit into from
Feb 2, 2025

Conversation

eregon
Copy link
Contributor

@eregon eregon commented Feb 2, 2025

  • The logic already intended to do that but failed to keep it below 2**30 if the benchmark is very short when computing the cycles per 100ms.
  • This potentially resulted in a very costly deoptimization for the second half of warmup, and a recompilation to either 64-bit integers or Bignum which is significantly slower.

@eregon
Copy link
Contributor Author

eregon commented Feb 2, 2025

For example when running https://gist.github.com/byroot/780c1fdee3585611f3bca1c49779617d on truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux], before it would hang for a very long time after:

truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
           say_hello

Because it would try to run the loop with a number of iterations > 32-bit, maybe even > 64-bit, that would cause a deoptimization and then we would have to run the loop in interpreter or with OSR compilation (significantly less optimized).

After:

truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
           say_hello     1.074B i/100ms
         public_send     1.074B i/100ms
                send     1.074B i/100ms
Calculating -------------------------------------
           say_hello     36.960Q (±15.2%) i/s    (0.00 ns/i) -     47.716Q
         public_send     36.258Q (±13.4%) i/s    (0.00 ns/i) -     49.885Q
                send     36.800Q (±16.1%) i/s    (0.00 ns/i) -     49.247Q

Comparison:
  say_hello: 36959903144559840.0 i/s
       send: 36800300321482968.0 i/s - same-ish: difference falls within error
public_send: 36258372060672192.0 i/s - same-ish: difference falls within error

It's clear the benchmark is optimized away :)
Probably the first time a quadrillion number of iterations by second is reported by benchmark-ips!

* The logic already intended to do that but failed
  to keep it below 2**30 if the benchmark is very short
  when computing the cycles per 100ms.
* This potentially resulted in a very costly deoptimization
  for the second half of warmup, and a recompilation to
  either 64-bit integers or Bignum which is significantly slower.
@eregon eregon force-pushed the keep_iterations_32bit branch from d5ea1ea to 14374d3 Compare February 2, 2025 21:11
@nateberkopec
Copy link
Collaborator

quadrillion number of iterations by second

image

@nateberkopec nateberkopec merged commit 0816c90 into evanphx:master Feb 2, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants