-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix manual_div_ceil clippy lints #902
base: main
Are you sure you want to change the base?
Conversation
Gloria Estefan and Miami Sound Machine were wrong - it's rustfmt that's gunna get ya. |
It was obviously naive to expect that |
I think this makes the code worse at opt-level 0. https://rust.godbolt.org/z/cGdPhs1Kj An add and divide the compiler can optimise because it can see the divide is by a multiple of two. Using div_ceil means there's a function call and that optimisation no longer applies. At opt-level=1 it's not as bad, but still different: https://rust.godbolt.org/z/ceb4xKz8z Edit: I suppose it's trying to avoid integer overflow in the general case. |
And the div_ceil also works for non-power of two. IMHO, in functions like ep_allocate which is expected to only be used a few times at the start of the system. The loss in performance may not be too bad. But in the UART function I think we can locally allow the manual impl because we may be on a critical path (eg changing baudrate between data frames). |
@@ -256,7 +256,7 @@ impl Inner { | |||
// size in 64bytes units. | |||
// NOTE: the compiler is smart enough to recognize /64 as a 6bit right shift so let's | |||
// keep the division here for the sake of clarity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true any more.
@@ -161,7 +161,7 @@ impl Inner { | |||
// size in 64bytes units. | |||
// NOTE: the compiler is smart enough to recognize /64 as a 6bit right shift so let's | |||
// keep the division here for the sake of clarity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true any more.
That's a good idea |
Looking at the UART code as it seems to be more performance critical. If I take the whole function (and not only the div_ceil), the two variants compare different. In the optimized case, they are nearly identical: I'm not sure if we need to care about the opt-level=0 case as long as it's not extraordinarily slow. (Ie. you can still use it when debugging non-performance-critical code.) |
Ok, I'm convinced. Does this mean the comments about division still hold true? |
I'll have a closer look. It's worth to check the rp2350 version as well, after all it's a different target so the assembly output may be different. But ignoring the generated code: Do you think the I think it's clearer, but only by a small margin. So if the generated code turns out to be worse, we should keep the old code. Perhaps with a comment that it's equivalent to div_ceil, but faster. |
I think it's the same, or perhaps slightly less clear. But then I'm a grizzled old programmer so I'm familiar with the idea of |
No description provided.