-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel formatting to increase speed #6095
Open
MarcusGrass
wants to merge
23
commits into
rust-lang:master
Choose a base branch
from
MarcusGrass:concurrent
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for the PR. Going to put this on hold until the team has had a chance to discuss this change #6091 (comment). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Multithreaded parallelism
Closes #6091
This is non-trivial because when internals can run threaded, eagerly printing cannot happen, messages have
to be passed back through buffers. Even that would be non-trivial if there wasn't color-term writing happening
making things more complicated.
The change became pretty big, even though the only substantive change
is in
print.rs
andbin/main.rs
, because the printer has to be propagateddeep into the library, about half of the changes are adding a function argument, and reformatting
caused by those lines becoming too long. I'm sorry about that.
Changes
Output facilities
In practice there are four printing facilities used:
stdout
, pretty easy, replaceprintln
with printing into a buffer.stderr
, same as above in all ways that matter.diff
-printing part of the code.rustc_error
and the most complex to integrate.Additionally, these four facilities can't be separated, since they have to preserve order between each other.
Rename StdoutEmitter
Confusing naming, it doesn't output into
stdout
, but into theprovided buffer.
Pros
This change brings a substantial speedup, especially for large projects, the speedup
becoming less and less but not none or negative for smaller projects.
If instant measures as the average reaction time of 250ms, this brings
down tokio from the not-instant ~260ms, to the instant 86ms.
Giving space for projects almost 3x tokio's size to have an "instant" formatting experience.
Results locally on
tokio
with diferent parallelism:1: Around 260ms, within noise of the master branch.
2: Around 160ms, 39% less time spent than the master branch.
4: Around 110ms, 58% less time spent than the master branch.
32: Around 85ms, 67% less time spent than the master branch.
The dramatic effect with just 2 cores means that this change doesn't just have an effect if you've got a threadripper, I don't have the data, but I would imagine that most users have 2 or more cores and would benefit.
Drawbacks
There are some drawbacks to consider
Late printing memory increase
All messages that would have gone into
stdout
orstderr
and been flushed eagerly now gointo an in-mem buffer. This causes a memory-overhead which scales with project size
(number of files). For a large project (went from 1700ms to about 500ms with this change in
cargo fmt --all
-time), this is a difference of117M
on the master version, and170M
on this version (~+45%).Rustc termcolor vendoring
Another problem is the way that
rustc
vendors termcolor. It vendors some types but no printing facilities,and the vendored code isn't interoperable with the published
termcolor
library, which means thateither the printing facilities need to be reinvented (no), or some compatibility-code needs to be introduced (yes).
Hopefully there's some better solution to this, but it's liveable at the moment see the
rustc_compat*
functions inprint.rs
.Accidentally messing with outputs will cause an output jumble
This change would make any direct interaction with
stdout
andstderr
in theformatting codepath a bug. Printing deep inside a library is usually inadvisable, but now that would be upgraded to a bug.
Some testing
Apart from the usual tests, I wanted to make sure that the


rustc_errors
printing is the same, pretty easy, just remove a semicolon:And diffing:
Although there are tests for those already.
Lastly I did a test modifying a bunch of different files in a project and then ran
cargo fmt --all --check
,the order within-file is preserved, but the order in which diffs are printed is "random".
It's intended, and the reason for that is because output is printed by file in completion-order. It could await the join handles in order, and get the same order as the singe-threaded implementation, but then if there are more files than available parallelism, full thread utilization won't happen, which in practice measurably impacts performance on large projects.
It's possible to do a half-measure and wait in handle-order for the remaining files after all tasks have been scheduled, this would preserve ordering in projects with fewer files than available parallelism, but that doesn't seem very impactful to me, since available parallelism likely differs a lot.