Skip to content
This repository was archived by the owner on May 28, 2025. It is now read-only.

Commit 6bca9f2

Browse files
committed
Auto merge of rust-lang#14859 - lunacookies:qos, r=lunacookies
Specify thread types using Quality of Service API <details> <summary>Some background (in case you haven’t heard of QoS before)</summary> Heterogenous multi-core CPUs are increasingly found in laptops and desktops (e.g. Alder Lake, Snapdragon 8cx Gen 3, M1). To maximize efficiency on this kind of hardware, it is important to provide the operating system with more information so threads can be scheduled on different core types appropriately. The approach that XNU (the kernel of macOS, iOS, etc) and Windows have taken is to provide a high-level semantic API – quality of service, or QoS – which informs the OS of the program’s intent. For instance, you might specify that a thread is running a render loop for a game. This makes the OS provide this thread with as large a share of the system’s resources as possible. Specifying a thread is running an unimportant background task, on the other hand, is cause for it to be scheduled exclusively on high-efficiency cores instead of high-performance cores. QoS APIs allows for easy configuration of many different parameters at once; for instance, setting QoS on XNU affects scheduling, timer latency, I/O priorities, and of course what core type the thread in question should run on. I don’t know any details on how QoS works on Windows, but I would guess it’s similar. Hypothetically, taking advantage of these APIs would improve power consumption, thermals, battery life if applicable, etc. </details> # Relevance to rust-analyzer From what I can tell the philosophy behind both the XNU and Windows QoS APIs is that _user interfaces should never stutter under any circumstances._ You can see this in the array of QoS classes which are available: the highest QoS class in both APIs is one intended explicitly for UI render loops. Imagine rust-analyzer is performing CPU-intensive background work – maybe you just invoked Find Usages on `usize` or opened a large project – in this scenario the editor’s render loop should absolutely get higher priority than rust-analyzer, no matter what. You could view it in terms of “realtime-ness”: flight control software is hard realtime, audio software is soft realtime, GUIs are softer realtime, and rust-analyzer is not realtime at all. Of course, maximizing responsiveness is important, but respecting the rest of the system is more important. # Implementation I’ve tried my best to unify thread creation in `stdx`, where the new API I’ve introduced _requires_ specifying a QoS class. Different points along the performance/efficiency curve can make a great difference; the M1’s e-cores use around three times less power than the p-cores, so putting in this effort is worthwhile IMO. It’s worth mentioning that Linux does not [yet](https://youtu.be/RfgPWpTwTQo) have a QoS API. Maybe translating QoS into regular thread priorities would be acceptable? From what I can tell the only scheduling-related code in rust-analyzer is Windows-specific, so ignoring QoS entirely on Linux shouldn’t cause any new issues. Also, I haven’t implemented support for the Windows QoS APIs because I don’t have a Windows machine to test on, and because I’m completely unfamiliar with Windows APIs :) I noticed that rust-analyzer handles some requests on the main thread (using `.on_sync()`) and others on a threadpool (using `.on()`). I think it would make sense to run the main thread at the User Initiated QoS and the threadpool at Utility, but only if all requests that are caused by typing use `.on_sync()` and all that don’t use `.on()`. I don’t understand how the `.on_sync()`/`.on()` split that’s currently present was chosen, so I’ve let this code be for the moment. Let me know if changing this to what I proposed makes any sense. To avoid having to change everything back in case I’ve misunderstood something, I’ve left all threads at the Utility QoS for now. Of course, this isn’t what I hope the code will look like in the end, but I figured I have to start somewhere :P # References <ul> <li><a href="https://developer.apple.com/library/archive/documentation/Performance/Conceptual/power_efficiency_guidelines_osx/PrioritizeWorkAtTheTaskLevel.html">Apple documentation related to QoS</a></li> <li><a href="https://github.com/apple-oss-distributions/libpthread/blob/67e155c94093be9a204b69637d198eceff2c7c46/include/pthread/qos.h">pthread API for setting QoS on XNU</a></li> <li><a href="https://learn.microsoft.com/en-us/windows/win32/procthread/quality-of-service">Windows’s QoS classes</a></li> <li> <details> <summary>Full documentation of XNU QoS classes. This documentation is only available as a huge not-very-readable comment in a header file, so I’ve reformatted it and put it here for reference.</summary> <ul> <li><p><strong><code>QOS_CLASS_USER_INTERACTIVE</code>: A QOS class which indicates work performed by this thread is interactive with the user.</strong></p><p>Such work is requested to run at high priority relative to other work on the system. Specifying this QOS class is a request to run with nearly all available system CPU and I/O bandwidth even under contention. This is not an energy-efficient QOS class to use for large tasks. The use of this QOS class should be limited to critical interaction with the user such as handling events on the main event loop, view drawing, animation, etc.</p></li> <li><p><strong><code>QOS_CLASS_USER_INITIATED</code>: A QOS class which indicates work performed by this thread was initiated by the user and that the user is likely waiting for the results.</strong></p><p>Such work is requested to run at a priority below critical user-interactive work, but relatively higher than other work on the system. This is not an energy-efficient QOS class to use for large tasks. Its use should be limited to operations of short enough duration that the user is unlikely to switch tasks while waiting for the results. Typical user-initiated work will have progress indicated by the display of placeholder content or modal user interface.</p></li> <li><p><strong><code>QOS_CLASS_DEFAULT</code>: A default QOS class used by the system in cases where more specific QOS class information is not available.</strong></p><p>Such work is requested to run at a priority below critical user-interactive and user-initiated work, but relatively higher than utility and background tasks. Threads created by <code>pthread_create()</code> without an attribute specifying a QOS class will default to <code>QOS_CLASS_DEFAULT</code>. This QOS class value is not intended to be used as a work classification, it should only be set when propagating or restoring QOS class values provided by the system.</p></li> <li><p><strong><code>QOS_CLASS_UTILITY</code>: A QOS class which indicates work performed by this thread may or may not be initiated by the user and that the user is unlikely to be immediately waiting for the results.</strong></p><p>Such work is requested to run at a priority below critical user-interactive and user-initiated work, but relatively higher than low-level system maintenance tasks. The use of this QOS class indicates the work should be run in an energy and thermally-efficient manner. The progress of utility work may or may not be indicated to the user, but the effect of such work is user-visible.</p></li> <li><p><strong><code>QOS_CLASS_BACKGROUND</code>: A QOS class which indicates work performed by this thread was not initiated by the user and that the user may be unaware of the results.</strong></p><p>Such work is requested to run at a priority below other work. The use of this QOS class indicates the work should be run in the most energy and thermally-efficient manner.</p></li> <li><p><strong><code>QOS_CLASS_UNSPECIFIED</code>: A QOS class value which indicates the absence or removal of QOS class information.</strong></p><p>As an API return value, may indicate that threads or pthread attributes were configured with legacy API incompatible or in conflict with the QOS class system.</p></li> </ul> </details> </li> </ul>
2 parents 3713c4b + 430bdd3 commit 6bca9f2

File tree

15 files changed

+393
-24
lines changed

15 files changed

+393
-24
lines changed

Cargo.lock

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/flycheck/Cargo.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ cargo_metadata = "0.15.0"
1818
rustc-hash = "1.1.0"
1919
serde_json.workspace = true
2020
serde.workspace = true
21-
jod-thread = "0.1.2"
2221
command-group = "2.0.1"
2322

2423
# local deps

crates/flycheck/src/lib.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ impl fmt::Display for FlycheckConfig {
7777
pub struct FlycheckHandle {
7878
// XXX: drop order is significant
7979
sender: Sender<StateChange>,
80-
_thread: jod_thread::JoinHandle,
80+
_thread: stdx::thread::JoinHandle,
8181
id: usize,
8282
}
8383

@@ -90,7 +90,7 @@ impl FlycheckHandle {
9090
) -> FlycheckHandle {
9191
let actor = FlycheckActor::new(id, sender, config, workspace_root);
9292
let (sender, receiver) = unbounded::<StateChange>();
93-
let thread = jod_thread::Builder::new()
93+
let thread = stdx::thread::Builder::new(stdx::thread::QoSClass::Utility)
9494
.name("Flycheck".to_owned())
9595
.spawn(move || actor.run(receiver))
9696
.expect("failed to spawn thread");
@@ -395,7 +395,7 @@ struct CargoHandle {
395395
/// The handle to the actual cargo process. As we cannot cancel directly from with
396396
/// a read syscall dropping and therefore terminating the process is our best option.
397397
child: JodGroupChild,
398-
thread: jod_thread::JoinHandle<io::Result<(bool, String)>>,
398+
thread: stdx::thread::JoinHandle<io::Result<(bool, String)>>,
399399
receiver: Receiver<CargoMessage>,
400400
}
401401

@@ -409,7 +409,7 @@ impl CargoHandle {
409409

410410
let (sender, receiver) = unbounded();
411411
let actor = CargoActor::new(sender, stdout, stderr);
412-
let thread = jod_thread::Builder::new()
412+
let thread = stdx::thread::Builder::new(stdx::thread::QoSClass::Utility)
413413
.name("CargoHandle".to_owned())
414414
.spawn(move || actor.run())
415415
.expect("failed to spawn thread");

crates/ide/src/prime_caches.rs

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,11 @@ pub(crate) fn parallel_prime_caches(
8080
for _ in 0..num_worker_threads {
8181
let worker = prime_caches_worker.clone();
8282
let db = db.snapshot();
83-
std::thread::spawn(move || Cancelled::catch(|| worker(db)));
83+
84+
stdx::thread::Builder::new(stdx::thread::QoSClass::Utility)
85+
.allow_leak(true)
86+
.spawn(move || Cancelled::catch(|| worker(db)))
87+
.expect("failed to spawn thread");
8488
}
8589

8690
(work_sender, progress_receiver)

crates/proc-macro-srv/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ object = { version = "0.30.2", default-features = false, features = [
2222
libloading = "0.7.3"
2323
memmap2 = "0.5.4"
2424

25+
stdx.workspace = true
2526
tt.workspace = true
2627
mbe.workspace = true
2728
paths.workspace = true

crates/rust-analyzer/Cargo.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,6 @@ jemallocator = { version = "0.5.0", package = "tikv-jemallocator", optional = tr
8686

8787
[dev-dependencies]
8888
expect-test = "1.4.0"
89-
jod-thread = "0.1.2"
9089
xshell = "0.2.2"
9190

9291
test-utils.workspace = true

crates/rust-analyzer/src/bin/main.rs

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,14 @@ fn try_main(flags: flags::RustAnalyzer) -> Result<()> {
7878
println!("rust-analyzer {}", rust_analyzer::version());
7979
return Ok(());
8080
}
81-
with_extra_thread("LspServer", run_server)?;
81+
82+
// rust-analyzer’s “main thread” is actually a secondary thread
83+
// with an increased stack size at the User Initiated QoS class.
84+
// We use this QoS class because any delay in the main loop
85+
// will make actions like hitting enter in the editor slow.
86+
// rust-analyzer does not block the editor’s render loop,
87+
// so we don’t use User Interactive.
88+
with_extra_thread("LspServer", stdx::thread::QoSClass::UserInitiated, run_server)?;
8289
}
8390
flags::RustAnalyzerCmd::Parse(cmd) => cmd.run()?,
8491
flags::RustAnalyzerCmd::Symbols(cmd) => cmd.run()?,
@@ -136,14 +143,17 @@ const STACK_SIZE: usize = 1024 * 1024 * 8;
136143
/// space.
137144
fn with_extra_thread(
138145
thread_name: impl Into<String>,
146+
qos_class: stdx::thread::QoSClass,
139147
f: impl FnOnce() -> Result<()> + Send + 'static,
140148
) -> Result<()> {
141-
let handle =
142-
std::thread::Builder::new().name(thread_name.into()).stack_size(STACK_SIZE).spawn(f)?;
143-
match handle.join() {
144-
Ok(res) => res,
145-
Err(panic) => std::panic::resume_unwind(panic),
146-
}
149+
let handle = stdx::thread::Builder::new(qos_class)
150+
.name(thread_name.into())
151+
.stack_size(STACK_SIZE)
152+
.spawn(f)?;
153+
154+
handle.join()?;
155+
156+
Ok(())
147157
}
148158

149159
fn run_server() -> Result<()> {

crates/rust-analyzer/src/main_loop.rs

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -665,14 +665,20 @@ impl GlobalState {
665665
use crate::handlers::request as handlers;
666666

667667
dispatcher
668+
// Request handlers that must run on the main thread
669+
// because they mutate GlobalState:
668670
.on_sync_mut::<lsp_ext::ReloadWorkspace>(handlers::handle_workspace_reload)
669671
.on_sync_mut::<lsp_ext::RebuildProcMacros>(handlers::handle_proc_macros_rebuild)
670672
.on_sync_mut::<lsp_ext::MemoryUsage>(handlers::handle_memory_usage)
671673
.on_sync_mut::<lsp_ext::ShuffleCrateGraph>(handlers::handle_shuffle_crate_graph)
674+
// Request handlers which are related to the user typing
675+
// are run on the main thread to reduce latency:
672676
.on_sync::<lsp_ext::JoinLines>(handlers::handle_join_lines)
673677
.on_sync::<lsp_ext::OnEnter>(handlers::handle_on_enter)
674678
.on_sync::<lsp_types::request::SelectionRangeRequest>(handlers::handle_selection_range)
675679
.on_sync::<lsp_ext::MatchingBrace>(handlers::handle_matching_brace)
680+
.on_sync::<lsp_ext::OnTypeFormatting>(handlers::handle_on_type_formatting)
681+
// All other request handlers:
676682
.on::<lsp_ext::FetchDependencyList>(handlers::fetch_dependency_list)
677683
.on::<lsp_ext::AnalyzerStatus>(handlers::handle_analyzer_status)
678684
.on::<lsp_ext::SyntaxTree>(handlers::handle_syntax_tree)
@@ -693,7 +699,6 @@ impl GlobalState {
693699
.on::<lsp_ext::OpenCargoToml>(handlers::handle_open_cargo_toml)
694700
.on::<lsp_ext::MoveItem>(handlers::handle_move_item)
695701
.on::<lsp_ext::WorkspaceSymbol>(handlers::handle_workspace_symbol)
696-
.on::<lsp_ext::OnTypeFormatting>(handlers::handle_on_type_formatting)
697702
.on::<lsp_types::request::DocumentSymbolRequest>(handlers::handle_document_symbol)
698703
.on::<lsp_types::request::GotoDefinition>(handlers::handle_goto_definition)
699704
.on::<lsp_types::request::GotoDeclaration>(handlers::handle_goto_declaration)

crates/rust-analyzer/src/task_pool.rs

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
//! A thin wrapper around `ThreadPool` to make sure that we join all things
22
//! properly.
3+
use std::sync::{Arc, Barrier};
4+
35
use crossbeam_channel::Sender;
46

57
pub(crate) struct TaskPool<T> {
@@ -16,6 +18,18 @@ impl<T> TaskPool<T> {
1618
.thread_stack_size(STACK_SIZE)
1719
.num_threads(threads)
1820
.build();
21+
22+
// Set QoS of all threads in threadpool.
23+
let barrier = Arc::new(Barrier::new(threads + 1));
24+
for _ in 0..threads {
25+
let barrier = barrier.clone();
26+
inner.execute(move || {
27+
stdx::thread::set_current_thread_qos_class(stdx::thread::QoSClass::Utility);
28+
barrier.wait();
29+
});
30+
}
31+
barrier.wait();
32+
1933
TaskPool { sender, inner }
2034
}
2135

@@ -26,7 +40,16 @@ impl<T> TaskPool<T> {
2640
{
2741
self.inner.execute({
2842
let sender = self.sender.clone();
29-
move || sender.send(task()).unwrap()
43+
move || {
44+
if stdx::thread::IS_QOS_AVAILABLE {
45+
debug_assert_eq!(
46+
stdx::thread::get_current_thread_qos_class(),
47+
Some(stdx::thread::QoSClass::Utility)
48+
);
49+
}
50+
51+
sender.send(task()).unwrap()
52+
}
3053
})
3154
}
3255

crates/rust-analyzer/tests/slow-tests/support.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ pub(crate) fn project(fixture: &str) -> Server {
155155
pub(crate) struct Server {
156156
req_id: Cell<i32>,
157157
messages: RefCell<Vec<Message>>,
158-
_thread: jod_thread::JoinHandle<()>,
158+
_thread: stdx::thread::JoinHandle,
159159
client: Connection,
160160
/// XXX: remove the tempdir last
161161
dir: TestDir,
@@ -165,7 +165,7 @@ impl Server {
165165
fn new(dir: TestDir, config: Config) -> Server {
166166
let (connection, client) = Connection::memory();
167167

168-
let _thread = jod_thread::Builder::new()
168+
let _thread = stdx::thread::Builder::new(stdx::thread::QoSClass::Utility)
169169
.name("test server".to_string())
170170
.spawn(move || main_loop(config, connection).unwrap())
171171
.expect("failed to spawn a thread");

crates/stdx/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ doctest = false
1515
libc = "0.2.135"
1616
backtrace = { version = "0.3.65", optional = true }
1717
always-assert = { version = "0.1.2", features = ["log"] }
18+
jod-thread = "0.1.2"
1819
# Think twice before adding anything here
1920

2021
[target.'cfg(windows)'.dependencies]

crates/stdx/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ pub mod process;
1111
pub mod panic_context;
1212
pub mod non_empty_vec;
1313
pub mod rand;
14+
pub mod thread;
1415

1516
pub use always_assert::{always, never};
1617

0 commit comments

Comments
 (0)