Skip to content

Add large-workspace stress test benchmark #2143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 29, 2025

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented May 28, 2025

At RustWeek, I was talking to Zed developers (@osiewicz) about having something related to Zed in our benchmark suite, because it is apparently bottlenecked on loading incremental cache from disk and having a large number of workspace crates, which is not something that we have been benchmarking much in our suite so far.

I tried putting Zed directly into the suite, but that.. doesn't really work, as it has more than a thousand of dependencies, and just the preparation phase took almost an hour on a 8-core laptop (the current collector has 6 cores).

But that doesn't mean that we can't create stress tests that simulate what Zed (or other similar projects) do. I thought that one interesting case could be to have a large workspace, for which rustc has to locate and load all dependencies and then instantiate some code from them. Something that rust-lang/rust#132910 was trying to optimize.

@nnethercote
Copy link
Contributor

I did a quick profile with Cachegrind. Here is the main part of a check full run:

--------------------------------------------------------------------------------
-- Function:file summary
--------------------------------------------------------------------------------
  Ir______________________  function:file

> 21,820,508 (7.8%,  7.8%)  <all-jemalloc-functions>:<all-jemalloc-files>

> 10,022,043 (3.6%, 11.4%)  <rustc_metadata::creader::CrateLoader>::maybe_resolve_crate:
   2,050,882 (0.7%)           /home/njn/dev/rust0/compiler/rustc_span/src/def_id.rs
   1,902,386 (0.7%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/creader.rs
   1,431,003 (0.5%)           /home/njn/dev/rust0/library/core/src/iter/traits/iterator.rs
     735,761 (0.3%)           /home/njn/dev/rust0/library/core/src/option.rs
     612,511 (0.2%)           /home/njn/dev/rust0/library/core/src/ptr/non_null.rs
     546,225 (0.2%)           /home/njn/dev/rust0/library/core/src/slice/iter/macros.rs
     500,780 (0.2%)           /home/njn/dev/rust0/library/core/src/iter/adapters/enumerate.rs
     499,183 (0.2%)           /home/njn/dev/rust0/compiler/rustc_span/src/symbol.rs
     479,402 (0.2%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/rmeta/decoder.rs
     458,432 (0.2%)           /home/njn/dev/rust0/library/core/src/slice/ascii.rs

>  6,726,069 (2.4%, 13.8%)  __memcpy_avx_unaligned_erms:./string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S

>  6,229,774 (2.2%, 16.1%)  __memcmp_avx2_movbe:./string/../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S

>  5,936,121 (2.1%, 18.2%)  <rustc_session::config::Externs>::get:
   2,107,465 (0.8%)           /home/njn/dev/rust0/library/alloc/src/collections/btree/search.rs
   1,730,032 (0.6%)           /home/njn/dev/rust0/library/core/src/slice/cmp.rs
     865,016 (0.3%)           /home/njn/dev/rust0/library/core/src/cmp.rs
     452,869 (0.2%)           /home/njn/dev/rust0/library/core/src/ptr/non_null.rs

>  5,553,568 (2.0%, 20.2%)  SetImpliedBits(llvm::FeatureBitset&, llvm::FeatureBitset const&, llvm::ArrayRef<llvm::SubtargetFeatureKV>):???

>  4,492,623 (1.6%, 21.8%)  _dl_relocate_object:
   2,564,384 (0.9%)           ./elf/../sysdeps/x86_64/dl-machine.h
   1,660,829 (0.6%)           ./elf/./elf/do-rel.h

>  2,389,742 (0.9%, 22.7%)  <rustc_metadata::rmeta::LazyTables as rustc_serialize::serialize::Decodable<rustc_metadata::rmeta::decoder::DecodeContext>>::decode:
   1,208,874 (0.4%)           /home/njn/dev/rust0/compiler/rustc_serialize/src/opaque.rs
     496,164 (0.2%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/rmeta/mod.rs

>  2,349,095 (0.8%, 23.5%)  <hashbrown::map::HashMap<rustc_query_system::dep_graph::dep_node::DepNode, rustc_span::def_id::DefId, rustc_hash::FxBuildHasher>>::insert:
     552,018 (0.2%)           /home/njn/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hashbrown-0.15.3/src/map.rs
     376,522 (0.1%)           /home/njn/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hashbrown-0.15.3/src/raw/mod.rs

>  1,795,788 (0.6%, 24.2%)  <rustc_metadata::creader::CStore>::from_tcx:
     756,168 (0.3%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/creader.rs
     441,098 (0.2%)           /home/njn/dev/rust0/library/core/src/any.rs
     359,202 (0.1%)           /home/njn/dev/rust0/compiler/rustc_data_structures/src/sync/freeze.rs

>  1,760,416 (0.6%, 24.8%)  <rustc_query_system::query::plumbing::JobOwner<rustc_span::def_id::DefId, rustc_query_system::query::QueryStackDeferred>>::complete::<rustc_query_system::query::caches::DefIdCache<rustc_middle::query::erase::Erased<[u8; 8]>>>:
     368,966 (0.1%)           /home/njn/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hashbrown-0.15.3/src/raw/mod.rs

>  1,746,668 (0.6%, 25.4%)  <indexmap::map::IndexMap<&str, (), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::get_index_of::<str>:
     370,524 (0.1%)           /home/njn/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/indexmap-2.9.0/src/map.rs
     283,337 (0.1%)           /home/njn/dev/rust0/library/core/src/slice/cmp.rs

It's definitely very different to other benchmarks, with lots of metadata stuff in there. But it's also pretty short running (0.13s, not much slower than helloworld at 0.09s). And the profile is still pretty flat, with only a few functions above 1%.

Probably worth it for the variety, but I don't see that much scope for improving it.

@osiewicz
Copy link

Probably worth it for the variety, but I don't see that much scope for improving it.

Perhaps introducing edges in a dep graph between dep- crates could stress it a bit more? E.g. have dep-X depend on dep-0..dep-(X-1).

@nnethercote
Copy link
Contributor

Good suggestion. I had thought about increasing the number of crates (e.g. 500 to 1000) or the size of each crate (more than just three functions) but increasing dependencies might be more interesting.

@Kobzol
Copy link
Contributor Author

Kobzol commented May 29, 2025

Right, so if some dependency is not direct in the final crate, rustc uses a different scheme to load the .rlibs, IIRC, so that could indeed be interesting. I'll modify the Python script to introduce some edges.

@Kobzol Kobzol force-pushed the synthetic-large-workspace branch from df48f19 to d23ce95 Compare May 29, 2025 07:31
@Kobzol
Copy link
Contributor Author

Kobzol commented May 29, 2025

Changed the generation to work in several hierarchical levels, where crates depend on crates in the previous level.

@nnethercote
Copy link
Contributor

That definitely bumped it up some:

--------------------------------------------------------------------------------
-- Function:file summary
--------------------------------------------------------------------------------
  Ir_______________________  function:file

> 58,563,986 (14.1%, 14.1%)  <rustc_metadata::creader::CrateLoader>::maybe_resolve_crate:
  12,542,750  (3.0%)           /home/njn/dev/rust0/compiler/rustc_span/src/def_id.rs
   9,224,789  (2.2%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/creader.rs
   8,357,410  (2.0%)           /home/njn/dev/rust0/library/core/src/iter/traits/iterator.rs
   6,397,503  (1.5%)           /home/njn/dev/rust0/library/core/src/ptr/non_null.rs
   4,612,535  (1.1%)           /home/njn/dev/rust0/library/core/src/option.rs
   3,837,230  (0.9%)           /home/njn/dev/rust0/library/core/src/slice/iter/macros.rs
   3,652,109  (0.9%)           /home/njn/dev/rust0/library/core/src/iter/adapters/enumerate.rs
   3,649,395  (0.9%)           /home/njn/dev/rust0/compiler/rustc_span/src/symbol.rs
   1,867,059  (0.4%)           /home/njn/dev/rust0/library/core/src/slice/ascii.rs
   1,275,604  (0.3%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/rmeta/decoder.rs

> 29,366,496  (7.1%, 21.1%)  <all-jemalloc-functions>:<all-jemalloc-files>
   
> 19,835,427  (4.8%, 25.9%)  <rustc_metadata::creader::CStore>::push_dependencies_in_postorder:
   9,370,430  (2.3%)           /home/njn/dev/rust0/library/core/src/slice/iter/macros.rs
   6,141,709  (1.5%)           /home/njn/dev/rust0/compiler/rustc_span/src/def_id.rs
   3,139,748  (0.8%)           /home/njn/dev/rust0/library/core/src/ptr/non_null.rs
   1,034,426  (0.2%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/creader.rs
   
> 16,405,634  (3.9%, 29.8%)  __memcmp_avx2_movbe:./string/../sysdeps/x86_64/multiarch/memcmp-avx2-movbe.S

> 16,378,368  (3.9%, 33.8%)  <rustc_session::config::Externs>::get:
   5,958,237  (1.4%)           /home/njn/dev/rust0/library/alloc/src/collections/btree/search.rs
   4,412,856  (1.1%)           /home/njn/dev/rust0/library/core/src/slice/cmp.rs
   2,206,428  (0.5%)           /home/njn/dev/rust0/library/core/src/cmp.rs
   1,147,167  (0.3%)           /home/njn/dev/rust0/library/core/src/ptr/non_null.rs
     977,640  (0.2%)           /home/njn/dev/rust0/library/alloc/src/collections/btree/node.rs
     845,964  (0.2%)           /home/njn/dev/rust0/compiler/rustc_session/src/config.rs
     595,560  (0.1%)           /home/njn/dev/rust0/library/core/src/slice/iter/macros.rs

>  8,996,506  (2.2%, 35.9%)  __memcpy_avx_unaligned_erms:./string/../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S

>  5,553,568  (1.3%, 37.3%)  SetImpliedBits(llvm::FeatureBitset&, llvm::FeatureBitset const&, llvm::ArrayRef<llvm::SubtargetFeatureKV>):???

>  5,257,806  (1.3%, 38.5%)  <rustc_metadata::creader::CrateLoader>::load:
     908,059  (0.2%)           /home/njn/dev/rust0/compiler/rustc_metadata/src/creader.rs
     863,037  (0.2%)           /home/njn/dev/rust0/compiler/rustc_span/src/def_id.rs
     860,256  (0.2%)           /home/njn/dev/rust0/library/core/src/ptr/non_null.rs
     860,256  (0.2%)           /home/njn/dev/rust0/library/core/src/iter/traits/iterator.rs
     436,617  (0.1%)           /home/njn/dev/rust0/library/core/src/slice/iter/macros.rs
     430,128  (0.1%)           /home/njn/dev/rust0/library/core/src/option.rs
     429,201  (0.1%)           /home/njn/dev/rust0/library/core/src/iter/adapters/enumerate.rs
     428,537  (0.1%)           /home/njn/dev/rust0/compiler/rustc_span/src/symbol.rs
 
>  5,034,310  (1.2%, 39.8%)  <indexmap::map::IndexMap<&str, (), core::hash::BuildHasherDefault<rustc_hash::FxHasher>>>::get_index_of::<str>:
   1,077,909  (0.3%)           /home/njn/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/indexmap-2.9.0/src/map.rs
     853,850  (0.2%)           /home/njn/dev/rust0/library/core/src/slice/cmp.rs
     743,412  (0.2%)           /home/njn/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/rustc-hash-2.1.1/src/lib.rs
     739,438  (0.2%)           /home/njn/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hashbrown-0.15.3/src/raw/mod.rs

@Kobzol
Copy link
Contributor Author

Kobzol commented May 29, 2025

Might be interesting to take a look at this after the next LLVM bump, where (hopefully) SetImpliedBits should disappear from the profile, and other bottlenecks might be more visible.

@Kobzol Kobzol merged commit 6a70166 into rust-lang:master May 29, 2025
11 checks passed
@Kobzol Kobzol deleted the synthetic-large-workspace branch May 29, 2025 08:53
osiewicz added a commit to osiewicz/rust that referenced this pull request May 29, 2025
<rustc_metadata::creader::CStore>::push_dependencies_in_postorder showed up in new benchmarks from rust-lang/rustc-perf#2143, hence I gave it a shot to remove an obvious O(n) there.
osiewicz added a commit to osiewicz/rust that referenced this pull request May 29, 2025
<rustc_metadata::creader::CStore>::push_dependencies_in_postorder showed up in new benchmarks from rust-lang/rustc-perf#2143, hence I gave it a shot to remove an obvious O(n) there.
bors added a commit to rust-lang/rust that referenced this pull request May 29, 2025
…exset, r=

cstore: Use IndexSet as backing store for postorder dependencies

<rustc_metadata::creader::CStore>::push_dependencies_in_postorder showed up in new benchmarks from rust-lang/rustc-perf#2143, hence I gave it a shot to remove an obvious O(n) there.

r? nnethercote
bors added a commit to rust-lang/rust that referenced this pull request May 29, 2025
…exset, r=<try>

cstore: Use IndexSet as backing store for postorder dependencies

<rustc_metadata::creader::CStore>::push_dependencies_in_postorder showed up in new benchmarks from rust-lang/rustc-perf#2143, hence I gave it a shot to remove an obvious O(n) there.

r? nnethercote
bors added a commit to rust-lang/rust that referenced this pull request Jun 1, 2025
…exset, r=nnethercote

cstore: Use IndexSet as backing store for postorder dependencies

`<rustc_metadata::creader::CStore>::push_dependencies_in_postorder` showed up in new benchmarks from rust-lang/rustc-perf#2143, hence I gave it a shot to remove an obvious O(n) there.

r? nnethercote
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants