@@ -7,7 +7,7 @@ Summarize is a tool to produce a human readable summary of `measureme` profiling
7
7
To use this tool you will first want to install it:
8
8
9
9
``` bash
10
- $ cargo install --git https://github.com/rust-lang/measureme --branch stable summarize
10
+ cargo install --git https://github.com/rust-lang/measureme --branch stable summarize
11
11
```
12
12
13
13
## Profiling the nightly compiler
@@ -23,9 +23,9 @@ profile the [regex][regex-crate] crate.
23
23
[ regex-crate ] : https://github.com/rust-lang/regex
24
24
25
25
``` bash
26
- $ git clone https://github.com/rust-lang/regex.git
27
- $ cd regex
28
- $ cargo +nightly rustc -- -Z self-profile
26
+ git clone https://github.com/rust-lang/regex.git
27
+ cd regex
28
+ cargo +nightly rustc -- -Z self-profile
29
29
```
30
30
31
31
The commands above will run ` rustc ` with the flag that enables profiling. You should now
@@ -38,36 +38,36 @@ You can now use the `summarize` tool we installed in the previous section to vie
38
38
contents of these files:
39
39
40
40
``` bash
41
- $ summarize summarize regex-{pid}.mm_profdata
42
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
43
- | Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
44
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
45
- | LLVM_emit_obj | 4.51s | 41.432 | 141 | 0 | 0.00ns | 0.00ns |
46
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
47
- | LLVM_module_passes | 1.05s | 9.626 | 140 | 0 | 0.00ns | 0.00ns |
48
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
49
- | LLVM_make_bitcode | 712.94ms | 6.543 | 140 | 0 | 0.00ns | 0.00ns |
50
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
51
- | typeck_tables_of | 542.23ms | 4.976 | 17470 | 16520 | 0.00ns | 0.00ns |
52
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
53
- | codegen | 366.82ms | 3.366 | 141 | 0 | 0.00ns | 0.00ns |
54
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
55
- | optimized_mir | 188.22ms | 1.727 | 11668 | 9114 | 0.00ns | 0.00ns |
56
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
57
- | mir_built | 156.30ms | 1.434 | 2040 | 1020 | 0.00ns | 0.00ns |
58
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
59
- | evaluate_obligation | 151.95ms | 1.394 | 33134 | 23817 | 0.00ns | 0.00ns |
60
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
61
- | LLVM_compress_bitcode | 126.55ms | 1.161 | 140 | 0 | 0.00ns | 0.00ns |
62
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
63
- | codegen crate | 119.08ms | 1.093 | 1 | 0 | 0.00ns | 0.00ns |
64
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
65
- | mir_const | 117.82ms | 1.081 | 1050 | 30 | 0.00ns | 0.00ns |
66
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
67
-
68
- (rows elided)
69
-
70
- Total cpu time: 10.896488447s
41
+ summarize summarize regex-{pid}.mm_profdata
42
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
43
+ # | Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
44
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
45
+ # | LLVM_emit_obj | 4.51s | 41.432 | 141 | 0 | 0.00ns | 0.00ns |
46
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
47
+ # | LLVM_module_passes | 1.05s | 9.626 | 140 | 0 | 0.00ns | 0.00ns |
48
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
49
+ # | LLVM_make_bitcode | 712.94ms | 6.543 | 140 | 0 | 0.00ns | 0.00ns |
50
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
51
+ # | typeck_tables_of | 542.23ms | 4.976 | 17470 | 16520 | 0.00ns | 0.00ns |
52
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
53
+ # | codegen | 366.82ms | 3.366 | 141 | 0 | 0.00ns | 0.00ns |
54
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
55
+ # | optimized_mir | 188.22ms | 1.727 | 11668 | 9114 | 0.00ns | 0.00ns |
56
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
57
+ # | mir_built | 156.30ms | 1.434 | 2040 | 1020 | 0.00ns | 0.00ns |
58
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
59
+ # | evaluate_obligation | 151.95ms | 1.394 | 33134 | 23817 | 0.00ns | 0.00ns |
60
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
61
+ # | LLVM_compress_bitcode | 126.55ms | 1.161 | 140 | 0 | 0.00ns | 0.00ns |
62
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
63
+ # | codegen crate | 119.08ms | 1.093 | 1 | 0 | 0.00ns | 0.00ns |
64
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
65
+ # | mir_const | 117.82ms | 1.081 | 1050 | 30 | 0.00ns | 0.00ns |
66
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
67
+ #
68
+ # (rows elided)
69
+ #
70
+ # Total cpu time: 10.896488447s
71
71
```
72
72
73
73
## Profiling your own build of rustc
@@ -80,41 +80,41 @@ You can also profile your own custom build of rustc. First you'll have to clone
80
80
[ compiling-rust ] : https://rustc-dev-guide.rust-lang.org/building/how-to-build-and-run.html
81
81
82
82
``` bash
83
- $ git clone https://github.com/rust-lang/rust.git
84
- $ ./x.py build
83
+ git clone https://github.com/rust-lang/rust.git
84
+ ./x.py build
85
85
# This will take a while...
86
- $ rustup toolchain link mytoolchain build/x86_64-unknown-linux-gnu/stage1
86
+ rustup toolchain link mytoolchain build/x86_64-unknown-linux-gnu/stage1
87
87
```
88
88
89
89
Where ` mytoolchain ` is the name of your custom toolchain. Now we do more or less the same
90
90
as before: (with regex as example)
91
91
92
92
``` bash
93
- $ git clone https://github.com/rust-lang/regex.git
94
- $ cd regex
95
- $ cargo +mytoolchain rustc -- -Z self-profile
96
- $ summarize summarize regex-{pid}.mm_profdata
97
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
98
- | Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
99
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
100
- | LLVM_emit_obj | 4.51s | 41.432 | 141 | 0 | 0.00ns | 0.00ns |
101
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
102
- | LLVM_module_passes | 1.05s | 9.626 | 140 | 0 | 0.00ns | 0.00ns |
103
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
104
- | LLVM_make_bitcode | 712.94ms | 6.543 | 140 | 0 | 0.00ns | 0.00ns |
105
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
106
- | typeck_tables_of | 542.23ms | 4.976 | 17470 | 16520 | 0.00ns | 0.00ns |
107
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
108
- | codegen | 366.82ms | 3.366 | 141 | 0 | 0.00ns | 0.00ns |
109
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
110
- | optimized_mir | 188.22ms | 1.727 | 11668 | 9114 | 0.00ns | 0.00ns |
111
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
112
- | mir_built | 156.30ms | 1.434 | 2040 | 1020 | 0.00ns | 0.00ns |
113
- +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
114
-
115
- (rows elided)
116
-
117
- Total cpu time: 10.896488447s
93
+ git clone https://github.com/rust-lang/regex.git
94
+ cd regex
95
+ cargo +mytoolchain rustc -- -Z self-profile
96
+ summarize summarize regex-{pid}.mm_profdata
97
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
98
+ # | Item | Self time | % of total time | Item count | Cache hits | Blocked time | Incremental load time |
99
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
100
+ # | LLVM_emit_obj | 4.51s | 41.432 | 141 | 0 | 0.00ns | 0.00ns |
101
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
102
+ # | LLVM_module_passes | 1.05s | 9.626 | 140 | 0 | 0.00ns | 0.00ns |
103
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
104
+ # | LLVM_make_bitcode | 712.94ms | 6.543 | 140 | 0 | 0.00ns | 0.00ns |
105
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
106
+ # | typeck_tables_of | 542.23ms | 4.976 | 17470 | 16520 | 0.00ns | 0.00ns |
107
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
108
+ # | codegen | 366.82ms | 3.366 | 141 | 0 | 0.00ns | 0.00ns |
109
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
110
+ # | optimized_mir | 188.22ms | 1.727 | 11668 | 9114 | 0.00ns | 0.00ns |
111
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
112
+ # | mir_built | 156.30ms | 1.434 | 2040 | 1020 | 0.00ns | 0.00ns |
113
+ # +------------------------+-----------+-----------------+------------+------------+--------------+-----------------------+
114
+ #
115
+ # (rows elided)
116
+ #
117
+ # Total cpu time: 10.896488447s
118
118
```
119
119
120
120
Note that your custom build of the compiler must not use a newer version of the
@@ -147,22 +147,22 @@ The `diff` sub command allows you to compare the performance of two different pr
147
147
The output is a table like that of the ` summarize ` sub command but it instead shows the differences in each metric.
148
148
149
149
``` bash
150
- $ summarize diff base-profile.mm_profdata changed-profile.mm_profdata
151
- +---------------------------+--------------+------------+------------+--------------+-----------------------+
152
- | Item | Self Time | Item count | Cache hits | Blocked time | Incremental load time |
153
- +---------------------------+--------------+------------+------------+--------------+-----------------------+
154
- | LLVM_module_passes | -66.626471ms | +0 | +0 | +0ns | +0ns |
155
- +---------------------------+--------------+------------+------------+--------------+-----------------------+
156
- | LLVM_emit_obj | -38.700719ms | +0 | +0 | +0ns | +0ns |
157
- +---------------------------+--------------+------------+------------+--------------+-----------------------+
158
- | LLVM_make_bitcode | +32.006706ms | +0 | +0 | +0ns | +0ns |
159
- +---------------------------+--------------+------------+------------+--------------+-----------------------+
160
- | mir_borrowck | -12.808322ms | +0 | +0 | +0ns | +0ns |
161
- +---------------------------+--------------+------------+------------+--------------+-----------------------+
162
- | typeck_tables_of | -10.325247ms | +0 | +0 | +0ns | +0ns |
163
- +---------------------------+--------------+------------+------------+--------------+-----------------------+
164
- (rows elided)
165
- Total cpu time: -155.177548ms
150
+ summarize diff base-profile.mm_profdata changed-profile.mm_profdata
151
+ # +---------------------------+--------------+------------+------------+--------------+-----------------------+
152
+ # | Item | Self Time | Item count | Cache hits | Blocked time | Incremental load time |
153
+ # +---------------------------+--------------+------------+------------+--------------+-----------------------+
154
+ # | LLVM_module_passes | -66.626471ms | +0 | +0 | +0ns | +0ns |
155
+ # +---------------------------+--------------+------------+------------+--------------+-----------------------+
156
+ # | LLVM_emit_obj | -38.700719ms | +0 | +0 | +0ns | +0ns |
157
+ # +---------------------------+--------------+------------+------------+--------------+-----------------------+
158
+ # | LLVM_make_bitcode | +32.006706ms | +0 | +0 | +0ns | +0ns |
159
+ # +---------------------------+--------------+------------+------------+--------------+-----------------------+
160
+ # | mir_borrowck | -12.808322ms | +0 | +0 | +0ns | +0ns |
161
+ # +---------------------------+--------------+------------+------------+--------------+-----------------------+
162
+ # | typeck_tables_of | -10.325247ms | +0 | +0 | +0ns | +0ns |
163
+ # +---------------------------+--------------+------------+------------+--------------+-----------------------+
164
+ # (rows elided)
165
+ # Total cpu time: -155.177548ms
166
166
```
167
167
168
168
The table is sorted by the absolute value of ` Self time ` descending.
0 commit comments