Add missing _excess methods to Alloc and use them in RawVec #50738

gnzlbg · 2018-05-14T11:45:33Z

Adds the following methods to the Alloc trait:

alloc_zeroed_excess
shrink_in_place_excess
grow_in_place_excess

Uses them in the appropriate places within the RawVec implementation.

rust-highfive · 2018-05-14T11:45:43Z

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

gnzlbg · 2018-05-14T11:45:48Z

r? @sfackler

--

cc @glandium

gnzlbg · 2018-05-14T11:48:37Z

The shrink_in_place_excess method is not currently used anywhere but should be used by RawVec::shrink_to_fit in a subsequent PR.

Also, sub-sequent PRs should move Vec from using double to using reserve (#50739), and should specialize these methods for jemalloc (where is the Alloc impl for jemalloc ? I can't really find it anymore ; it used to be in liballoc_jemalloc).

gnzlbg · 2018-05-14T11:50:49Z

cc @alexcrichton I'll update the jemallocator crate once these land on nightly with the specialized implementations: {grow,shrink}_in_place_excess map directly to xallocx and don't need to call nallocx, also, {grow_shrink}_in_place for jemalloc become then thin wrappers over the _excess variants that just drop the excess.

rust-highfive · 2018-05-14T12:00:27Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.


[00:04:59] travis_fold:start:tidy
travis_time:start:tidy
tidy check
[00:04:59] tidy error: /checkout/src/liballoc/raw_vec.rs:413: line longer than 100 chars
[00:04:59] tidy error: /checkout/src/liballoc/raw_vec.rs:491: line longer than 100 chars
[00:05:00] some tidy checks failed
[00:05:00] 
[00:05:00] 
[00:05:00] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:05:00] 
[00:05:00] 
[00:05:00] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:05:00] Build completed unsuccessfully in 0:02:00
[00:05:00] Build completed unsuccessfully in 0:02:00
[00:05:00] make: *** [tidy] Error 1
[00:05:00] Makefile:79: recipe for target 'tidy' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:043a73b4
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

sfackler · 2018-05-14T15:28:53Z

Could you show some benchmark comparisons between the old and new behavior?

gnzlbg · 2018-05-14T15:45:14Z

@sfackler sure, what benchmarks would you like to see ?

sfackler · 2018-05-14T15:51:43Z

The existing vec benchmarks would be a good choice https://github.com/rust-lang/rust/blob/master/src/liballoc/benches/vec.rs

gnzlbg · 2018-05-14T15:54:21Z

Yeah I can do those!

glandium · 2018-05-14T21:15:47Z

While I think something like this is necessary, I also think that's way too much churn to add to the Alloc API. Background: https://internals.rust-lang.org/t/pre-rfc-changing-the-alloc-trait/7487

alexcrichton · 2018-05-14T21:55:04Z

When this is about ready to land I think we'll also want to do at least a perf.rust-lang.org run, these tend to be very performance sensitive so just want to try to weed out possible regressions if we can!

gnzlbg · 2018-05-16T13:11:32Z

After the PR (vec_bench_exc) the benchmarks are faster in most cases (negative diff means faster), but honestly I think all of this is within noise.

 name                                vec_bench_master ns/iter  vec_bench_exc ns/iter  diff ns/iter   diff %  speedup 
 slice::ends_with_same_vector        0                         0                                 0     NaN%    x NaN 
 slice::starts_with_same_vector      0                         0                                 0     NaN%    x NaN 
 vec::bench_clone_0000               14                        13                               -1   -7.14%   x 1.08 
 vec::bench_clone_0010               51 (196 MB/s)             51 (196 MB/s)                     0    0.00%   x 1.00 
 vec::bench_clone_0100               149 (671 MB/s)            139 (719 MB/s)                  -10   -6.71%   x 1.07 
 vec::bench_clone_1000               1,085 (921 MB/s)          1,057 (946 MB/s)                -28   -2.58%   x 1.03 
 vec::bench_clone_from_01_0000_0000  25                        24                               -1   -4.00%   x 1.04 
 vec::bench_clone_from_01_0000_0010  80 (125 MB/s)             77 (129 MB/s)                    -3   -3.75%   x 1.04 
 vec::bench_clone_from_01_0000_0100  212 (471 MB/s)            204 (490 MB/s)                   -8   -3.77%   x 1.04 
 vec::bench_clone_from_01_0000_1000  1,631 (613 MB/s)          1,616 (618 MB/s)                -15   -0.92%   x 1.01 
 vec::bench_clone_from_01_0010_0000  24                        27                                3   12.50%   x 0.89 
 vec::bench_clone_from_01_0010_0010  81 (123 MB/s)             72 (138 MB/s)                    -9  -11.11%   x 1.12 
 vec::bench_clone_from_01_0010_0100  236 (423 MB/s)            204 (490 MB/s)                  -32  -13.56%   x 1.16 
 vec::bench_clone_from_01_0100_0010  81 (123 MB/s)             70 (142 MB/s)                   -11  -13.58%   x 1.16 
 vec::bench_clone_from_01_0100_0100  213 (469 MB/s)            210 (476 MB/s)                   -3   -1.41%   x 1.01 
 vec::bench_clone_from_01_0100_1000  1,715 (583 MB/s)          1,660 (602 MB/s)                -55   -3.21%   x 1.03 
 vec::bench_clone_from_01_1000_0100  230 (434 MB/s)            226 (442 MB/s)                   -4   -1.74%   x 1.02 
 vec::bench_clone_from_01_1000_1000  1,745 (573 MB/s)          1,658 (603 MB/s)                -87   -4.99%   x 1.05 
 vec::bench_clone_from_10_0000_0000  152                       146                              -6   -3.95%   x 1.04 
 vec::bench_clone_from_10_0000_0010  347 (288 MB/s)            321 (311 MB/s)                  -26   -7.49%   x 1.08 
 vec::bench_clone_from_10_0000_0100  1,619 (617 MB/s)          1,513 (660 MB/s)               -106   -6.55%   x 1.07 
 vec::bench_clone_from_10_0000_1000  14,540 (687 MB/s)         14,150 (706 MB/s)              -390   -2.68%   x 1.03 
 vec::bench_clone_from_10_0010_0000  148                       143                              -5   -3.38%   x 1.03 
 vec::bench_clone_from_10_0010_0010  373 (268 MB/s)            321 (311 MB/s)                  -52  -13.94%   x 1.16 
 vec::bench_clone_from_10_0010_0100  1,711 (584 MB/s)          1,513 (660 MB/s)               -198  -11.57%   x 1.13 
 vec::bench_clone_from_10_0100_0010  356 (280 MB/s)            323 (309 MB/s)                  -33   -9.27%   x 1.10 
 vec::bench_clone_from_10_0100_0100  1,719 (581 MB/s)          1,510 (662 MB/s)               -209  -12.16%   x 1.14 
 vec::bench_clone_from_10_0100_1000  14,240 (702 MB/s)         14,201 (704 MB/s)               -39   -0.27%   x 1.00 
 vec::bench_clone_from_10_1000_0100  1,732 (577 MB/s)          1,511 (661 MB/s)               -221  -12.76%   x 1.15 
 vec::bench_clone_from_10_1000_1000  15,541 (643 MB/s)         15,025 (665 MB/s)              -516   -3.32%   x 1.03 
 vec::bench_extend_0000_0000         40                        34                               -6  -15.00%   x 1.18 
 vec::bench_extend_0000_0010         119 (84 MB/s)             110 (90 MB/s)                    -9   -7.56%   x 1.08 
 vec::bench_extend_0000_0100         227 (440 MB/s)            220 (454 MB/s)                   -7   -3.08%   x 1.03 
 vec::bench_extend_0000_1000         1,373 (728 MB/s)          1,332 (750 MB/s)                -41   -2.99%   x 1.03 
 vec::bench_extend_0010_0010         177 (56 MB/s)             180 (55 MB/s)                     3    1.69%   x 0.98 
 vec::bench_extend_0100_0100         410 (243 MB/s)            396 (252 MB/s)                  -14   -3.41%   x 1.04 
 vec::bench_extend_1000_1000         3,133 (319 MB/s)          3,002 (333 MB/s)               -131   -4.18%   x 1.04 
 vec::bench_from_elem_0000           10                        10                                0    0.00%   x 1.00 
 vec::bench_from_elem_0010           52 (192 MB/s)             51 (196 MB/s)                    -1   -1.92%   x 1.02 
 vec::bench_from_elem_0100           168 (595 MB/s)            192 (520 MB/s)                   24   14.29%   x 0.88 
 vec::bench_from_elem_1000           1,284 (778 MB/s)          1,264 (791 MB/s)                -20   -1.56%   x 1.02 
 vec::bench_from_fn_0000             11                        11                                0    0.00%   x 1.00 
 vec::bench_from_fn_0010             54 (185 MB/s)             60 (166 MB/s)                     6   11.11%   x 0.90 
 vec::bench_from_fn_0100             207 (483 MB/s)            228 (438 MB/s)                   21   10.14%   x 0.91 
 vec::bench_from_fn_1000             1,627 (614 MB/s)          1,600 (625 MB/s)                -27   -1.66%   x 1.02 
 vec::bench_from_iter_0000           15                        15                                0    0.00%   x 1.00 
 vec::bench_from_iter_0010           55 (181 MB/s)             51 (196 MB/s)                    -4   -7.27%   x 1.08 
 vec::bench_from_iter_0100           142 (704 MB/s)            142 (704 MB/s)                    0    0.00%   x 1.00 
 vec::bench_from_iter_1000           1,110 (900 MB/s)          1,057 (946 MB/s)                -53   -4.77%   x 1.05 
 vec::bench_from_slice_0000          23                        23                                0    0.00%   x 1.00 
 vec::bench_from_slice_0010          94 (106 MB/s)             110 (90 MB/s)                    16   17.02%   x 0.85 
 vec::bench_from_slice_0100          202 (495 MB/s)            228 (438 MB/s)                   26   12.87%   x 0.89 
 vec::bench_from_slice_1000          1,357 (736 MB/s)          1,320 (757 MB/s)                -37   -2.73%   x 1.03 
 vec::bench_new                      0                         0                                 0     NaN%    x NaN 
 vec::bench_push_all_0000_0000       21                        21                                0    0.00%   x 1.00 
 vec::bench_push_all_0000_0010       61 (163 MB/s)             58 (172 MB/s)                    -3   -4.92%   x 1.05 
 vec::bench_push_all_0000_0100       164 (609 MB/s)            147 (680 MB/s)                  -17  -10.37%   x 1.12 
 vec::bench_push_all_0000_1000       1,089 (918 MB/s)          1,072 (932 MB/s)                -17   -1.56%   x 1.02 
 vec::bench_push_all_0010_0010       139 (71 MB/s)             127 (78 MB/s)                   -12   -8.63%   x 1.09 
 vec::bench_push_all_0100_0100       338 (295 MB/s)            334 (299 MB/s)                   -4   -1.18%   x 1.01 
 vec::bench_push_all_1000_1000       2,979 (335 MB/s)          3,086 (324 MB/s)                107    3.59%   x 0.97 
 vec::bench_push_all_move_0000_0000  36                        37                                1    2.78%   x 0.97 
 vec::bench_push_all_move_0000_0010  118 (84 MB/s)             116 (86 MB/s)                    -2   -1.69%   x 1.02 
 vec::bench_push_all_move_0000_0100  224 (446 MB/s)            234 (427 MB/s)                   10    4.46%   x 0.96 
 vec::bench_push_all_move_0000_1000  1,400 (714 MB/s)          1,352 (739 MB/s)                -48   -3.43%   x 1.04 
 vec::bench_push_all_move_0010_0010  178 (56 MB/s)             196 (51 MB/s)                    18   10.11%   x 0.91 
 vec::bench_push_all_move_0100_0100  399 (250 MB/s)            399 (250 MB/s)                    0    0.00%   x 1.00 
 vec::bench_push_all_move_1000_1000  3,218 (310 MB/s)          3,334 (299 MB/s)                116    3.60%   x 0.97 
 vec::bench_with_capacity_0000       2                         2                                 0    0.00%   x 1.00 
 vec::bench_with_capacity_0010       31 (322 MB/s)             31 (322 MB/s)                     0    0.00%   x 1.00 
 vec::bench_with_capacity_0100       31 (3225 MB/s)            30 (3333 MB/s)                   -1   -3.23%   x 1.03 
 vec::bench_with_capacity_1000       31 (32258 MB/s)           34 (29411 MB/s)                   3    9.68%   x 0.91 
 vec_deque::bench_grow_1025          5,754                     5,546                          -208   -3.61%   x 1.04 
 vec_deque::bench_iter_1000          1,002                     910                             -92   -9.18%   x 1.10 
 vec_deque::bench_mut_iter_1000      983                       924                             -59   -6.00%   x 1.06 
 vec_deque::bench_new                37                        36                               -1   -2.70%   x 1.03

@alexcrichton perf.rust-lang.org memory consumption results has noise levels of 20-30% (master vs master) and this change would mostly impact memory consumption. This PR does not change the Alloc implementation of jemalloc, which should specialize {grow,shrink_in_place_excess} amongst other things. That should happen when I figure out where that implementation went (I don't know where that is done anymore and haven't been able to find it).

For the run-time parts of the benchmarks, whether that's relevant or not will depend on how much does rustc use Vecs and Strings internally in the performance critical parts. For example, if we would reduce the cost of all Vec operations to zero, by how much would that impact rustc timings? AFAICT rustc uses "ids" and hashtables more often than Vecs and Strings.

@Mark-Simulacrum could you launch a perf run? Does perf-rust-lang.org provide an estimate of the measurement noise? (e.g. running the benchmarks enough times with master should give at least some estimate of the noise)

@glandium

While I think something like this is necessary, I also think that's way too much churn to add to the Alloc API. Background

I am of the opinion that all Alloc methods should return the Excess always, period. If the Excess is not used, rustc should optimize its computation out. This is possible for jemalloc, and should be possible for every allocator whether its implemented in C or in Rust. Also every allocator can return an Excess by just returning the size that was passed in, so it is trivial for anybody to write a newtype that takes an Alloc and implements Alloc by just always dropping whichever Excess the alloc methods return and returns the input size instead.

The only situation in which returning the Excess cannot be optimized out is if usable_size is a side-effecting operation, but given that it isn't in all allocators that we are using today, it suffices to add it as a requirement (taking &self is not enough due to interior mutability).

alexcrichton · 2018-05-16T14:18:36Z

@bors: try

FWIW a perf run isn't a great way to measure this, it's mostly a "hey look there's no obvious regressions by accident" sort of signal

bors · 2018-05-16T14:18:47Z

⌛ Trying commit 6e2fcbd with merge bf00fd8...

Add missing _excess methods to Alloc and use them in RawVec Adds the following methods to the `Alloc` trait: * `alloc_zeroed_excess` * `shrink_in_place_excess` * `grow_in_place_excess` Uses them in the appropriate places within the `RawVec` implementation, with the exception of `shrink_in_place_excess` which should be used in `RawVec::shrink_to_fit`, but which will be done in a different PR.

gnzlbg · 2018-05-16T15:16:24Z

cc @SimonSapin this only touches Alloc but you might want to give this a look as well

bors · 2018-05-16T16:03:03Z

☀️ Test successful - status-travis
State: approved= try=True

alexcrichton · 2018-05-16T16:05:41Z

I've scheduled a perf run, data will show up here when ready

gnzlbg · 2018-05-16T16:06:54Z

Awesome, thanks @alexcrichton

glandium · 2018-05-16T21:55:47Z

I think the discussion about Excess should happen on https://internals.rust-lang.org/t/pre-rfc-changing-the-alloc-trait/7487 . I don't totally disagree that returning Excess could be a good change, but there are serious downsides to it for the callers. Whichever conclusion happens wrt Excess, it doesn't seem to me adding more _excess methods is going to be the solution. IOW, I don't think the Alloc trait is going to look like either what it looks like today, nor what it looks like with the change from this PR, so it feels like change for the sake of change, but doesn't get us any closer to the final goal.

gnzlbg · 2018-05-16T22:33:08Z

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0901r0.html suggest the same, FWIW I implemented that same API on jemalloc (jemalloc/jemalloc#1074 (comment)) , and in some cases is much faster than doing alloc + usable_size.

glandium · 2018-05-16T22:47:24Z

Maybe what's missing is for Excess to have some conversion code, like Into<NonNull<Opaque>>, to make it easier for code that doesn't use the size.
I'll note that when I wrote in the internals thread that the compiler doesn't do a good job at optimizing the size out, I was not even talking about an implementation that uses usable_size, but one that just returned the layout size. Presumably, that can be improved.
But pretty please, can we move this discussion to internals?

gnzlbg · 2018-05-16T22:50:27Z

I also wondered whether Excess should just return a Layout. Why not alloc(Layout) -> (ptr, Layout) ? It might well be that the alignment of the allocation is higher than the requested one. Is that useful? Probably not, but who knows. Worst case, allocators that do not support this just return the input layout as is.

But pretty please, can we move this discussion to internals?

I'll try to chime in there tomorrow.

pietroalbini · 2018-05-21T11:10:54Z

Ping from triage @sfackler! What's the status of this PR?

kennytm · 2018-05-21T17:26:33Z

Perf is ready BTW. https://perf.rust-lang.org/compare.html?start=2a3f5367a23a769a068c37460db336de427c4b48&end=bf00fd830464f9ce3827d3fa3c99f5397a600fa4

alexcrichton · 2018-05-21T18:10:24Z

Great! That shows there aren't any "obvious" regressions

sfackler · 2018-05-21T23:20:42Z

src/liballoc/vec.rs

@@ -574,6 +574,10 @@ impl<T> Vec<T> {
    /// It will drop down as close as possible to the length but the allocator
    /// may still inform the vector that there is space for a few more elements.
    ///
+    /// Note: `shrink_to_fit` is a non-binding request. Whether the `Vec` size


In what way will this help someone reading the documentation?

sfackler · 2018-05-21T23:22:41Z

src/libcore/alloc.rs

+        if l <= new_size {
+            return Ok(Excess(ptr, u));
+        } else {
+            return Err(CannotReallocInPlace);


How can we end up in this situation? The allocation can always fit less data than it already contains, right?

There are two issues that interact here:

the user is not allowed to call dealloc with any size, but the size used must be in the range returned by usable_size -> (l, u). Sizes in that range must not invoke undefined behavior when passed to dealloc.

the only requirement on the sizes (l, u) returned by usable_size is that l <= u. Therefore, a valid usable_size implementation is: usable_size() -> (u, u).

Therefore, we can end here if the lower bound on the allocation size l returned by usable_size -> (l, u) is larger than new_size, which happens if usable_size() -> (u, u) and new_size < u.

If we were to return Ok in this case, the user would be allowed to call dealloc with new_size, but since that's out of the [l, u] range, that would trigger undefined behavior when calling dealloc.

Does that make sense?

FWIW I think that the current API of shrink_in_place is a bit broken. For example, malloc does not care about the allocation size on deallocation, so a valid implementation of usable_size for malloc is to return (0, size). This would allow implementing shrink_in_place / shrink_in_place_excess for malloc by just doing nothing and returning Ok, but I don't think that's a useful implementation of shrink_in_place. Even if it were to return Ok only if it was able to shrink the allocation, that might not be useful either. If I have a 1Gb allocation, and want to shrink it in place to 100 Mb, and shrink_in_place only shrinks it by 1 byte, then that's not useful.

The wording that is problematic is that shrink_in_place is allowed to return more size than the one requested, which means that it can just return the original size and do nothing.

For grow_in_place this wording makes sense, but it seems that it was c&p for shrink_in_place. So either shrink_in_place should always return the allocation size so that the user can check whether it shrunk the allocation enough, or it should take a requested size and an upper bound, and return error if it can't shrink the allocation under the upper bound.

Also while I understand the rationale behind splitting realloc_in_place into {grow,shrink}_in_place, I am starting to doubt whether that was the right call. A single realloc_in_place method would reduce API surface and simplify things even if it adds a branch for allocators that can have completely different execution paths for growing and shrinking.

In any case all of this must be discussed in the RFC issue, and not here, but those are my thoughts on the matter.

sfackler · 2018-05-21T23:23:27Z

On the other hand, it also doesn't show any particular advantage to making this change.

gnzlbg · 2018-05-22T13:39:39Z

On the other hand, it also doesn't show any particular advantage to making this change.

Yes, these changes should not have any impact on Perf at all since liballoc_jemalloc does not specialize the _excess methods anymore. One would need to run perf with a rustc compiled with jemallocator configured as the global allocator for that.

pietroalbini · 2018-05-28T08:59:48Z

Ping from triage! What's the status of this PR?

pietroalbini · 2018-06-04T10:42:31Z

Ping from triage @sfackler! What's the status of this PR?

sfackler · 2018-06-05T03:47:05Z

I'd like to see a concrete performance improvement before adding more complexity here.

gnzlbg · 2018-06-05T09:47:30Z

I'd like to see a concrete performance improvement

The liballoc_jemalloc specialization of the Alloc functions were removed form rustc, and the plan is to completely remove liballoc_jemalloc and move to jemallocator.

Jemallocator can't implement these APIs until they are part of the Alloc trait. Std can't use these APIs until they are part of the Alloc trait. Perf runs without both std using these APIs and the allocator specializing them, are meaningless beyond "this change does not introduce regressions".

So can you explain how to do a meaningful perf run from this? I can try to remove liballoc_jemalloc from rustc, include jemallocator as a 3rd party dependency, make rustc use that by default, change all of std to use the new APIs, and then report back. Is that what you mean that should be done?

glandium · 2018-06-05T10:06:29Z

Performance is not the point anyways. The point is aligning the allocated sizes from types like Vec to what the underlying allocator actually allocates.

However, I still think this adds too much API surface.

gnzlbg · 2018-06-05T10:47:24Z

I have a branch that changes the Alloc trait to use an Allocation type as part of its API (ptr + Layout) and moves from shrink/grow to realloc_in_place. The API surface of Alloc becomes the same as that of GlobalAlloc + realloc_in_place.

glandium · 2018-06-05T11:35:29Z

Did you look at the rationale when realloc_in_place was split into grow_in_place/shrink_in_place?

gnzlbg · 2018-06-05T13:40:21Z

Did you look at the rationale when realloc_in_place was split into grow_in_place/shrink_in_place?

Yes, the rationale for the split was that two different functions make intent clearer and can save a branch in the implementation of Alloc.

AFAIK there is no rationale anywhere for why the current APIs for grow_in_place/shrink_in_place were chosen. AFAICT the split was done naively which is why the current API for shrink_in_place (and shrink_in_place_excess) are not really useful / do not work.

Fixing shrink_in_place so that it can be used in RawVec's shrink_to_fit is too much work for this PR though.

FWIW nowhere was discussed whether saving a branch in Alloc was worth adding one or two more methods to Alloc's API; back then the concern of Alloc having too many methods did not exist.

The main reason I went with realloc_in_place in my branch was because that made Alloc and GlobalAlloc identical and I was hoping that we could just have one trait instead of two.

sfackler · 2018-06-05T16:15:47Z

The fact that it's annoying to determine of a change is actually beneficial or not doesn't mean we should just YOLO merge every exotic variant on allocation/reallocation/deallocation/whatever that could possibly exist. I am generally of the opinion that all of the _excess methods are a waste of effort and complexity budget.

If this change couldn't have any benefit until it's propagated through the the global allocator API, then it seems like this change should do that? You could alternatively .yank the relevant code out into a separate project and point the allocation calls to something that actually forwards to the right allocator methods for the purpose of benchmarking.

gnzlbg · 2018-06-05T16:34:04Z

What’s annoying is the that the work to determine the effect of this change was already done already but the specialization of Alloc for jemalloc was removed “YOLO-style”: no benchmarks required for that change.

pietroalbini · 2018-06-18T08:50:03Z

Ping from triage! @sfackler, what should we do with this?

gnzlbg · 2018-06-18T09:15:41Z

@pietroalbini Probably close. This is now in the limbo where the only thing I can do to benchmark this is a synthetic benchmark using jemallocator and a modified version of rustc. For example, creating one vector/string of a particular size, and then pushing one element to it. Depending on whether the vector reallocates or not, the timings vary significantly. So I can always choose a size for which if there is capacity available the excess variant takes 1ns for the push back, and the non-excess one takes at least 20ns-30ns because of the reallocation. Depending on how hard I want to make my point, I can make the vectors huge, and choose one of the known particularly bad prime number for jemalloc, so that I can pump those numbers to the 1000ns range.

But I don't know what any of that would prove. Before liballoc_jemalloc was crippled we could have used rustc + perf for this, but until we can compile rustc or servo with jemallocator again, measuring the general impact of this is just going to be hard.

SimonSapin · 2018-06-18T11:18:06Z

until we can compile rustc or servo with jemallocator again

Off-topic but you totally can do that in a Servo branch, it just won’t pass our Android CI at the moment.

pietroalbini · 2018-06-25T08:37:52Z

Ok, closing this.

add missing _excess methods to Alloc

9b01ca8

rust-highfive assigned dtolnay May 14, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 14, 2018

rust-highfive assigned sfackler and unassigned dtolnay May 14, 2018

use Excess in RawVec

3f9e3fd

gnzlbg force-pushed the exc branch from 1e41194 to 3f9e3fd Compare May 14, 2018 12:10

gnzlbg mentioned this pull request May 14, 2018

shrink_to_fit shrinks in place #50742

Closed

the behavior of Vec::shrink_to_fit is allocator dependent

6e2fcbd

kennytm added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 16, 2018

kennytm removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 21, 2018

sfackler reviewed May 21, 2018

View reviewed changes

pietroalbini added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 5, 2018

pietroalbini closed this Jun 25, 2018

Add missing _excess methods to Alloc and use them in RawVec #50738

Add missing _excess methods to Alloc and use them in RawVec #50738

Uh oh!

Conversation

gnzlbg commented May 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented May 14, 2018

Uh oh!

gnzlbg commented May 14, 2018

Uh oh!

gnzlbg commented May 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gnzlbg commented May 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented May 14, 2018

Uh oh!

sfackler commented May 14, 2018

Uh oh!

gnzlbg commented May 14, 2018

Uh oh!

sfackler commented May 14, 2018

Uh oh!

gnzlbg commented May 14, 2018

Uh oh!

glandium commented May 14, 2018

Uh oh!

alexcrichton commented May 14, 2018

Uh oh!

gnzlbg commented May 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton commented May 16, 2018

Uh oh!

bors commented May 16, 2018

Uh oh!

gnzlbg commented May 16, 2018

Uh oh!

bors commented May 16, 2018

Uh oh!

alexcrichton commented May 16, 2018

Uh oh!

gnzlbg commented May 16, 2018

Uh oh!

glandium commented May 16, 2018

Uh oh!

gnzlbg commented May 16, 2018

Uh oh!

glandium commented May 16, 2018

Uh oh!

gnzlbg commented May 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pietroalbini commented May 21, 2018

Uh oh!

kennytm commented May 21, 2018

Uh oh!

alexcrichton commented May 21, 2018

Uh oh!

sfackler May 21, 2018

Choose a reason for hiding this comment

Uh oh!

sfackler May 21, 2018

Choose a reason for hiding this comment

Uh oh!

gnzlbg May 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfackler commented May 21, 2018

Uh oh!

gnzlbg commented May 22, 2018

Uh oh!

pietroalbini commented May 28, 2018

Uh oh!

pietroalbini commented Jun 4, 2018

Uh oh!

sfackler commented Jun 5, 2018

gnzlbg commented May 14, 2018 •

edited

Loading

gnzlbg commented May 14, 2018 •

edited

Loading

gnzlbg commented May 14, 2018 •

edited

Loading

gnzlbg commented May 16, 2018 •

edited

Loading

gnzlbg commented May 16, 2018 •

edited

Loading

gnzlbg May 22, 2018 •

edited

Loading

gnzlbg commented Jun 5, 2018 •

edited

Loading