-
Notifications
You must be signed in to change notification settings - Fork 13.4k
replace the system allocator in executables #18915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This provides one piece of the puzzle but there are various ways this could be improved. It is possible to replace the platform allocator on OS X / Windows but it's not as simple as overriding weak symbols. It could also use symbol aliases instead of wrappers in the case where liballoc is being statically linked. Future improvementsIt would be nice if an alternate global allocator could be dropped in at runtime (dropping in an asserts build of jemalloc or another allocator), but it's not yet clear if there's a good way to do it beyond 2 layers of indirection (wrapper functions marked as weak). On some platforms like Windows, mixed allocator usage is outweighed by the gains of using a better allocator in Rust code. However, it's a performance / memory usage loss relative to using a single great allocator in both Rust and C due to various forms of fragmentation. On FreeBSD, the system allocator is jemalloc so Rust should avoid bundling it there in the future. |
👍 🍰 |
@thestinger Thanks for following up with this patch. I can't offhand think of any complications this might cause so I'm tentatively in favor. I imagine the ultimate design here might change, along with changes to both It's worth noting that even after this patch rustc's memory usage will not be improved on OS X or Windows. |
Actually, @thestinger will this result in overriding malloc on Mac via the Zone allocator API? What happens on Windows with this patch? |
Oh, duh. This specifically targets Linux so has no effect on OS X and Windows. |
@brson My understanding from the previous debacle is that Rust currently already overrides the allocator on OS X. |
|
||
void *mallocx(size_t size, int flags) { | ||
return je_mallocx(size, flags); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this reexports a number of jemalloc symbols without the je_
prefix? I would expect the standard libc weak symbols to be exposed, but the jemalloc symbols aren't able to be overridden, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of the public symbols defined by jemalloc are weak symbols. I'm exporting these to address the demand that mallocx
be usable as it is in vanilla jemalloc with no prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this "demand" a little more? This is basically one of the possible shims rustc can inject, and the purpose is to override the system malloc/free, and I am unaware of the desire to export jemalloc-specific symbols as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's necessary for Rust's jemalloc to satisfy the needs of third party code calling into jemalloc. That was the primary argument against the last pull request...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Rust doesn't do this, then third party code using jemalloc cannot be used. C libraries don't usually have versions in the symbol names, so you can't just have multiple copies living side-by-side without problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that this "third party code" was primarily code in other processes that Rust itself was linked into. Either via a staticlib, dylib, or dlopen()'d dylib. Within a Rust executable itself (which this PR is focused on), however, I don't think that this would help too much. Libraries should be written knowing that the allocator is not their decision, and should plan appropriately (not relying on an upstream definition of jemalloc). Native code linked into an executable cannot rely on the existence of these symbols as the compiler is the one choosing whether to link in jemalloc or not, not the code itself.
Note that I'm just at this from the perspective of having this shim be as small as possible. I'd rather stick to well-known standardized apis like malloc than duplicate the nonstandard apis of jemalloc. If these were to fall out of sync with the jemalloc definitions, then I imagine badness could ensue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that this "third party code" was primarily code in other processes that Rust itself was linked into.
If that code depends on jemalloc, then it will need to be using Rust's jemalloc.
I was under the impression that this "third party code" was primarily code in other processes that Rust itself was linked into. Either via a staticlib, dylib, or dlopen()'d dylib. Within a Rust executable itself (which this PR is focused on), however, I don't think that this would help too much. Libraries should be written knowing that the allocator is not their decision, and should plan appropriately (not relying on an upstream definition of jemalloc). Native code linked into an executable cannot rely on the existence of these symbols as the compiler is the one choosing whether to link in jemalloc or not, not the code itself.
The only argument against the previous one was that it would break code relying on mixing mallocx
and free
. The previous pull request was simpler and didn't have the added overhead of these wrapper functions. I'll just reopen it in favour of this one if that dubious argument has been abandoned.
I'd rather stick to well-known standardized apis like malloc than duplicate the nonstandard apis of jemalloc.
They are not "duplicated" in any way. It is manually removing the prefix because you rejected my pull request doing this the easy and low-overhead way by using the default configuration.
If these were to fall out of sync with the jemalloc definitions, then I imagine badness could ensue.
It's a stable API. There was a long deprecation period for the old experimental API before the shift to this one.
I've looked this over and I've just got one question about the number of reexported symbols, but otherwise looks good to me. |
This adds support for replacing the system allocator with jemalloc by overriding the weak symbols on Linux (including Android). It is disabled by default with #![no_std] and can be toggled via a compiler switch. It will be possible to extend this to other platforms in the future. This results in a performance improvement for memory allocation in C along with reduced fragmentation. For example, the time spent on LLVM passes in the Rust compiler on Linux is cut by 10% and peak memory usage is reduced by 15%. Closes #18896
Is third party code mixing |
I've discussed this with @nikomatsakis and I would be ok for now with adding a comment to the C file explicitly stating that the reexportation of any non-libc symbol is experimental and may change in the future, can you add a comment to that effect? |
Closing due to inactivity, but feel free to reopen with my comment addressed! |
This adds support for replacing the system allocator with jemalloc by
overriding the weak symbols on Linux (including Android). It is disabled
by default with #![no_std] and can be toggled via a compiler switch. It
will be possible to extend this to other platforms in the future.
This results in a performance improvement for memory allocation in C
along with reduced fragmentation. For example, the time spent on LLVM
passes in the Rust compiler on Linux is cut by 10% and peak memory usage
is reduced by 15%.
Closes #18896