-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Maybe emit phis directly instead of using alloca #11008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @pcwalton |
Emitting an SSA-form IR without using Would we have a reason for doing this other than giving LLVM an easier time? I would imagine that LLVM would be able of promoting any alloca slot that really matters, but that's just speculation on my end. Do we know of cases where LLVM did not promote the alloca and it really should have? If I take your code example you gave and compile it with rust, LLVM promotes all of our alloca instances to temporaries (merged with phi instructions) just fine, so this would right now be purely a before-LLVM optimization rather than a overall performance-win optimization. |
@alexcrichton As far as "minimum number of phi nodes", that's not really the concern ... every block needs one for every variable that has a different value depending on which edge it's following. Even if you just emit all the phi nodes, that simplification would be easier for LLVM's optimizers than getting rid of the allocas. I'm fairly certain that this would improve compile speed (when optimization is on - when optimization is off it would probably improve runtime speed closer to what optimized output does), but it's not like this bug is important to me. |
To me I don't think we'd get a lot of compile-time speedup by emitting an "even more" SSA-form IR, especially when LLVM optimizations almost always completely dominate our compile times. This may be an interesting thing to pursue, but until we know that there are concrete benefits to emitting an IR with fewer allocas, I'm going to close this for now. |
But I think that's the point: if we can spend a little bit of time emitting a bit better IR, optimizations won't take as long. |
@cmr exactly. Since you know more about the structure of the code, it should be cheaper for you to do it. |
We used to omit phis. iirc, Chris Lattner himself told us to remove it, since that makes integration with LLVM's debug info facility better. I was personally happy to do so because it makes the code in trans easier to read, which I personally still value as much or more than tiny upticks in compilation time. But in this case the debug info case is the killer argument. |
Well, actually, I think the debug info has more to do with ensuring that all local variables are in memory. But I'm not sure. Just using phis for if/match might be ok, though as I said I think the code is easier to reason about as is. |
Don't lint code from external macros for 8 lints Fixes rust-lang#11008 changelog: [`unneeded_field_pattern`], [`redundant_pattern`], [`uneeded_wildcard_pattern`], [`double_neg`], [`separated_literal_suffix`], [`unseparated_literal_suffix`], [`mixed_case_hex_literals`], [`zero_prefixed_literal`]: Don't lint code from external macros (wow, that's one hell of a changelog for such a small change...)
It might be better to generate phis instead of emitting allocas and letting LLVM do the work in the (I think) SRoA optimization pass.
An example of what could be done (in C, without the case of taking the address of the local) is at https://gist.github.com/o11c/7995342
The general approach is:
For each block, create a phi node for each live variable.
For each jump to a block, add an alternative on the phi with the value before that jump.
Unlike C++, Rust can afford to do this in the general case because moves are always free. However, it is still complicated by the fact that there may be references to the variable.
The text was updated successfully, but these errors were encountered: