-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[x86] Unable to fold table lookup into tail call #136848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-backend-x86 Author: Hans Wennborg (zmodem)
Reported by @haberman
Consider:
Actual output:
Desired output:
(GCC can do it: https://godbolt.org/z/73rGW9M45) (For 32-bit x86 we get it right, but only for -fno-pic; for -fpic we get confused: https://godbolt.org/z/1e9s787Gr) |
The tail call instruction we want(?) is
it takes an llvm-project/llvm/lib/Target/X86/X86InstrOperands.td Lines 136 to 142 in 208257f
But neither RSI nor RDI are callee-preseved registers, so using them should be fine? The register class is
The comment says that includes Here: llvm-project/llvm/lib/Target/X86/X86RegisterInfo.cpp Lines 222 to 233 in a2c1ff1
okay, and GR64_TC looks like this: llvm-project/llvm/lib/Target/X86/X86RegisterInfo.td Lines 637 to 638 in 4cc806f
Both RSI and RDI are there, so what's the problem? |
I think this is where we would fold a tail call to TCRETURNmi64: llvm-project/llvm/lib/Target/X86/X86InstrCompiler.td Lines 1345 to 1349 in e58d227
That sounds a bit special. X86tcret_6regs is defined as: llvm-project/llvm/lib/Target/X86/X86InstrFragments.td Lines 676 to 684 in e58d227
We're not hitting that
it does not do the fold. Maybe I'm looking at the wrong thing. |
The debug log indicates it is failing |
There's this piece of code in X86DAGToDAGISel::PreprocessISelDAG that is supposed to fix up the chain to make the load foldable.
The |
Thanks for the pointer! It seems possible to coax Attaching the DAGs from The one-argument version (where we successfully move the load next to the call and fold it):
The original test case -- two arguments, where we fail to move/fold the load: |
.. or never mind. Those inputs come from |
…two args this folds more stuff, but also finds new breakages Fixes llvm#136848
Reported by @haberman
Consider:
Actual output:
Desired output:
(GCC can do it: https://godbolt.org/z/73rGW9M45)
(For 32-bit x86 we get it right, but only for -fno-pic; for -fpic we get confused: https://godbolt.org/z/1e9s787Gr)
The text was updated successfully, but these errors were encountered: