-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV] Preserve tail agnostic policy in foldVMV_V_V #105788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesThis patch helps avoid regressions in an upcoming patch by making sure we don't accidentally lose a tail agnostic policy when doing some peepholes. There are two places this happens:
Full diff: https://github.com/llvm/llvm-project/pull/105788.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp b/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
index 9772782ad3d6db..2b6ea66d0021e2 100644
--- a/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
@@ -362,8 +362,10 @@ bool RISCVVectorPeephole::convertVMergeToVMv(MachineInstr &MI) const {
MI.removeOperand(1); // Passthru operand
MI.tieOperands(0, 1); // Tie false to dest
MI.removeOperand(3); // Mask operand
- MI.addOperand(
- MachineOperand::CreateImm(RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED));
+ int64_t Policy = RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED;
+ if (PassthruReg == RISCV::NoRegister)
+ Policy |= RISCVII::TAIL_AGNOSTIC;
+ MI.addOperand(MachineOperand::CreateImm(Policy));
// vmv.v.v doesn't have a mask operand, so we may be able to inflate the
// register class for the destination and passthru operands e.g. VRNoV0 -> VR
@@ -498,10 +500,12 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) {
*Src->getParent()->getParent()));
}
- // Use a conservative tu,mu policy, RISCVInsertVSETVLI will relax it if
- // passthru is undef.
- Src->getOperand(RISCVII::getVecPolicyOpNum(Src->getDesc()))
- .setImm(RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED);
+ // If MI had a ta policy and the VL didn't increase, we can preserve it.
+ int64_t Policy = RISCVII::TAIL_UNDISTURBED_MASK_UNDISTURBED;
+ if ((MI.getOperand(5).getImm() & RISCVII::TAIL_AGNOSTIC) &&
+ isVLKnownLE(MI.getOperand(3), SrcVL))
+ Policy |= RISCVII::TAIL_AGNOSTIC;
+ Src->getOperand(RISCVII::getVecPolicyOpNum(Src->getDesc())).setImm(Policy);
MRI->replaceRegWith(MI.getOperand(0).getReg(), Src->getOperand(0).getReg());
MI.eraseFromParent();
diff --git a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-to-vmv.mir b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-to-vmv.mir
index 01fff3de0aa8bd..9080fcfb0f0420 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-to-vmv.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-to-vmv.mir
@@ -15,7 +15,7 @@ body: |
; CHECK-NEXT: %avl:gprnox0 = COPY $x1
; CHECK-NEXT: %mask:vmv0 = PseudoVMSET_M_B8 %avl, 5 /* e32 */
; CHECK-NEXT: $v0 = COPY %mask
- ; CHECK-NEXT: %x:vr = PseudoVMV_V_V_M1 %false, %true, %avl, 5 /* e32 */, 0 /* tu, mu */
+ ; CHECK-NEXT: %x:vr = PseudoVMV_V_V_M1 %false, %true, %avl, 5 /* e32 */, 1 /* ta, mu */
%false:vr = COPY $v8
%true:vr = COPY $v9
%avl:gprnox0 = COPY $x1
diff --git a/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.mir b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.mir
new file mode 100644
index 00000000000000..a0edfa96d62485
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.mir
@@ -0,0 +1,18 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc %s -o - -mtriple=riscv64 -mattr=+v -run-pass=riscv-vector-peephole \
+# RUN: -verify-machineinstrs | FileCheck %s
+
+---
+name: tail_agnostic
+body: |
+ bb.0:
+ liveins: $v8
+ ; CHECK-LABEL: name: tail_agnostic
+ ; CHECK: liveins: $v8
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: %passthru:vr = COPY $v8
+ ; CHECK-NEXT: %x:vr = PseudoVADD_VV_M1 %passthru, $noreg, $noreg, 4, 5 /* e32 */, 1 /* ta, mu */
+ %passthru:vr = COPY $v8
+ %x:vr = PseudoVADD_VV_M1 %passthru, $noreg, $noreg, 4, 5 /* e32 */, 0 /* tu, mu */
+ %y:vr = PseudoVMV_V_V_M1 %passthru, %x, 4, 5 /* e32 */, 1 /* ta, mu */
+...
|
@@ -498,10 +500,12 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) { | |||
*Src->getParent()->getParent())); | |||
} | |||
|
|||
// Use a conservative tu,mu policy, RISCVInsertVSETVLI will relax it if | |||
// passthru is undef. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not stand for cases in the new test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still stands, but the passthru isn't undef in the new test so after RISCVInsertVSETVLI a tu policy gets emitted, even though the original policy was ta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this thread very confusing, so let me comment that I talked to Luke and the original comment never held. We thought it did, but we made a mistake in our reasoning. The missed optimization was simply never visible in practice without the follow up changes.
Can you say a bit about why this helps? It seems slightly odd to me that we need to preserve the policy here. How does doing so fit into your next step? |
I'm trying to move the "fold a vmerge into a masked pseudo, where the masks are the same" bit of RISCVDAGToDAGISel::performCombineVMergeAndVOps to RISCVVectorPeephole. We can do this by extending convertVMergeToVMv to detect when the masks are the same, so the vmerge is converted to a vmv.v.v, and then foldVMV_V_V will do the folding. But if the vmerge's passthru was undef, then the tail is always undef and we can use a ta policy, which is what we currently do in RISCVISelDAGToDAG. When moving this to convertVMergeToVMv, the intermediate vmv.v.v ends up taking the false operand as its passthru which may be non-undef, but the original passthru might have been undef in which case we lose this information. And then separately, we don't propagate the policy in foldVMV_V_V. This is probably easier to see in the aforementioned patch itself, so I'll post that and point out the regressions that this tries to fix |
Posted #106108 to showcase the regressions that this fixes (the code changes themselves are very scrappy, please ignore!) |
@@ -15,7 +15,7 @@ body: | | |||
; CHECK-NEXT: %avl:gprnox0 = COPY $x1 | |||
; CHECK-NEXT: %mask:vmv0 = PseudoVMSET_M_B8 %avl, 5 /* e32 */ | |||
; CHECK-NEXT: $v0 = COPY %mask | |||
; CHECK-NEXT: %x:vr = PseudoVMV_V_V_M1 %false, %true, %avl, 5 /* e32 */, 0 /* tu, mu */ | |||
; CHECK-NEXT: %x:vr = PseudoVMV_V_V_M1 %false, %true, %avl, 5 /* e32 */, 1 /* ta, mu */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to have %false as the passthru here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to, I think we can just use the passthru given that all the elements come from true or the tail (passthru). I think this is just a detail from the initial implementation in https://reviews.llvm.org/D133255
78b2287
to
7bab2f5
Compare
Rebased to add an LLVM IR test case showing the codegen diff for the vmerge.vvm -> vmv.v.v peephole |
vmerge.vvm needs to have an all ones mask, so nothing is taken from the false operand. So instead of checking that the passthru is the same as false, just use the passthru directly for the tail elements. This supersedes the convertVMergeToVMv part of llvm#105788, as noted in https://github.com/llvm/llvm-project/pull/105788/files#r1731683971
…106688) vmerge.vvm needs to have an all ones mask, so nothing is taken from the false operand. So instead of checking that the passthru is the same as false, just use the passthru directly for the tail elements. This supersedes the convertVMergeToVMv part of #105788, as noted in https://github.com/llvm/llvm-project/pull/105788/files#r1731683971
This patch helps avoid regressions in an upcoming patch by making sure we don't accidentally lose a tail agnostic policy when folding a vmv.v.v into its source. The previous comment about RISCVInsertVSETVLI relaxing the policy didn't take into account the fact that there's a policy operand on vmv.v.v, which can be tail agnostic. If the tail is agnostic (via either the policy operand or the passthru being undef) and vmv.v.v's VL <= Src's VL, then Src's tail can be made agnostic.
7bab2f5
to
d1d74fd
Compare
After #106688 I've narrowed this PR to only handle the policy in foldVMV_V_V |
If a PseudoVMV_V_V's passthru is undef then we don't need the Src to have the same passthru, nor do we need to check its VL. The tail policy in these tests is still tu, llvm#105788 should fix this separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -498,10 +500,12 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) { | |||
*Src->getParent()->getParent())); | |||
} | |||
|
|||
// Use a conservative tu,mu policy, RISCVInsertVSETVLI will relax it if | |||
// passthru is undef. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this thread very confusing, so let me comment that I talked to Luke and the original comment never held. We thought it did, but we made a mistake in our reasoning. The missed optimization was simply never visible in practice without the follow up changes.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/66/builds/3452 Here is the relevant piece of the build log for the reference
|
This patch helps avoid regressions in an upcoming patch by making sure we don't accidentally lose a tail agnostic policy when folding a vmv.v.v into its source.
The previous comment about RISCVInsertVSETVLI relaxing the policy didn't take into account the fact that there's a policy operand on vmv.v.v, which can be tail agnostic.
If the tail is agnostic (via either the policy operand or the passthru being undef) and vmv.v.v's VL <= Src's VL, then Src's tail can be made agnostic.