-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[RISCV] Teach RISCVInsertVSETVLI to work without LiveIntervals #93796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
At O0 LiveIntervals isn't usually computed, but after moving RISCVInsertVSETVLI to after phi elimination we've added it as a dependency. This removes the dependency on LiveIntervals so O0 builds can stay fast, and also avoids a crash caused by LiveIntervals being run at O0 when I'm not sure if ti was designed to be. The only parts of RISCVInsertVSETVLI that actually use the LiveIntervals analysis are computeInfoForInstr and getInfoForVSETVLI, which lookup val nos for each register AVL use. At O0 we can emulate this by conservatively returning fake but unique val nos, which will give us suboptimal but otherwise correct codegen. We need to make sure that we return the same val no given the same MachineInstr, otherwise computeInfoForInstr/getInforForVSETVLI won't be stable across MachineInstrs and the dataflow analysis will fail to converge. Fixes llvm#93587
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesAt O0 LiveIntervals isn't usually computed, but after moving RISCVInsertVSETVLI to after phi elimination we've added it as a dependency. This removes the dependency on LiveIntervals so O0 builds can stay fast, and also avoids a crash caused by LiveIntervals being run at O0 when I'm not sure if ti was designed to be. The only parts of RISCVInsertVSETVLI that actually use the LiveIntervals analysis are computeInfoForInstr and getInfoForVSETVLI, which lookup val nos for each register AVL use. At O0 we can emulate this by conservatively returning fake but unique val nos, which will give us suboptimal but otherwise correct codegen. We need to make sure that we return the same val no given the same MachineInstr, otherwise computeInfoForInstr/getInforForVSETVLI won't be stable across MachineInstrs and the dataflow analysis will fail to converge. Fixes #93587 Patch is 20.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93796.diff 4 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index 4c57eecd8465d..081ac621c17d7 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -580,6 +580,8 @@ class VSETVLIInfo {
// boundary slot.
const MachineInstr *getAVLDefMI(const LiveIntervals *LIS) const {
assert(hasAVLReg());
+ if (!LIS)
+ return nullptr;
const VNInfo *VNI =
LIS->getInterval(getAVLReg()).getValNumInfo(getAVLValNo());
auto *MI = LIS->getInstructionFromIndex(VNI->def);
@@ -871,9 +873,9 @@ class RISCVInsertVSETVLI : public MachineFunctionPass {
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesCFG();
- AU.addRequired<LiveIntervals>();
+ AU.addUsedIfAvailable<LiveIntervals>();
AU.addPreserved<LiveIntervals>();
- AU.addRequired<SlotIndexes>();
+ AU.addUsedIfAvailable<SlotIndexes>();
AU.addPreserved<SlotIndexes>();
AU.addPreserved<LiveDebugVariables>();
AU.addPreserved<LiveStacks>();
@@ -885,30 +887,47 @@ class RISCVInsertVSETVLI : public MachineFunctionPass {
private:
bool needVSETVLI(const DemandedFields &Used, const VSETVLIInfo &Require,
- const VSETVLIInfo &CurInfo) const;
- bool needVSETVLIPHI(const VSETVLIInfo &Require,
- const MachineBasicBlock &MBB) const;
+ const VSETVLIInfo &CurInfo);
+ bool needVSETVLIPHI(const VSETVLIInfo &Require, const MachineBasicBlock &MBB);
void insertVSETVLI(MachineBasicBlock &MBB, MachineInstr &MI,
const VSETVLIInfo &Info, const VSETVLIInfo &PrevInfo);
void insertVSETVLI(MachineBasicBlock &MBB,
MachineBasicBlock::iterator InsertPt, DebugLoc DL,
const VSETVLIInfo &Info, const VSETVLIInfo &PrevInfo);
- void transferBefore(VSETVLIInfo &Info, const MachineInstr &MI) const;
- void transferAfter(VSETVLIInfo &Info, const MachineInstr &MI) const;
- bool computeVLVTYPEChanges(const MachineBasicBlock &MBB,
- VSETVLIInfo &Info) const;
+ void transferBefore(VSETVLIInfo &Info, const MachineInstr &MI);
+ void transferAfter(VSETVLIInfo &Info, const MachineInstr &MI);
+ bool computeVLVTYPEChanges(const MachineBasicBlock &MBB, VSETVLIInfo &Info);
void computeIncomingVLVTYPE(const MachineBasicBlock &MBB);
void emitVSETVLIs(MachineBasicBlock &MBB);
void doPRE(MachineBasicBlock &MBB);
void insertReadVL(MachineBasicBlock &MBB);
bool canMutatePriorConfig(const MachineInstr &PrevMI, const MachineInstr &MI,
- const DemandedFields &Used) const;
- void coalesceVSETVLIs(MachineBasicBlock &MBB) const;
-
- VSETVLIInfo getInfoForVSETVLI(const MachineInstr &MI) const;
- VSETVLIInfo computeInfoForInstr(const MachineInstr &MI) const;
+ const DemandedFields &Used);
+ void coalesceVSETVLIs(MachineBasicBlock &MBB);
+
+ VSETVLIInfo getInfoForVSETVLI(const MachineInstr &MI);
+ VSETVLIInfo computeInfoForInstr(const MachineInstr &MI);
+
+ // At O0 LiveIntervals isn't available, but we still need to lookup a val no
+ // for each register AVL use. So generate these conservative dummy val nos
+ // that ensure each register use has a unique val no + register combination.
+ //
+ // They're tied to a MachineInstr since we need computeInfoForInstr and
+ // getInfoForVSETVLI to be stable across MachineInstrs.
+ DenseMap<const MachineInstr *, unsigned> DummyValNos;
+ unsigned NextDummyValNo = 0;
+ unsigned getDummyValNo(const MachineInstr *MI) {
+ unsigned ValNo;
+ if (DummyValNos.contains(MI))
+ ValNo = DummyValNos[MI];
+ else {
+ ValNo = NextDummyValNo++;
+ DummyValNos[MI] = ValNo;
+ }
+ return ValNo;
+ }
};
} // end anonymous namespace
@@ -921,8 +940,7 @@ INITIALIZE_PASS(RISCVInsertVSETVLI, DEBUG_TYPE, RISCV_INSERT_VSETVLI_NAME,
// Return a VSETVLIInfo representing the changes made by this VSETVLI or
// VSETIVLI instruction.
-VSETVLIInfo
-RISCVInsertVSETVLI::getInfoForVSETVLI(const MachineInstr &MI) const {
+VSETVLIInfo RISCVInsertVSETVLI::getInfoForVSETVLI(const MachineInstr &MI) {
VSETVLIInfo NewInfo;
if (MI.getOpcode() == RISCV::PseudoVSETIVLI) {
NewInfo.setAVLImm(MI.getOperand(1).getImm());
@@ -934,6 +952,8 @@ RISCVInsertVSETVLI::getInfoForVSETVLI(const MachineInstr &MI) const {
"Can't handle X0, X0 vsetvli yet");
if (AVLReg == RISCV::X0)
NewInfo.setAVLVLMAX();
+ else if (!LIS)
+ NewInfo.setAVLRegDef(getDummyValNo(&MI), AVLReg);
else if (VNInfo *VNI = getVNInfoFromReg(AVLReg, MI, LIS))
NewInfo.setAVLRegDef(VNI->id, AVLReg);
else {
@@ -956,8 +976,7 @@ static unsigned computeVLMAX(unsigned VLEN, unsigned SEW,
return VLEN/SEW;
}
-VSETVLIInfo
-RISCVInsertVSETVLI::computeInfoForInstr(const MachineInstr &MI) const {
+VSETVLIInfo RISCVInsertVSETVLI::computeInfoForInstr(const MachineInstr &MI) {
VSETVLIInfo InstrInfo;
const uint64_t TSFlags = MI.getDesc().TSFlags;
@@ -1010,6 +1029,8 @@ RISCVInsertVSETVLI::computeInfoForInstr(const MachineInstr &MI) const {
}
else
InstrInfo.setAVLImm(Imm);
+ } else if (!LIS) {
+ InstrInfo.setAVLRegDef(getDummyValNo(&MI), VLOp.getReg());
} else if (VNInfo *VNI = getVNInfoFromReg(VLOp.getReg(), MI, LIS)) {
InstrInfo.setAVLRegDef(VNI->id, VLOp.getReg());
} else {
@@ -1068,7 +1089,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addReg(RISCV::X0, RegState::Kill)
.addImm(Info.encodeVTYPE())
.addReg(RISCV::VL, RegState::Implicit);
- LIS->InsertMachineInstrInMaps(*MI);
+ if (LIS)
+ LIS->InsertMachineInstrInMaps(*MI);
return;
}
@@ -1085,7 +1107,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addReg(RISCV::X0, RegState::Kill)
.addImm(Info.encodeVTYPE())
.addReg(RISCV::VL, RegState::Implicit);
- LIS->InsertMachineInstrInMaps(*MI);
+ if (LIS)
+ LIS->InsertMachineInstrInMaps(*MI);
return;
}
}
@@ -1097,7 +1120,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addReg(RISCV::X0, RegState::Define | RegState::Dead)
.addImm(Info.getAVLImm())
.addImm(Info.encodeVTYPE());
- LIS->InsertMachineInstrInMaps(*MI);
+ if (LIS)
+ LIS->InsertMachineInstrInMaps(*MI);
return;
}
@@ -1111,7 +1135,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addReg(RISCV::X0, RegState::Kill)
.addImm(Info.encodeVTYPE())
.addReg(RISCV::VL, RegState::Implicit);
- LIS->InsertMachineInstrInMaps(*MI);
+ if (LIS)
+ LIS->InsertMachineInstrInMaps(*MI);
return;
}
// Otherwise use an AVL of 1 to avoid depending on previous vl.
@@ -1119,7 +1144,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addReg(RISCV::X0, RegState::Define | RegState::Dead)
.addImm(1)
.addImm(Info.encodeVTYPE());
- LIS->InsertMachineInstrInMaps(*MI);
+ if (LIS)
+ LIS->InsertMachineInstrInMaps(*MI);
return;
}
@@ -1129,8 +1155,10 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addReg(DestReg, RegState::Define | RegState::Dead)
.addReg(RISCV::X0, RegState::Kill)
.addImm(Info.encodeVTYPE());
- LIS->InsertMachineInstrInMaps(*MI);
- LIS->createAndComputeVirtRegInterval(DestReg);
+ if (LIS) {
+ LIS->InsertMachineInstrInMaps(*MI);
+ LIS->createAndComputeVirtRegInterval(DestReg);
+ }
return;
}
@@ -1140,19 +1168,21 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
.addReg(RISCV::X0, RegState::Define | RegState::Dead)
.addReg(AVLReg)
.addImm(Info.encodeVTYPE());
- LIS->InsertMachineInstrInMaps(*MI);
- // Normally the AVL's live range will already extend past the inserted vsetvli
- // because the pseudos below will already use the AVL. But this isn't always
- // the case, e.g. PseudoVMV_X_S doesn't have an AVL operand.
- LIS->getInterval(AVLReg).extendInBlock(
- LIS->getMBBStartIdx(&MBB), LIS->getInstructionIndex(*MI).getRegSlot());
+ if (LIS) {
+ LIS->InsertMachineInstrInMaps(*MI);
+ // Normally the AVL's live range will already extend past the inserted
+ // vsetvli because the pseudos below will already use the AVL. But this
+ // isn't always the case, e.g. PseudoVMV_X_S doesn't have an AVL operand.
+ LIS->getInterval(AVLReg).extendInBlock(
+ LIS->getMBBStartIdx(&MBB), LIS->getInstructionIndex(*MI).getRegSlot());
+ }
}
/// Return true if a VSETVLI is required to transition from CurInfo to Require
/// given a set of DemandedFields \p Used.
bool RISCVInsertVSETVLI::needVSETVLI(const DemandedFields &Used,
const VSETVLIInfo &Require,
- const VSETVLIInfo &CurInfo) const {
+ const VSETVLIInfo &CurInfo) {
if (!CurInfo.isValid() || CurInfo.isUnknown() || CurInfo.hasSEWLMULRatioOnly())
return true;
@@ -1197,7 +1227,7 @@ static VSETVLIInfo adjustIncoming(VSETVLIInfo PrevInfo, VSETVLIInfo NewInfo,
// is compatible with MI. The resulting state is guaranteed to be semantically
// legal for MI, but may not be the state requested by MI.
void RISCVInsertVSETVLI::transferBefore(VSETVLIInfo &Info,
- const MachineInstr &MI) const {
+ const MachineInstr &MI) {
if (!RISCVII::hasSEWOp(MI.getDesc().TSFlags))
return;
@@ -1250,7 +1280,7 @@ void RISCVInsertVSETVLI::transferBefore(VSETVLIInfo &Info,
// this might be different that the state MI requested), modify the state to
// reflect the changes MI might make.
void RISCVInsertVSETVLI::transferAfter(VSETVLIInfo &Info,
- const MachineInstr &MI) const {
+ const MachineInstr &MI) {
if (isVectorConfigInstr(MI)) {
Info = getInfoForVSETVLI(MI);
return;
@@ -1259,10 +1289,14 @@ void RISCVInsertVSETVLI::transferAfter(VSETVLIInfo &Info,
if (RISCV::isFaultFirstLoad(MI)) {
// Update AVL to vl-output of the fault first load.
assert(MI.getOperand(1).getReg().isVirtual());
- auto &LI = LIS->getInterval(MI.getOperand(1).getReg());
- SlotIndex SI = LIS->getSlotIndexes()->getInstructionIndex(MI).getRegSlot();
- VNInfo *VNI = LI.getVNInfoAt(SI);
- Info.setAVLRegDef(VNI->id, MI.getOperand(1).getReg());
+ if (LIS) {
+ auto &LI = LIS->getInterval(MI.getOperand(1).getReg());
+ SlotIndex SI =
+ LIS->getSlotIndexes()->getInstructionIndex(MI).getRegSlot();
+ VNInfo *VNI = LI.getVNInfoAt(SI);
+ Info.setAVLRegDef(VNI->id, MI.getOperand(1).getReg());
+ } else
+ Info.setUnknown();
return;
}
@@ -1275,7 +1309,7 @@ void RISCVInsertVSETVLI::transferAfter(VSETVLIInfo &Info,
}
bool RISCVInsertVSETVLI::computeVLVTYPEChanges(const MachineBasicBlock &MBB,
- VSETVLIInfo &Info) const {
+ VSETVLIInfo &Info) {
bool HadVectorOp = false;
Info = BlockInfo[MBB.getNumber()].Pred;
@@ -1349,13 +1383,16 @@ void RISCVInsertVSETVLI::computeIncomingVLVTYPE(const MachineBasicBlock &MBB) {
// be unneeded if the AVL was a phi node where all incoming values are VL
// outputs from the last VSETVLI in their respective basic blocks.
bool RISCVInsertVSETVLI::needVSETVLIPHI(const VSETVLIInfo &Require,
- const MachineBasicBlock &MBB) const {
+ const MachineBasicBlock &MBB) {
if (DisableInsertVSETVLPHIOpt)
return true;
if (!Require.hasAVLReg())
return true;
+ if (!LIS)
+ return true;
+
// We need the AVL to have been produced by a PHI node in this basic block.
const VNInfo *Valno = LIS->getInterval(Require.getAVLReg())
.getValNumInfo(Require.getAVLValNo());
@@ -1432,27 +1469,30 @@ void RISCVInsertVSETVLI::emitVSETVLIs(MachineBasicBlock &MBB) {
MachineOperand &VLOp = MI.getOperand(getVLOpNum(MI));
if (VLOp.isReg()) {
Register Reg = VLOp.getReg();
- LiveInterval &LI = LIS->getInterval(Reg);
// Erase the AVL operand from the instruction.
VLOp.setReg(RISCV::NoRegister);
VLOp.setIsKill(false);
- SmallVector<MachineInstr *> DeadMIs;
- LIS->shrinkToUses(&LI, &DeadMIs);
- // We might have separate components that need split due to
- // needVSETVLIPHI causing us to skip inserting a new VL def.
- SmallVector<LiveInterval *> SplitLIs;
- LIS->splitSeparateComponents(LI, SplitLIs);
-
- // If the AVL was an immediate > 31, then it would have been emitted
- // as an ADDI. However, the ADDI might not have been used in the
- // vsetvli, or a vsetvli might not have been emitted, so it may be
- // dead now.
- for (MachineInstr *DeadMI : DeadMIs) {
- if (!TII->isAddImmediate(*DeadMI, Reg))
- continue;
- LIS->RemoveMachineInstrFromMaps(*DeadMI);
- DeadMI->eraseFromParent();
+
+ if (LIS) {
+ LiveInterval &LI = LIS->getInterval(Reg);
+ SmallVector<MachineInstr *> DeadMIs;
+ LIS->shrinkToUses(&LI, &DeadMIs);
+ // We might have separate components that need split due to
+ // needVSETVLIPHI causing us to skip inserting a new VL def.
+ SmallVector<LiveInterval *> SplitLIs;
+ LIS->splitSeparateComponents(LI, SplitLIs);
+
+ // If the AVL was an immediate > 31, then it would have been emitted
+ // as an ADDI. However, the ADDI might not have been used in the
+ // vsetvli, or a vsetvli might not have been emitted, so it may be
+ // dead now.
+ for (MachineInstr *DeadMI : DeadMIs) {
+ if (!TII->isAddImmediate(*DeadMI, Reg))
+ continue;
+ LIS->RemoveMachineInstrFromMaps(*DeadMI);
+ DeadMI->eraseFromParent();
+ }
}
}
MI.addOperand(MachineOperand::CreateReg(RISCV::VL, /*isDef*/ false,
@@ -1588,9 +1628,9 @@ void RISCVInsertVSETVLI::doPRE(MachineBasicBlock &MBB) {
// Return true if we can mutate PrevMI to match MI without changing any the
// fields which would be observed.
-bool RISCVInsertVSETVLI::canMutatePriorConfig(
- const MachineInstr &PrevMI, const MachineInstr &MI,
- const DemandedFields &Used) const {
+bool RISCVInsertVSETVLI::canMutatePriorConfig(const MachineInstr &PrevMI,
+ const MachineInstr &MI,
+ const DemandedFields &Used) {
// If the VL values aren't equal, return false if either a) the former is
// demanded, or b) we can't rewrite the former to be the later for
// implementation reasons.
@@ -1623,7 +1663,7 @@ bool RISCVInsertVSETVLI::canMutatePriorConfig(
return areCompatibleVTYPEs(PriorVType, VType, Used);
}
-void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
+void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) {
MachineInstr *NextMI = nullptr;
// We can have arbitrary code in successors, so VL and VTYPE
// must be considered demanded.
@@ -1661,7 +1701,7 @@ void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
// The def of DefReg moved to MI, so extend the LiveInterval up to
// it.
- if (DefReg.isVirtual()) {
+ if (LIS && DefReg.isVirtual()) {
LiveInterval &DefLI = LIS->getInterval(DefReg);
SlotIndex MISlot = LIS->getInstructionIndex(MI).getRegSlot();
VNInfo *DefVNI = DefLI.getVNInfoAt(DefLI.beginIndex());
@@ -1688,7 +1728,7 @@ void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
if (NextMI->getOperand(1).isReg())
NextMI->getOperand(1).setReg(RISCV::NoRegister);
- if (OldVLReg && OldVLReg.isVirtual()) {
+ if (LIS && OldVLReg && OldVLReg.isVirtual()) {
// NextMI no longer uses OldVLReg so shrink its LiveInterval.
LIS->shrinkToUses(&LIS->getInterval(OldVLReg));
@@ -1712,7 +1752,8 @@ void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
NumCoalescedVSETVL += ToDelete.size();
for (auto *MI : ToDelete) {
- LIS->RemoveMachineInstrFromMaps(*MI);
+ if (LIS)
+ LIS->RemoveMachineInstrFromMaps(*MI);
MI->eraseFromParent();
}
}
@@ -1726,13 +1767,16 @@ void RISCVInsertVSETVLI::insertReadVL(MachineBasicBlock &MBB) {
if (!MI.getOperand(1).isDead()) {
auto ReadVLMI = BuildMI(MBB, I, MI.getDebugLoc(),
TII->get(RISCV::PseudoReadVL), VLOutput);
+
// Move the LiveInterval's definition down to PseudoReadVL.
- SlotIndex NewDefSI =
- LIS->InsertMachineInstrInMaps(*ReadVLMI).getRegSlot();
- LiveInterval &DefLI = LIS->getInterval(VLOutput);
- VNInfo *DefVNI = DefLI.getVNInfoAt(DefLI.beginIndex());
- DefLI.removeSegment(DefLI.beginIndex(), NewDefSI);
- DefVNI->def = NewDefSI;
+ if (LIS) {
+ SlotIndex NewDefSI =
+ LIS->InsertMachineInstrInMaps(*ReadVLMI).getRegSlot();
+ LiveInterval &DefLI = LIS->getInterval(VLOutput);
+ VNInfo *DefVNI = DefLI.getVNInfoAt(DefLI.beginIndex());
+ DefLI.removeSegment(DefLI.beginIndex(), NewDefSI);
+ DefVNI->def = NewDefSI;
+ }
}
// We don't use the vl output of the VLEFF/VLSEGFF anymore.
MI.getOperand(1).setReg(RISCV::X0);
@@ -1750,7 +1794,7 @@ bool RISCVInsertVSETVLI::runOnMachineFunction(MachineFunction &MF) {
TII = ST->getInstrInfo();
MRI = &MF.getRegInfo();
- LIS = &getAnalysis<LiveIntervals>();
+ LIS = getAnalysisIfAvailable<LiveIntervals>();
assert(BlockInfo.empty() && "Expect empty block infos");
BlockInfo.resize(MF.getNumBlockIDs());
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index ef7a8f2c7bbee..a9d9ab75143ed 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -46,9 +46,6 @@
; CHECK-NEXT: Eliminate PHI nodes for register allocation
; CHECK-NEXT: Two-Address instruction pass
; CHECK-NEXT: Fast Register Allocator
-; CHECK-NEXT: MachineDominator Tree Construction
-; CHECK-NEXT: Slot index numbering
-; CHECK-NEXT: Live Interval Analysis
; CHECK-NEXT: RISC-V Insert VSETVLI pass
; CHECK-NEXT: Fast Register Allocator
; CHECK-NEXT: Remove Redundant DEBUG_VALUE analysis
diff --git a/llvm/test/CodeGen/RISCV/pr93587.ll b/llvm/test/CodeGen/RISCV/pr93587.ll
new file mode 100644
index 0000000000000..1c2923a2de893
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/pr93587.ll
@@ -0,0 +1,37 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv64 -O0 < %s | FileCheck %s
+
+; Make sure we don't run LiveIntervals at O0, otherwise it will crash when
+; running on this unreachable block.
+
+define i16 @f() {
+; CHECK-LABEL: f:
+; CHECK: # %bb.0: # %BB
+; CHECK-NEXT: addi sp, sp, -16
+; CHECK-NEXT: .cfi_def_cfa_offset 16
+; CHECK-NEXT: j .LBB0_1
+; CHECK-NEXT: .LBB0_1: # %BB1
+; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: li a0, 0
+; CHECK-NEXT: sd a0, 8(sp) # 8-byte Folded Spill
+; CHECK-NEXT: j .LBB0_1
+; CHECK-NEXT: # %bb.2: # %BB1
+; CHECK-NEXT: li a0, 0
+; CHECK-NEXT: bnez a0, .LBB0_1
+; CHECK-NEXT: j .LBB0_3
+; CHECK-NEXT: .LBB0_3: # %BB2
+; CHECK-NEXT: ld a0, 8(sp) # 8-byte Folded Reload
+; CHECK-NEXT: addi sp, sp, 16
+; CHECK-NEXT: ret
+BB:
+ br label %BB1
+
+BB1:
+ %A = or i16 0, 0
+ %B = fcmp true flo...
[truncated]
|
Mentioning this as an alternative fix for the crash in #93587, we could also try running the remove unreachable machine block pass at O0 to keep LiveIntervalsAnalysis happy. |
I have an alternate suggestion here. I'm going to describe this with the representation before 4b4d366 as I think that makes things harder to understand. My alternative basically comes down to "let ValNo be nullptr", and update uses of this value, not the definitions with some synthetic value (this patch). getAVLDefMI is easy at is already allowed to return nullptr. We'd need to update the doc comment, but that's about it. getAVLVNInfo requires a bit of care. Here's a list per callsite:
Each of the above is a lost optimization at O0, but shouldn't influence correctness. Out of the above, the hasSameAVL is probably the biggest lost optimization, but I don't see an easy way to avoid that. You'd also need to reverse the order of checks involving getVNInfoFromReg, but the assert in the else shows that possible without issue. |
Setting ValNo to nullptr is what I had tried initially, but the snag that I ran into was that VSETVLIInfo::== was no longer reflexive and I ran into some infinite cycles, presumably because computeIncomingVLVTYPE kept on adding successors to the worklist due to the terminating condition But I do agree that it would be nicer to avoid the dummy values, I'll give this another try. Maybe we could redefine == to not use hasSameAVL. I think we may be conflating it a bit with the notion of "being compatible" which is what VSETVLIInfo::isCompatible is for. |
As noted in #93796 (comment), a better way to teach RISCVInsertVSETVLI to work without LiveIntervals is to set VNInfo to nullptr and teach the various methods to handle it. We should try that approach first, so we no longer need this pre-commit patch. This reverts commit 4b4d366.
In VSETVLIInfo we define == as structural equality, e.g. unknown == unknown and invalid == invalid. We need this to check if the information at a block's entry or exit has changed. However I think we may have been conflating it with the notion that the state of VL and VTYPE are the same, which isn't the case for two unknowns, and potentially in an upcoming patch where we don't know the value of an AVL register (see llvm#93796) This patch introduces a new method for checking that the VL and VTYPE are the same and switches it over where it would make a difference. This should be NFC for now since the two uses of == we're replacing either didn't compare two unknowns or checked for unknowns. But it's needed if we're to introduce the notion of an AVLReg with a nullptr ValNo, which will need this non-reflexive equality.
Stacked on llvm#94658. We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation. When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run. As reported in llvm#93587, this was apparently not safe to do. This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present. The only real tricky part of this change is the abstract state tracking in the dataflow. We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals. This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition. With LiveIntervals, wehave an exact definition for each AVL use. Without LiveIntervals, we treat the definition of a register AVL as being unknown. The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do *not* neccessarily represent the same AVL value at runtime. Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown). This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction. The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values. This patch is an alternative to llvm#93796 and llvm#94340. It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.
Superseded by #94686 |
In VSETVLIInfo we define == as structural equality, e.g. unknown == unknown and invalid == invalid. We need this to check if the information at a block's entry or exit has changed. However I think we may have been conflating it with the notion that the state of VL and VTYPE are the same, which isn't the case for two unknowns, and potentially in an upcoming patch where we don't know the value of an AVL register (see llvm#93796) This patch introduces a new method for checking that the VL and VTYPE are the same and switches it over where it would make a difference. This should be NFC for now since the two uses of == we're replacing either didn't compare two unknowns or checked for unknowns. But it's needed if we're to introduce the notion of an AVLReg with a nullptr ValNo, which will need this non-reflexive equality.
…FC (#94340) In VSETVLIInfo we define == as lattice equality, e.g. unknown == unknown and invalid == invalid. We need this to check if the information at a block's entry or exit has changed. However I think we may have been conflating it with the notion that the state of VL and VTYPE are the same, which isn't the case for two unknowns, and potentially in an upcoming patch where we don't know the value of an AVL register (see #93796) This patch switches over the use in emitVSETVLIs to use isCompatible with all fields demanded, since we need VL and VTYPE to be known equal at runtime rather than just the VSETVLIInfos to be the same. This should be NFC for now we shouldn't reach here with an unknown state. But it's needed if we're to introduce the notion of an AVLReg with a nullptr ValNo, which will need this non-reflexive equality. Specifically, hasSameAVL should return false for this case, but == should return true.
We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation. When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run. As reported in llvm#93587, this was apparently not safe to do. This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present. The only real tricky part of this change is the abstract state tracking in the dataflow. We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals. This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition. With LiveIntervals, wehave an exact definition for each AVL use. Without LiveIntervals, we treat the definition of a register AVL as being unknown. The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do *not* neccessarily represent the same AVL value at runtime. Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown). This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction. The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values. In practice, this means we pessimize code written with intrinsics at O0. This patch is an alternative to llvm#93796 and llvm#94340. It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.
Stacked on #94658. We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation. When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run. As reported in #93587, this was apparently not safe to do. This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present. The only real tricky part of this change is the abstract state tracking in the dataflow. We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals. This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition. With LiveIntervals, we have an exact definition for each AVL use. Without LiveIntervals, we treat the definition of a register AVL as being unknown. The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do *not* necessarily represent the same AVL value at runtime. Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown). This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction. The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values. This patch is an alternative to #93796 and #94340. It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.
(Reapplying with corrected commit message) We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation. When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run. As reported in #93587, this was apparently not safe to do. This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present. The only real tricky part of this change is the abstract state tracking in the dataflow. We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals. This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition. With LiveIntervals, we have an exact definition for each AVL use. Without LiveIntervals, we treat the definition of a register AVL as being unknown. The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do *not* neccessarily represent the same AVL value at runtime. Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown). This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction. The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values. In practice, this means we pessimize code written with intrinsics at O0. This patch is an alternative to #93796 and #94340. It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.
At O0 LiveIntervals isn't usually computed, but after moving RISCVInsertVSETVLI to after phi elimination we've added it as a dependency.
This removes the dependency on LiveIntervals so O0 builds can stay fast, and also avoids a crash caused by LiveIntervals being run at O0 when I'm not sure if ti was designed to be.
The only necessary parts of RISCVInsertVSETVLI that actually use the LiveIntervals analysis are computeInfoForInstr and getInfoForVSETVLI, which lookup val nos for each register AVL use.
At O0 we can emulate this by conservatively returning fake but unique val nos, which will give us suboptimal but otherwise correct codegen.
We need to make sure that we return the same val no given the same MachineInstr, otherwise computeInfoForInstr/getInforForVSETVLI won't be stable across MachineInstrs and the dataflow analysis will fail to converge.
Fixes #93587