Skip to content

[RISCV] Teach RISCVInsertVSETVLI to work without LiveIntervals #93796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented May 30, 2024

At O0 LiveIntervals isn't usually computed, but after moving RISCVInsertVSETVLI to after phi elimination we've added it as a dependency.

This removes the dependency on LiveIntervals so O0 builds can stay fast, and also avoids a crash caused by LiveIntervals being run at O0 when I'm not sure if ti was designed to be.

The only necessary parts of RISCVInsertVSETVLI that actually use the LiveIntervals analysis are computeInfoForInstr and getInfoForVSETVLI, which lookup val nos for each register AVL use.

At O0 we can emulate this by conservatively returning fake but unique val nos, which will give us suboptimal but otherwise correct codegen.

We need to make sure that we return the same val no given the same MachineInstr, otherwise computeInfoForInstr/getInforForVSETVLI won't be stable across MachineInstrs and the dataflow analysis will fail to converge.

Fixes #93587

At O0 LiveIntervals isn't usually computed, but after moving RISCVInsertVSETVLI to after phi elimination we've added it as a dependency.

This removes the dependency on LiveIntervals so O0 builds can stay fast, and also avoids a crash caused by LiveIntervals being run at O0 when I'm not sure if ti was designed to be.

The only parts of RISCVInsertVSETVLI that actually use the LiveIntervals analysis are computeInfoForInstr and getInfoForVSETVLI, which lookup val nos for each register AVL use.

At O0 we can emulate this by conservatively returning fake but unique val nos, which will give us suboptimal but otherwise correct codegen.

We need to make sure that we return the same val no given the same MachineInstr, otherwise computeInfoForInstr/getInforForVSETVLI won't be stable across MachineInstrs and the dataflow analysis will fail to converge.

Fixes llvm#93587
@llvmbot
Copy link
Member

llvmbot commented May 30, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

At O0 LiveIntervals isn't usually computed, but after moving RISCVInsertVSETVLI to after phi elimination we've added it as a dependency.

This removes the dependency on LiveIntervals so O0 builds can stay fast, and also avoids a crash caused by LiveIntervals being run at O0 when I'm not sure if ti was designed to be.

The only parts of RISCVInsertVSETVLI that actually use the LiveIntervals analysis are computeInfoForInstr and getInfoForVSETVLI, which lookup val nos for each register AVL use.

At O0 we can emulate this by conservatively returning fake but unique val nos, which will give us suboptimal but otherwise correct codegen.

We need to make sure that we return the same val no given the same MachineInstr, otherwise computeInfoForInstr/getInforForVSETVLI won't be stable across MachineInstrs and the dataflow analysis will fail to converge.

Fixes #93587


Patch is 20.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93796.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+115-71)
  • (modified) llvm/test/CodeGen/RISCV/O0-pipeline.ll (-3)
  • (added) llvm/test/CodeGen/RISCV/pr93587.ll (+37)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir (+1-1)
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index 4c57eecd8465d..081ac621c17d7 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -580,6 +580,8 @@ class VSETVLIInfo {
   // boundary slot.
   const MachineInstr *getAVLDefMI(const LiveIntervals *LIS) const {
     assert(hasAVLReg());
+    if (!LIS)
+      return nullptr;
     const VNInfo *VNI =
         LIS->getInterval(getAVLReg()).getValNumInfo(getAVLValNo());
     auto *MI = LIS->getInstructionFromIndex(VNI->def);
@@ -871,9 +873,9 @@ class RISCVInsertVSETVLI : public MachineFunctionPass {
   void getAnalysisUsage(AnalysisUsage &AU) const override {
     AU.setPreservesCFG();
 
-    AU.addRequired<LiveIntervals>();
+    AU.addUsedIfAvailable<LiveIntervals>();
     AU.addPreserved<LiveIntervals>();
-    AU.addRequired<SlotIndexes>();
+    AU.addUsedIfAvailable<SlotIndexes>();
     AU.addPreserved<SlotIndexes>();
     AU.addPreserved<LiveDebugVariables>();
     AU.addPreserved<LiveStacks>();
@@ -885,30 +887,47 @@ class RISCVInsertVSETVLI : public MachineFunctionPass {
 
 private:
   bool needVSETVLI(const DemandedFields &Used, const VSETVLIInfo &Require,
-                   const VSETVLIInfo &CurInfo) const;
-  bool needVSETVLIPHI(const VSETVLIInfo &Require,
-                      const MachineBasicBlock &MBB) const;
+                   const VSETVLIInfo &CurInfo);
+  bool needVSETVLIPHI(const VSETVLIInfo &Require, const MachineBasicBlock &MBB);
   void insertVSETVLI(MachineBasicBlock &MBB, MachineInstr &MI,
                      const VSETVLIInfo &Info, const VSETVLIInfo &PrevInfo);
   void insertVSETVLI(MachineBasicBlock &MBB,
                      MachineBasicBlock::iterator InsertPt, DebugLoc DL,
                      const VSETVLIInfo &Info, const VSETVLIInfo &PrevInfo);
 
-  void transferBefore(VSETVLIInfo &Info, const MachineInstr &MI) const;
-  void transferAfter(VSETVLIInfo &Info, const MachineInstr &MI) const;
-  bool computeVLVTYPEChanges(const MachineBasicBlock &MBB,
-                             VSETVLIInfo &Info) const;
+  void transferBefore(VSETVLIInfo &Info, const MachineInstr &MI);
+  void transferAfter(VSETVLIInfo &Info, const MachineInstr &MI);
+  bool computeVLVTYPEChanges(const MachineBasicBlock &MBB, VSETVLIInfo &Info);
   void computeIncomingVLVTYPE(const MachineBasicBlock &MBB);
   void emitVSETVLIs(MachineBasicBlock &MBB);
   void doPRE(MachineBasicBlock &MBB);
   void insertReadVL(MachineBasicBlock &MBB);
 
   bool canMutatePriorConfig(const MachineInstr &PrevMI, const MachineInstr &MI,
-                            const DemandedFields &Used) const;
-  void coalesceVSETVLIs(MachineBasicBlock &MBB) const;
-
-  VSETVLIInfo getInfoForVSETVLI(const MachineInstr &MI) const;
-  VSETVLIInfo computeInfoForInstr(const MachineInstr &MI) const;
+                            const DemandedFields &Used);
+  void coalesceVSETVLIs(MachineBasicBlock &MBB);
+
+  VSETVLIInfo getInfoForVSETVLI(const MachineInstr &MI);
+  VSETVLIInfo computeInfoForInstr(const MachineInstr &MI);
+
+  // At O0 LiveIntervals isn't available, but we still need to lookup a val no
+  // for each register AVL use. So generate these conservative dummy val nos
+  // that ensure each register use has a unique val no + register combination.
+  //
+  // They're tied to a MachineInstr since we need computeInfoForInstr and
+  // getInfoForVSETVLI to be stable across MachineInstrs.
+  DenseMap<const MachineInstr *, unsigned> DummyValNos;
+  unsigned NextDummyValNo = 0;
+  unsigned getDummyValNo(const MachineInstr *MI) {
+    unsigned ValNo;
+    if (DummyValNos.contains(MI))
+      ValNo = DummyValNos[MI];
+    else {
+      ValNo = NextDummyValNo++;
+      DummyValNos[MI] = ValNo;
+    }
+    return ValNo;
+  }
 };
 
 } // end anonymous namespace
@@ -921,8 +940,7 @@ INITIALIZE_PASS(RISCVInsertVSETVLI, DEBUG_TYPE, RISCV_INSERT_VSETVLI_NAME,
 
 // Return a VSETVLIInfo representing the changes made by this VSETVLI or
 // VSETIVLI instruction.
-VSETVLIInfo
-RISCVInsertVSETVLI::getInfoForVSETVLI(const MachineInstr &MI) const {
+VSETVLIInfo RISCVInsertVSETVLI::getInfoForVSETVLI(const MachineInstr &MI) {
   VSETVLIInfo NewInfo;
   if (MI.getOpcode() == RISCV::PseudoVSETIVLI) {
     NewInfo.setAVLImm(MI.getOperand(1).getImm());
@@ -934,6 +952,8 @@ RISCVInsertVSETVLI::getInfoForVSETVLI(const MachineInstr &MI) const {
            "Can't handle X0, X0 vsetvli yet");
     if (AVLReg == RISCV::X0)
       NewInfo.setAVLVLMAX();
+    else if (!LIS)
+      NewInfo.setAVLRegDef(getDummyValNo(&MI), AVLReg);
     else if (VNInfo *VNI = getVNInfoFromReg(AVLReg, MI, LIS))
       NewInfo.setAVLRegDef(VNI->id, AVLReg);
     else {
@@ -956,8 +976,7 @@ static unsigned computeVLMAX(unsigned VLEN, unsigned SEW,
   return VLEN/SEW;
 }
 
-VSETVLIInfo
-RISCVInsertVSETVLI::computeInfoForInstr(const MachineInstr &MI) const {
+VSETVLIInfo RISCVInsertVSETVLI::computeInfoForInstr(const MachineInstr &MI) {
   VSETVLIInfo InstrInfo;
   const uint64_t TSFlags = MI.getDesc().TSFlags;
 
@@ -1010,6 +1029,8 @@ RISCVInsertVSETVLI::computeInfoForInstr(const MachineInstr &MI) const {
       }
       else
         InstrInfo.setAVLImm(Imm);
+    } else if (!LIS) {
+      InstrInfo.setAVLRegDef(getDummyValNo(&MI), VLOp.getReg());
     } else if (VNInfo *VNI = getVNInfoFromReg(VLOp.getReg(), MI, LIS)) {
       InstrInfo.setAVLRegDef(VNI->id, VLOp.getReg());
     } else {
@@ -1068,7 +1089,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
                     .addReg(RISCV::X0, RegState::Kill)
                     .addImm(Info.encodeVTYPE())
                     .addReg(RISCV::VL, RegState::Implicit);
-      LIS->InsertMachineInstrInMaps(*MI);
+      if (LIS)
+        LIS->InsertMachineInstrInMaps(*MI);
       return;
     }
 
@@ -1085,7 +1107,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
                         .addReg(RISCV::X0, RegState::Kill)
                         .addImm(Info.encodeVTYPE())
                         .addReg(RISCV::VL, RegState::Implicit);
-          LIS->InsertMachineInstrInMaps(*MI);
+          if (LIS)
+            LIS->InsertMachineInstrInMaps(*MI);
           return;
         }
       }
@@ -1097,7 +1120,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
                   .addReg(RISCV::X0, RegState::Define | RegState::Dead)
                   .addImm(Info.getAVLImm())
                   .addImm(Info.encodeVTYPE());
-    LIS->InsertMachineInstrInMaps(*MI);
+    if (LIS)
+      LIS->InsertMachineInstrInMaps(*MI);
     return;
   }
 
@@ -1111,7 +1135,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
                     .addReg(RISCV::X0, RegState::Kill)
                     .addImm(Info.encodeVTYPE())
                     .addReg(RISCV::VL, RegState::Implicit);
-      LIS->InsertMachineInstrInMaps(*MI);
+      if (LIS)
+        LIS->InsertMachineInstrInMaps(*MI);
       return;
     }
     // Otherwise use an AVL of 1 to avoid depending on previous vl.
@@ -1119,7 +1144,8 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
                   .addReg(RISCV::X0, RegState::Define | RegState::Dead)
                   .addImm(1)
                   .addImm(Info.encodeVTYPE());
-    LIS->InsertMachineInstrInMaps(*MI);
+    if (LIS)
+      LIS->InsertMachineInstrInMaps(*MI);
     return;
   }
 
@@ -1129,8 +1155,10 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
                   .addReg(DestReg, RegState::Define | RegState::Dead)
                   .addReg(RISCV::X0, RegState::Kill)
                   .addImm(Info.encodeVTYPE());
-    LIS->InsertMachineInstrInMaps(*MI);
-    LIS->createAndComputeVirtRegInterval(DestReg);
+    if (LIS) {
+      LIS->InsertMachineInstrInMaps(*MI);
+      LIS->createAndComputeVirtRegInterval(DestReg);
+    }
     return;
   }
 
@@ -1140,19 +1168,21 @@ void RISCVInsertVSETVLI::insertVSETVLI(MachineBasicBlock &MBB,
                 .addReg(RISCV::X0, RegState::Define | RegState::Dead)
                 .addReg(AVLReg)
                 .addImm(Info.encodeVTYPE());
-  LIS->InsertMachineInstrInMaps(*MI);
-  // Normally the AVL's live range will already extend past the inserted vsetvli
-  // because the pseudos below will already use the AVL. But this isn't always
-  // the case, e.g. PseudoVMV_X_S doesn't have an AVL operand.
-  LIS->getInterval(AVLReg).extendInBlock(
-      LIS->getMBBStartIdx(&MBB), LIS->getInstructionIndex(*MI).getRegSlot());
+  if (LIS) {
+    LIS->InsertMachineInstrInMaps(*MI);
+    // Normally the AVL's live range will already extend past the inserted
+    // vsetvli because the pseudos below will already use the AVL. But this
+    // isn't always the case, e.g. PseudoVMV_X_S doesn't have an AVL operand.
+    LIS->getInterval(AVLReg).extendInBlock(
+        LIS->getMBBStartIdx(&MBB), LIS->getInstructionIndex(*MI).getRegSlot());
+  }
 }
 
 /// Return true if a VSETVLI is required to transition from CurInfo to Require
 /// given a set of DemandedFields \p Used.
 bool RISCVInsertVSETVLI::needVSETVLI(const DemandedFields &Used,
                                      const VSETVLIInfo &Require,
-                                     const VSETVLIInfo &CurInfo) const {
+                                     const VSETVLIInfo &CurInfo) {
   if (!CurInfo.isValid() || CurInfo.isUnknown() || CurInfo.hasSEWLMULRatioOnly())
     return true;
 
@@ -1197,7 +1227,7 @@ static VSETVLIInfo adjustIncoming(VSETVLIInfo PrevInfo, VSETVLIInfo NewInfo,
 // is compatible with MI. The resulting state is guaranteed to be semantically
 // legal for MI, but may not be the state requested by MI.
 void RISCVInsertVSETVLI::transferBefore(VSETVLIInfo &Info,
-                                        const MachineInstr &MI) const {
+                                        const MachineInstr &MI) {
   if (!RISCVII::hasSEWOp(MI.getDesc().TSFlags))
     return;
 
@@ -1250,7 +1280,7 @@ void RISCVInsertVSETVLI::transferBefore(VSETVLIInfo &Info,
 // this might be different that the state MI requested), modify the state to
 // reflect the changes MI might make.
 void RISCVInsertVSETVLI::transferAfter(VSETVLIInfo &Info,
-                                       const MachineInstr &MI) const {
+                                       const MachineInstr &MI) {
   if (isVectorConfigInstr(MI)) {
     Info = getInfoForVSETVLI(MI);
     return;
@@ -1259,10 +1289,14 @@ void RISCVInsertVSETVLI::transferAfter(VSETVLIInfo &Info,
   if (RISCV::isFaultFirstLoad(MI)) {
     // Update AVL to vl-output of the fault first load.
     assert(MI.getOperand(1).getReg().isVirtual());
-    auto &LI = LIS->getInterval(MI.getOperand(1).getReg());
-    SlotIndex SI = LIS->getSlotIndexes()->getInstructionIndex(MI).getRegSlot();
-    VNInfo *VNI = LI.getVNInfoAt(SI);
-    Info.setAVLRegDef(VNI->id, MI.getOperand(1).getReg());
+    if (LIS) {
+      auto &LI = LIS->getInterval(MI.getOperand(1).getReg());
+      SlotIndex SI =
+          LIS->getSlotIndexes()->getInstructionIndex(MI).getRegSlot();
+      VNInfo *VNI = LI.getVNInfoAt(SI);
+      Info.setAVLRegDef(VNI->id, MI.getOperand(1).getReg());
+    } else
+      Info.setUnknown();
     return;
   }
 
@@ -1275,7 +1309,7 @@ void RISCVInsertVSETVLI::transferAfter(VSETVLIInfo &Info,
 }
 
 bool RISCVInsertVSETVLI::computeVLVTYPEChanges(const MachineBasicBlock &MBB,
-                                               VSETVLIInfo &Info) const {
+                                               VSETVLIInfo &Info) {
   bool HadVectorOp = false;
 
   Info = BlockInfo[MBB.getNumber()].Pred;
@@ -1349,13 +1383,16 @@ void RISCVInsertVSETVLI::computeIncomingVLVTYPE(const MachineBasicBlock &MBB) {
 // be unneeded if the AVL was a phi node where all incoming values are VL
 // outputs from the last VSETVLI in their respective basic blocks.
 bool RISCVInsertVSETVLI::needVSETVLIPHI(const VSETVLIInfo &Require,
-                                        const MachineBasicBlock &MBB) const {
+                                        const MachineBasicBlock &MBB) {
   if (DisableInsertVSETVLPHIOpt)
     return true;
 
   if (!Require.hasAVLReg())
     return true;
 
+  if (!LIS)
+    return true;
+
   // We need the AVL to have been produced by a PHI node in this basic block.
   const VNInfo *Valno = LIS->getInterval(Require.getAVLReg())
                             .getValNumInfo(Require.getAVLValNo());
@@ -1432,27 +1469,30 @@ void RISCVInsertVSETVLI::emitVSETVLIs(MachineBasicBlock &MBB) {
         MachineOperand &VLOp = MI.getOperand(getVLOpNum(MI));
         if (VLOp.isReg()) {
           Register Reg = VLOp.getReg();
-          LiveInterval &LI = LIS->getInterval(Reg);
 
           // Erase the AVL operand from the instruction.
           VLOp.setReg(RISCV::NoRegister);
           VLOp.setIsKill(false);
-          SmallVector<MachineInstr *> DeadMIs;
-          LIS->shrinkToUses(&LI, &DeadMIs);
-          // We might have separate components that need split due to
-          // needVSETVLIPHI causing us to skip inserting a new VL def.
-          SmallVector<LiveInterval *> SplitLIs;
-          LIS->splitSeparateComponents(LI, SplitLIs);
-
-          // If the AVL was an immediate > 31, then it would have been emitted
-          // as an ADDI. However, the ADDI might not have been used in the
-          // vsetvli, or a vsetvli might not have been emitted, so it may be
-          // dead now.
-          for (MachineInstr *DeadMI : DeadMIs) {
-            if (!TII->isAddImmediate(*DeadMI, Reg))
-              continue;
-            LIS->RemoveMachineInstrFromMaps(*DeadMI);
-            DeadMI->eraseFromParent();
+
+          if (LIS) {
+            LiveInterval &LI = LIS->getInterval(Reg);
+            SmallVector<MachineInstr *> DeadMIs;
+            LIS->shrinkToUses(&LI, &DeadMIs);
+            // We might have separate components that need split due to
+            // needVSETVLIPHI causing us to skip inserting a new VL def.
+            SmallVector<LiveInterval *> SplitLIs;
+            LIS->splitSeparateComponents(LI, SplitLIs);
+
+            // If the AVL was an immediate > 31, then it would have been emitted
+            // as an ADDI. However, the ADDI might not have been used in the
+            // vsetvli, or a vsetvli might not have been emitted, so it may be
+            // dead now.
+            for (MachineInstr *DeadMI : DeadMIs) {
+              if (!TII->isAddImmediate(*DeadMI, Reg))
+                continue;
+              LIS->RemoveMachineInstrFromMaps(*DeadMI);
+              DeadMI->eraseFromParent();
+            }
           }
         }
         MI.addOperand(MachineOperand::CreateReg(RISCV::VL, /*isDef*/ false,
@@ -1588,9 +1628,9 @@ void RISCVInsertVSETVLI::doPRE(MachineBasicBlock &MBB) {
 
 // Return true if we can mutate PrevMI to match MI without changing any the
 // fields which would be observed.
-bool RISCVInsertVSETVLI::canMutatePriorConfig(
-    const MachineInstr &PrevMI, const MachineInstr &MI,
-    const DemandedFields &Used) const {
+bool RISCVInsertVSETVLI::canMutatePriorConfig(const MachineInstr &PrevMI,
+                                              const MachineInstr &MI,
+                                              const DemandedFields &Used) {
   // If the VL values aren't equal, return false if either a) the former is
   // demanded, or b) we can't rewrite the former to be the later for
   // implementation reasons.
@@ -1623,7 +1663,7 @@ bool RISCVInsertVSETVLI::canMutatePriorConfig(
   return areCompatibleVTYPEs(PriorVType, VType, Used);
 }
 
-void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
+void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) {
   MachineInstr *NextMI = nullptr;
   // We can have arbitrary code in successors, so VL and VTYPE
   // must be considered demanded.
@@ -1661,7 +1701,7 @@ void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
 
           // The def of DefReg moved to MI, so extend the LiveInterval up to
           // it.
-          if (DefReg.isVirtual()) {
+          if (LIS && DefReg.isVirtual()) {
             LiveInterval &DefLI = LIS->getInterval(DefReg);
             SlotIndex MISlot = LIS->getInstructionIndex(MI).getRegSlot();
             VNInfo *DefVNI = DefLI.getVNInfoAt(DefLI.beginIndex());
@@ -1688,7 +1728,7 @@ void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
           if (NextMI->getOperand(1).isReg())
             NextMI->getOperand(1).setReg(RISCV::NoRegister);
 
-          if (OldVLReg && OldVLReg.isVirtual()) {
+          if (LIS && OldVLReg && OldVLReg.isVirtual()) {
             // NextMI no longer uses OldVLReg so shrink its LiveInterval.
             LIS->shrinkToUses(&LIS->getInterval(OldVLReg));
 
@@ -1712,7 +1752,8 @@ void RISCVInsertVSETVLI::coalesceVSETVLIs(MachineBasicBlock &MBB) const {
 
   NumCoalescedVSETVL += ToDelete.size();
   for (auto *MI : ToDelete) {
-    LIS->RemoveMachineInstrFromMaps(*MI);
+    if (LIS)
+      LIS->RemoveMachineInstrFromMaps(*MI);
     MI->eraseFromParent();
   }
 }
@@ -1726,13 +1767,16 @@ void RISCVInsertVSETVLI::insertReadVL(MachineBasicBlock &MBB) {
       if (!MI.getOperand(1).isDead()) {
         auto ReadVLMI = BuildMI(MBB, I, MI.getDebugLoc(),
                                 TII->get(RISCV::PseudoReadVL), VLOutput);
+
         // Move the LiveInterval's definition down to PseudoReadVL.
-        SlotIndex NewDefSI =
-            LIS->InsertMachineInstrInMaps(*ReadVLMI).getRegSlot();
-        LiveInterval &DefLI = LIS->getInterval(VLOutput);
-        VNInfo *DefVNI = DefLI.getVNInfoAt(DefLI.beginIndex());
-        DefLI.removeSegment(DefLI.beginIndex(), NewDefSI);
-        DefVNI->def = NewDefSI;
+        if (LIS) {
+          SlotIndex NewDefSI =
+              LIS->InsertMachineInstrInMaps(*ReadVLMI).getRegSlot();
+          LiveInterval &DefLI = LIS->getInterval(VLOutput);
+          VNInfo *DefVNI = DefLI.getVNInfoAt(DefLI.beginIndex());
+          DefLI.removeSegment(DefLI.beginIndex(), NewDefSI);
+          DefVNI->def = NewDefSI;
+        }
       }
       // We don't use the vl output of the VLEFF/VLSEGFF anymore.
       MI.getOperand(1).setReg(RISCV::X0);
@@ -1750,7 +1794,7 @@ bool RISCVInsertVSETVLI::runOnMachineFunction(MachineFunction &MF) {
 
   TII = ST->getInstrInfo();
   MRI = &MF.getRegInfo();
-  LIS = &getAnalysis<LiveIntervals>();
+  LIS = getAnalysisIfAvailable<LiveIntervals>();
 
   assert(BlockInfo.empty() && "Expect empty block infos");
   BlockInfo.resize(MF.getNumBlockIDs());
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index ef7a8f2c7bbee..a9d9ab75143ed 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -46,9 +46,6 @@
 ; CHECK-NEXT:       Eliminate PHI nodes for register allocation
 ; CHECK-NEXT:       Two-Address instruction pass
 ; CHECK-NEXT:       Fast Register Allocator
-; CHECK-NEXT:       MachineDominator Tree Construction
-; CHECK-NEXT:       Slot index numbering
-; CHECK-NEXT:       Live Interval Analysis
 ; CHECK-NEXT:       RISC-V Insert VSETVLI pass
 ; CHECK-NEXT:       Fast Register Allocator
 ; CHECK-NEXT:       Remove Redundant DEBUG_VALUE analysis
diff --git a/llvm/test/CodeGen/RISCV/pr93587.ll b/llvm/test/CodeGen/RISCV/pr93587.ll
new file mode 100644
index 0000000000000..1c2923a2de893
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/pr93587.ll
@@ -0,0 +1,37 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv64 -O0 < %s | FileCheck %s
+
+; Make sure we don't run LiveIntervals at O0, otherwise it will crash when
+; running on this unreachable block.
+
+define i16 @f() {
+; CHECK-LABEL: f:
+; CHECK:       # %bb.0: # %BB
+; CHECK-NEXT:    addi sp, sp, -16
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    j .LBB0_1
+; CHECK-NEXT:  .LBB0_1: # %BB1
+; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:    li a0, 0
+; CHECK-NEXT:    sd a0, 8(sp) # 8-byte Folded Spill
+; CHECK-NEXT:    j .LBB0_1
+; CHECK-NEXT:  # %bb.2: # %BB1
+; CHECK-NEXT:    li a0, 0
+; CHECK-NEXT:    bnez a0, .LBB0_1
+; CHECK-NEXT:    j .LBB0_3
+; CHECK-NEXT:  .LBB0_3: # %BB2
+; CHECK-NEXT:    ld a0, 8(sp) # 8-byte Folded Reload
+; CHECK-NEXT:    addi sp, sp, 16
+; CHECK-NEXT:    ret
+BB:
+  br label %BB1
+
+BB1:
+  %A = or i16 0, 0
+  %B = fcmp true flo...
[truncated]

@lukel97
Copy link
Contributor Author

lukel97 commented May 30, 2024

Mentioning this as an alternative fix for the crash in #93587, we could also try running the remove unreachable machine block pass at O0 to keep LiveIntervalsAnalysis happy.

@preames
Copy link
Collaborator

preames commented May 31, 2024

I have an alternate suggestion here. I'm going to describe this with the representation before 4b4d366 as I think that makes things harder to understand.

My alternative basically comes down to "let ValNo be nullptr", and update uses of this value, not the definitions with some synthetic value (this patch).

getAVLDefMI is easy at is already allowed to return nullptr. We'd need to update the doc comment, but that's about it.

getAVLVNInfo requires a bit of care. Here's a list per callsite:

  • setAVL just propagates the nullptr
  • hasSameAVL can simply return false when ValNo is nullptr on either side.
  • needVSETVLIPHI can simply bail out if ValNo is nullptr.
  • doPRE can bailout

Each of the above is a lost optimization at O0, but shouldn't influence correctness. Out of the above, the hasSameAVL is probably the biggest lost optimization, but I don't see an easy way to avoid that.

You'd also need to reverse the order of checks involving getVNInfoFromReg, but the assert in the else shows that possible without issue.

@lukel97
Copy link
Contributor Author

lukel97 commented Jun 3, 2024

hasSameAVL can simply return false when ValNo is nullptr on either side.

Setting ValNo to nullptr is what I had tried initially, but the snag that I ran into was that VSETVLIInfo::== was no longer reflexive and I ran into some infinite cycles, presumably because computeIncomingVLVTYPE kept on adding successors to the worklist due to the terminating condition if (BBInfo.Exit == TmpStatus) never holding. The assertion at the end of RISCVInsertVSETVLI::emitVSETVLIs` also failed.

But I do agree that it would be nicer to avoid the dummy values, I'll give this another try. Maybe we could redefine == to not use hasSameAVL. I think we may be conflating it a bit with the notion of "being compatible" which is what VSETVLIInfo::isCompatible is for.

lukel97 added a commit that referenced this pull request Jun 3, 2024
As noted in
#93796 (comment),
a better way to teach RISCVInsertVSETVLI to work without LiveIntervals
is to set VNInfo to nullptr and teach the various methods to handle it.
We should try that approach first, so we no longer need this pre-commit
patch.

This reverts commit 4b4d366.
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jun 4, 2024
In VSETVLIInfo we define == as structural equality, e.g. unknown == unknown and invalid == invalid. We need this to check if the information at a block's entry or exit has changed.

However I think we may have been conflating it with the notion that the state of VL and VTYPE are the same, which isn't the case for two unknowns, and potentially in an upcoming patch where we don't know the value of an AVL register (see llvm#93796)

This patch introduces a new method for checking that the VL and VTYPE are the same and switches it over where it would make a difference.

This should be NFC for now since the two uses of == we're replacing either didn't compare two unknowns or checked for unknowns. But it's needed if we're to introduce the notion of an AVLReg with a nullptr ValNo, which will need this non-reflexive equality.
preames added a commit to preames/llvm-project that referenced this pull request Jun 6, 2024
Stacked on llvm#94658.

We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation.  When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run.  As reported in llvm#93587, this was apparently not safe to do.

This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present.  The only real tricky part of this change is the abstract state tracking in the dataflow.  We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals.

This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition.  With LiveIntervals, wehave an exact definition for each AVL use.  Without LiveIntervals, we  treat the definition of a register AVL as being unknown.

The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do *not* neccessarily represent the same AVL value at runtime.  Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown).  This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction.

The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values.

This patch is an alternative to llvm#93796 and llvm#94340.  It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.
@lukel97 lukel97 closed this Jun 14, 2024
@lukel97
Copy link
Contributor Author

lukel97 commented Jun 14, 2024

Superseded by #94686

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jun 14, 2024
In VSETVLIInfo we define == as structural equality, e.g. unknown == unknown and invalid == invalid. We need this to check if the information at a block's entry or exit has changed.

However I think we may have been conflating it with the notion that the state of VL and VTYPE are the same, which isn't the case for two unknowns, and potentially in an upcoming patch where we don't know the value of an AVL register (see llvm#93796)

This patch introduces a new method for checking that the VL and VTYPE are the same and switches it over where it would make a difference.

This should be NFC for now since the two uses of == we're replacing either didn't compare two unknowns or checked for unknowns. But it's needed if we're to introduce the notion of an AVLReg with a nullptr ValNo, which will need this non-reflexive equality.
lukel97 added a commit that referenced this pull request Jun 15, 2024
…FC (#94340)

In VSETVLIInfo we define == as lattice equality, e.g. unknown == unknown
and invalid == invalid. We need this to check if the information at a
block's entry or exit has changed.

However I think we may have been conflating it with the notion that the
state of VL and VTYPE are the same, which isn't the case for two
unknowns, and potentially in an upcoming patch where we don't know the
value of an AVL register (see #93796)

This patch switches over the use in emitVSETVLIs to use isCompatible
with all fields demanded, since we need VL and VTYPE to be known equal
at runtime rather than just the VSETVLIInfos to be the same.

This should be NFC for now we shouldn't reach here with an unknown
state. But it's needed if we're to introduce the notion of an AVLReg
with a nullptr ValNo, which will need this non-reflexive equality.
Specifically, hasSameAVL should return false for this case, but ==
should return true.
preames added a commit to preames/llvm-project that referenced this pull request Jun 17, 2024
We recently moved RISCVInsertVSETVLI from before vector register allocation to after vector register allocation.  When doing so, we added an unconditional dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously run.  As reported in llvm#93587, this was apparently not safe to do.

This change makes LiveIntervals optional, and adjusts all the update code to only run wen live intervals is present.  The only real tricky part of this change is the abstract state tracking in the dataflow.  We need to represent a "register w/unknown definition" state - but only when we don't have LiveIntervals.

This adjust the abstract state definition so that the AVLIsReg state can represent either a register + valno, or a register + unknown definition.  With LiveIntervals, wehave an exact definition for each AVL use.  Without LiveIntervals, we  treat the definition of a register AVL as being unknown.

The key semantic change is that we now have a state in the lattice for which something is known about the AVL value, but for which two identical lattice elements do *not* neccessarily represent the same AVL value at runtime.  Previously, the only case which could result in such an unknown AVL was the fully unknown state (where VTYPE is also fully unknown).  This requires a small adjustment to hasSameAVL and lattice state equality to draw this important distinction.

The net effect of this patch is that we remove the LiveIntervals dependency at O0, and O0 code quality will regress for cases involving register AVL values.  In practice, this means we pessimize code written with intrinsics at O0.

This patch is an alternative to llvm#93796 and llvm#94340.  It is very directly inspired by review conversation around them, and thus should be considered coauthored by Luke.
preames added a commit that referenced this pull request Jun 17, 2024
Stacked on #94658.
    
We recently moved RISCVInsertVSETVLI from before vector register
allocation to after vector register allocation. When doing so, we added
an unconditional dependency on LiveIntervals - even at O0 where
LiveIntevals hadn't previously run. As reported in #93587, this was
apparently not safe to do.
    
This change makes LiveIntervals optional, and adjusts all the update
code to only run wen live intervals is present. The only real tricky
part of this change is the abstract state tracking in the dataflow. We
need to represent a "register w/unknown definition" state - but only
when we don't have LiveIntervals.
    
This adjust the abstract state definition so that the AVLIsReg state can
represent either a register + valno, or a register + unknown definition.
With LiveIntervals, we have an exact definition for each AVL use.
Without LiveIntervals, we treat the definition of a register AVL as
being unknown.
    
The key semantic change is that we now have a state in the lattice for
which something is known about the AVL value, but for which two
identical lattice elements do *not* necessarily represent the same AVL
value at runtime. Previously, the only case which could result in such
an unknown AVL was the fully unknown state (where VTYPE is also fully
unknown). This requires a small adjustment to hasSameAVL and lattice
state equality to draw this important distinction.
    
The net effect of this patch is that we remove the LiveIntervals
dependency at O0, and O0 code quality will regress for cases involving
register AVL values.
    
This patch is an alternative to
#93796 and
#94340. It is very directly
inspired by review conversation around them, and thus should be
considered coauthored by Luke.
preames added a commit that referenced this pull request Jun 17, 2024
(Reapplying with corrected commit message)

We recently moved RISCVInsertVSETVLI from before vector register allocation
to after vector register allocation.  When doing so, we added an unconditional
dependency on LiveIntervals - even at O0 where LiveIntevals hadn't previously
run.  As reported in #93587, this was apparently not safe to do.

This change makes LiveIntervals optional, and adjusts all the update code to
only run wen live intervals is present.  The only real tricky part of this
change is the abstract state tracking in the dataflow.  We need to represent
a "register w/unknown definition" state - but only when we don't have
LiveIntervals.

This adjust the abstract state definition so that the AVLIsReg state can
represent either a register + valno, or a register + unknown definition.
With LiveIntervals, we have an exact definition for each AVL use.  Without
LiveIntervals, we  treat the definition of a register AVL as being unknown.

The key semantic change is that we now have a state in the lattice for which
something is known about the AVL value, but for which two identical lattice
elements do *not* neccessarily represent the same AVL value at runtime.
Previously, the only case which could result in such an unknown AVL was the
fully unknown state (where VTYPE is also fully unknown).  This requires a
small adjustment to hasSameAVL and lattice state equality to draw this
important distinction.

The net effect of this patch is that we remove the LiveIntervals dependency
at O0, and O0 code quality will regress for cases involving register AVL values.
In practice, this means we pessimize code written with intrinsics at O0.

This patch is an alternative to #93796 and #94340.  It is very directly
inspired by review conversation around them, and thus should be considered
coauthored by Luke.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISC-V] Use of %0 does not have a corresponding definition on every path with -O0
3 participants