Skip to content

[MLIR][LLVM] Use CyclicReplacerCache for recursive DIType import #98203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 12, 2024

Conversation

zyx-billy
Copy link
Contributor

@zyx-billy zyx-billy commented Jul 9, 2024

Use the new CyclicReplacerCache from #98202 to support importing of recursive DITypes in LLVM dialect's DebugImporter. This helps simplify the implementation, allows for separate testing of the cache infra itself, and as a result we even got more efficient translations.


Stacked PRs:

@zyx-billy zyx-billy force-pushed the users/zyx-billy/mlir/llvm/fix_cyclic_importer branch from 288234b to a0b59b2 Compare July 9, 2024 19:35
@zyx-billy zyx-billy marked this pull request as ready for review July 9, 2024 21:05
@zyx-billy zyx-billy requested review from gysit and Dinistro July 9, 2024 21:05
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2024

@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir

Author: Billy Zhu (zyx-billy)

Changes

Use the new CyclicReplacerCache from #98202 to support importing of recursive DITypes in LLVM dialect's DebugImporter. This helps simplify the implementation, allows for separate testing of the cache infra itself, and as a result we even got more efficient translations.


Stacked PRs:


Full diff: https://github.com/llvm/llvm-project/pull/98203.diff

3 Files Affected:

  • (modified) mlir/lib/Target/LLVMIR/DebugImporter.cpp (+26-117)
  • (modified) mlir/lib/Target/LLVMIR/DebugImporter.h (+9-92)
  • (modified) mlir/test/Target/LLVMIR/Import/debug-info.ll (+16-18)
diff --git a/mlir/lib/Target/LLVMIR/DebugImporter.cpp b/mlir/lib/Target/LLVMIR/DebugImporter.cpp
index 15737aa681c59..f104b72209c39 100644
--- a/mlir/lib/Target/LLVMIR/DebugImporter.cpp
+++ b/mlir/lib/Target/LLVMIR/DebugImporter.cpp
@@ -28,7 +28,7 @@ using namespace mlir::LLVM::detail;
 
 DebugImporter::DebugImporter(ModuleOp mlirModule,
                              bool dropDICompositeTypeElements)
-    : recursionPruner(mlirModule.getContext()),
+    : cache([&](llvm::DINode *node) { return createRecSelf(node); }),
       context(mlirModule.getContext()), mlirModule(mlirModule),
       dropDICompositeTypeElements(dropDICompositeTypeElements) {}
 
@@ -287,16 +287,9 @@ DINodeAttr DebugImporter::translate(llvm::DINode *node) {
     return nullptr;
 
   // Check for a cached instance.
-  if (DINodeAttr attr = nodeToAttr.lookup(node))
-    return attr;
-
-  // Register with the recursive translator. If it can be handled without
-  // recursing into it, return the result immediately.
-  if (DINodeAttr attr = recursionPruner.pruneOrPushTranslationStack(node))
-    return attr;
-
-  auto guard = llvm::make_scope_exit(
-      [&]() { recursionPruner.popTranslationStack(node); });
+  auto cacheEntry = cache.lookupOrInit(node);
+  if (std::optional<DINodeAttr> result = cacheEntry.get())
+    return *result;
 
   // Convert the debug metadata if possible.
   auto translateNode = [this](llvm::DINode *node) -> DINodeAttr {
@@ -335,20 +328,20 @@ DINodeAttr DebugImporter::translate(llvm::DINode *node) {
     return nullptr;
   };
   if (DINodeAttr attr = translateNode(node)) {
-    auto [result, isSelfContained] =
-        recursionPruner.finalizeTranslation(node, attr);
-    // Only cache fully self-contained nodes.
-    if (isSelfContained)
-      nodeToAttr.try_emplace(node, result);
-    return result;
+    // If this node was repeated, lookup its recursive ID and assign it to the
+    // base result.
+    if (cacheEntry.wasRepeated()) {
+      DistinctAttr recId = nodeToRecId.lookup(node);
+      auto recType = cast<DIRecursiveTypeAttrInterface>(attr);
+      attr = cast<DINodeAttr>(recType.withRecId(recId));
+    }
+    cacheEntry.resolve(attr);
+    return attr;
   }
+  cacheEntry.resolve(nullptr);
   return nullptr;
 }
 
-//===----------------------------------------------------------------------===//
-// RecursionPruner
-//===----------------------------------------------------------------------===//
-
 /// Get the `getRecSelf` constructor for the translated type of `node` if its
 /// translated DITypeAttr supports recursion. Otherwise, returns nullptr.
 static function_ref<DIRecursiveTypeAttrInterface(DistinctAttr)>
@@ -361,104 +354,20 @@ getRecSelfConstructor(llvm::DINode *node) {
       .Default(CtorType());
 }
 
-DINodeAttr DebugImporter::RecursionPruner::pruneOrPushTranslationStack(
-    llvm::DINode *node) {
-  // If the node type is capable of being recursive, check if it's seen
-  // before.
+std::optional<DINodeAttr> DebugImporter::createRecSelf(llvm::DINode *node) {
   auto recSelfCtor = getRecSelfConstructor(node);
-  if (recSelfCtor) {
-    // If a cyclic dependency is detected since the same node is being
-    // traversed twice, emit a recursive self type, and mark the duplicate
-    // node on the translationStack so it can emit a recursive decl type.
-    auto [iter, inserted] = translationStack.try_emplace(node);
-    if (!inserted) {
-      // The original node may have already been assigned a recursive ID from
-      // a different self-reference. Use that if possible.
-      DIRecursiveTypeAttrInterface recSelf = iter->second.recSelf;
-      if (!recSelf) {
-        DistinctAttr recId = nodeToRecId.lookup(node);
-        if (!recId) {
-          recId = DistinctAttr::create(UnitAttr::get(context));
-          nodeToRecId[node] = recId;
-        }
-        recSelf = recSelfCtor(recId);
-        iter->second.recSelf = recSelf;
-      }
-      // Inject the self-ref into the previous layer.
-      translationStack.back().second.unboundSelfRefs.insert(recSelf);
-      return cast<DINodeAttr>(recSelf);
-    }
+  if (!recSelfCtor)
+    return std::nullopt;
+
+  // The original node may have already been assigned a recursive ID from
+  // a different self-reference. Use that if possible.
+  DistinctAttr recId = nodeToRecId.lookup(node);
+  if (!recId) {
+    recId = DistinctAttr::create(UnitAttr::get(context));
+    nodeToRecId[node] = recId;
   }
-
-  return lookup(node);
-}
-
-std::pair<DINodeAttr, bool>
-DebugImporter::RecursionPruner::finalizeTranslation(llvm::DINode *node,
-                                                    DINodeAttr result) {
-  // If `node` is not a potentially recursive type, it will not be on the
-  // translation stack. Nothing to set in this case.
-  if (translationStack.empty())
-    return {result, true};
-  if (translationStack.back().first != node)
-    return {result, translationStack.back().second.unboundSelfRefs.empty()};
-
-  TranslationState &state = translationStack.back().second;
-
-  // If this node is actually recursive, set the recId onto `result`.
-  if (DIRecursiveTypeAttrInterface recSelf = state.recSelf) {
-    auto recType = cast<DIRecursiveTypeAttrInterface>(result);
-    result = cast<DINodeAttr>(recType.withRecId(recSelf.getRecId()));
-    // Remove this recSelf from the set of unbound selfRefs.
-    state.unboundSelfRefs.erase(recSelf);
-  }
-
-  // Insert the result into our internal cache if it's not self-contained.
-  if (!state.unboundSelfRefs.empty()) {
-    [[maybe_unused]] auto [_, inserted] = dependentCache.try_emplace(
-        node, DependentTranslation{result, state.unboundSelfRefs});
-    assert(inserted && "invalid state: caching the same DINode twice");
-    return {result, false};
-  }
-  return {result, true};
-}
-
-void DebugImporter::RecursionPruner::popTranslationStack(llvm::DINode *node) {
-  // If `node` is not a potentially recursive type, it will not be on the
-  // translation stack. Nothing to handle in this case.
-  if (translationStack.empty() || translationStack.back().first != node)
-    return;
-
-  // At the end of the stack, all unbound self-refs must be resolved already,
-  // and the entire cache should be accounted for.
-  TranslationState &currLayerState = translationStack.back().second;
-  if (translationStack.size() == 1) {
-    assert(currLayerState.unboundSelfRefs.empty() &&
-           "internal error: unbound recursive self reference at top level.");
-    translationStack.pop_back();
-    return;
-  }
-
-  // Copy unboundSelfRefs down to the previous level.
-  TranslationState &nextLayerState = (++translationStack.rbegin())->second;
-  nextLayerState.unboundSelfRefs.insert(currLayerState.unboundSelfRefs.begin(),
-                                        currLayerState.unboundSelfRefs.end());
-  translationStack.pop_back();
-}
-
-DINodeAttr DebugImporter::RecursionPruner::lookup(llvm::DINode *node) {
-  auto cacheIter = dependentCache.find(node);
-  if (cacheIter == dependentCache.end())
-    return {};
-
-  DependentTranslation &entry = cacheIter->second;
-  if (llvm::set_is_subset(entry.unboundSelfRefs,
-                          translationStack.back().second.unboundSelfRefs))
-    return entry.attr;
-
-  // Stale cache entry.
-  dependentCache.erase(cacheIter);
-  return {};
+  DIRecursiveTypeAttrInterface recSelf = recSelfCtor(recId);
+  return cast<DINodeAttr>(recSelf);
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Target/LLVMIR/DebugImporter.h b/mlir/lib/Target/LLVMIR/DebugImporter.h
index fac86dbe2cdd2..4a2bf35c160e1 100644
--- a/mlir/lib/Target/LLVMIR/DebugImporter.h
+++ b/mlir/lib/Target/LLVMIR/DebugImporter.h
@@ -17,6 +17,7 @@
 #include "mlir/Dialect/LLVMIR/LLVMDialect.h"
 #include "mlir/IR/BuiltinOps.h"
 #include "mlir/IR/MLIRContext.h"
+#include "mlir/Support/CyclicReplacerCache.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 
 namespace mlir {
@@ -87,102 +88,18 @@ class DebugImporter {
   /// for it, or create a new one if not.
   DistinctAttr getOrCreateDistinctID(llvm::DINode *node);
 
-  /// A mapping between LLVM debug metadata and the corresponding attribute.
-  DenseMap<llvm::DINode *, DINodeAttr> nodeToAttr;
+  std::optional<DINodeAttr> createRecSelf(llvm::DINode *node);
+
   /// A mapping between distinct LLVM debug metadata nodes and the corresponding
   /// distinct id attribute.
   DenseMap<llvm::DINode *, DistinctAttr> nodeToDistinctAttr;
 
-  /// Translation helper for recursive DINodes.
-  /// Works alongside a stack-based DINode translator (the "main translator")
-  /// for gracefully handling DINodes that are recursive.
-  ///
-  /// Usage:
-  /// - Before translating a node, call `pruneOrPushTranslationStack` to see if
-  ///   the pruner can preempt this translation. If this is a node that the
-  ///   pruner already knows how to handle, it will return the translated
-  ///   DINodeAttr.
-  /// - After a node is successfully translated by the main translator, call
-  ///   `finalizeTranslation` to save the translated result with the pruner, and
-  ///   give it a chance to further modify the result.
-  /// - Regardless of success or failure by the main translator, always call
-  ///   `popTranslationStack` at the end of translating a node. This is
-  ///   necessary to keep the internal book-keeping in sync.
-  ///
-  /// This helper maintains an internal cache so that no recursive type will
-  /// be translated more than once by the main translator.
-  /// This internal cache is different from the cache maintained by the main
-  /// translator because it may store nodes that are not self-contained (i.e.
-  /// contain unbounded recursive self-references).
-  class RecursionPruner {
-  public:
-    RecursionPruner(MLIRContext *context) : context(context) {}
-
-    /// If this node is a recursive instance that was previously seen, returns a
-    /// self-reference. If this node was previously cached, returns the cached
-    /// result. Otherwise, returns null attr, and a translation stack frame is
-    /// created for this node. Expects `finalizeTranslation` &
-    /// `popTranslationStack` to be called on this node later.
-    DINodeAttr pruneOrPushTranslationStack(llvm::DINode *node);
-
-    /// Register the translated result of `node`. Returns the finalized result
-    /// (with recId if recursive) and whether the result is self-contained
-    /// (i.e. contains no unbound self-refs).
-    std::pair<DINodeAttr, bool> finalizeTranslation(llvm::DINode *node,
-                                                    DINodeAttr result);
-
-    /// Pop off a frame from the translation stack after a node is done being
-    /// translated.
-    void popTranslationStack(llvm::DINode *node);
-
-  private:
-    /// Returns the cached result (if exists) or null.
-    /// The cache entry will be removed if not all of its dependent self-refs
-    /// exists.
-    DINodeAttr lookup(llvm::DINode *node);
-
-    MLIRContext *context;
-
-    /// A cached translation that contains the translated attribute as well
-    /// as any unbound self-references that it depends on.
-    struct DependentTranslation {
-      /// The translated attr. May contain unbound self-references for other
-      /// recursive attrs.
-      DINodeAttr attr;
-      /// The set of unbound self-refs that this cached entry refers to. All
-      /// these self-refs must exist for the cached entry to be valid.
-      DenseSet<DIRecursiveTypeAttrInterface> unboundSelfRefs;
-    };
-    /// A mapping between LLVM debug metadata and the corresponding attribute.
-    /// Only contains those with unboundSelfRefs. Fully self-contained attrs
-    /// will be cached by the outer main translator.
-    DenseMap<llvm::DINode *, DependentTranslation> dependentCache;
-
-    /// Each potentially recursive node will have a TranslationState pushed onto
-    /// the `translationStack` to keep track of whether this node is actually
-    /// recursive (i.e. has self-references inside), and other book-keeping.
-    struct TranslationState {
-      /// The rec-self if this node is indeed a recursive node (i.e. another
-      /// instance of itself is seen while translating it). Null if this node
-      /// has not been seen again deeper in the translation stack.
-      DIRecursiveTypeAttrInterface recSelf;
-      /// All the unbound recursive self references in this layer of the
-      /// translation stack.
-      DenseSet<DIRecursiveTypeAttrInterface> unboundSelfRefs;
-    };
-    /// A stack that stores the metadata nodes that are being traversed. The
-    /// stack is used to handle cyclic dependencies during metadata translation.
-    /// Each node is pushed with an empty TranslationState. If it is ever seen
-    /// later when the stack is deeper, the node is recursive, and its
-    /// TranslationState is assigned a recSelf.
-    llvm::MapVector<llvm::DINode *, TranslationState> translationStack;
-
-    /// A mapping between DINodes that are recursive, and their assigned recId.
-    /// This is kept so that repeated occurrences of the same node can reuse the
-    /// same ID and be deduplicated.
-    DenseMap<llvm::DINode *, DistinctAttr> nodeToRecId;
-  };
-  RecursionPruner recursionPruner;
+  /// A mapping between DINodes that are recursive, and their assigned recId.
+  /// This is kept so that repeated occurrences of the same node can reuse the
+  /// same ID and be deduplicated.
+  DenseMap<llvm::DINode *, DistinctAttr> nodeToRecId;
+
+  CyclicReplacerCache<llvm::DINode *, DINodeAttr> cache;
 
   MLIRContext *context;
   ModuleOp mlirModule;
diff --git a/mlir/test/Target/LLVMIR/Import/debug-info.ll b/mlir/test/Target/LLVMIR/Import/debug-info.ll
index 2173cc33e68a4..a194ecbf2eb20 100644
--- a/mlir/test/Target/LLVMIR/Import/debug-info.ll
+++ b/mlir/test/Target/LLVMIR/Import/debug-info.ll
@@ -308,16 +308,16 @@ define void @class_method() {
 }
 
 ; Verify the cyclic composite type is identified, even though conversion begins from the subprogram type.
-; CHECK: #[[COMP_SELF:.+]] = #llvm.di_composite_type<tag = DW_TAG_null, recId = [[REC_ID:.+]]>
-; CHECK: #[[COMP_PTR:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP_SELF]], sizeInBits = 64>
-; CHECK: #[[SP_TYPE:.+]] = #llvm.di_subroutine_type<types = #{{.*}}, #[[COMP_PTR]]>
-; CHECK: #[[SP_INNER:.+]] = #llvm.di_subprogram<id = [[SP_ID:.+]], compileUnit = #{{.*}}, scope = #[[COMP_SELF]], name = "class_method", file = #{{.*}}, subprogramFlags = Definition, type = #[[SP_TYPE]]>
-; CHECK: #[[COMP:.+]] = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = [[REC_ID]], name = "class_name", file = #{{.*}}, line = 42, flags = "TypePassByReference|NonTrivial", elements = #[[SP_INNER]]>
+; CHECK-DAG: #[[COMP_SELF:.+]] = #llvm.di_composite_type<tag = DW_TAG_null, recId = [[REC_ID:.+]]>
+; CHECK-DAG: #[[COMP_PTR:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP_SELF]], sizeInBits = 64>
+; CHECK-DAG: #[[SP_TYPE:.+]] = #llvm.di_subroutine_type<types = #{{.*}}, #[[COMP_PTR]]>
+; CHECK-DAG: #[[SP_INNER:.+]] = #llvm.di_subprogram<id = [[SP_ID:.+]], compileUnit = #{{.*}}, scope = #[[COMP_SELF]], name = "class_method", file = #{{.*}}, subprogramFlags = Definition, type = #[[SP_TYPE]]>
+; CHECK-DAG: #[[COMP:.+]] = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = [[REC_ID]], name = "class_name", file = #{{.*}}, line = 42, flags = "TypePassByReference|NonTrivial", elements = #[[SP_INNER]]>
 
-; CHECK: #[[COMP_PTR_OUTER:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP]], sizeInBits = 64>
-; CHECK: #[[SP_TYPE_OUTER:.+]] = #llvm.di_subroutine_type<types = #{{.*}}, #[[COMP_PTR_OUTER]]>
-; CHECK: #[[SP_OUTER:.+]] = #llvm.di_subprogram<id = [[SP_ID]], compileUnit = #{{.*}}, scope = #[[COMP]], name = "class_method", file = #{{.*}}, subprogramFlags = Definition, type = #[[SP_TYPE_OUTER]]>
-; CHECK: #[[LOC]] = loc(fused<#[[SP_OUTER]]>
+; CHECK-DAG: #[[COMP_PTR_OUTER:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP]], sizeInBits = 64>
+; CHECK-DAG: #[[SP_TYPE_OUTER:.+]] = #llvm.di_subroutine_type<types = #{{.*}}, #[[COMP_PTR_OUTER]]>
+; CHECK-DAG: #[[SP_OUTER:.+]] = #llvm.di_subprogram<id = [[SP_ID]], compileUnit = #{{.*}}, scope = #[[COMP]], name = "class_method", file = #{{.*}}, subprogramFlags = Definition, type = #[[SP_TYPE_OUTER]]>
+; CHECK-DAG: #[[LOC]] = loc(fused<#[[SP_OUTER]]>
 
 !llvm.dbg.cu = !{!1}
 !llvm.module.flags = !{!0}
@@ -335,12 +335,12 @@ define void @class_method() {
 ; // -----
 
 ; Verify the cyclic composite type is handled correctly.
-; CHECK: #[[COMP_SELF:.+]] = #llvm.di_composite_type<tag = DW_TAG_null, recId = [[REC_ID:.+]]>
-; CHECK: #[[COMP_PTR_INNER:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP_SELF]]>
-; CHECK: #[[FIELD:.+]] = #llvm.di_derived_type<tag = DW_TAG_member, name = "call_field", baseType = #[[COMP_PTR_INNER]]>
-; CHECK: #[[COMP:.+]] = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = [[REC_ID]], name = "class_field", file = #{{.*}}, line = 42, flags = "TypePassByReference|NonTrivial", elements = #[[FIELD]]>
-; CHECK: #[[COMP_PTR_OUTER:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP]]>
-; CHECK: #[[VAR0:.+]] = #llvm.di_local_variable<scope = #{{.*}}, name = "class_field", file = #{{.*}}, type = #[[COMP_PTR_OUTER]]>
+; CHECK-DAG: #[[COMP_SELF:.+]] = #llvm.di_composite_type<tag = DW_TAG_null, recId = [[REC_ID:.+]]>
+; CHECK-DAG: #[[COMP_PTR_INNER:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP_SELF]]>
+; CHECK-DAG: #[[FIELD:.+]] = #llvm.di_derived_type<tag = DW_TAG_member, name = "call_field", baseType = #[[COMP_PTR_INNER]]>
+; CHECK-DAG: #[[COMP:.+]] = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = [[REC_ID]], name = "class_field", file = #{{.*}}, line = 42, flags = "TypePassByReference|NonTrivial", elements = #[[FIELD]]>
+; CHECK-DAG: #[[COMP_PTR_OUTER:.+]] = #llvm.di_derived_type<tag = DW_TAG_pointer_type, baseType = #[[COMP]]>
+; CHECK-DAG: #[[VAR0:.+]] = #llvm.di_local_variable<scope = #{{.*}}, name = "class_field", file = #{{.*}}, type = #[[COMP_PTR_OUTER]]>
 
 ; CHECK: @class_field
 ; CHECK-SAME:  %[[ARG0:[a-zA-Z0-9]+]]
@@ -727,9 +727,7 @@ define void @class_field(ptr %arg1) !dbg !18 {
 ; CHECK-DAG: #[[B_TO_C]] = #llvm.di_derived_type<{{.*}}name = "->C", {{.*}}baseType = #[[C_FROM_B:.+]]>
 ; CHECK-DAG: #[[C_FROM_B]] = #llvm.di_composite_type<{{.*}}recId = [[C_RECID:.+]], {{.*}}name = "C", {{.*}}elements = #[[TO_A_SELF:.+]], #[[TO_B_SELF:.+]], #[[TO_C_SELF:.+]]>
 
-; CHECK-DAG: #[[C_FROM_A]] = #llvm.di_composite_type<{{.*}}recId = [[C_RECID]], {{.*}}name = "C", {{.*}}elements = #[[TO_A_SELF]], #[[TO_B_INNER:.+]], #[[TO_C_SELF]]
-; CHECK-DAG: #[[TO_B_INNER]] = #llvm.di_derived_type<{{.*}}name = "->B", {{.*}}baseType = #[[B_INNER:.+]]>
-; CHECK-DAG: #[[B_INNER]] = #llvm.di_composite_type<{{.*}}name = "B", {{.*}}elements = #[[TO_C_SELF]]>
+; CHECK-DAG: #[[C_FROM_A]] = #llvm.di_composite_type<{{.*}}recId = [[C_RECID]], {{.*}}name = "C", {{.*}}elements = #[[TO_A_SELF]], #[[A_TO_B:.+]], #[[TO_C_SELF]]
 
 ; CHECK-DAG: #[[TO_A_SELF]] = #llvm.di_derived_type<{{.*}}name = "->A", {{.*}}baseType = #[[A_SELF:.+]]>
 ; CHECK-DAG: #[[TO_B_SELF]] = #llvm.di_derived_type<{{.*}}name = "->B", {{.*}}baseType = #[[B_SELF:.+]]>

Base automatically changed from users/zyx-billy/mlir/cyclic_replacer_cache to main July 12, 2024 05:08
@zyx-billy zyx-billy merged commit dd86604 into main Jul 12, 2024
7 checks passed
@zyx-billy zyx-billy deleted the users/zyx-billy/mlir/llvm/fix_cyclic_importer branch July 12, 2024 16:24
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
…m#98203)

Use the new CyclicReplacerCache from
llvm#98202 to support importing of
recursive DITypes in LLVM dialect's DebugImporter. This helps simplify
the implementation, allows for separate testing of the cache infra
itself, and as a result we even got more efficient translations.
@Dinistro
Copy link
Contributor

We have a case where a cycle involving a DISubprogram, and we hit an assertion that it is attempted to look it up for the third time.

The following input is a reproducer of said issue. It seems that the presence of 3 types transitively referring back to the subprogram break an assumption made by this.

; ModuleID = 'out.ll'
source_filename = "out.ll"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare void @llvm.dbg.declare(metadata, metadata, metadata) #0

define void @foo() {
L.entry:
  call void @llvm.dbg.declare(metadata ptr null, metadata !4, metadata !DIExpression()), !dbg !22
  ret void
}

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3}

!0 = distinct !DICompileUnit(language: DW_LANG_Fortran90, file: !1, producer: "F90 Flang - 1.5 2017-05-01", isOptimized: true, flags: "", runtimeVersion: 0, emissionKind: FullDebug, enums: !2, retainedTypes: !2, globals: !2, imports: !2, nameTableKind: None)
!1 = !DIFile(filename: "file.f90", directory: "/")
!2 = !{}
!3 = !{i32 2, !"Debug Info Version", i32 3}
!4 = !DILocalVariable(name: "x_min", arg: 1, scope: !5, file: !1, line: 27, type: !21)
!5 = distinct !DISubprogram(name: "accelerate_kernel", scope: !6, file: !1, line: 27, type: !7, scopeLine: 27, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !2)
!6 = !DIModule(scope: !0, name: "accelerate_kernel_module", file: !1)
!7 = !DISubroutineType(types: !8)
!8 = !{null, !9, !16, !20}
!9 = !DICompositeType(tag: DW_TAG_array_type, baseType: !10, size: 64, align: 64, elements: !11)
!10 = !DIBasicType(name: "double precision", size: 64, align: 64, encoding: DW_ATE_float)
!11 = !{!12}
!12 = !DISubrange(lowerBound: !13, upperBound: !15)
!13 = distinct !DILocalVariable(scope: !5, file: !1, type: !14, flags: DIFlagArtificial)
!14 = !DIBasicType(name: "integer*8", size: 64, align: 64, encoding: DW_ATE_signed)
!15 = distinct !DILocalVariable(scope: !5, file: !1, type: !14, flags: DIFlagArtificial)
!16 = distinct !DICompositeType(tag: DW_TAG_array_type, baseType: !10, size: 64, align: 64, elements: !17)
!17 = !{!18}
!18 = !DISubrange(lowerBound: !13, upperBound: !19)
!19 = distinct !DILocalVariable(scope: !5, file: !1, type: !14, flags: DIFlagArtificial)
!20 = !DICompositeType(tag: DW_TAG_array_type, baseType: !10, size: 64, align: 64, elements: !17)
!21 = !DIBasicType(name: "integer", size: 32, align: 32, encoding: DW_ATE_signed)
!22 = !DILocation(line: 0, scope: !5)

So far, I'm not sure if this could be fixed by implementing the DIRecursiveTypeAttrInterface or if something else is broken. If it's "just" implementing the interface, I'm a bit afraid that there might be other cases like this that we might encounter along the way.

@zyx-billy
Copy link
Contributor Author

Ohhh... I see. There's no guarantee from this import algo that we'll never see a node 3 times. If not all nodes can break cycles, having multiple back-edges will cause it to die.

0 -> 1
1 -> 2, 3
2 -> 0
3 -> 0

Everything goes fine when looping through the first loop until we reach the second loop 🤦

0 -> 1 -> 2 -> 0 -> 1 -> 2'
                      -> 3 -> 0

Thanks for the repro btw. Very easy to digest. It seems having a cycle-breaking node in each loop is not enough to guarantee we never see a node 3 times. We have to make sure it's impossible to walk the graph so that you can go through a node 3 times before visiting a cycle-breaking node twice (and in this case such a violating path would be 0 -> 1 -> 2 -> 0 -> 1 -> 3 -> 0). So I totally agree with your concern... can we do this without requiring all nodes to be cycle-breaking 🤔 (or is anything special with 0 that we can use to refine our rule).

@Dinistro
Copy link
Contributor

Dinistro commented Aug 2, 2024

I'm unsure if making 0 (in this case DISubprogram) a cycle breaker would solve the issue or if we would require changes to the algorithm all-together.

Assuming that cycle breaking is enough, then we only have specific nodes we find that will have back-edges when walking the debug metadata in the import. So far, we are aware of DICompositeType and DISubprogram. The question is if there are any more cases where this can happen.
I currently see three different way in how we could deal with that:

  1. Add cycle-breaking support for every node type
  2. Deeply analyse the debug metadata documentation and find all necessary cycle-breakers. Given the state of the documentation, this might be very hard, though.
  3. Iteratively add cycle breaking for issues we find along the way. While this might require some cycles (pun intended), the assertion should help us in finding these.

Thanks for the repro btw. Very easy to digest.

llvm-reduce can do magic ;P

@gysit
Copy link
Contributor

gysit commented Aug 2, 2024

can we do this without requiring all nodes to be cycle-breaking 🤔 (or is anything special with 0 that we can use to refine our rule).

What would be the implication in terms of implementation and performance if we make all nodes self recursive? I assume the implementation overheads could be hidden to some extend in a base class (but not entirely)?

I would probably go with the pragmatic solution (3.) and make DISubprogram cycle-breaking initially. I think it is fair to say that DISubprogram is special since it can is a scope and it has a type. I don't think this is the case for another scope attribute. But yes there may be other cyclic node types.

@zyx-billy
Copy link
Contributor Author

If we made all nodes cycle-breaking... We can either make each node have a dedicated self-reference mode, or use a dedicated back-reference DIAttr, but either way we have to deal with extra casting / checking everywhere 🤔. So I agree it's worth trying (3.) first and see if that solves everything in practice already.

@Dinistro
Copy link
Contributor

Dinistro commented Aug 4, 2024

If we made all nodes cycle-breaking... We can either make each node have a dedicated self-reference mode, or use a dedicated back-reference DIAttr, but either way we have to deal with extra casting / checking everywhere 🤔. So I agree it's worth trying (3.) first and see if that solves everything in practice already.

Let's give it a try then. Do you have some spare cycles, or should one of us try to allocate time to try this?

@zyx-billy
Copy link
Contributor Author

I might not be able to get to it these two weeks, so if you have time that would be great (might also be easier to test with the rest of your inputs) Happy to help if needed 🙏.

@Dinistro
Copy link
Contributor

Dinistro commented Aug 9, 2024

I might not be able to get to it these two weeks, so if you have time that would be great (might also be easier to test with the rest of your inputs) Happy to help if needed 🙏.

We are also quite under water, but I'll try to get back to this next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants