-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[MLIR][LLVM] Recursion importer handle repeated self-references #87295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR][LLVM] Recursion importer handle repeated self-references #87295
Conversation
@llvm/pr-subscribers-mlir-llvm @llvm/pr-subscribers-mlir Author: Billy Zhu (zyx-billy) ChangesFollowup to this discussion: #80251 (comment). The previous debug importer was correct but inefficient. For cases with mutual recursion that contain more than one back-edge, each back-edge would result in a new translated instance. This is because the previous implementation never caches any translated result with unbounded self-references. This means all translation inside a recursive context is performed from scratch, which will incur repeated run-time cost as well as repeated attribute sub-trees in the translated IR (differing only in their This PR refactors the importer to handle caching inside a recursive context.
Patch is 20.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87295.diff 3 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/DebugImporter.cpp b/mlir/lib/Target/LLVMIR/DebugImporter.cpp
index 779ad26fc847e6..3de62d960ffc44 100644
--- a/mlir/lib/Target/LLVMIR/DebugImporter.cpp
+++ b/mlir/lib/Target/LLVMIR/DebugImporter.cpp
@@ -25,6 +25,269 @@ using namespace mlir;
using namespace mlir::LLVM;
using namespace mlir::LLVM::detail;
+/// Get the `getRecSelf` constructor for the translated type of `node` if its
+/// translated DITypeAttr supports recursion. Otherwise, returns nullptr.
+static function_ref<DIRecursiveTypeAttrInterface(DistinctAttr)>
+getRecSelfConstructor(llvm::DINode *node) {
+ using CtorType = function_ref<DIRecursiveTypeAttrInterface(DistinctAttr)>;
+ return TypeSwitch<llvm::DINode *, CtorType>(node)
+ .Case([&](llvm::DICompositeType *) {
+ return CtorType(DICompositeTypeAttr::getRecSelf);
+ })
+ .Default(CtorType());
+}
+
+/// Translation helper for recursive DINodes.
+/// Works alongside a stack-based DINode translator (the "main translator") for
+/// gracefully handling DINodes that are recursive.
+///
+/// Usage:
+/// - Before translating a node, call `tryPrune` to see if the pruner can
+/// preempt this translation. If this is a node that the pruner already knows
+/// how to handle, it will return the translated DINodeAttr.
+/// - After a node is successfully translated by the main translator, call
+/// `finalizeTranslation` to save the translated result with the pruner, and
+/// give it a chance to further modify the result.
+/// - Regardless of success or failure by the main translator, always call
+/// `finally` at the end of translating a node. This is necessary to keep the
+/// internal book-keeping in sync.
+///
+/// This helper maintains an internal cache so that no recursive type will
+/// be translated more than once by the main translator.
+/// This internal cache is different from the cache maintained by the main
+/// translator because it may store nodes that are not self-contained (i.e.
+/// contain unbounded recursive self-references).
+class mlir::LLVM::detail::RecursionPruner {
+public:
+ RecursionPruner(MLIRContext *context) : context(context) {}
+
+ /// If this node was previously cached, returns the cached result.
+ /// If this node is a recursive instance that was previously seen, returns a
+ /// self-reference.
+ /// Otherwise, returns null attr.
+ DINodeAttr tryPrune(llvm::DINode *node) {
+ // Lookup the cache first.
+ auto [result, unboundSelfRefs] = lookup(node);
+ if (result) {
+ // Need to inject unbound self-refs into the previous layer.
+ if (!unboundSelfRefs.empty())
+ translationStack.back().second.unboundSelfRefs.insert(
+ unboundSelfRefs.begin(), unboundSelfRefs.end());
+ return result;
+ }
+
+ // If the node type is capable of being recursive, check if it's seen
+ // before.
+ auto recSelfCtor = getRecSelfConstructor(node);
+ if (recSelfCtor) {
+ // If a cyclic dependency is detected since the same node is being
+ // traversed twice, emit a recursive self type, and mark the duplicate
+ // node on the translationStack so it can emit a recursive decl type.
+ auto [iter, inserted] = translationStack.try_emplace(node);
+ if (!inserted) {
+ // The original node may have already been assigned a recursive ID from
+ // a different self-reference. Use that if possible.
+ DIRecursiveTypeAttrInterface recSelf = iter->second.recSelf;
+ if (!recSelf) {
+ DistinctAttr recId = DistinctAttr::create(UnitAttr::get(context));
+ recSelf = recSelfCtor(recId);
+ iter->second.recSelf = recSelf;
+ }
+ // Inject the self-ref into the previous layer.
+ translationStack.back().second.unboundSelfRefs.insert(recSelf);
+ return cast<DINodeAttr>(recSelf);
+ }
+ }
+ return nullptr;
+ }
+
+ /// Register the translated result of `node`. Returns the finalized result
+ /// (with recId if recursive) and whether the result is self-contained
+ /// (i.e. contains no unbound self-refs).
+ std::pair<DINodeAttr, bool> finalizeTranslation(llvm::DINode *node,
+ DINodeAttr result) {
+ // If `node` is not a potentially recursive type, it will not be on the
+ // translation stack. Nothing to set in this case.
+ if (translationStack.empty())
+ return {result, true};
+ if (translationStack.back().first != node)
+ return {result, translationStack.back().second.unboundSelfRefs.empty()};
+
+ TranslationState &state = translationStack.back().second;
+
+ // If this node is actually recursive, set the recId onto `result`.
+ if (DIRecursiveTypeAttrInterface recSelf = state.recSelf) {
+ auto recType = cast<DIRecursiveTypeAttrInterface>(result);
+ result = cast<DINodeAttr>(recType.withRecId(recSelf.getRecId()));
+
+ // Remove this recSelf from the set of unbound selfRefs.
+ state.unboundSelfRefs.erase(recSelf);
+
+ // Insert the newly resolved recursive type into the cache entries that
+ // rely on it.
+ // Only need to look at the caches at this level.
+ uint64_t numRemaining = state.cacheSize;
+ for (auto &cacheEntry : llvm::make_range(cache.rbegin(), cache.rend())) {
+ if (numRemaining == 0)
+ break;
+
+ if (auto refIter = cacheEntry.second.pendingReplacements.find(recSelf);
+ refIter != cacheEntry.second.pendingReplacements.end()) {
+ refIter->second = result;
+ }
+
+ --numRemaining;
+ }
+ }
+
+ // Insert the current result into the cache.
+ state.cacheSize++;
+ auto [iter, inserted] = cache.try_emplace(node);
+ assert(inserted && "invalid state: caching the same DINode twice");
+ iter->second.attr = result;
+
+ // If this node had any unbound self-refs free when it is registered into
+ // the cache, set up replacement placeholders: This result will need these
+ // unbound self-refs to be replaced before being used.
+ for (DIRecursiveTypeAttrInterface selfRef : state.unboundSelfRefs)
+ iter->second.pendingReplacements.try_emplace(selfRef, nullptr);
+
+ return {result, state.unboundSelfRefs.empty()};
+ }
+
+ /// Pop off a frame from the translation stack after a node is done being
+ /// translated.
+ void finally(llvm::DINode *node) {
+ // If `node` is not a potentially recursive type, it will not be on the
+ // translation stack. Nothing to handle in this case.
+ if (translationStack.empty() || translationStack.back().first != node)
+ return;
+
+ // At the end of the stack, all unbound self-refs must be resolved already,
+ // and the entire cache should be accounted for.
+ TranslationState &currLayerState = translationStack.back().second;
+ if (translationStack.size() == 1) {
+ assert(currLayerState.unboundSelfRefs.empty() &&
+ "internal error: unbound recursive self reference at top level.");
+ assert(currLayerState.cacheSize == cache.size() &&
+ "internal error: inconsistent cache size");
+ translationStack.pop_back();
+ cache.clear();
+ return;
+ }
+
+ // Copy unboundSelfRefs down to the previous level.
+ TranslationState &nextLayerState = (++translationStack.rbegin())->second;
+ nextLayerState.unboundSelfRefs.insert(
+ currLayerState.unboundSelfRefs.begin(),
+ currLayerState.unboundSelfRefs.end());
+
+ // The current layer cache is now considered part of the lower layer cache.
+ nextLayerState.cacheSize += currLayerState.cacheSize;
+
+ // Finally pop off this layer when all bookkeeping is done.
+ translationStack.pop_back();
+ }
+
+private:
+ /// Returns the cached result (if exists) with all known replacements applied.
+ /// Also returns the set of unbound self-refs that are unresolved in this
+ /// cached result.
+ /// The cache entry will also be updated with the replaced result and with the
+ /// applied replacements removed from the pendingReplacements map.
+ std::pair<DINodeAttr, llvm::DenseSet<DIRecursiveTypeAttrInterface>>
+ lookup(llvm::DINode *node) {
+ auto cacheIter = cache.find(node);
+ if (cacheIter == cache.end())
+ return {};
+
+ CachedTranslation &entry = cacheIter->second;
+
+ if (entry.pendingReplacements.empty())
+ return std::make_pair(entry.attr,
+ llvm::DenseSet<DIRecursiveTypeAttrInterface>{});
+
+ mlir::AttrTypeReplacer replacer;
+ replacer.addReplacement(
+ [&entry](DIRecursiveTypeAttrInterface attr)
+ -> std::optional<std::pair<Attribute, WalkResult>> {
+ if (auto replacement = entry.pendingReplacements.lookup(attr)) {
+ // A replacement may contain additional unbound self-refs.
+ return std::make_pair(replacement, mlir::WalkResult::advance());
+ }
+ return std::make_pair(attr, mlir::WalkResult::skip());
+ });
+
+ Attribute replacedAttr = replacer.replace(entry.attr);
+ DINodeAttr result = cast<DINodeAttr>(replacedAttr);
+
+ // Update cache entry to save replaced version and remove already-applied
+ // replacements.
+ entry.attr = result;
+ DenseSet<DIRecursiveTypeAttrInterface> unboundRefs;
+ DenseSet<DIRecursiveTypeAttrInterface> boundRefs;
+ for (auto [refSelf, replacement] : entry.pendingReplacements) {
+ if (replacement)
+ boundRefs.insert(refSelf);
+ else
+ unboundRefs.insert(refSelf);
+ }
+
+ for (DIRecursiveTypeAttrInterface ref : boundRefs)
+ entry.pendingReplacements.erase(ref);
+
+ return std::make_pair(result, unboundRefs);
+ }
+
+ MLIRContext *context;
+
+ /// A lazy cached translation that contains the translated attribute as well
+ /// as any unbound self-references that need to be replaced at lookup.
+ struct CachedTranslation {
+ /// The translated attr. May contain unbound self-references for other
+ /// recursive attrs.
+ DINodeAttr attr;
+ /// The pending replacements that need to be run on this `attr` before it
+ /// can be used.
+ /// Each key is a recursive self-ref, and each value is a recursive decl
+ /// that may contain additional unbound self-refs that need to be replaced.
+ /// Each replacement will be applied at most once. Once a replacement is
+ /// applied, the cached `attr` will be updated, and the replacement will be
+ /// removed from this map.
+ DenseMap<DIRecursiveTypeAttrInterface, DINodeAttr> pendingReplacements;
+ };
+ /// A mapping between LLVM debug metadata and the corresponding attribute.
+ llvm::MapVector<llvm::DINode *, CachedTranslation> cache;
+
+ /// Each potentially recursive node will have a TranslationState pushed onto
+ /// the `translationStack` to keep track of whether this node is actually
+ /// recursive (i.e. has self-references inside), and other book-keeping.
+ struct TranslationState {
+ /// The rec-self if this node is indeed a recursive node (i.e. another
+ /// instance of itself is seen while translating it). Null if this node
+ /// has not been seen again deeper in the translation stack.
+ DIRecursiveTypeAttrInterface recSelf;
+ /// The number of cache entries belonging to this layer of the translation
+ /// stack. This corresponds to the last `cacheSize` entries of `cache`.
+ uint64_t cacheSize = 0;
+ /// All the unbound recursive self references in this layer of the
+ /// translation stack.
+ DenseSet<DIRecursiveTypeAttrInterface> unboundSelfRefs;
+ };
+ /// A stack that stores the metadata nodes that are being traversed. The stack
+ /// is used to handle cyclic dependencies during metadata translation.
+ /// Each node is pushed with an empty TranslationState. If it is ever seen
+ /// later when the stack is deeper, the node is recursive, and its
+ /// TranslationState is assigned a recSelf.
+ llvm::MapVector<llvm::DINode *, TranslationState> translationStack;
+};
+
+DebugImporter::DebugImporter(ModuleOp mlirModule)
+ : recursionPruner(new RecursionPruner(mlirModule.getContext())),
+ context(mlirModule.getContext()), mlirModule(mlirModule) {}
+
+DebugImporter::~DebugImporter() {}
+
Location DebugImporter::translateFuncLocation(llvm::Function *func) {
llvm::DISubprogram *subprogram = func->getSubprogram();
if (!subprogram)
@@ -246,42 +509,12 @@ DINodeAttr DebugImporter::translate(llvm::DINode *node) {
if (DINodeAttr attr = nodeToAttr.lookup(node))
return attr;
- // If the node type is capable of being recursive, check if it's seen before.
- auto recSelfCtor = getRecSelfConstructor(node);
- if (recSelfCtor) {
- // If a cyclic dependency is detected since the same node is being traversed
- // twice, emit a recursive self type, and mark the duplicate node on the
- // translationStack so it can emit a recursive decl type.
- auto [iter, inserted] = translationStack.try_emplace(node, nullptr);
- if (!inserted) {
- // The original node may have already been assigned a recursive ID from
- // a different self-reference. Use that if possible.
- DistinctAttr recId = iter->second;
- if (!recId) {
- recId = DistinctAttr::create(UnitAttr::get(context));
- iter->second = recId;
- }
- unboundRecursiveSelfRefs.back().insert(recId);
- return cast<DINodeAttr>(recSelfCtor(recId));
- }
- }
-
- unboundRecursiveSelfRefs.emplace_back();
-
- auto guard = llvm::make_scope_exit([&]() {
- if (recSelfCtor)
- translationStack.pop_back();
+ // Register with the recursive translator. If it is seen before, return the
+ // result immediately.
+ if (DINodeAttr attr = recursionPruner->tryPrune(node))
+ return attr;
- // Copy unboundRecursiveSelfRefs down to the previous level.
- if (unboundRecursiveSelfRefs.size() == 1)
- assert(unboundRecursiveSelfRefs.back().empty() &&
- "internal error: unbound recursive self reference at top level.");
- else
- unboundRecursiveSelfRefs[unboundRecursiveSelfRefs.size() - 2].insert(
- unboundRecursiveSelfRefs.back().begin(),
- unboundRecursiveSelfRefs.back().end());
- unboundRecursiveSelfRefs.pop_back();
- });
+ auto guard = llvm::make_scope_exit([&]() { recursionPruner->finally(node); });
// Convert the debug metadata if possible.
auto translateNode = [this](llvm::DINode *node) -> DINodeAttr {
@@ -318,20 +551,12 @@ DINodeAttr DebugImporter::translate(llvm::DINode *node) {
return nullptr;
};
if (DINodeAttr attr = translateNode(node)) {
- // If this node was marked as recursive, set its recId.
- if (auto recType = dyn_cast<DIRecursiveTypeAttrInterface>(attr)) {
- if (DistinctAttr recId = translationStack.lookup(node)) {
- attr = cast<DINodeAttr>(recType.withRecId(recId));
- // Remove the unbound recursive ID from the set of unbound self
- // references in the translation stack.
- unboundRecursiveSelfRefs.back().erase(recId);
- }
- }
-
+ auto [result, isSelfContained] =
+ recursionPruner->finalizeTranslation(node, attr);
// Only cache fully self-contained nodes.
- if (unboundRecursiveSelfRefs.back().empty())
- nodeToAttr.try_emplace(node, attr);
- return attr;
+ if (isSelfContained)
+ nodeToAttr.try_emplace(node, result);
+ return result;
}
return nullptr;
}
@@ -394,13 +619,3 @@ DistinctAttr DebugImporter::getOrCreateDistinctID(llvm::DINode *node) {
id = DistinctAttr::create(UnitAttr::get(context));
return id;
}
-
-function_ref<DIRecursiveTypeAttrInterface(DistinctAttr)>
-DebugImporter::getRecSelfConstructor(llvm::DINode *node) {
- using CtorType = function_ref<DIRecursiveTypeAttrInterface(DistinctAttr)>;
- return TypeSwitch<llvm::DINode *, CtorType>(node)
- .Case([&](llvm::DICompositeType *concreteNode) {
- return CtorType(DICompositeTypeAttr::getRecSelf);
- })
- .Default(CtorType());
-}
diff --git a/mlir/lib/Target/LLVMIR/DebugImporter.h b/mlir/lib/Target/LLVMIR/DebugImporter.h
index bcf628fc4234fb..1344362ff8d749 100644
--- a/mlir/lib/Target/LLVMIR/DebugImporter.h
+++ b/mlir/lib/Target/LLVMIR/DebugImporter.h
@@ -27,10 +27,12 @@ class LLVMFuncOp;
namespace detail {
+class RecursionPruner;
+
class DebugImporter {
public:
- DebugImporter(ModuleOp mlirModule)
- : context(mlirModule.getContext()), mlirModule(mlirModule) {}
+ DebugImporter(ModuleOp mlirModule);
+ ~DebugImporter();
/// Translates the given LLVM debug location to an MLIR location.
Location translateLoc(llvm::DILocation *loc);
@@ -86,24 +88,14 @@ class DebugImporter {
/// for it, or create a new one if not.
DistinctAttr getOrCreateDistinctID(llvm::DINode *node);
- /// Get the `getRecSelf` constructor for the translated type of `node` if its
- /// translated DITypeAttr supports recursion. Otherwise, returns nullptr.
- function_ref<DIRecursiveTypeAttrInterface(DistinctAttr)>
- getRecSelfConstructor(llvm::DINode *node);
-
/// A mapping between LLVM debug metadata and the corresponding attribute.
DenseMap<llvm::DINode *, DINodeAttr> nodeToAttr;
/// A mapping between distinct LLVM debug metadata nodes and the corresponding
/// distinct id attribute.
DenseMap<llvm::DINode *, DistinctAttr> nodeToDistinctAttr;
- /// A stack that stores the metadata nodes that are being traversed. The stack
- /// is used to detect cyclic dependencies during the metadata translation.
- /// A node is pushed with a null value. If it is ever seen twice, it is given
- /// a recursive id attribute, indicating that it is a recursive node.
- llvm::MapVector<llvm::DINode *, DistinctAttr> translationStack;
- /// All the unbound recursive self references in the translation stack.
- SmallVector<DenseSet<DistinctAttr>> unboundRecursiveSelfRefs;
+ // Translation copilot for recursive types.
+ std::unique_ptr<RecursionPruner> recursionPruner;
MLIRContext *context;
ModuleOp mlirModule;
diff --git a/mlir/test/Target/LLVMIR/Import/debug-info.ll b/mlir/test/Target/LLVMIR/Import/debug-info.ll
index 959a5a1cd97176..b322840d852611 100644
--- a/mlir/test/Target/LLVMIR/Import/debug-info.ll
+++ b/mlir/test/Target/LLVMIR/Import/debug-info.ll
@@ -607,3 +607,44 @@ declare !dbg !1 void @declaration()
!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = !DISubprogram(name: "declaration", scope: !2, file: !2, flags: DIFlagPrototyped, spFlags: 0)
!2 = !DIFile(filename: "debug-info.ll", directory: "/")
+
+; // -----
+
+; Ensure that repeated occurence of recursive subtree does not result in duplicate MLIR entries.
+
+; CHECK-DAG: #[[B1_INNER:.+]] = #llvm.di_derived_type<{{.*}}name = "B:B1", baseType = #[[B_SELF:.+]]>
+; CHECK-DAG: #[[B2_INNER:.+]] = #llvm.di_derived_type<{{.*}}name = "B:B2", baseType = #[[B_SELF]]>
+; CHECK-DAG: #[[B_INNER:.+]] = #llvm.di_composite_type<{{.*}}recId = [[B_RECID:.+]], {{.*}}name = "B", {{.*}}elements = #[[B1_INNER]], #[[B2_INNER]]
+
+; CHECK-DAG: #[[B1_OUTER:.+]] = #llvm.di_derived_type<{{.*}}name = "B:B1", baseType = #[[B_INNER]]>
+; CHECK-DAG: #[[B2_OUTER:.+]] = #llvm.di_derived_type<{{.*}}name = "B:B2", baseType = #[[B_INNER]]>
+; CHECK-DAG: #[[A_OUTER:.+]] = #llvm.di_composite_type<{{.*}}recId = [[A_RECID:.+]], {{.*}}name = "A", {{.*}}elements = #[[B1_OUTER]], #[[B2_OUTER]]
+
+; CHECK-DAG: #[[A_SELF:.+]] = #llvm.di_composite_type<{{.*}}recId = [[A_RECID]]
+; CHECK-DAG: #[[B_SELF:.+]] = #llvm.di_composite_type<{{.*}}recId = [[B_RECID]]
+
+; CHECK: #llvm.di_subprogram<{{.*}}scope = #[[A_OUTER]]
+
+
+define void @class_field(ptr %arg1) !dbg !18 {
+ ret void
+}
+
+declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+!llvm.dbg.cu = !{!1}
+!llvm.module.flags = !{!0}
+!0 = !{i32 2, !"Debug Info Version", i32 3}
+!1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2)
+!2 = !DIFile(filename: "debug-info.ll", directory: "/")
+
+
+!3 = !DICompositeType(tag: DW_TAG_class_type, name: "A", file: !2, line: 42, flags: DIFlagTypePassByReference | DIFlagNonTrivial, elements: !4)
+!4 = !{!7, !8}
+
+!5 = !DICompositeType(tag: DW_TAG_class_type, name: "B", scope: !3, file: !2, line: 42, flags: DIFlagTypePassByReference | DIFlagNonTrivial, elements: !9)
+!7 = !DIDerivedType(tag: DW_TAG_member, name: "B:B1", file: !2, baseType: !5)
+!8 = !DIDerivedType(tag: DW_TAG_member, name: "B:B2", file: !2, baseType: !5)
+!9 = !{!7, !8}
+
+!18 = distinct !DISubp...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this is quite a thing :). I believe I understand what is going on but I would not bet on it...
Added some nit comments. Feel free to apply what makes sense. I also had at least one question where I was not sure I follow (essentially why the attribute replace does not need to recurse in one place).
llvm::MapVector<llvm::DINode *, DistinctAttr> translationStack; | ||
/// All the unbound recursive self references in the translation stack. | ||
SmallVector<DenseSet<DistinctAttr>> unboundRecursiveSelfRefs; | ||
// Translation copilot for recursive types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Translation copilot for recursive types. | |
/// A translation helper for recursive types. |
I would suggest helper otherwise one may think there is some AI involved here :).
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would drop this redundant newline.
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
/// All the unbound recursive self references in the translation stack. | ||
SmallVector<DenseSet<DistinctAttr>> unboundRecursiveSelfRefs; | ||
// Translation copilot for recursive types. | ||
std::unique_ptr<RecursionPruner> recursionPruner; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a member variable (without unique pointer)? Or was there a reason for the unique pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was just so I can keep the implementation hidden in the source file. If no one's actually against it I'm more than happy to move the class decl here (even inside this class) and make this a direct member.
Edit: OK I moved the class decl inside DebugImporter and made this a direct member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah didn't get that. I prefer the nested class approach!
/// This internal cache is different from the cache maintained by the main | ||
/// translator because it may store nodes that are not self-contained (i.e. | ||
/// contain unbounded recursive self-references). | ||
class mlir::LLVM::detail::RecursionPruner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: normally the namespaces are added as follows:
namespace mlir {
namespace LLVM {
namespace detail {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for mentioning this, I wasn't quite sure about the style here and the only rule I found was this one for functions, which is not clear if it applied to classes that were forward-declared.
Though I realize this might not be an issue though if we go with including this full object inside DebugImporter (re: your other comment about using a pointer instead of just having the raw thing there), since that would require declaring the class in the header instead of here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right the inner class makes most sense here IMO since this is something that is really only used from within the Debug Importer.
return std::make_pair(entry.attr, | ||
llvm::DenseSet<DIRecursiveTypeAttrInterface>{}); | ||
|
||
mlir::AttrTypeReplacer replacer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mlir::AttrTypeReplacer replacer; | |
AttrTypeReplacer replacer; |
nit:
// A replacement may contain additional unbound self-refs. | ||
return std::make_pair(replacement, mlir::WalkResult::advance()); | ||
} | ||
return std::make_pair(attr, mlir::WalkResult::skip()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment why we can skip here could make sense? I would have expected than one has to recurse into nested attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely correct... This should definitely recurse. I added a test that would break unless this was recursed on, and fixed it here.
Unfortunately this means any previous runs with this branch was probably producing incorrect IR. So it might be good to check again with this fix (I don't think IR size should differ by too much, since we'll still be reusing IR when this happens). I think I'll also add a followup PR with a verifier for DI types. That way we have some way of verifying correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was just running and I get a strange crash during the replacement. Apparently it tries to get the value of a string attribute somewhere deep in the translation stack (and that attribute seems to be null...). It does not make sense to me at the moment. At the very least I will try to produce a reduced test case later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the issue it is not related to your PR and requires some changes to the Attribute/Type replacer (I will follow up separately). The change did not have a significant impact on performance or file size. So it looks good from that perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you convert this to a non-recursive algorithm?
Recursions are ticking bombs (https://mlir.llvm.org/getting_started/DeveloperGuide/#style-guide )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joker-eph are you referring to AttrTypeReplacer? I think this sounds like a larger separate project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm referring to every single unbounded recursion in the compiler: this is in the dev guide "forever".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debug import follows the same design of the corresponding export. While the usage of recursion is problematic and there should be plans to change it, doing so now is not feasible, I think. This PR resolves a substantial performance problem and we would rather not delay this fix.
Honestly, I'm not sure if anyone currently has capacity to look into turning these recursions into iterative implementations. Non-the-less, we should keep this in mind and try to allocate time to resolve these issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK turns out as part of fixing the issue @gysit raised, I had to replace this AttrTypeReplacer usage with a more manual replacer that also tracked recIds. So I made it iterative to avoid this issue (and I think something similar can be done to make AttrTypeReplacer iterative too).
As for the larger issue of the entire DebugImporter & exporter being recursion-based... I agree it's not great but also can be fixed separately. Now that AttrTypeReplacer is not used, this PR doesn't introduce any additional recursion besides what's already there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
||
/// Pop off a frame from the translation stack after a node is done being | ||
/// translated. | ||
void finally(llvm::DINode *node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void finally(llvm::DINode *node) { | |
void popFromTranslationStack(llvm::DINode *node) { |
I would probably go for a slightly more verbose name here. The translation stack modification seems to be the main purpose?
// rely on it. | ||
// Only need to look at the caches at this level. | ||
uint64_t numRemaining = state.cacheSize; | ||
for (auto &cacheEntry : llvm::make_range(cache.rbegin(), cache.rend())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would make_second_range work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good point, yes that would make sense 🙏
if (numRemaining == 0) | ||
break; | ||
|
||
if (auto refIter = cacheEntry.second.pendingReplacements.find(recSelf); | ||
refIter != cacheEntry.second.pendingReplacements.end()) { | ||
refIter->second = result; | ||
} | ||
|
||
--numRemaining; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (numRemaining == 0) | |
break; | |
if (auto refIter = cacheEntry.second.pendingReplacements.find(recSelf); | |
refIter != cacheEntry.second.pendingReplacements.end()) { | |
refIter->second = result; | |
} | |
--numRemaining; | |
if (numRemaining == 0) | |
break; | |
--numRemaining; | |
auto refIter = cacheEntry.second.pendingReplacements.find(recSelf); | |
if (refIter != cacheEntry.second.pendingReplacements.end()) | |
refIter->second = result; |
nit: pure personal preference feel free to ignore :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change, LGTM!
I think we have to land #87475 (or some alternative solution) first though since things can go wrong otherwise (see explanation on that PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out there is also another issue. When I try to export I run in the following assert:
Assertion
inserted && "illegal reuse of recursive id"' failed`
I am working on a reproducer...
The following file crashes when exporting using The input llvm file is unfortunately not that nice. Try to get it into reasonable shape as well. |
Maybe contact me on discord (gysit) for a reduced.bc file? It seems I cannot really upload it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import now looks good to me. I like the simplification and both file size as well as performance look good on the larger example as well.
The only issue I see now is that we run into the "illegal reuse of recursive id" assert during the export of the more complex example. As discussed offline it may make sense to make the export more robust to handle this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for this effort! I see a more than a 10x IR size reduction to compared to head and the compilation times went down a lot as well. It would be nice to avoid the redundancies completely or have a better understanding of an upper bound for the redundancies we introduce. However, this can be done independent of this revision which is a significant improvement over the status quo.
#di_subprogram_inner = #llvm.di_subprogram<scope = #di_file, file = #di_file, subprogramFlags = Optimized, type = #di_subroutine_type_inner> | ||
#di_composite_type_inner = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = distinct[0]<>, scope = #di_subprogram_inner> | ||
|
||
#di_subroutine_type = #llvm.di_subroutine_type<types = #di_composite_type_inner> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not relevant for this revision, but I wonder if #di_composite_type_inner
could be replaced by #di_composite_type_self
in a second traversal after the import. That should allow us to prune all redundancies and thanks the having the full translation stack seems potentially easier than during the import? It may also not be to bad performance-wise since the redundant subtrees anyways come from the cache if I understand correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes for sure. It'd be a lot easier to re-prune this whole thing afterwards. We can just adapt this function that I deleted into a separate util. I can also add a DINodeAttr verifier that can be run after import to catch things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think doing this together with verification in a post processing sounds like a good idea!
Thanks! Since it's been a while, let me rebase to rerun tests with latest main and then merge. |
dffb213
to
10cd9fc
Compare
10cd9fc
to
12c8ef2
Compare
(void)iter; | ||
(void)inserted; | ||
assert(inserted && "illegal reuse of recursive id"); | ||
recursiveTypeMap.try_emplace(recursiveId, placeholder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We found some leak sanitizer issue with this locally :(. It looks like it can happen that the temporary remains in the IR somehow. My suspicion is that we need to overwrite the entry here now that we allow nested occurrences of the same recursive type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input.txt
Here is a reproducer of the leak (renamed to .txt
due to github restrictions). When we run this with valgrind -- mlir-translate --mlir-to-llvm input.mlir
, this shows that there is in fact a memory leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oof 🤦 thanks for the test case. I think the emplace here is guaranteed to insert because the check above will return if the recId is found in the map. I see that this use-after-free is the temporary that was created, and then RAUW'ed, but referenced again. I'm afraid we might have just exposed an existing issue with the exporter? Let me take a closer look 👀 ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if my testing is correct, this fixes the memory leak: #88122. Apparently we were just caching temporary nodes, causing them to be reused after RAUW.
Followup to discussion #87295 (comment). The export cache should not cache temporary nodes.
Followup to discussion llvm#87295 (comment). The export cache should not cache temporary nodes.
Followup to discussion llvm#87295 (comment). The export cache should not cache temporary nodes.
Followup to this discussion: #80251 (comment).
The previous debug importer was correct but inefficient. For cases with mutual recursion that contain more than one back-edge, each back-edge would result in a new translated instance. This is because the previous implementation never caches any translated result with unbounded self-references. This means all translation inside a recursive context is performed from scratch, which will incur repeated run-time cost as well as repeated attribute sub-trees in the translated IR (differing only in their
recId
s).This PR refactors the importer to handle caching inside a recursive context.