[MLIR][LLVM] Support Recursive DITypes #80251

zyx-billy · 2024-02-01T07:02:54Z

Following the discussion from this thread, this PR adds support for recursive DITypes.

This PR adds:

DIRecursiveTypeAttrInterface: An interface that DITypeAttrs can implement to indicate that it supports recursion. See full description in code.
Importer & exporter support (The only DITypeAttr that implements the interface is DICompositeTypeAttr, so the exporter is only implemented for composites too. There will be two methods that each llvm DI type that supports mutation needs to implement since there's nothing general).

gysit

I looked only at the translation back to LLVM IR this time.

I think the translation makes sense. I added some minor comments that may or may not be comprehensible / correct given the time of the day here :).

It is definitely not super easy to follow the logic due to assumptions on the structure and the recursive single pass approach. Would it theoretically be possible to do this in two passes? A first pass that creates the place holders and a second pass that fixes all place holders? But yeah even if it works it is probably also non-trivial.

gysit · 2024-02-01T11:48:14Z

mlir/lib/Target/LLVMIR/DebugImporter.cpp

+        attr = DIRecursiveTypeAttr::get(context, id, typeAttr);
+
+        // Remove the unbound recursive attr.
+        AttrTypeReplacer replacer;


This looks like a left over?

Ok this probably a comment from last time...

oh 🤦 yes forgot to erase. Will cleanup.

gysit · 2024-02-28T18:17:05Z

mlir/test/Target/LLVMIR/llvmir-debug.mlir

+#di_subprogram_outer = #llvm.di_subprogram<
+  id = distinct[2]<>,
+  compileUnit = #di_compile_unit,
+  scope = #di_rec_struct,


Do I understand correctly that we cannot use #di_rec_self here? That would break the translation right?

yeah that's right. #di_rec_self can only occur inside #di_rec_struct. Since the rec_self is given meaning by the surrounding rec-type with the same id, it requires anyone traversing DI attrs to maintain a "context" dictionary, and it's illegal if it's possible to visit a rec_self without its id already in the context. If we ever write a legalization pass, this is what it'll check.

gysit · 2024-02-28T18:44:52Z

mlir/lib/Target/LLVMIR/DebugTranslation.cpp

+          attr.getAlignInBits(),
+          /*OffsetInBits=*/0,
+          /*Flags=*/static_cast<llvm::DINode::DIFlags>(attr.getFlags()),
+          /*Elements=*/nullptr, /*RuntimeLang=*/0, /*VTableHolder=*/nullptr);


I just fixed a problem similar to this today #83257. I wonder if this is problematic here as well and if it would make sense to use:
auto dummy = llvm::MDNode::getTemporary(ctx, std::nullopt);
instead of nullptr here. As metadata nodes are uniqued it could be possible that the later replace step actually replaces the elements of multiple composite types nodes. Although it looks like composite types usually are distinct. I guess that solves the problem... still better safe than sorry.

oh I see... so basically using a temporary avoids this from being uniqued with otherwise identical instances? Yeah sounds good, I'll do the same here 🙏

Yeah the temporary prevents uniquing essential. I don't think it is likely to ever hit that case, but if it is painful to debug. Note that the temporary's life time is probably limited to by the scope but that seems fine here.

gysit · 2024-02-28T18:59:11Z

mlir/lib/Target/LLVMIR/DebugTranslation.cpp

+  llvm::DIType *node =
+      TypeSwitch<DITypeAttr, llvm::DIType *>(attr.getBaseType())
+          .Case<DICompositeTypeAttr>([&](auto attr) {
+            return translateImpl(attr, setRecursivePlaceholderFn);


Did you consider to split the translateImpl in two function calls? E.g. one that computes the paceholder type without elements and one that adds the translated elements? The translateImpl could then use these two helper functions without the callback, while this function could call the two helper functions and in-between store the place holder on the stack. I hope this is understandable/makes sense I am starting to get tired...

It feels like this is a bit more direct than the extra setRecursivePlaceholderFn parameter.

haha yeah I was debating between these two. I don't have a strong preference (went with this one to avoid having two correlated functions, but I also don't like the callback...), we can go with the two-step approach.

zyx-billy · 2024-02-29T01:21:18Z

Thanks for taking a look at this! I've addressed all your code comments.

Re: two pass vs one pass, I think the main challenge is how to replace in the second pass. E.g. if we have a cycle A -> B -> C -> A. Then in the first pass, we create 4 nodes: A -> B -> C -> D (D is a copy of A with holes). But now if we want to replace D with A, we need C to be mutable, which makes it a requirement that all DITypes need to allow all operands to be mutable. Please feel free to correct my understanding here, but I feel like that's not the case today? Otherwise, we'd have to replace the entire B -> C -> D chain, which seems wasteful, and also runs into more complexity if we have nested recursion. So it seems simpler to just create the chain with A already at the end, and just mutate A instead.

gysit · 2024-02-29T06:39:59Z

Thanks for the changes! I will do another evening review this time I hopefully manage to look at the full revision!

Re: two pass vs one pass, I think the main challenge is how to replace in the second pass. E.g. if we have a cycle A -> B -> C -> A. Then in the first pass, we create 4 nodes: A -> B -> C -> D (D is a copy of A with holes). But now if we want to replace D with A, we need C to be mutable

I was thinking of stopping the traversal once a recursion is detected. So once we see the A attribute for the second time we would insert the placeholder node into C and stop the traversal. That means no update of C should be necessary. The second pass would then just replace the elements on all placeholder (incomplete nodes). However, the more I think about it the more I believe that this maybe more complex overall and probably also slower.

gysit

Thanks for moving this foward! I did another pass mostly adding nit comments.

After looking at the import again I wonder if we could use DIRecursiveTypeAttrOf for the composite type only to avoid some of the casting. I would probably also drop the nonTypeTranslationStack since it seems unused for now. Plus I am not really aware of a use case?

I wonder if there is another recursive type. Maybe subroutine type? If yes we could use that one to keep some of the deleted tests alive. Over time some of the checks / nullptr handling we introduced types can then be deleted. However, that will for sure be something for a follow up once we have more experience with this approach.

mlir/test/Target/LLVMIR/Import/import-failure.ll

gysit · 2024-02-29T18:14:28Z

mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td

+def LLVM_DIRecursiveTypeAttr : LLVM_Attr<"DIRecursiveType", "di_recursive_type",
+                                    /*traits=*/[], "DITypeAttr"> {


ultra nit: Can you indent /traits=/ to match "DIRecursiveType".

mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td

gysit · 2024-02-29T18:33:35Z

mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td

+  }];
+
+  let parameters = (ins
+    "DistinctAttr":$id,


In other places, we use "DistinctAttr":$id to mark specific attributes as distinct in the LLVM sense. Here we use the id in a slightly different context. Do you know if a recursive composite type is always distinct? I guess so looking at the logic in DebugTranslation.cpp.

Ah yes the DistinctAttr here is only used as a globally unique ID with no other meaning. It should not matter if the contained type is distinct or not.

maybe it would make to name it recId or so? That way id can be clearly distinguished from the distinct id (which of course may be renamed as well).

mlir/lib/Dialect/LLVMIR/IR/LLVMAttrs.cpp

mlir/lib/Target/LLVMIR/DebugImporter.h

mlir/lib/Target/LLVMIR/DebugTranslation.h

gysit · 2024-02-29T19:53:47Z

mlir/lib/Target/LLVMIR/DebugTranslation.h

@@ -102,6 +116,14 @@ class DebugTranslation {
  /// metadata.
  DenseMap<Attribute, llvm::DINode *> attrToNode;

+  /// A mapping from DIRecursiveTypeAttr id to the translated DIType.


Suggested change

/// A mapping from DIRecursiveTypeAttr id to the translated DIType.

/// A mapping from DistinctAttr ID to the translated DIType.

gysit · 2024-02-29T19:56:11Z

mlir/lib/Target/LLVMIR/DebugTranslation.h

+  /// A mapping between distinct ID attr for DI nodes that require distinction
+  /// and the translate LLVM metadata node. This helps identify attrs that
+  /// should translate into the same LLVM debug node.


Suggested change

/// A mapping between distinct ID attr for DI nodes that require distinction

/// and the translate LLVM metadata node. This helps identify attrs that

/// should translate into the same LLVM debug node.

/// A mapping between DistinctAttr ID

/// and the translate distinct LLVM metadata node. This helps identify attrs that

/// should translate into the same LLVM debug node.

github-actions · 2024-03-01T06:05:44Z

✅ With the latest revision this PR passed the C/C++ code formatter.

zyx-billy · 2024-03-06T01:46:42Z

As I was writing recursive support for DICompositeTypeAttr, I realized it's not too difficult to just abstract out the recursive implementation (just in case we missed another DINode that needs recursive support). So with this latest refactor, instead of making DICompositeTypeAttr special, I pulled out the API needed to support recursion into an attr interface: DIRecursiveTypeAttrInterface. This means the importer & exporter can have generic code (targeting any attr that implements this interface), and we won't need to do a major rewrite if we find out about another DINode that needs recursion support.

Let me know what you think of this new design 🙏 . My hope is it removes the worst part about the previous design (which is a new DITypeAttr type that messed up the expected types everywhere and required manual casting), while also keeping the door open in case more types need to be supported.

gysit · 2024-03-06T07:02:57Z

As I was writing recursive support for DICompositeTypeAttr, I realized it's not too difficult to just abstract out the recursive implementation (just in case we missed another DINode that needs recursive support). So with this latest refactor, instead of making DICompositeTypeAttr special, I pulled out the API needed to support recursion into an attr interface: DIRecursiveTypeAttrInterface. This means the importer & exporter can have generic code (targeting any attr that implements this interface), and we won't need to do a major rewrite if we find out about another DINode that needs recursion support.

That looks like a very nice solution indeed!

FYI I have very limited time this week due to business travels. I think I will not get to do a full review before the weekend.

zyx-billy · 2024-03-07T00:59:15Z

Sounds good! Thank you for taking the time with this. I do like this revision a lot better 😂.

gysit

This change makes a lot of sense. I like the new revision a lot more!

I left a couple of nit comments please pick what makes sense and then it is time to promote the revision from draft to ready for the final review I guess :).

mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td

mlir/include/mlir/Dialect/LLVMIR/LLVMInterfaces.td

mlir/lib/Target/LLVMIR/DebugImporter.cpp

mlir/lib/Target/LLVMIR/DebugImporter.h

gysit · 2024-03-10T15:12:05Z

mlir/lib/Target/LLVMIR/DebugTranslation.cpp

+void DebugTranslation::translateImplFillPlaceholder(
+    DICompositeTypeAttr attr, llvm::DICompositeType *placeholder) {
+  SmallVector<llvm::Metadata *> elements;
+  for (auto member : attr.getElements())


Suggested change

for (auto member : attr.getElements())

for (DINodeAttr member : attr.getElements())

nit: or whatever the type is...

mlir/lib/Target/LLVMIR/DebugTranslation.h

mlir/lib/Target/LLVMIR/DebugTranslation.cpp

Co-authored-by: Tobias Gysi <[email protected]>

gysit

Very nice!

LGTM modulo some optional nits.

Also note that the Windows build bot apparently does not like some of the C++ magic, which should be addressed before landing.

gysit · 2024-03-13T16:57:22Z

mlir/include/mlir/Dialect/LLVMIR/LLVMAttrs.h

@@ -52,9 +52,9 @@ class DILocalScopeAttr : public DIScopeAttr {
 };

 /// This class represents a LLVM attribute that describes a debug info type.
-class DITypeAttr : public DINodeAttr {
+class DITypeAttr : public DIScopeAttr {


This probably should either also be reverted to deriving from DINodeAttr or we decide to follow the LLVM dialect class hierarchy but then DIScopeAttr::classof needs to include all the types? I am fine with both solutions.

oh yes you're right. Let me revert this too.

mlir/include/mlir/Dialect/LLVMIR/LLVMInterfaces.td

mlir/lib/Target/LLVMIR/DebugImporter.cpp

mlir/lib/Target/LLVMIR/DebugTranslation.h

gysit · 2024-03-13T17:30:16Z

mlir/lib/Target/LLVMIR/DebugTranslation.h

+  /// DIType.
+  llvm::MapVector<DistinctAttr, llvm::DIType *> recursiveTypeMap;
+
+  /// A mapping between DistinctAttr ID and the translated LLVM metadata node.


Suggested change

/// A mapping between DistinctAttr ID and the translated LLVM metadata node.

/// A mapping between recursive ID and the translated LLVM metadata node.

ultra nit:

oh this one actually is the distinct attr itself of nodes that are distinct (this way we don't accidentally create two llvm nodes for the same one because now the same llvm node may appear both inside & outside a loop as two different mlir attrs).

Too many IDs :). Ignore my comment then! Or use distinct ID or so...

Co-authored-by: Tobias Gysi <[email protected]>

zyx-billy · 2024-03-13T21:23:18Z

Thank you @gysit for all the comments! I'll go fix the windows issue 🙏.

Dinistro · 2024-03-14T07:23:37Z

mlir/lib/Target/LLVMIR/DebugImporter.cpp

+DebugImporter::getRecSelfConstructor(llvm::DINode *node) {
+  using CtorType = function_ref<DIRecursiveTypeAttrInterface(DistinctAttr)>;
+  return TypeSwitch<llvm::DINode *, CtorType>(node)
+      .Case<llvm::DICompositeType>([&](auto *concreteNode) {


Suggested change

.Case<llvm::DICompositeType>([&](auto *concreteNode) {

.Case([&](llvm::DICompositeType *concreteNode) {

Wouldn't this make it a bit simpler? Maybe MSVC is somehow falling apart due to this. This might additionally also help in simplifying the inner template magic even further.

Yeah you're right. I was trying to see if I could get it to be generic, so we don't need duplicate cases just for each type that supports recursion, but maybe it's not worth it if each one is just one line 😂.

Fix bazel build after #80251.

gysit · 2024-03-19T13:58:30Z

@zyx-billy

We see some downstream breakage after bumping LLVM past your revision. It looks like DICompositeType can have a recursive scope parameter. Since the scope is translated as part of translatePlaceholder running the following example:

#di_composite_type = #llvm.di_composite_type<tag = DW_TAG_null, recId = distinct[0]<>>
#di_composite_type1 = #llvm.di_composite_type<tag = DW_TAG_null, recId = distinct[1]<>>
#di_file = #llvm.di_file<"test.mlir" in "/">
#di_derived_type = #llvm.di_derived_type<tag = DW_TAG_reference_type, baseType = #di_composite_type1, sizeInBits = 64>
#di_namespace = #llvm.di_namespace<name = "__1", exportSymbols = true>
#di_subroutine_type = #llvm.di_subroutine_type<types = #di_derived_type>
#di_subprogram = #llvm.di_subprogram<scope = #di_composite_type, name = "my_func", file = #di_file, line = 182, scopeLine = 182, subprogramFlags = Optimized, type = #di_subroutine_type>
#di_composite_type2 = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = distinct[0]<>, name = "locale", file = #di_file, line = 124, scope = #di_namespace, flags = "TypePassByReference|NonTrivial", sizeInBits = 64, elements = #di_subprogram>
// This currently does not work since the scope tries to recursively translate
// the composite type again. We probably need to translate the scope late
// like the composite type elements.
#di_composite_type3 = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = distinct[1]<>, name = "id", file = #di_file, line = 204, scope = #di_composite_type2, flags = "Public|TypePassByReference|NonTrivial", sizeInBits = 128>
#di_global_variable = #llvm.di_global_variable<scope = #di_namespace, name = "my_global", file = #di_file, line = 4641, type = #di_composite_type3, isDefined = true>
#di_global_variable_expression = #llvm.di_global_variable_expression<var = #di_global_variable, expr = <>>

llvm.mlir.global @global_variable() {dbg_expr = #di_global_variable_expression} : !llvm.struct<(struct<(i64)>, i32)>

produces an assert: Assertion iter != recursiveTypeMap.end() && "unbound DI recursive self type"`.

I believe solving that problem probably requires setting the scope to a placeholder value in translatePlaceholder and then replace it - similar to the elements - in fillPlaceholder once the recursive scope is avialable.

Any chance you can work on a forward fix for this? Or should we maybe revert until there is a fix?

zyx-billy · 2024-03-19T17:08:35Z

@gysit ah thanks for providing this test case! Let me try to come up with a fix asap today (I'm thinking it won't be too complicated), and if not by the end of my day feel free to revert first to unblock.

gysit · 2024-03-19T17:14:08Z

I'm thinking it won't be too complicated

I quickly looked and didn't find a nice mutation API on the llvm composite type... Then I went back to my actual tasks :).

if not by the end of my day feel free to revert first to unblock.

Let's hope it is quite easily fixable!

zyx-billy · 2024-03-19T19:34:30Z

OK I went back to the drawing board and found out that we can just create temporaries that support RAUW. Instead of adding more mutation to DICompositeType, perhaps this was the cleanest solution anyway (I think you might've mentioned something like this earlier?). Have the test case passing now: #85850.

…tes (#85850) Fixes this bug for the previous recursive DI type PR: #80251 (comment). Drawing inspiration from how clang uses DIBuilder to build forward decls, this PR changes how placeholders are created & updated. Instead of requiring each recursive DIType to do in-place mutation, we simply ask for a temporary node as the placeholder, and run RAUW at the end when the concrete node is translated. This has the side effect of simplifying what's needed to add recursion support for a type. Now only one additional method needs to be created for exporting. Concretely, for this PR, `translateImpl` for DICompositeType is back to the state it was before the previous PR, and the only net addition for DICompositeType is `translateTemporaryImpl`. --------- Co-authored-by: Tobias Gysi <[email protected]>

gysit · 2024-03-22T13:39:51Z

mlir/lib/Target/LLVMIR/DebugImporter.cpp

+    }
+
+    // Only cache fully self-contained nodes.
+    if (unboundRecursiveSelfRefs.back().empty())


@zyx-billy the import now works on all our tests but we found a efficiency issue. One of our test files grows by roughly factor 10x when importing and I see may versions of the same unbounded version of a composite type. I suspect this is the case nested recursive types. Then the same DICompositeType may be translated several times with different distinct identifiers since only once we left the outermost cycle we actually cache the type.

The solution to this seems not so trivial since caching a composite type is only really possible once we know there is no unbound type. Could we maybe have some sort of temporary cache during the traversal that keeps track of translated recursive types given a set of currently open unbounded references. That could possibly handle the case where in a cycle there are several entry nodes into a nested cycle...

Didn't have time to think about this in depth yet but you may have had thoughts when implementing this.

oh very interesting... Is it something like this

// LLVM +---> B ----+ | v A <-------- D | ^ +---> C ----+ // MLIR A -> B -> D -> A' +--> C -> D -> A'

So now in MLIR you have two copies of the subgraph D -> A', but they're different because we don't cache it?

I like your idea of a temporary cache that keys on the set of unbounded references. This way repeated subgraphs can be merged too. I can get started on something for this. Please let me know if this is not what you were looking for though 😂.

Oh I suppose in the above case, D also has to contain another cycle (perhaps D <-> E), this way D is not simply deduplicated by MLIR. So something like

// LLVM +---> B ----+ | v A <-------- D <--> E | ^ +---> C ----+ // MLIR A -> B -> D0 -> A' | + --> E0 --> D0' | +--> C -> D1 -> A' + --> E1 --> D1'

D0 & D1 cannot be cached because they contain an unbound A', but also are not identical because they have distinct recIds.

Yes I was thinking of the last example you posted with the nested cycles. I am not 100% sure if this is what we are seeing though our input is 1.6gb of ir...

What I tried though is the following hack:

if (DistinctAttr recId = translationStack.lookup(node)) { attr = cast<DINodeAttr>(recType.withRecId(recId)); // Remove the unbound recursive ID from the set of unbound self // references in the translation stack. unboundRecursiveSelfRefs.back().erase(recId); if (!unboundRecursiveSelfRefs.back().empty()) { DINodeAttr containedAttr = nullptr; for (auto &[unboundSelfRefs, attr] : unboundAttrCache[node]) { if (unboundSelfRefs == unboundRecursiveSelfRefs.back()) containedAttr = attr; } if (!containedAttr) unboundAttrCache[node].emplace_back(unboundRecursiveSelfRefs.back(), attr); if (containedAttr) { attr = containedAttr; } } } }

So essentially I cache previous instances of a recursive attribute declaration with the corresponding unbounded self refs and then always return an attribute from the cache... This solves the IR size problem but not the compilation time problem since the traversal is unchanged. I am not really sure the IR is correct but at least it imports and exports...

This is the minimal example I can come up with:

define void @class_field(ptr %arg1) !dbg !18 { ret void } declare void @llvm.dbg.value(metadata, metadata, metadata) !llvm.dbg.cu = !{!1} !llvm.module.flags = !{!0} !0 = !{i32 2, !"Debug Info Version", i32 3} !1 = distinct !DICompileUnit(language: DW_LANG_C, file: !2) !2 = !DIFile(filename: "debug-info.ll", directory: "/") !3 = !DICompositeType(tag: DW_TAG_class_type, name: "A", file: !2, line: 42, flags: DIFlagTypePassByReference | DIFlagNonTrivial, elements: !4) !4 = !{!7, !8} !5 = !DICompositeType(tag: DW_TAG_class_type, name: "B", scope: !3, file: !2, line: 42, flags: DIFlagTypePassByReference | DIFlagNonTrivial, elements: !9) !6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !5, flags: DIFlagArtificial | DIFlagObjectPointer) !7 = !DIDerivedType(tag: DW_TAG_member, name: "B:B1", file: !2, baseType: !5) !8 = !DIDerivedType(tag: DW_TAG_member, name: "B:B2", file: !2, baseType: !5) !9 = !{!7, !8} !18 = distinct !DISubprogram(name: "A", scope: !3, file: !2, spFlags: DISPFlagDefinition, unit: !1)

When importing the B type declaration duplicates:

#di_composite_type3 = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = distinct[1]<>, name = "B", file = #di_file, line = 42, scope = #di_composite_type, flags = "TypePassByReference|NonTrivial", elements = #di_derived_type, #di_derived_type1> #di_composite_type4 = #llvm.di_composite_type<tag = DW_TAG_class_type, recId = distinct[2]<>, name = "B", file = #di_file, line = 42, scope = #di_composite_type, flags = "TypePassByReference|NonTrivial", elements = #di_derived_type2, #di_derived_type3>

Maybe it also makes sense to key the cache on the translation stack since the unboundSelfRefs are only really known after traversing the subtree if I understand correctly. Anyways I hope the example helps to clarify the problem!

Oh thank you so much for the example. I see now that caching on the set of unbound self-refs doesn't solve the compile time problem... It probably does make sense to instead just create a new cache for each level of the translation stack like you said. Every time we put something that is potentially recursive onto the stack, we start a new cache. And one trick we can play is every time we finish a recursive type, we go into the current layer cache, replace all instances of its rec-self with the just-finished recursive type, and insert into the outer layer cache. This way translate never walks into a recursive type twice. Let me try to make something like this that's efficient 🤔

And one trick we can play is every time we finish a recursive type, we go into the current layer cache, replace all instances of its rec-self with the just-finished recursive type, and insert into the outer layer cache.

Ah right, with that we should be on the safe side.

@gysit I have a solution here that should work: #87295. For your example, it only creates one copy of the B type declaration.
In addition to caching every translated result, the unbound self-ref replacement is lazy so that it's only triggered on use. Interested to see how this works for your cases. Hoping that both compile time and result IR is smaller 🙏. I can open it up for review if it works, otherwise we can see where to further optimize. Thanks!

Very nice!

I see a massive reduction in size and it is comparable to llvm now:

-rw-r--r-- 1 tobiasg tobiasg 196553791 Apr 2 10:59 sparta-input.ll -rw-r--r-- 1 tobiasg tobiasg 162484418 Apr 2 10:47 sparta-with-fix.mlir -rw-r--r-- 1 tobiasg tobiasg 1817219645 Apr 2 10:59 sparta-without-fix.mlir

Additionally compilation time drops by factor 10x as well. I have a debug build only though. However, I expect this reproduce in release build.

So I think we should probably go ahead with your solution!

PS note that we do not import everything so the LLVM comparison is not entirely fair. However, being in the same ballpark is a very good sign. I tried to look at the IR to see if I see some inefficiencies but it is too large for easy conclusions. I am very optimistic we are fine now though!

oh awesome! Thanks for giving this a try so quickly. I'll open it up for review 🙏

…tes (llvm#85850) Fixes this bug for the previous recursive DI type PR: llvm#80251 (comment). Drawing inspiration from how clang uses DIBuilder to build forward decls, this PR changes how placeholders are created & updated. Instead of requiring each recursive DIType to do in-place mutation, we simply ask for a temporary node as the placeholder, and run RAUW at the end when the concrete node is translated. This has the side effect of simplifying what's needed to add recursion support for a type. Now only one additional method needs to be created for exporting. Concretely, for this PR, `translateImpl` for DICompositeType is back to the state it was before the previous PR, and the only net addition for DICompositeType is `translateTemporaryImpl`. --------- Co-authored-by: Tobias Gysi <[email protected]>

Followup to this discussion: #80251 (comment). The previous debug importer was correct but inefficient. For cases with mutual recursion that contain more than one back-edge, each back-edge would result in a new translated instance. This is because the previous implementation never caches any translated result with unbounded self-references. This means all translation inside a recursive context is performed from scratch, which will incur repeated run-time cost as well as repeated attribute sub-trees in the translated IR (differing only in their `recId`s). This PR refactors the importer to handle caching inside a recursive context. - In the presence of unbound self-refs, the translation result is cached in a separate cache that keeps track of the set of dependent unbound self-refs. - A dependent cache entry is valid only when all the unbound self-refs are in scope. Whenever a cached entry goes out of scope, it will be removed the next time it is looked up.

zyx-billy force-pushed the mlir/llvm/di_recursive_type branch 2 times, most recently from f2abfcd to dc53164 Compare February 1, 2024 07:48

zyx-billy force-pushed the mlir/llvm/di_recursive_type branch from 761a0dd to 0657c86 Compare February 27, 2024 23:05

gysit reviewed Feb 28, 2024

View reviewed changes

walter-erquinigo self-requested a review February 28, 2024 23:18

gysit reviewed Feb 29, 2024

View reviewed changes

zyx-billy force-pushed the mlir/llvm/di_recursive_type branch 3 times, most recently from 1481118 to ac05394 Compare March 6, 2024 01:45

zyx-billy force-pushed the mlir/llvm/di_recursive_type branch from ac05394 to 8691805 Compare March 6, 2024 01:51

zyx-billy requested a review from Mogball March 8, 2024 19:21

gysit reviewed Mar 10, 2024

View reviewed changes

zyx-billy and others added 12 commits March 12, 2024 23:09

generic recursive type with importer impl

b4de568

add exporter support for composite type

2632768

remove supported failure test

41f2064

cleanup

194236f

use two-step recursive translation hook

5a8c2ad

Apply suggestions from code review

a0841d3

Co-authored-by: Tobias Gysi <[email protected]>

addressed some local comments

87b1db1

rename id to recId

af017d0

Replace dedicated type with interface instead

1d85a02

Apply suggestions from code review

88250a7

Co-authored-by: Tobias Gysi <[email protected]>

apply more review suggestions

635a8f5

handle merge

67be64d

zyx-billy force-pushed the mlir/llvm/di_recursive_type branch from 4dcd1b9 to 67be64d Compare March 13, 2024 03:23

zyx-billy marked this pull request as ready for review March 13, 2024 03:24

gysit approved these changes Mar 13, 2024

View reviewed changes

zyx-billy and others added 4 commits March 13, 2024 11:58

Apply suggestions from code review

a9991ba

Co-authored-by: Tobias Gysi <[email protected]>

fix hierarchy

3ca8c05

nit fixes

168b4e1

attempt to fix catpure

f8309a8

Dinistro reviewed Mar 14, 2024

View reviewed changes

make concrete

e6caf31

zyx-billy mentioned this pull request Mar 15, 2024

[mlir] Add support for recursive elements in DICompositeTypeAttr. #74948

Draft

zyx-billy merged commit 1e8dad3 into llvm:main Mar 15, 2024

zyx-billy mentioned this pull request Mar 15, 2024

[MLIR][LLVM] Fix bazel build #85462

Merged

zyx-billy added a commit that referenced this pull request Mar 15, 2024

[MLIR][LLVM] Fix bazel build (#85462)

f0863a0

Fix bazel build after #80251.

zyx-billy mentioned this pull request Mar 19, 2024

[MLIR][LLVM] DI Recursive Type fix for recursion via scope of composites #85850

Merged

gysit reviewed Mar 22, 2024

View reviewed changes

zyx-billy mentioned this pull request Apr 1, 2024

[MLIR][LLVM] Recursion importer handle repeated self-references #87295

Merged

gysit mentioned this pull request Aug 14, 2024

[mlir][debug] Handle DIImportedEntity. #103055

Merged

		def LLVM_DIRecursiveTypeAttr : LLVM_Attr<"DIRecursiveType", "di_recursive_type",
		/traits=/[], "DITypeAttr"> {

	/// A mapping from DIRecursiveTypeAttr id to the translated DIType.
	/// A mapping from DistinctAttr ID to the translated DIType.

	for (auto member : attr.getElements())
	for (DINodeAttr member : attr.getElements())

	/// A mapping between DistinctAttr ID and the translated LLVM metadata node.
	/// A mapping between recursive ID and the translated LLVM metadata node.

	.Case<llvm::DICompositeType>([&](auto *concreteNode) {
	.Case([&](llvm::DICompositeType *concreteNode) {

[MLIR][LLVM] Support Recursive DITypes #80251

[MLIR][LLVM] Support Recursive DITypes #80251

Uh oh!

Conversation

zyx-billy commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zyx-billy commented Feb 29, 2024

Uh oh!

gysit commented Feb 29, 2024

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zyx-billy commented Mar 6, 2024

Uh oh!

gysit commented Mar 6, 2024

Uh oh!

zyx-billy commented Mar 7, 2024

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

zyx-billy commented Feb 1, 2024 •

edited

Loading

github-actions bot commented Mar 1, 2024 •

edited

Loading