Skip to content

[BoundsSafety] Allow 'counted_by' attribute on pointers in structs in C #90786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

delcypher
Copy link
Contributor

Previously the attribute was only allowed on flexible array members. This patch patch changes this to also allow the attribute on pointer fields in structs and also allows late parsing of the attribute in some contexts.

For example this previously wasn't allowed:

struct BufferTypeDeclAttributePosition {
  size_t count;
  char* buffer __counted_by(count); // Now allowed
}

Note the attribute is prevented on pointee types where the size isn't known at compile time. In particular pointee types that are:

  • Incomplete (e.g. void) and sizeless types
  • Function types (e.g. the pointee of a function pointer)
  • Struct types with a flexible array member

This patch also introduces late parsing of the attribute when used in the declaration attribute position. For example

struct BufferTypeDeclAttributePosition {
  char* buffer __counted_by(count); // Now allowed
  size_t count;
}

is now allowed but only when passing
-fexperimental-late-parse-attributes. The motivation for using late parsing here is to avoid breaking the data layout of structs in existing code that want to use the counted_by attribute. This patch is the first use of LateAttrParseExperimentalExt in Attr.td that was introduced in a previous patch.

Note by allowing the attribute on struct member pointers this now allows the possiblity of writing the attribute in the type attribute position. For example:

struct BufferTypeAttributePosition {
  size_t count;
  char *__counted_by(count) buffer; // Now allowed
}

However, the attribute in this position is still currently parsed immediately rather than late parsed. So this will not parse currently:

struct BufferTypeAttributePosition {
  char *__counted_by(count) buffer; // Fails to parse
  size_t count;
}

The intention is to lift this restriction in future patches. It has not been done in this patch to keep this size of this commit small.

There are also several other follow up changes that will need to be addressed in future patches:

  • Make late parsing working with anonymous structs (see on_pointer_anon_buf in attr-counted-by-late-parsed-struct-ptrs.c).
  • Allow counted_by on more subjects (e.g. parameters, returns types) when -fbounds-safety is enabled.
  • Make use of the attribute on pointer types in code gen (e.g. for _builtin_dynamic_object_size and UBSan's array-bounds checks).

This work is heavily based on a patch originally written by Yeoul Na.

rdar://125400257

@delcypher delcypher added the clang:bounds-safety Issue/PR relating to the experimental -fbounds-safety feature in Clang label May 1, 2024
@delcypher delcypher self-assigned this May 1, 2024
@delcypher delcypher requested a review from Endilll as a code owner May 1, 2024 22:01
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 1, 2024
@llvmbot
Copy link
Member

llvmbot commented May 1, 2024

@llvm/pr-subscribers-clang

Author: Dan Liew (delcypher)

Changes

Previously the attribute was only allowed on flexible array members. This patch patch changes this to also allow the attribute on pointer fields in structs and also allows late parsing of the attribute in some contexts.

For example this previously wasn't allowed:

struct BufferTypeDeclAttributePosition {
  size_t count;
  char* buffer __counted_by(count); // Now allowed
}

Note the attribute is prevented on pointee types where the size isn't known at compile time. In particular pointee types that are:

  • Incomplete (e.g. void) and sizeless types
  • Function types (e.g. the pointee of a function pointer)
  • Struct types with a flexible array member

This patch also introduces late parsing of the attribute when used in the declaration attribute position. For example

struct BufferTypeDeclAttributePosition {
  char* buffer __counted_by(count); // Now allowed
  size_t count;
}

is now allowed but only when passing
-fexperimental-late-parse-attributes. The motivation for using late parsing here is to avoid breaking the data layout of structs in existing code that want to use the counted_by attribute. This patch is the first use of LateAttrParseExperimentalExt in Attr.td that was introduced in a previous patch.

Note by allowing the attribute on struct member pointers this now allows the possiblity of writing the attribute in the type attribute position. For example:

struct BufferTypeAttributePosition {
  size_t count;
  char *__counted_by(count) buffer; // Now allowed
}

However, the attribute in this position is still currently parsed immediately rather than late parsed. So this will not parse currently:

struct BufferTypeAttributePosition {
  char *__counted_by(count) buffer; // Fails to parse
  size_t count;
}

The intention is to lift this restriction in future patches. It has not been done in this patch to keep this size of this commit small.

There are also several other follow up changes that will need to be addressed in future patches:

  • Make late parsing working with anonymous structs (see on_pointer_anon_buf in attr-counted-by-late-parsed-struct-ptrs.c).
  • Allow counted_by on more subjects (e.g. parameters, returns types) when -fbounds-safety is enabled.
  • Make use of the attribute on pointer types in code gen (e.g. for _builtin_dynamic_object_size and UBSan's array-bounds checks).

This work is heavily based on a patch originally written by Yeoul Na.

rdar://125400257


Patch is 38.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90786.diff

18 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+18-1)
  • (modified) clang/include/clang/AST/Type.h (+1)
  • (modified) clang/include/clang/Basic/Attr.td (+2-1)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+13-2)
  • (modified) clang/include/clang/Parse/Parser.h (+6-1)
  • (modified) clang/include/clang/Sema/Sema.h (+2-1)
  • (modified) clang/lib/AST/Type.cpp (+12)
  • (modified) clang/lib/Parse/ParseDecl.cpp (+97-7)
  • (modified) clang/lib/Parse/ParseObjc.cpp (+6-4)
  • (modified) clang/lib/Sema/SemaDeclAttr.cpp (+56-15)
  • (modified) clang/lib/Sema/SemaType.cpp (+3-3)
  • (modified) clang/lib/Sema/TreeTransform.h (+1-1)
  • (added) clang/test/AST/attr-counted-by-late-parsed-struct-ptrs.c (+45)
  • (added) clang/test/Sema/attr-counted-by-late-parsed-off.c (+26)
  • (added) clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c (+185)
  • (added) clang/test/Sema/attr-counted-by-struct-ptrs-sizeless-types.c (+17)
  • (added) clang/test/Sema/attr-counted-by-struct-ptrs.c (+163)
  • (modified) clang/test/Sema/attr-counted-by.c (+3-2)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 4c0fe5bcf6b122..81d0a89b294878 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -258,7 +258,8 @@ New Compiler Flags
 
 - ``-fexperimental-late-parse-attributes`` enables an experimental feature to
   allow late parsing certain attributes in specific contexts where they would
-  not normally be late parsed.
+  not normally be late parsed. Currently this allows late parsing the
+  `counted_by` attribute in C. See `Attribute Changes in Clang`_.
 
 Deprecated Compiler Flags
 -------------------------
@@ -335,6 +336,22 @@ Attribute Changes in Clang
 - Clang now warns that the ``exclude_from_explicit_instantiation`` attribute
   is ignored when applied to a local class or a member thereof.
 
+- The ``counted_by`` attribute can now be late parsed in C when
+  ``-fexperimental-late-parse-attributes`` is passed but only when attribute is
+  used in the declaration attribute position. This allows using the
+  attribute on existing code where it previously impossible to do so without
+  re-ordering struct field declarations would break ABI as shown below.
+
+  .. code-block:: c
+
+     struct BufferTy {
+       /* Refering to `count` requires late parsing */
+       char* buffer __counted_by(count);
+       /* Swapping `buffer` and `count` to avoid late parsing would break ABI */
+       size_t count;
+     };
+
+
 Improvements to Clang's diagnostics
 -----------------------------------
 - Clang now applies syntax highlighting to the code snippets it
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index e6643469e0b334..e982a565413751 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -2515,6 +2515,7 @@ class alignas(TypeAlignment) Type : public ExtQualsTypeCommonBase {
   bool isRecordType() const;
   bool isClassType() const;
   bool isStructureType() const;
+  bool isStructureTypeWithFlexibleArrayMember() const;
   bool isObjCBoxableRecordType() const;
   bool isInterfaceType() const;
   bool isStructureOrClassType() const;
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 0225598cbbe8ad..ef08d2a274a907 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -2229,7 +2229,8 @@ def TypeNullUnspecified : TypeAttr {
 def CountedBy : DeclOrTypeAttr {
   let Spellings = [Clang<"counted_by">];
   let Subjects = SubjectList<[Field], ErrorDiag>;
-  let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel">];
+  let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel", 1>];
+  let LateParsed = LateAttrParseExperimentalExt;
   let ParseArgumentsAsUnevaluated = 1;
   let Documentation = [CountedByDocs];
   let LangOpts = [COnly];
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index f72d5c252b863e..0ab2a96c7b464b 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6518,8 +6518,10 @@ def warn_superclass_variable_sized_type_not_at_end : Warning<
 
 def err_flexible_array_count_not_in_same_struct : Error<
   "'counted_by' field %0 isn't within the same struct as the flexible array">;
-def err_counted_by_attr_not_on_flexible_array_member : Error<
-  "'counted_by' only applies to C99 flexible array members">;
+def err_counted_by_attr_not_on_ptr_or_flexible_array_member : Error<
+  "'counted_by' only applies to pointers or C99 flexible array members">;
+def err_counted_by_attr_on_array_not_flexible_array_member : Error<
+  "'counted_by' on arrays only applies to C99 flexible array members">;
 def err_counted_by_attr_refer_to_itself : Error<
   "'counted_by' cannot refer to the flexible array member %0">;
 def err_counted_by_must_be_in_structure : Error<
@@ -6534,6 +6536,15 @@ def err_counted_by_attr_refer_to_union : Error<
   "'counted_by' argument cannot refer to a union member">;
 def note_flexible_array_counted_by_attr_field : Note<
   "field %0 declared here">;
+def err_counted_by_attr_pointee_unknown_size : Error<
+  "'counted_by' cannot be applied a pointer with pointee with unknown size "
+  "because %0 is %select{"
+    "an incomplete type|"  // CountedByInvalidPointeeTypeKind::INCOMPLETE
+    "a sizeless type|"     // CountedByInvalidPointeeTypeKind::SIZELESS
+    "a function type|"     // CountedByInvalidPointeeTypeKind::FUNCTION
+    // CountedByInvalidPointeeTypeKind::FLEXIBLE_ARRAY_MEMBER
+    "a struct type with a flexible array member"
+  "}1">;
 
 let CategoryName = "ARC Semantic Issue" in {
 
diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h
index 81aab8c888ab65..0f2ba17aadaf2d 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -1643,6 +1643,8 @@ class Parser : public CodeCompletionHandler {
                                bool EnterScope, bool OnDefinition);
   void ParseLexedAttribute(LateParsedAttribute &LA,
                            bool EnterScope, bool OnDefinition);
+  void ParseLexedCAttribute(LateParsedAttribute &LA,
+                            ParsedAttributes *OutAttrs = nullptr);
   void ParseLexedMethodDeclarations(ParsingClass &Class);
   void ParseLexedMethodDeclaration(LateParsedMethodDeclaration &LM);
   void ParseLexedMethodDefs(ParsingClass &Class);
@@ -2530,7 +2532,8 @@ class Parser : public CodeCompletionHandler {
 
   void ParseStructDeclaration(
       ParsingDeclSpec &DS,
-      llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback);
+      llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
+      LateParsedAttrList *LateFieldAttrs = nullptr);
 
   DeclGroupPtrTy ParseTopLevelStmtDecl();
 
@@ -3107,6 +3110,8 @@ class Parser : public CodeCompletionHandler {
                                  SourceLocation ScopeLoc,
                                  ParsedAttr::Form Form);
 
+  void DistributeCLateParsedAttrs(Decl *Dcl, LateParsedAttrList *LateAttrs);
+
   void ParseBoundsAttribute(IdentifierInfo &AttrName,
                             SourceLocation AttrNameLoc, ParsedAttributes &Attrs,
                             IdentifierInfo *ScopeName, SourceLocation ScopeLoc,
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 1ca523ec88c2f9..f1e1be24f4fd2f 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -11509,7 +11509,8 @@ class Sema final : public SemaBase {
   QualType BuildMatrixType(QualType T, Expr *NumRows, Expr *NumColumns,
                            SourceLocation AttrLoc);
 
-  QualType BuildCountAttributedArrayType(QualType WrappedTy, Expr *CountExpr);
+  QualType BuildCountAttributedArrayOrPointerType(QualType WrappedTy,
+                                                  Expr *CountExpr);
 
   QualType BuildAddressSpaceAttr(QualType &T, LangAS ASIdx, Expr *AddrSpace,
                                  SourceLocation AttrLoc);
diff --git a/clang/lib/AST/Type.cpp b/clang/lib/AST/Type.cpp
index 68e81f45b4c28e..316d70e0b79ecf 100644
--- a/clang/lib/AST/Type.cpp
+++ b/clang/lib/AST/Type.cpp
@@ -631,6 +631,18 @@ bool Type::isStructureType() const {
   return false;
 }
 
+bool Type::isStructureTypeWithFlexibleArrayMember() const {
+  const auto *RT = getAs<RecordType>();
+  if (!RT)
+    return false;
+  const auto *Decl = RT->getDecl();
+  if (!Decl->isStruct())
+    return false;
+  if (!Decl->hasFlexibleArrayMember())
+    return false;
+  return true;
+}
+
 bool Type::isObjCBoxableRecordType() const {
   if (const auto *RT = getAs<RecordType>())
     return RT->getDecl()->hasAttr<ObjCBoxableAttr>();
diff --git a/clang/lib/Parse/ParseDecl.cpp b/clang/lib/Parse/ParseDecl.cpp
index 7431c256d2c135..470d26fb3d73be 100644
--- a/clang/lib/Parse/ParseDecl.cpp
+++ b/clang/lib/Parse/ParseDecl.cpp
@@ -3282,6 +3282,19 @@ void Parser::ParseAlignmentSpecifier(ParsedAttributes &Attrs,
   }
 }
 
+void Parser::DistributeCLateParsedAttrs(Decl *Dcl,
+                                        LateParsedAttrList *LateAttrs) {
+  assert(Dcl);
+
+  if (!LateAttrs)
+    return;
+
+  for (auto *LateAttr : *LateAttrs) {
+    if (LateAttr->Decls.empty())
+      LateAttr->addDecl(Dcl);
+  }
+}
+
 /// Bounds attributes (e.g., counted_by):
 ///   AttrName '(' expression ')'
 void Parser::ParseBoundsAttribute(IdentifierInfo &AttrName,
@@ -4815,13 +4828,14 @@ static void DiagnoseCountAttributedTypeInUnnamedAnon(ParsingDeclSpec &DS,
 ///
 void Parser::ParseStructDeclaration(
     ParsingDeclSpec &DS,
-    llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback) {
+    llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
+    LateParsedAttrList *LateFieldAttrs) {
 
   if (Tok.is(tok::kw___extension__)) {
     // __extension__ silences extension warnings in the subexpression.
     ExtensionRAIIObject O(Diags);  // Use RAII to do this.
     ConsumeToken();
-    return ParseStructDeclaration(DS, FieldsCallback);
+    return ParseStructDeclaration(DS, FieldsCallback, LateFieldAttrs);
   }
 
   // Parse leading attributes.
@@ -4886,10 +4900,12 @@ void Parser::ParseStructDeclaration(
     }
 
     // If attributes exist after the declarator, parse them.
-    MaybeParseGNUAttributes(DeclaratorInfo.D);
+    MaybeParseGNUAttributes(DeclaratorInfo.D, LateFieldAttrs);
 
     // We're done with this declarator;  invoke the callback.
-    FieldsCallback(DeclaratorInfo);
+    Decl *Field = FieldsCallback(DeclaratorInfo);
+    if (Field)
+      DistributeCLateParsedAttrs(Field, LateFieldAttrs);
 
     // If we don't have a comma, it is either the end of the list (a ';')
     // or an error, bail out.
@@ -4900,6 +4916,69 @@ void Parser::ParseStructDeclaration(
   }
 }
 
+/// Finish parsing an attribute for which parsing was delayed.
+/// This will be called at the end of parsing a class declaration
+/// for each LateParsedAttribute. We consume the saved tokens and
+/// create an attribute with the arguments filled in. We add this
+/// to the Attribute list for the decl.
+void Parser::ParseLexedCAttribute(LateParsedAttribute &LA,
+                                  ParsedAttributes *OutAttrs) {
+  // Create a fake EOF so that attribute parsing won't go off the end of the
+  // attribute.
+  Token AttrEnd;
+  AttrEnd.startToken();
+  AttrEnd.setKind(tok::eof);
+  AttrEnd.setLocation(Tok.getLocation());
+  AttrEnd.setEofData(LA.Toks.data());
+  LA.Toks.push_back(AttrEnd);
+
+  // Append the current token at the end of the new token stream so that it
+  // doesn't get lost.
+  LA.Toks.push_back(Tok);
+  PP.EnterTokenStream(LA.Toks, /*DisableMacroExpansion=*/true,
+                      /*IsReinject=*/true);
+  // Drop the current token and bring the first cached one. It's the same token
+  // as when we entered this function.
+  ConsumeAnyToken(/*ConsumeCodeCompletionTok=*/true);
+
+  ParsedAttributes Attrs(AttrFactory);
+
+  assert(LA.Decls.size() <= 1 &&
+         "late field attribute expects to have at most one declaration.");
+
+  // Dispatch based on the attribute and parse it
+  const AttributeCommonInfo::Form ParsedForm = ParsedAttr::Form::GNU();
+  IdentifierInfo *ScopeName = nullptr;
+  const ParsedAttr::Kind AttrKind =
+      ParsedAttr::getParsedKind(&LA.AttrName, /*ScopeName=*/ScopeName,
+                                /*SyntaxUsed=*/ParsedForm.getSyntax());
+  switch (AttrKind) {
+  case ParsedAttr::Kind::AT_CountedBy:
+    ParseBoundsAttribute(LA.AttrName, LA.AttrNameLoc, Attrs,
+                         /*ScopeName=*/ScopeName, SourceLocation(),
+                         /*Form=*/ParsedForm);
+    break;
+  default:
+    llvm_unreachable("Unhandled late parsed attribute");
+  }
+
+  for (auto *D : LA.Decls)
+    Actions.ActOnFinishDelayedAttribute(getCurScope(), D, Attrs);
+
+  // Due to a parsing error, we either went over the cached tokens or
+  // there are still cached tokens left, so we skip the leftover tokens.
+  while (Tok.isNot(tok::eof))
+    ConsumeAnyToken();
+
+  // Consume the fake EOF token if it's there
+  if (Tok.is(tok::eof) && Tok.getEofData() == AttrEnd.getEofData())
+    ConsumeAnyToken();
+
+  if (OutAttrs) {
+    OutAttrs->takeAllFrom(Attrs);
+  }
+}
+
 /// ParseStructUnionBody
 ///       struct-contents:
 ///         struct-declaration-list
@@ -4923,6 +5002,11 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
   ParseScope StructScope(this, Scope::ClassScope|Scope::DeclScope);
   Actions.ActOnTagStartDefinition(getCurScope(), TagDecl);
 
+  // `LateAttrParseExperimentalExtOnly=true` requests that only attributes
+  // marked with `LateAttrParseExperimentalExt` are late parsed.
+  LateParsedAttrList LateFieldAttrs(/*PSoon=*/false,
+                                    /*LateAttrParseExperimentalExtOnly=*/true);
+
   // While we still have something to read, read the declarations in the struct.
   while (!tryParseMisplacedModuleImport() && Tok.isNot(tok::r_brace) &&
          Tok.isNot(tok::eof)) {
@@ -4973,18 +5057,19 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
     }
 
     if (!Tok.is(tok::at)) {
-      auto CFieldCallback = [&](ParsingFieldDeclarator &FD) {
+      auto CFieldCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
         // Install the declarator into the current TagDecl.
         Decl *Field =
             Actions.ActOnField(getCurScope(), TagDecl,
                                FD.D.getDeclSpec().getSourceRange().getBegin(),
                                FD.D, FD.BitfieldSize);
         FD.complete(Field);
+        return Field;
       };
 
       // Parse all the comma separated declarators.
       ParsingDeclSpec DS(*this);
-      ParseStructDeclaration(DS, CFieldCallback);
+      ParseStructDeclaration(DS, CFieldCallback, &LateFieldAttrs);
     } else { // Handle @defs
       ConsumeToken();
       if (!Tok.isObjCAtKeyword(tok::objc_defs)) {
@@ -5025,7 +5110,12 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
 
   ParsedAttributes attrs(AttrFactory);
   // If attributes exist after struct contents, parse them.
-  MaybeParseGNUAttributes(attrs);
+  MaybeParseGNUAttributes(attrs, &LateFieldAttrs);
+
+  // Late parse field attributes if necessary.
+  assert(!getLangOpts().CPlusPlus);
+  for (auto *LateAttr : LateFieldAttrs)
+    ParseLexedCAttribute(*LateAttr);
 
   SmallVector<Decl *, 32> FieldDecls(TagDecl->fields());
 
diff --git a/clang/lib/Parse/ParseObjc.cpp b/clang/lib/Parse/ParseObjc.cpp
index 8e54fe012c55d7..2eaeff38a24a6f 100644
--- a/clang/lib/Parse/ParseObjc.cpp
+++ b/clang/lib/Parse/ParseObjc.cpp
@@ -778,16 +778,16 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
       }
 
       bool addedToDeclSpec = false;
-      auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) {
+      auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
         if (FD.D.getIdentifier() == nullptr) {
           Diag(AtLoc, diag::err_objc_property_requires_field_name)
               << FD.D.getSourceRange();
-          return;
+          return nullptr;
         }
         if (FD.BitfieldSize) {
           Diag(AtLoc, diag::err_objc_property_bitfield)
               << FD.D.getSourceRange();
-          return;
+          return nullptr;
         }
 
         // Map a nullability property attribute to a context-sensitive keyword
@@ -816,6 +816,7 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
             MethodImplKind);
 
         FD.complete(Property);
+        return Property;
       };
 
       // Parse all the comma separated declarators.
@@ -2024,7 +2025,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
       continue;
     }
 
-    auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) {
+    auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
       assert(getObjCDeclContext() == interfaceDecl &&
              "Ivar should have interfaceDecl as its decl context");
       // Install the declarator into the interface decl.
@@ -2035,6 +2036,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
       if (Field)
         AllIvarDecls.push_back(Field);
       FD.complete(Field);
+      return Field;
     };
 
     // Parse all the comma separated declarators.
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 363ae93cb62df1..7ae1e09c7a5dcd 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -8588,31 +8588,71 @@ static const RecordDecl *GetEnclosingNamedOrTopAnonRecord(const FieldDecl *FD) {
   return RD;
 }
 
-static bool
-CheckCountExpr(Sema &S, FieldDecl *FD, Expr *E,
-               llvm::SmallVectorImpl<TypeCoupledDeclRefInfo> &Decls) {
+enum class CountedByInvalidPointeeTypeKind {
+  INCOMPLETE,
+  SIZELESS,
+  FUNCTION,
+  FLEXIBLE_ARRAY_MEMBER,
+  VALID,
+};
+
+static bool CheckCountedByAttrOnField(
+    Sema &S, FieldDecl *FD, Expr *E,
+    llvm::SmallVectorImpl<TypeCoupledDeclRefInfo> &Decls) {
+  // Check the context the attribute is used in
+
   if (FD->getParent()->isUnion()) {
     S.Diag(FD->getBeginLoc(), diag::err_counted_by_attr_in_union)
         << FD->getSourceRange();
     return true;
   }
 
-  if (!E->getType()->isIntegerType() || E->getType()->isBooleanType()) {
-    S.Diag(E->getBeginLoc(), diag::err_counted_by_attr_argument_not_integer)
-        << E->getSourceRange();
+  const auto FieldTy = FD->getType();
+  if (!FieldTy->isArrayType() && !FieldTy->isPointerType()) {
+    S.Diag(FD->getBeginLoc(),
+           diag::err_counted_by_attr_not_on_ptr_or_flexible_array_member)
+        << FD->getLocation();
     return true;
   }
 
   LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
       LangOptions::StrictFlexArraysLevelKind::IncompleteOnly;
-
-  if (!Decl::isFlexibleArrayMemberLike(S.getASTContext(), FD, FD->getType(),
+  if (FieldTy->isArrayType() &&
+      !Decl::isFlexibleArrayMemberLike(S.getASTContext(), FD, FieldTy,
                                        StrictFlexArraysLevel, true)) {
-    // The "counted_by" attribute must be on a flexible array member.
-    SourceRange SR = FD->getLocation();
-    S.Diag(SR.getBegin(),
-           diag::err_counted_by_attr_not_on_flexible_array_member)
-        << SR;
+    S.Diag(FD->getBeginLoc(),
+           diag::err_counted_by_attr_on_array_not_flexible_array_member)
+        << FD->getLocation();
+    return true;
+  }
+
+  if (FieldTy->isPointerType()) {
+    CountedByInvalidPointeeTypeKind InvalidTypeKind =
+        CountedByInvalidPointeeTypeKind::VALID;
+    const auto PointeeTy = FieldTy->getPointeeType();
+    if (PointeeTy->isIncompleteType()) {
+      InvalidTypeKind = CountedByInvalidPointeeTypeKind::INCOMPLETE;
+    } else if (PointeeTy->isSizelessType()) {
+      InvalidTypeKind = CountedByInvalidPointeeTypeKind::SIZELESS;
+    } else if (PointeeTy->isFunctionType()) {
+      InvalidTypeKind = CountedByInvalidPointeeTypeKind::FUNCTION;
+    } else if (PointeeTy->isStructureTypeWithFlexibleArrayMember()) {
+      InvalidTypeKind = CountedByInvalidPointeeTypeKind::FLEXIBLE_ARRAY_MEMBER;
+    }
+
+    if (InvalidTypeKind != CountedByInvalidPointeeTypeKind::VALID) {
+      S.Diag(FD->getBeginLoc(), diag::err_counted_by_attr_pointee_unknown_size)
+          << FieldTy->getPointeeType() << (int)InvalidTypeKind
+          << FD->getSourceRange();
+      return true;
+    }
+  }
+
+  // Check the expression
+
+  if (!E->getType()->isIntegerType() || E->getType()->isBooleanType()) {
+    S.Diag(E->getBeginLoc(), diag::err_counted_by_attr_argument_not_integer)
+        << E->getSourceRange();
     return true;
   }
 
@@ -8675,10 +8715,11 @@ static void handleCountedByAttrField(Sema &S, Decl *D, const ParsedAttr &AL) {
     return;
 
   llvm::SmallVector<TypeCoupledDeclRefInfo, 1> Decls;
-  if (CheckCountExpr(S, FD, CountExpr, Decls))
+  if (CheckCountedByAttrOnField(S, FD, CountExpr, Decls))
     return;
 
-  QualType CAT = S.BuildCountAttributedArrayType(FD->getType(), CountExpr);
+  QualType CAT =
+      S.BuildCountAttributedArrayOrPointerType(FD->getType(), CountExpr);
   FD->setType(CAT);
 }
 
diff --git a/clang/lib/Sema/SemaType.cpp b/clang/lib/Sema/SemaType.cpp
index fddc3545ecb61c..29e9700959cf51 100644
--- a/clang/lib/Sema/SemaType.cpp
+++ b/clang/lib/Sema/SemaType.cpp
@@ -9790,9 +9790,9 @@ BuildTypeCoupledDecls(Expr *E,
   De...
[truncated]

Copy link
Contributor

@Endilll Endilll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sema.h changes look good to me.

return true;
}

if (FieldTy->isPointerType()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be limited to pointers? Seems to me that it could apply to arrays also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Flexible array member

struct Has_VLA {
    int count;
    char buffer[] __counted_by(count);
};

struct Test {
    int count;
    struct Has_VLA Arr[] __counted_by(count);
};

There's no diagnostic here currently. I think the code above is something we should disallow (at least for -fbounds-safety). I'll adjust the patch to disallow this.

@isanbard Is the above code pattern something you actually want to support? In -fbounds-safety we disallow it because it means computing the bounds of Arr is not constant time because the size of Has_VLA is not a compile time constant. To compute the bounds of Arr at runtime it would require walking every Has_VLA object in Arr to get its count field. That seems expensive and is also at risk of running into a race condition (i.e. the size of a VLA changes while the bounds are being computed).

Incomplete type

struct Test {
    int count;
    void Arr[] __counted_by(count);
};

We already have these diagnostics:

tmp/arr_sizeless_ty.c:5:13: error: array has incomplete element type 'void'
    5 |     void Arr[] __counted_by(count);
      |             ^
tmp/arr_sizeless_ty.c:5:5: error: 'counted_by' only applies to pointers or C99 flexible array members
    5 |     void Arr[] __counted_by(count);
      |     ^    ~~~
2 errors generated.

I don't think we need to add another.

Sizeless types

struct Test {
    int count;
    __SVInt8_t Arr[] __counted_by(count);
};

We already have these diagnostics:

tmp/arr_sizeless_ty.c:5:19: error: array has sizeless element type '__SVInt8_t'
    5 |     __SVInt8_t Arr[] __counted_by(count);
      |                   ^
tmp/arr_sizeless_ty.c:5:5: error: 'counted_by' only applies to pointers or C99 flexible array members
    5 |     __SVInt8_t Arr[] __counted_by(count);
      |     ^          ~~~
2 errors generated.

I don't think we need to add another.

Function Type

typedef void(foo_fn)(int);

struct Test {
    int count;
    foo_fn Arr[] __counted_by(count);
};

We already have these diagnostics:

tmp/arr_sizeless_ty.c:7:15: error: 'Arr' declared as array of functions of type 'foo_fn' (aka 'void (int)')
    7 |     foo_fn Arr[] __counted_by(count);
      |               ^
tmp/arr_sizeless_ty.c:7:5: error: 'counted_by' only applies to pointers or C99 flexible array members
    7 |     foo_fn Arr[] __counted_by(count);
      |     ^      ~~~
2 errors generated.

I don't think we need to add another.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isanbard Is the above code pattern something you actually want to support? In -fbounds-safety we disallow it because it means computing the bounds of Arr is not constant time because the size of Has_VLA is not a compile time constant. To compute the bounds of Arr at runtime it would require walking every Has_VLA object in Arr to get its count field. That seems expensive and is also at risk of running into a race condition (i.e. the size of a VLA changes while the bounds are being computed).

Is it even valid C? I suppose it could be valid if struct Has_VLA wasn't allocated with extra bytes, i.e. count is 0 and buffer has no size.

But that's a horrible use case and I think it's fine to flag it as an error.

@bwendling bwendling requested a review from kees May 3, 2024 18:37
@@ -335,6 +336,22 @@ Attribute Changes in Clang
- Clang now warns that the ``exclude_from_explicit_instantiation`` attribute
is ignored when applied to a local class or a member thereof.

- The ``counted_by`` attribute can now be late parsed in C when
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should mention that counted_by was extended to work with pointers in structs first. Otherwise, "last parsing" of the attribute doesn't make a lot of sense since a flexible array member is at the end of a struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can mention it but I'm wondering if it should be in the release notes at all because this patch only makes parsing work. I've not done anything on the codegen side to make use of the attribute on pointers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't have a use yet, maybe you don't need to add it in the release notes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to add some context here. The original reason there was a release note is because in #88596 (which this patch builds upon) @AaronBallman asked for there to be a release note about the new -fexperimental-late-parse-attributes flag. This patch means -fexperimental-late-parse-attributes actually does something now (before it did nothing) so I updated the release notes on that flag and it "felt" like it made sense to explain in the release notes the change on the attributes in relation to the flag.

I don't have strong opinions here. I'm ok to drop the changes to the release notes or keep them or something in between. @AaronBallman any opinions here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion either. :-) It's fine to keep the blurb in.

"because %0 is %select{"
"an incomplete type|" // CountedByInvalidPointeeTypeKind::INCOMPLETE
"a sizeless type|" // CountedByInvalidPointeeTypeKind::SIZELESS
"a function type|" // CountedByInvalidPointeeTypeKind::FUNCTION
Copy link
Collaborator

@bwendling bwendling May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused why a function type is excluded. Isn't it just a pointer? @kees can correct me, but I think the point of counted_by on a pointer is that it could be a list of pointers, and we don't want to allow someone to access beyond that list:

struct s {
  int count;
  int *ptr __counted_by(count);
};

struct s *alloc(size_t num_elems) {
  struct s *p = malloc(sizeof(struct s));

  p->count = num_elems;
  p->ptr = calloc(num_elems, sizeof(int));
  return p;
}

If that's the case, then any pointer should be okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bwendling I was also confused by this initially. A "function type" and "function pointer type" are different. A function type doesn't have a size so we have to forbid it, but a function pointer does have a size (it's the size of a pointer).

Hopefully these examples illustrate the problem:

#define __counted_by(f)  __attribute__((counted_by(f)))

typedef void(fn_ty)(int);
typedef void(*fn_ptr_ty)(int);

struct a {
    int count;
    // Buffer of functions is invalid. A "function" has no size
    fn_ty* buffer __counted_by(count);
    fn_ptr_ty buffer2 __counted_by(count);
};

struct b {
    int count;
    // Valid: A buffer of function pointers is allowed
    fn_ty** buffer __counted_by(count);
    fn_ptr_ty* buffer2 __counted_by(count);
};

A similar thing exists for arrays. If you write this

struct c {
    fn_ty Arr[];
};

this produces the following error:

error: 'Arr' declared as array of functions of type 'fn_ty' (aka 'void (int)')
   22 |     fn_ty Arr[];
      |              ^

However these two struct definitions are allowed by clang:

struct d {
    fn_ty* Arr[];
};

struct e {
    fn_ptr_ty Arr2[];
};

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "function type" and "function pointer type" are different.

Ah! my mistake. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: Given the above I expected pointer arithmetic on a function pointer to be forbidden but it's not...

typedef void(fn_ty)(int);
typedef void(*fn_ptr_ty)(int);

void test(fn_ty* fn) {
    fn(5); // ok
    // WHAT!?: How can pointer arithmetic be possible when the size of
    // the function isn't a defined thing...
    //
    // Turns out clang codegens this as if `sizeof(*fn) == 1`
    _Static_assert(sizeof(*fn) == 1, "oh boy...we're in for a wild ride");
    ++fn;
    // Let's jump somewhere into the function that's 1 byte along from the start.
    // what could possibly go wrong!?
    fn(5); // CRASH
    return;
}

It's fine to do pointer arithmetic on an array of function pointers though

typedef void(fn_ty)(int);
typedef void(*fn_ptr_ty)(int);
void do_something(int i) { printf("hello %d\n", i);}
void do_something2(int i) { printf("hello2 %d\n", i);}

fn_ptr_ty Funcs[] = { do_something, do_something2};

void test2(fn_ty** fn) {
    // fn == do_something
    (*fn)(5);
    ++fn;
    // Now fn == do_something2
    (*fn)(5);
    return;
}

int main(void) {
    test2(Funcs);
    return 0;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that as long as it's a pointer we're okay, at least w.r.t. the counted_by attribute. We know the size of pointers at least. :-)

return true;
}

if (FieldTy->isPointerType()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isanbard Is the above code pattern something you actually want to support? In -fbounds-safety we disallow it because it means computing the bounds of Arr is not constant time because the size of Has_VLA is not a compile time constant. To compute the bounds of Arr at runtime it would require walking every Has_VLA object in Arr to get its count field. That seems expensive and is also at risk of running into a race condition (i.e. the size of a VLA changes while the bounds are being computed).

Is it even valid C? I suppose it could be valid if struct Has_VLA wasn't allocated with extra bytes, i.e. count is 0 and buffer has no size.

But that's a horrible use case and I think it's fine to flag it as an error.

@delcypher delcypher force-pushed the users/delcypher/count-on-pointers-and-late-parsing branch from a89cca7 to 9ec3203 Compare May 3, 2024 19:28
@bwendling
Copy link
Collaborator

Note the attribute is prevented on pointee types where the size isn't known at compile time. In particular pointee types that are:

  • Incomplete (e.g. void) and sizeless types
  • Function types (e.g. the pointee of a function pointer)
  • Struct types with a flexible array member

I've been thinking about this restriction. Why is this necessary? My assumption was that applying counted_by to a pointer causes a bounds check on an index into the pointer rather than its underlying type. So something like:

struct foo;
struct bar {
  int a;
  int fam[] __counted_by(a);
};

struct x {
    void *p __counted_by(count);       // void * is treated like char *, I think.
    struct foo *f __counted_by(count); // sizeof(f) is the size of a general pointer.
    struct bar *b __counted_by(count); // a list of pointers to 'struct bar's should be okay.
    int count;
};

@apple-fcloutier
Copy link
Contributor

I think that there's room to allow __counted_by on incomplete types so that a TU where it's complete could use it (and we have use cases where that would be handy), but our implementation doesn't support it at this time. This can be added without disruptions at a later time. FWIW, the point of bounds-checking an index is moot at least while the type is incomplete, since it's an error to dereference a pointer to an incomplete type. Aside from indexing, operations that would cast the pointer (such as memset(x->foo, 0, some_value), assuming memset(void *__sized_by(n), int, size_t n)) would still have to be illegal when struct foo is incomplete because there's no way to know how many bytes we have at x->foo without sizeof(struct foo).

void *__counted_by could be made legal because clang defines sizeof(void) == 1 in C mode, as an extension to the C standard. I think it's still best avoided: "how many voids do I have at p?" is not a great question to ask, whereas the same with __sized_by just makes sense. It's also a little bit easier to filter out void with the rest of the incomplete types.

For structs with a flexible array member, the reason is that sizeof(struct bar) is a lie. How would we reconcile, for instance, a count of 1 (implying there's like 4 bytes at x->bar) with a x->bar->a of 10 (implying that x->bar->fam[8] is in bounds)? How do we feel that x->bar[0]->fam[0] aliases with x->bar[1]->a?

@delcypher
Copy link
Contributor Author

delcypher commented May 9, 2024

@bwendling

I've been thinking about this restriction. Why is this necessary? My assumption was that applying counted_by to a pointer causes a bounds check on an index into the pointer rather than its underlying type.

@rapidsna Please add additional points if I accidentally miss something.

I can try to explain the restrictions from the perspective of the -fbounds-safety perspective. Essentially the reason these constructs are illegal in -fbounds-safety is because at runtime we need to be able to construct a wide pointer from __counted_by() pointers. In the cases I've made illegal in this PR, this is impossible to do correctly if we don't know the size of the pointee.

What I mean specifically by wide pointer is __bidi_indexable which is -fbounds-safety's wide pointer that can be indexed positively and negatively. In -fbounds-safety's model all pointers on the stack are __bidi_indexable by default and all __counted_by() pointers are implicitly converted to wide pointers when they are used inside a function.

-fbounds-safety's __bidi_indexable is very roughly equivalent to this C++ code

template <class T> struct WidePtr {
  T *ptr; // The address that is dereferenced
  // The bounds of the memory that forms the range of valid bytes to access
  // [lower_bound, upper_bound)
  void *lower_bound;
  void *upper_bound;

  WidePtr(T * /*__counted_by(count)*/ p, size_t count) {
    ptr = p;
    lower_bound = (void *)p;
    // NEED sizeof(T) here!
    upper_bound = ((char *)p) + sizeof(T) * count;
  }

  T& operator*() {
    check_bounds();
    return *ptr;
  }

  void check_bounds() {
    if (ptr < lower_bound)
      __builtin_trap();

    uintptr_t LastByteAccessedPlusOne;
    // NEED sizeof(T) here!
    if (__builtin_uadd_overflow(ptr, sizeof(T), &LastByteAccessedPlusOne))
      __builtin_trap();

    if (LastByteAccessedPlusOne > upper_bound)
      __builtin_trap();
  }

  // Lots of other operators are overloaded (e.g. +, -[], ++, -- and ->).
};

We need to know what sizeof(T) is in two places:

  1. Everywhere you want to do a bounds check. E.g. when dereferencing the pointer.
  2. When converting a T* __counted_by() pointer to a WidePointer<T> (or T* __bidi_indexable in -fbounds-safety parlance). If you look at the constructor we need to know sizeof(T) when computing the upper bound of the wide pointer. Given that this conversion happens whenever a pointer of the type T* __counted_by() is used inside a function in the -fbounds-safety model, the model requires that T have a known size. Technically this PR is being even stricter. It's forbidding declaring a T* __counted_by() even if it isn't used. However, this is a restriction we can lift later once more of the implementation is in place.

I've tried to annotate your code below with the specific reasons they are forbidden.

// Both `foo` and `bar` are legal
struct foo;
struct bar {
  int a;
  int fam[] __counted_by(a);
};

struct x {
    // Illegal in `-fbounds-safety`. We can't compute the upper bound of this pointer
    // because `sizeof(void)*count` isn't valid. In `-fbounds-safety` the attribute you
    // would use here is `__sized_by()` which is a byte count rather than an element count.
    void *p __counted_by(count); 
    
    // Illegal in `-fbounds-safety`. We can't compute the upper bound of this pointer because
    // `sizeof(struct foo)*count` isn't valid. In particular `sizeof(struct foo)` is invalid because
    // `struct foo` is an incomplete type.
    struct foo *f __counted_by(count);
    
    // Illegal in `-fbounds-safety`. While we can technically compute `sizeof(struct bar)*count`
    // that computation is completely wrong unless the `bar::a` is `0`. To actually get the correct
    // upper bound for `b` we would need to walk every `struct bar` object in the buffer pointed to by `b` to
    //  read `b::a`. This could be expensive (and prone to race conditions) so we simply
    // don't support this right now.
    struct bar *b __counted_by(count);
    
    int count;
};

@rapidsna
Copy link
Contributor

rapidsna commented May 9, 2024

I've been thinking about this restriction. Why is this necessary? My assumption was that applying counted_by to a pointer causes a bounds check on an index into the pointer rather than its underlying type.

@bwendling It's because these types are not indexable really.

void:
void doesn't have a size and C standard doesn't allow indexing into void *. I understand void * can be indexable under a GNU extension, but I don't see not supporting it is a problem because we can use __sized_by to annotate void * to clearly indicate the byte size. We will upstream __sized_by support soon so you can use it for void *.

function types
Although, again, the GNU extension allows it, we don't really want to index into function pointers. We can still use __sized_by if we really need to.

Incomplete structs
You can't really index into a pointer to incomplete struct. Though as @apple-fcloutier mentioned, by the point when the pointer is actually indexed, you should have the complete type definition. Otherwise, indexing will be an error anyway. So we have been considering relaxing this requirement in the future, and move the error point to where the pointer is actually used in a way it requires the concrete element size (e.g, places you would insert __dynamic_builtin_object_size, you need the element size to calculate the byte size; indexing into a pointer to incomplete struct is already an error).

@bwendling
Copy link
Collaborator

bwendling commented May 10, 2024

@apple-fcloutier:

think that there's room to allow __counted_by on incomplete types so that a TU where it's complete could use it (and we have use cases where that would be handy), but our implementation doesn't support it at this time. This can be added without disruptions at a later time. FWIW, the point of bounds-checking an index is moot at least while the type is incomplete, since it's an error to dereference a pointer to an incomplete type. Aside from indexing, operations that would cast the pointer (such as memset(x->foo, 0, some_value), assuming memset(void *__sized_by(n), int, size_t n)) would still have to be illegal when struct foo is incomplete because there's no way to know how many bytes we have at x->foo without sizeof(struct foo).

What I'm thinking is something like:

a.c:
struct foo;

struct bar {
  size_t count;
  struct foo *ptr __counted_by(count);
};

struct bar *alloc_bar(size_t num_elems) {
  struct bar *p = malloc(sizeof(bar));

  p->count = num_elems;
  p->ptr = malloc(sizeof(struct foo *) * num_elems);
  return p;
}

/* later on, or in another TU */

struct foo {
  /* fields */
};

void bork(size_t num_elems) {
  struct bar *p = alloc_bar(num_elems);

  for (size_t i = 0; i < num_elems; ++i) {
    p->ptr[i] = alloc_foo();
  }

  /* some use of p */
}

I was incorrect about void *. It's valid to cast any pointer to void * and back to the original type.

For structs with a flexible array member, the reason is that sizeof(struct bar) is a lie. How would we reconcile, for instance, a count of 1 (implying there's like 4 bytes at x->bar) with a x->bar->a of 10 (implying that x->bar->fam[8] is in bounds)? How do we feel that x->bar[0]->fam[0] aliases with x->bar[1]->a?

It's not a lie, because the contents of a pointer don't contribute to the size of the struct containing that pointer.

@bwendling
Copy link
Collaborator

bwendling commented May 10, 2024

@rapidsna @delcypher @apple-fcloutier @kees:

Okay, I think I see what the complication is. Are you trying to prevent the use case of someone writing something like:

struct bar;

struct foo {
  size_t count;
  struct bar *ptr __counted_by(count);
};

where ptr is a pointer to one large space that the user splits up into count number of struct bar objects?

struct foo *alloc_foo(size_t count) {
  struct foo *p = malloc(sizeof(struct foo));

  p->count;
  p->ptr = malloc(sizeof(struct bar) * count); // struct bar can't be incomplete here
  return p;
}

@delcypher
Copy link
Contributor Author

@rapidsna @delcypher @apple-fcloutier @kees:

Okay, I think I see what the complication is. Are you trying to prevent the use case of someone writing something like:

struct bar;

struct foo {
  size_t count;
  struct bar *ptr __counted_by(count);
};

where ptr is a pointer to one large space that the user splits up into count number of struct bar objects?

struct foo *alloc_foo(size_t count) {
  struct foo *p = malloc(sizeof(struct foo));

  p->count;
  p->ptr = malloc(sizeof(struct bar) * count); // struct bar can't be incomplete here
  return p;
}

With the PR in its current form we are preventing this code but only because struct bar is an incomplete type at the point the attribute is applied. If struct bar was defined at the point struct foo was defined then this would be allowed. So this restriction has nothing to do with how the user wants to allocate their memory. It's all about the pointee type having a defined size when the __counted_by attribute is applied.

As @apple-fcloutier @rapidsna noted this is how -fbounds-safety is currently implemented (because its much simpler) but it is a restriction that could be lifted in future by only requiring struct bar to be defined at the point that foo::bar is used rather than when the __counted_by attribute is applied. Given that this PR is allowing __counted_by in a new place (pointers in structs) I think its reasonable for us to place restrictions on that until we've proved its actually necessary to have the more complicated implementation.

@kees
Copy link
Contributor

kees commented May 10, 2024

As @apple-fcloutier @rapidsna noted this is how -fbounds-safety is currently implemented (because its much simpler) but it is a restriction that could be lifted in future by only requiring struct bar to be defined at the point that foo::bar is used rather than when the __counted_by attribute is applied. Given that this PR is allowing __counted_by in a new place (pointers in structs) I think its reasonable for us to place restrictions on that until we've proved its actually necessary to have the more complicated implementation.

The main concern I have with delaying support for this is that header files could find themselves in a state where they could not be refactored without removing counted_by attributes that refer to now-incomplete structs.

For example, in Linux we've been separating structs from implementations, and that usually means using incomplete structs as much as possible to avoid lots of #includes.

So, no objection on this PR, but I think the more complete behavior needs to follow soonish. :)

@delcypher
Copy link
Contributor Author

delcypher commented May 10, 2024

It's not a lie, because the contents of a pointer don't contribute to the size of the struct containing that pointer.

Consider this example. It tries to illustrate why putting __counted_by() on a pointer to a structs containing flexible array members doesn't make sense.

struct HasFAM {
    int count;
    char buffer[] __counted_by(count); // This is OK
};

struct BufferOfFAMS {
    int count;
    struct HasFAM* fams __counted_by(count); // This is invalid
};

struct BufferOfFAMS* setup(void) {
    const int numFams = 2;
    const int famSizes[] = { 8, 16};

    struct BufferOfFAMS *f = (struct BufferOfFAMS *)malloc(
        sizeof(struct BufferOfFAMS) + (sizeof(struct HasFam *) * numFams));

    f->count = numFams;

    size_t famsBufferSize = 0;
    for (int i=0; i < numFams; ++i) {
        famsBufferSize += sizeof(struct HasFAM) + sizeof(char)* famSizes[i];
    }
    f->fams = (struct HasFAM*) malloc(famsBufferSize);
    memset(f->fams, 0, famsBufferSize);

    size_t byteOffset = 0;
    for (int i=0; i < numFams; ++i) {
        struct HasFAM* cur = (struct HasFAM*) (((char*)f->fams) + byteOffset);
        cur->count = famSizes[i];
        byteOffset = sizeof(struct HasFAM) + (sizeof(char) * famSizes[i]);
    }
    return f;
}

int main(void) {
    // How would we compute the bounds of f::fams here??
    // `sizeof(struct HasFAM) * f->count` is WRONG. It's too small but this
    // is what writing `__counted_by(count)` on `HasFAM::buffer` means.
    //
    // It's actually
    // `(sizeof(struct HasFAM) * f->count) + (sizeof(char) * (8 + 16))`
    //
    // This means we can't use the `__counted_by()` attribute on
    // `BufferOfFAMS::fams` to compute its bounds. But allowing the bounds
    // to be computed is the whole point of the attribute.
    struct BufferOfFAMS* f = setup();

    // Similary doing any kind of indexing on the pointer 
    // is extremely problematic for exactly the same reason.
    // 
    // The address is completely wrong because the offset is computed using
    // `sizeof(struct HasFAM)` which won't include the array at the end.
    //
    // So `bogus_ptr` points to the first byte of the first `HasFAM::buffer`,
    // instead of the second `struct HasFAM`.
    struct HasFAM* bogus_ptr = &f->fams[1];
    return 0;
}

@kees
Copy link
Contributor

kees commented May 11, 2024

Consider this example. It tries to illustrate why putting __counted_by() on a pointer to a structs containing flexible array members doesn't make sense.

struct HasFAM {
    int count;
    char buffer[] __counted_by(count); // This is OK
};

struct BufferOfFAMS {
    int count;
    struct HasFAM* fams __counted_by(count); // This is invalid
};

Agreed: structs with flexible array members must be considered to be singletons. This property is actually important for being able to have __builtin_dynamic_object_size() work on pointers to flexible array structs. i.e.:

size_t func(struct foo *p)
{
    return__builtin_dynamic_object_size(p, 0);
}

This must always return SIZE_MAX for fixed-size arrays since the pointer may be in the middle of a larger array of struct foos, but if it is a struct with a flexible array marked with counted_by, then we know deterministically what the size is, since it must be a single complete object.

// expected-error@+2{{'counted_by' only applies to pointers or C99 flexible array members}}
// expected-error@+1{{'buf' declared as array of functions of type 'fn_ty' (aka 'void (int)')}}
fn_ty buf[] __counted_by(count);
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a positive test for typedef void(*fn_ty_p)(int) (i.e. a function pointer).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kees Good catch. Fixed

@delcypher
Copy link
Contributor Author

@kees Thanks for approving.

I'm going to resolve the merge conflict in clang/docs/ReleaseNotes.rst and then merge. I'll start looking at supporting __counted_by() on incomplete pointee types next. @hnrklssn is going to start working on upstreaming the __sized_by attribute soon.

@delcypher delcypher force-pushed the users/delcypher/count-on-pointers-and-late-parsing branch from e6fb7a3 to 80dbab4 Compare May 17, 2024 19:01
@delcypher delcypher merged commit 0ec3b97 into llvm:main May 17, 2024
4 of 5 checks passed
@delcypher
Copy link
Contributor Author

Looks like I broke the clang/test/Misc/pragma-attribute-supported-attributes-list.test test. I'll push a follow up fix to that test once I've confirmed I've fixed it.

@delcypher
Copy link
Contributor Author

Test fixed by 112eadd55f06bee15caadff688ea0b45acbfa804.

@bwendling
Copy link
Collaborator

This seems to have broken the Linux build:

0ec3b97 breaks the build for Linux, added by https://git.kernel.org/linus/781d41fed19caf900c8405064676813dc9921d32:

https://paste.debian.net/plainh/b192bcd1

@rapidsna
Copy link
Contributor

@bwendling Thanks for reporting. We will relax the restrictions for arrays to not break the existing users.

@rapidsna
Copy link
Contributor

rapidsna commented May 17, 2024

@bwendling @kees Wait. ATOM_PPLIB_STATE_V2 is also a struct with flexible array member? This is concerning because ucNumEntries * sizeof(ATOM_PPLIB_STATE_V2) will not be the correct size anyway if the flexible array member of ATOM_PPLIB_STATE_V2 is also supposed to have trailing elements. Do you know the semantics of this structure?

@rapidsna
Copy link
Contributor

@bwendling @kees Likely, we should not put __counted_by in that case. Could we fix the source?

@delcypher
Copy link
Contributor Author

@bwendling This is unfortunate

drivers/gpu/drm/radeon/pptable.h:442:5: error: 'counted_by' cannot be applied to an array with element of unknown size because 'ATOM_PPLIB_STATE_V2' (aka 'struct _ATOM_PPLIB_STATE_V2') is a struct type with a flexible array member
  442 |     ATOM_PPLIB_STATE_V2 states[] __counted_by(ucNumEntries);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Based on our previous discussion (#90786 (comment)) I hope we can agree using the attribute this way is problematic. I don't see how states could be have its bounds correctly computed when the compiler doesn't know how big struct _ATOM_PPLIB_STATE_V2 actually is.

To unbreak this particular use quickly I'll downgrade the struct with FAM case to a warning and from there we'll need to decide how to proceed.

@bwendling
Copy link
Collaborator

Ah! I see what you mean. I'll bring this up with the developer. (Actually, that construct makes me nervous about their code in general...)

delcypher added a commit that referenced this pull request May 17, 2024
…n on flexible array members

In 0ec3b97 an additional restriction
was added when applying the `counted_by` attribute to flexible array
members in structs. The restriction prevented the element type being
a struct that itself had a flexible array member. E.g.:

```
struct has_unannotated_VLA {
  int count;
  char buffer[];
};

struct buffer_of_structs_with_unnannotated_vla {
  int count;
  struct has_unannotated_VLA Arr[] __counted_by(count);
};
```

In this example assuming the size of `Arr` is `sizeof(struct
has_unannotated_VLA)*count` (which is what the attribute says) is wrong
because it doesn't account for the size of
`has_unannotated_VLA::buffer`. This is why this kind of code construct
was treated as an error.

However, it turns out existing Linux kernel code used the attribute
on a flexible array member in this way
(#90786 (comment)).

To unbreak the build this restriction is downgraded to a warning with
the plan to make it an error again once the errornous use of the
attribute in the Linux kernel is resolved.
@delcypher
Copy link
Contributor Author

@kees @bwendling @rapidsna The workaround to downgrade this error to a warning has landed
cef6387

@bwendling
Copy link
Collaborator

Thank you. I wrote to the author. I hope he'll be able to come up with a change on his end. Or at least an explanation that makes sense :-)

@delcypher
Copy link
Contributor Author

Hmm. Apparently there's a memory leak.

https://lab.llvm.org/buildbot/#/builders/239/builds/7043

-- Testing: 79948 of 79949 tests, 48 workers --
Testing: 
FAIL: Clang :: AST/attr-counted-by-late-parsed-struct-ptrs.c (480 of 79948)
******************** TEST 'Clang :: AST/attr-counted-by-late-parsed-struct-ptrs.c' FAILED ********************
Exit Code: 1
Command Output (stderr):
--
RUN: at line 1: /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang -cc1 -internal-isystem /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/lib/clang/19/include -nostdsysteminc -fexperimental-late-parse-attributes /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/test/AST/attr-counted-by-late-parsed-struct-ptrs.c -ast-dump | /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/test/AST/attr-counted-by-late-parsed-struct-ptrs.c
+ /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang -cc1 -internal-isystem /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/lib/clang/19/include -nostdsysteminc -fexperimental-late-parse-attributes /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/test/AST/attr-counted-by-late-parsed-struct-ptrs.c -ast-dump
+ /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/test/AST/attr-counted-by-late-parsed-struct-ptrs.c
=================================================================
==1477834==ERROR: LeakSanitizer: detected memory leaks
Indirect leak of 432 byte(s) in 2 object(s) allocated from:
    #0 0xaaaabdec8684 in malloc /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:68:3
    #1 0xaaaac36532a8 in safe_malloc /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/Support/MemAlloc.h:26:18
    #2 0xaaaac36532a8 in llvm::SmallVectorBase<unsigned int>::grow_pod(void*, unsigned long, unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/Support/SmallVector.cpp:143:15
    #3 0xaaaac82d9960 in grow_pod /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:151:11
    #4 0xaaaac82d9960 in grow /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:538:41
    #5 0xaaaac82d9960 in reserveForParamAndGetAddressImpl<llvm::SmallVectorTemplateBase<clang::Token, true> > /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:256:11
    #6 0xaaaac82d9960 in reserveForParamAndGetAddress /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:543:12
    #7 0xaaaac82d9960 in push_back /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:575:23
    #8 0xaaaac82d9960 in clang::Parser::ParseLexedCAttribute(clang::Parser::LateParsedAttribute&, clang::ParsedAttributes*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4943:11
    #9 0xaaaac82db118 in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5128:5
    #10 0xaaaac827170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #11 0xaaaac82c4734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #12 0xaaaac81b99dc in ParseDeclarationSpecifiers /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2495:12
    #13 0xaaaac81b99dc in clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1153:3
    #14 0xaaaac81b8ffc in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1271:12
    #15 0xaaaac81b63e4 in clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1074:14
    #16 0xaaaac81b17e0 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:763:12
    #17 0xaaaac81a5874 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:163:20
    #18 0xaaaac4eea774 in clang::FrontendAction::Execute() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1078:8
    #19 0xaaaac4e105fc in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1062:33
    #20 0xaaaac50da540 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:280:25
    #21 0xaaaabdf118e0 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/cc1_main.cpp:232:15
    #22 0xaaaabdf0ac68 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:215:12
    #23 0xaaaabdf094a0 in clang_main(int, char**, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:256:12
    #24 0xaaaabdf2574c in main /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/tools/clang/tools/driver/clang-driver.cpp:17:10
    #25 0xffffa5a07580  (/lib/aarch64-linux-gnu/libc.so.6+0x27580) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #26 0xffffa5a07654 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27654) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #27 0xaaaabde28e2c in _start (/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-19+0xbcc8e2c)
Indirect leak of 368 byte(s) in 2 object(s) allocated from:
    #0 0xaaaabdf057c8 in operator new(unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:86:3
    #1 0xaaaac829d05c in clang::Parser::ParseGNUAttributes(clang::ParsedAttributes&, clang::Parser::LateParsedAttrList*, clang::Declarator*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:265:11
    #2 0xaaaac8227b28 in clang::Parser::MaybeParseGNUAttributes(clang::Declarator&, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2917:7
    #3 0xaaaac82d8abc in clang::Parser::ParseStructDeclaration(clang::ParsingDeclSpec&, llvm::function_ref<clang::Decl* (clang::ParsingFieldDeclarator&)>, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4913:5
    #4 0xaaaac82da84c in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5082:7
    #5 0xaaaac827170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #6 0xaaaac82c4734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #7 0xaaaac81b99dc in ParseDeclarationSpecifiers /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2495:12
    #8 0xaaaac81b99dc in clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1153:3
    #9 0xaaaac81b8ffc in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1271:12
    #10 0xaaaac81b63e4 in clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1074:14
    #11 0xaaaac81b17e0 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:763:12
    #12 0xaaaac81a5874 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:163:20
    #13 0xaaaac4eea774 in clang::FrontendAction::Execute() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1078:8
    #14 0xaaaac4e105fc in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1062:33
    #15 0xaaaac50da540 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:280:25
    #16 0xaaaabdf118e0 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/cc1_main.cpp:232:15
    #17 0xaaaabdf0ac68 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:215:12
    #18 0xaaaabdf094a0 in clang_main(int, char**, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:256:12
    #19 0xaaaabdf2574c in main /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/tools/clang/tools/driver/clang-driver.cpp:17:10
    #20 0xffffa5a07580  (/lib/aarch64-linux-gnu/libc.so.6+0x27580) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #21 0xffffa5a07654 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27654) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #22 0xaaaabde28e2c in _start (/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-19+0xbcc8e2c)
SUMMARY: AddressSanitizer: 800 byte(s) leaked in 4 allocation(s).
--
********************
Testing:  0.. 10..
FAIL: Clang :: Sema/attr-counted-by-late-parsed-struct-ptrs.c (15944 of 79948)
******************** TEST 'Clang :: Sema/attr-counted-by-late-parsed-struct-ptrs.c' FAILED ********************
Exit Code: 1
Command Output (stderr):
--
RUN: at line 1: /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang -cc1 -internal-isystem /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/lib/clang/19/include -nostdsysteminc -fexperimental-late-parse-attributes -fsyntax-only -verify /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
+ /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang -cc1 -internal-isystem /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/lib/clang/19/include -nostdsysteminc -fexperimental-late-parse-attributes -fsyntax-only -verify /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
=================================================================
==1827465==ERROR: LeakSanitizer: detected memory leaks
Indirect leak of 3024 byte(s) in 14 object(s) allocated from:
    #0 0xaaaae8c78684 in malloc /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:68:3
    #1 0xaaaaee4032a8 in safe_malloc /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/Support/MemAlloc.h:26:18
    #2 0xaaaaee4032a8 in llvm::SmallVectorBase<unsigned int>::grow_pod(void*, unsigned long, unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/Support/SmallVector.cpp:143:15
    #3 0xaaaaf3089960 in grow_pod /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:151:11
    #4 0xaaaaf3089960 in grow /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:538:41
    #5 0xaaaaf3089960 in reserveForParamAndGetAddressImpl<llvm::SmallVectorTemplateBase<clang::Token, true> > /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:256:11
    #6 0xaaaaf3089960 in reserveForParamAndGetAddress /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:543:12
    #7 0xaaaaf3089960 in push_back /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:575:23
    #8 0xaaaaf3089960 in clang::Parser::ParseLexedCAttribute(clang::Parser::LateParsedAttribute&, clang::ParsedAttributes*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4943:11
    #9 0xaaaaf308b118 in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5128:5
    #10 0xaaaaf302170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #11 0xaaaaf3074734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #12 0xaaaaf2f699dc in ParseDeclarationSpecifiers /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2495:12
    #13 0xaaaaf2f699dc in clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1153:3
    #14 0xaaaaf2f68ffc in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1271:12
    #15 0xaaaaf2f663e4 in clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1074:14
    #16 0xaaaaf2f617e0 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:763:12
    #17 0xaaaaf2f55874 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:163:20
    #18 0xaaaaefc9a774 in clang::FrontendAction::Execute() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1078:8
    #19 0xaaaaefbc05fc in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1062:33
    #20 0xaaaaefe8a540 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:280:25
    #21 0xaaaae8cc18e0 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/cc1_main.cpp:232:15
    #22 0xaaaae8cbac68 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:215:12
    #23 0xaaaae8cb94a0 in clang_main(int, char**, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:256:12
    #24 0xaaaae8cd574c in main /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/tools/clang/tools/driver/clang-driver.cpp:17:10
    #25 0xffff92d27580  (/lib/aarch64-linux-gnu/libc.so.6+0x27580) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #26 0xffff92d27654 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27654) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #27 0xaaaae8bd8e2c in _start (/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-19+0xbcc8e2c)
Indirect leak of 2576 byte(s) in 14 object(s) allocated from:
    #0 0xaaaae8cb57c8 in operator new(unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:86:3
    #1 0xaaaaf304d05c in clang::Parser::ParseGNUAttributes(clang::ParsedAttributes&, clang::Parser::LateParsedAttrList*, clang::Declarator*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:265:11
    #2 0xaaaaf2fd7b28 in clang::Parser::MaybeParseGNUAttributes(clang::Declarator&, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2917:7
    #3 0xaaaaf3088abc in clang::Parser::ParseStructDeclaration(clang::ParsingDeclSpec&, llvm::function_ref<clang::Decl* (clang::ParsingFieldDeclarator&)>, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4913:5
    #4 0xaaaaf308a84c in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5082:7
    #5 0xaaaaf302170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #6 0xaaaaf3074734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #7 0xaaaaf2f699dc in ParseDeclarationSpecifiers /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2495:12
    #8 0xaaaaf2f699dc in clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1153:3
    #9 0xaaaaf2f68ffc in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1271:12
    #10 0xaaaaf2f663e4 in clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1074:14
    #11 0xaaaaf2f617e0 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:763:12
    #12 0xaaaaf2f55874 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:163:20
    #13 0xaaaaefc9a774 in clang::FrontendAction::Execute() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1078:8
    #14 0xaaaaefbc05fc in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1062:33
    #15 0xaaaaefe8a540 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:280:25
    #16 0xaaaae8cc18e0 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/cc1_main.cpp:232:15
    #17 0xaaaae8cbac68 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:215:12
    #18 0xaaaae8cb94a0 in clang_main(int, char**, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:256:12
    #19 0xaaaae8cd574c in main /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/tools/clang/tools/driver/clang-driver.cpp:17:10
    #20 0xffff92d27580  (/lib/aarch64-linux-gnu/libc.so.6+0x27580) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #21 0xffff92d27654 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27654) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #22 0xaaaae8bd8e2c in _start (/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-19+0xbcc8e2c)
Indirect leak of 216 byte(s) in 1 object(s) allocated from:
    #0 0xaaaae8c78684 in malloc /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:68:3
    #1 0xaaaaee4032a8 in safe_malloc /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/Support/MemAlloc.h:26:18
    #2 0xaaaaee4032a8 in llvm::SmallVectorBase<unsigned int>::grow_pod(void*, unsigned long, unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/Support/SmallVector.cpp:143:15
    #3 0xaaaaf3089960 in grow_pod /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:151:11
    #4 0xaaaaf3089960 in grow /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:538:41
    #5 0xaaaaf3089960 in reserveForParamAndGetAddressImpl<llvm::SmallVectorTemplateBase<clang::Token, true> > /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:256:11
    #6 0xaaaaf3089960 in reserveForParamAndGetAddress /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:543:12
    #7 0xaaaaf3089960 in push_back /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/ADT/SmallVector.h:575:23
    #8 0xaaaaf3089960 in clang::Parser::ParseLexedCAttribute(clang::Parser::LateParsedAttribute&, clang::ParsedAttributes*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4943:11
    #9 0xaaaaf308b118 in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5128:5
    #10 0xaaaaf302170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #11 0xaaaaf3074734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #12 0xaaaaf3072570 in clang::Parser::ParseSpecifierQualifierList(clang::DeclSpec&, clang::ImplicitTypenameContext, clang::AccessSpecifier, clang::Parser::DeclSpecContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2857:3
    #13 0xaaaaf3087eb0 in ParseSpecifierQualifierList /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2516:5
    #14 0xaaaaf3087eb0 in clang::Parser::ParseStructDeclaration(clang::ParsingDeclSpec&, llvm::function_ref<clang::Decl* (clang::ParsingFieldDeclarator&)>, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4856:3
    #15 0xaaaaf308a84c in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5082:7
    #16 0xaaaaf302170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #17 0xaaaaf3074734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #18 0xaaaaf2f699dc in ParseDeclarationSpecifiers /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2495:12
    #19 0xaaaaf2f699dc in clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1153:3
    #20 0xaaaaf2f68ffc in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1271:12
    #21 0xaaaaf2f663e4 in clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1074:14
    #22 0xaaaaf2f617e0 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:763:12
    #23 0xaaaaf2f55874 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:163:20
    #24 0xaaaaefc9a774 in clang::FrontendAction::Execute() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1078:8
    #25 0xaaaaefbc05fc in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1062:33
    #26 0xaaaaefe8a540 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:280:25
    #27 0xaaaae8cc18e0 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/cc1_main.cpp:232:15
    #28 0xaaaae8cbac68 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:215:12
    #29 0xaaaae8cb94a0 in clang_main(int, char**, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:256:12
    #30 0xaaaae8cd574c in main /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/tools/clang/tools/driver/clang-driver.cpp:17:10
    #31 0xffff92d27580  (/lib/aarch64-linux-gnu/libc.so.6+0x27580) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #32 0xffff92d27654 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27654) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #33 0xaaaae8bd8e2c in _start (/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-19+0xbcc8e2c)
Indirect leak of 184 byte(s) in 1 object(s) allocated from:
    #0 0xaaaae8cb57c8 in operator new(unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:86:3
    #1 0xaaaaf304d05c in clang::Parser::ParseGNUAttributes(clang::ParsedAttributes&, clang::Parser::LateParsedAttrList*, clang::Declarator*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:265:11
    #2 0xaaaaf2fd7b28 in clang::Parser::MaybeParseGNUAttributes(clang::Declarator&, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2917:7
    #3 0xaaaaf3088abc in clang::Parser::ParseStructDeclaration(clang::ParsingDeclSpec&, llvm::function_ref<clang::Decl* (clang::ParsingFieldDeclarator&)>, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4913:5
    #4 0xaaaaf308a84c in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5082:7
    #5 0xaaaaf302170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #6 0xaaaaf3074734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #7 0xaaaaf3072570 in clang::Parser::ParseSpecifierQualifierList(clang::DeclSpec&, clang::ImplicitTypenameContext, clang::AccessSpecifier, clang::Parser::DeclSpecContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:2857:3
    #8 0xaaaaf3087eb0 in ParseSpecifierQualifierList /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2516:5
    #9 0xaaaaf3087eb0 in clang::Parser::ParseStructDeclaration(clang::ParsingDeclSpec&, llvm::function_ref<clang::Decl* (clang::ParsingFieldDeclarator&)>, clang::Parser::LateParsedAttrList*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4856:3
    #10 0xaaaaf308a84c in clang::Parser::ParseStructUnionBody(clang::SourceLocation, clang::TypeSpecifierType, clang::RecordDecl*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:5082:7
    #11 0xaaaaf302170c in clang::Parser::ParseClassSpecifier(clang::tok::TokenKind, clang::SourceLocation, clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, bool, clang::Parser::DeclSpecContext, clang::ParsedAttributes&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:2275:7
    #12 0xaaaaf3074734 in clang::Parser::ParseDeclarationSpecifiers(clang::DeclSpec&, clang::Parser::ParsedTemplateInfo&, clang::AccessSpecifier, clang::Parser::DeclSpecContext, clang::Parser::LateParsedAttrList*, clang::ImplicitTypenameContext) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:4598:7
    #13 0xaaaaf2f699dc in ParseDeclarationSpecifiers /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/include/clang/Parse/Parser.h:2495:12
    #14 0xaaaaf2f699dc in clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1153:3
    #15 0xaaaaf2f68ffc in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1271:12
    #16 0xaaaaf2f663e4 in clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:1074:14
    #17 0xaaaaf2f617e0 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, clang::Sema::ModuleImportState&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/Parser.cpp:763:12
    #18 0xaaaaf2f55874 in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:163:20
    #19 0xaaaaefc9a774 in clang::FrontendAction::Execute() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1078:8
    #20 0xaaaaefbc05fc in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1062:33
    #21 0xaaaaefe8a540 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:280:25
    #22 0xaaaae8cc18e0 in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/cc1_main.cpp:232:15
    #23 0xaaaae8cbac68 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:215:12
    #24 0xaaaae8cb94a0 in clang_main(int, char**, llvm::ToolContext const&) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/driver/driver.cpp:256:12
    #25 0xaaaae8cd574c in main /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/tools/clang/tools/driver/clang-driver.cpp:17:10
    #26 0xffff92d27580  (/lib/aarch64-linux-gnu/libc.so.6+0x27580) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #27 0xffff92d27654 in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x27654) (BuildId: 605b0f51a62b4c7b7370930d1cd7d395aac51429)
    #28 0xaaaae8bd8e2c in _start (/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-19+0xbcc8e2c)
SUMMARY: AddressSanitizer: 6000 byte(s) leaked in 30 allocation(s).

@delcypher
Copy link
Contributor Author

delcypher commented May 18, 2024

The leak via clang::Parser::ParseLexedCAttribute is

LA.Toks.push_back(AttrEnd);

which we just added in this PR.

and the leak via clang::Parser::ParseGNUAttributes

is

      LateParsedAttribute *LA =
          new LateParsedAttribute(this, *AttrName, AttrNameLoc);

which is really old code so I doubt this is directly responsible for the leak.

I'm a little confused because LSan is telling us that these are indirect leaks which means "reachable from other leaked blocks" but LSan isn't showing any direct leaks (not reachable from anywhere). This might suggest the leak is some sort of cycle (multiple objects leaked that point to each other).

@delcypher
Copy link
Contributor Author

Ok. Now I see what's happening.

These lines here are basically giving ownership of LateParsedAttribute to the LateParsedAttrList

      // Handle attributes with arguments that require late parsing.
      LateParsedAttribute *LA =
          new LateParsedAttribute(this, *AttrName, AttrNameLoc);
      LateAttrs->push_back(LA);

However LateParsedAttrList is basically just a

class LateParsedAttrList: public SmallVector<LateParsedAttribute *, 2> {
...
}

so it won't free the pointers when the list is destroyed. If we look at the C++ code that handles this we can see a bunch of manual memory management.

/// Parse all attributes in LAs, and attach them to Decl D.
void Parser::ParseLexedAttributeList(LateParsedAttrList &LAs, Decl *D,
                                     bool EnterScope, bool OnDefinition) {
  assert(LAs.parseSoon() &&
         "Attribute list should be marked for immediate parsing.");
  for (unsigned i = 0, ni = LAs.size(); i < ni; ++i) {
    if (D)
      LAs[i]->addDecl(D);
    ParseLexedAttribute(*LAs[i], EnterScope, OnDefinition);
    delete LAs[i];
  }
  LAs.clear();
}

This is a silly footgun. LateParsedAttrList should properly take ownership of its pointers by deleting them when the list is destroyed.

For now we should just replicate what is done for C++ but once we're confident we've fixed things we should make LateParsedAttrList properly own its pointers.

@delcypher
Copy link
Contributor Author

I'll put up a new version of this PR with the memory leak fixed soon.

delcypher added a commit to delcypher/apple-llvm-project that referenced this pull request May 23, 2024
… C (llvm#90786)

Previously the attribute was only allowed on flexible array members.
This patch patch changes this to also allow the attribute on pointer
fields in structs and also allows late parsing of the attribute in some
contexts.

For example this previously wasn't allowed:

```
struct BufferTypeDeclAttributePosition {
  size_t count;
  char* buffer __counted_by(count); // Now allowed
}
```

Note the attribute is prevented on pointee types where the size isn't
known at compile time. In particular pointee types that are:

* Incomplete (e.g. `void`) and sizeless types
* Function types (e.g. the pointee of a function pointer)
* Struct types with a flexible array member

This patch also introduces late parsing of the attribute when used in
the declaration attribute position. For example

```
struct BufferTypeDeclAttributePosition {
  char* buffer __counted_by(count); // Now allowed
  size_t count;
}
```

is now allowed but **only** when passing
`-fexperimental-late-parse-attributes`. The motivation for using late
parsing here is to avoid breaking the data layout of structs in existing
code that want to use the `counted_by` attribute. This patch is the
first use of `LateAttrParseExperimentalExt` in `Attr.td` that was
introduced in a previous patch.

Note by allowing the attribute on struct member pointers this now allows
the possiblity of writing the attribute in the type attribute position.
For example:

```
struct BufferTypeAttributePosition {
  size_t count;
  char *__counted_by(count) buffer; // Now allowed
}
```

However, the attribute in this position is still currently parsed
immediately rather than late parsed. So this will not parse currently:

```
struct BufferTypeAttributePosition {
  char *__counted_by(count) buffer; // Fails to parse
  size_t count;
}
```

The intention is to lift this restriction in future patches. It has not
been done in this patch to keep this size of this commit small.

There are also several other follow up changes that will need to be
addressed in future patches:

* Make late parsing working with anonymous structs (see
`on_pointer_anon_buf` in `attr-counted-by-late-parsed-struct-ptrs.c`).
* Allow `counted_by` on more subjects (e.g. parameters, returns types)
when `-fbounds-safety` is enabled.
* Make use of the attribute on pointer types in code gen (e.g. for
`_builtin_dynamic_object_size` and UBSan's array-bounds checks).

This work is heavily based on a patch originally written by Yeoul Na.

rdar://125400257

Co-authored-by: Dan Liew <[email protected]>
delcypher added a commit to delcypher/apple-llvm-project that referenced this pull request May 23, 2024
…n on flexible array members

In 0ec3b97 an additional restriction
was added when applying the `counted_by` attribute to flexible array
members in structs. The restriction prevented the element type being
a struct that itself had a flexible array member. E.g.:

```
struct has_unannotated_VLA {
  int count;
  char buffer[];
};

struct buffer_of_structs_with_unnannotated_vla {
  int count;
  struct has_unannotated_VLA Arr[] __counted_by(count);
};
```

In this example assuming the size of `Arr` is `sizeof(struct
has_unannotated_VLA)*count` (which is what the attribute says) is wrong
because it doesn't account for the size of
`has_unannotated_VLA::buffer`. This is why this kind of code construct
was treated as an error.

However, it turns out existing Linux kernel code used the attribute
on a flexible array member in this way
(llvm#90786 (comment)).

To unbreak the build this restriction is downgraded to a warning with
the plan to make it an error again once the errornous use of the
attribute in the Linux kernel is resolved.
delcypher added a commit that referenced this pull request May 24, 2024
…s in structs in C) (#93121)

[BoundsSafety] Reland #93121 Allow 'counted_by' attribute on pointers in structs in C (#93121)

Fixes #92687.

Previously the attribute was only allowed on flexible array members.
This patch patch changes this to also allow the attribute on pointer
fields in structs and also allows late parsing of the attribute in some
contexts.

For example this previously wasn't allowed:

```
struct BufferTypeDeclAttributePosition {
  size_t count;
  char* buffer __counted_by(count); // Now allowed
}
```

Note the attribute is prevented on pointee types where the size isn't
known at compile time. In particular pointee types that are:

* Incomplete (e.g. `void`) and sizeless types
* Function types (e.g. the pointee of a function pointer)
* Struct types with a flexible array member

This patch also introduces late parsing of the attribute when used in
the declaration attribute position. For example

```
struct BufferTypeDeclAttributePosition {
  char* buffer __counted_by(count); // Now allowed
  size_t count;
}
```

is now allowed but **only** when passing
`-fexperimental-late-parse-attributes`. The motivation for using late
parsing here is to avoid breaking the data layout of structs in existing
code that want to use the `counted_by` attribute. This patch is the
first use of `LateAttrParseExperimentalExt` in `Attr.td` that was
introduced in a previous patch.

Note by allowing the attribute on struct member pointers this now allows
the possiblity of writing the attribute in the type attribute position.
For example:

```
struct BufferTypeAttributePosition {
  size_t count;
  char *__counted_by(count) buffer; // Now allowed
}
```

However, the attribute in this position is still currently parsed
immediately rather than late parsed. So this will not parse currently:

```
struct BufferTypeAttributePosition {
  char *__counted_by(count) buffer; // Fails to parse
  size_t count;
}
```

The intention is to lift this restriction in future patches. It has not
been done in this patch to keep this size of this commit small.

There are also several other follow up changes that will need to be
addressed in future patches:

* Make late parsing working with anonymous structs (see
`on_pointer_anon_buf` in `attr-counted-by-late-parsed-struct-ptrs.c`).
* Allow `counted_by` on more subjects (e.g. parameters, returns types)
when `-fbounds-safety` is enabled.
* Make use of the attribute on pointer types in code gen (e.g. for
`_builtin_dynamic_object_size` and UBSan's array-bounds checks).

This work is heavily based on a patch originally written by Yeoul Na.

** Differences between #93121 and this patch **

* The memory leak that caused #93121 to be reverted (see #92687) should
  now be fixed. See "The Memory Leak".
* The fix to `pragma-attribute-supported-attributes-list.test`
  (originally in cef6387) has been incorporated into this patch.
* A relaxation of counted_by semantics (originally in 112eadd) has been
  incorporated into this patch.
* The assert in `Parser::DistributeCLateParsedAttrs` has been removed
  because that broke downstream code.
* The switch statement in `Parser::ParseLexedCAttribute` has been
  removed in favor of using `Parser::ParseGNUAttributeArgs` which does
  the same thing but is more feature complete.
* The `EnterScope` parameter has been plumbed through
  `Parser::ParseLexedCAttribute` and `Parser::ParseLexedCAttributeList`.
  It currently doesn't do anything but it will be needed in future
  commits.

** The Memory Leak **

The problem was that these lines parsed the attributes but then did nothing to free the memory

```
assert(!getLangOpts().CPlusPlus);
for (auto *LateAttr : LateFieldAttrs)
  ParseLexedCAttribute(*LateAttr);
```

To fix this this a new `Parser::ParseLexedCAttributeList` method has been
added (based on `Parser::ParseLexedAttributeList`) which does the
necessary memory management. The intention is to merge these two
methods together so there is just one implementation in a future patch
(#93263).

A more principled fixed here would be to fix the ownership of the
`LateParsedAttribute` objects. In principle `LateParsedAttrList` should own
its pointers exclusively and be responsible for deallocating them.
Unfortunately this is complicated by `LateParsedAttribute` objects also
being stored in another data structure (`LateParsedDeclarations`) as
can be seen below (`LA` gets stored in two places).

```
      // Handle attributes with arguments that require late parsing.
      LateParsedAttribute *LA =
          new LateParsedAttribute(this, *AttrName, AttrNameLoc);
      LateAttrs->push_back(LA);

      // Attributes in a class are parsed at the end of the class, along
      // with other late-parsed declarations.
      if (!ClassStack.empty() && !LateAttrs->parseSoon())
        getCurrentClass().LateParsedDeclarations.push_back(LA);
```

this means the ownership of LateParsedAttribute objects isn't very
clear.

rdar://125400257
delcypher added a commit to delcypher/apple-llvm-project that referenced this pull request Jul 19, 2024
…ption and use it to tweak `counted_by`'s semantics

This adds the `-fexperimental-bounds-safety` CC1 and corresponding
language option. This language option enables "-fbounds-safety" which
is a bounds-safety extension for C that is being incrementally
upstreamed.

This CC1 flag is not exposed as a driver flag yet because most of the
implementation isn't upstream yet.

The language option is used to make a small semantic change to how the
`counted_by` attribute is treated. Without
`-fexperimental-bounds-safety` the attribute is allowed (but emits a
warning) on a flexible array member where the element type is a struct
with a flexible array member. With the flag this situation is an error.

E.g.

```
struct has_unannotated_FAM {
  int count;
  char buffer[];
};

struct buffer_of_structs_with_unnannotated_FAM {
  int count;
  // Forbidden with `-fexperimental-bounds-safety`
  struct has_unannotated_FAM Arr[] __counted_by(count);
};
```

The above code **should always** be an error. However, when llvm#90786 was
originally landed (which allowed `counted_by` to be used on pointers in
structs) it exposed an issue in code in the Linux kernel that was using
the `counted_by` attribute incorrectly (see
llvm#90786 (comment))
which was now caught by a new error diagnostic in the PR. To unbreak the
build of the Linux kernel the error diagnostic was temporarily
downgraded to be a warning to give the kernel authors time to fix their
code.

This downgrading of the error diagnostic to a warning is a departure
from the intended semantics of `-fbounds-safety` so in order to have
both behaviors (error and warning) it is necessary for Clang to actually
have a notion of `-fbounds-safety` being on vs off.

rdar://125400392
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:bounds-safety Issue/PR relating to the experimental -fbounds-safety feature in Clang clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants