-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[BoundsSafety] Allow 'counted_by' attribute on pointers in structs in C #90786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BoundsSafety] Allow 'counted_by' attribute on pointers in structs in C #90786
Conversation
@llvm/pr-subscribers-clang Author: Dan Liew (delcypher) ChangesPreviously the attribute was only allowed on flexible array members. This patch patch changes this to also allow the attribute on pointer fields in structs and also allows late parsing of the attribute in some contexts. For example this previously wasn't allowed:
Note the attribute is prevented on pointee types where the size isn't known at compile time. In particular pointee types that are:
This patch also introduces late parsing of the attribute when used in the declaration attribute position. For example
is now allowed but only when passing Note by allowing the attribute on struct member pointers this now allows the possiblity of writing the attribute in the type attribute position. For example:
However, the attribute in this position is still currently parsed immediately rather than late parsed. So this will not parse currently:
The intention is to lift this restriction in future patches. It has not been done in this patch to keep this size of this commit small. There are also several other follow up changes that will need to be addressed in future patches:
This work is heavily based on a patch originally written by Yeoul Na. rdar://125400257 Patch is 38.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90786.diff 18 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 4c0fe5bcf6b122..81d0a89b294878 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -258,7 +258,8 @@ New Compiler Flags
- ``-fexperimental-late-parse-attributes`` enables an experimental feature to
allow late parsing certain attributes in specific contexts where they would
- not normally be late parsed.
+ not normally be late parsed. Currently this allows late parsing the
+ `counted_by` attribute in C. See `Attribute Changes in Clang`_.
Deprecated Compiler Flags
-------------------------
@@ -335,6 +336,22 @@ Attribute Changes in Clang
- Clang now warns that the ``exclude_from_explicit_instantiation`` attribute
is ignored when applied to a local class or a member thereof.
+- The ``counted_by`` attribute can now be late parsed in C when
+ ``-fexperimental-late-parse-attributes`` is passed but only when attribute is
+ used in the declaration attribute position. This allows using the
+ attribute on existing code where it previously impossible to do so without
+ re-ordering struct field declarations would break ABI as shown below.
+
+ .. code-block:: c
+
+ struct BufferTy {
+ /* Refering to `count` requires late parsing */
+ char* buffer __counted_by(count);
+ /* Swapping `buffer` and `count` to avoid late parsing would break ABI */
+ size_t count;
+ };
+
+
Improvements to Clang's diagnostics
-----------------------------------
- Clang now applies syntax highlighting to the code snippets it
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index e6643469e0b334..e982a565413751 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -2515,6 +2515,7 @@ class alignas(TypeAlignment) Type : public ExtQualsTypeCommonBase {
bool isRecordType() const;
bool isClassType() const;
bool isStructureType() const;
+ bool isStructureTypeWithFlexibleArrayMember() const;
bool isObjCBoxableRecordType() const;
bool isInterfaceType() const;
bool isStructureOrClassType() const;
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 0225598cbbe8ad..ef08d2a274a907 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -2229,7 +2229,8 @@ def TypeNullUnspecified : TypeAttr {
def CountedBy : DeclOrTypeAttr {
let Spellings = [Clang<"counted_by">];
let Subjects = SubjectList<[Field], ErrorDiag>;
- let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel">];
+ let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel", 1>];
+ let LateParsed = LateAttrParseExperimentalExt;
let ParseArgumentsAsUnevaluated = 1;
let Documentation = [CountedByDocs];
let LangOpts = [COnly];
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index f72d5c252b863e..0ab2a96c7b464b 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6518,8 +6518,10 @@ def warn_superclass_variable_sized_type_not_at_end : Warning<
def err_flexible_array_count_not_in_same_struct : Error<
"'counted_by' field %0 isn't within the same struct as the flexible array">;
-def err_counted_by_attr_not_on_flexible_array_member : Error<
- "'counted_by' only applies to C99 flexible array members">;
+def err_counted_by_attr_not_on_ptr_or_flexible_array_member : Error<
+ "'counted_by' only applies to pointers or C99 flexible array members">;
+def err_counted_by_attr_on_array_not_flexible_array_member : Error<
+ "'counted_by' on arrays only applies to C99 flexible array members">;
def err_counted_by_attr_refer_to_itself : Error<
"'counted_by' cannot refer to the flexible array member %0">;
def err_counted_by_must_be_in_structure : Error<
@@ -6534,6 +6536,15 @@ def err_counted_by_attr_refer_to_union : Error<
"'counted_by' argument cannot refer to a union member">;
def note_flexible_array_counted_by_attr_field : Note<
"field %0 declared here">;
+def err_counted_by_attr_pointee_unknown_size : Error<
+ "'counted_by' cannot be applied a pointer with pointee with unknown size "
+ "because %0 is %select{"
+ "an incomplete type|" // CountedByInvalidPointeeTypeKind::INCOMPLETE
+ "a sizeless type|" // CountedByInvalidPointeeTypeKind::SIZELESS
+ "a function type|" // CountedByInvalidPointeeTypeKind::FUNCTION
+ // CountedByInvalidPointeeTypeKind::FLEXIBLE_ARRAY_MEMBER
+ "a struct type with a flexible array member"
+ "}1">;
let CategoryName = "ARC Semantic Issue" in {
diff --git a/clang/include/clang/Parse/Parser.h b/clang/include/clang/Parse/Parser.h
index 81aab8c888ab65..0f2ba17aadaf2d 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -1643,6 +1643,8 @@ class Parser : public CodeCompletionHandler {
bool EnterScope, bool OnDefinition);
void ParseLexedAttribute(LateParsedAttribute &LA,
bool EnterScope, bool OnDefinition);
+ void ParseLexedCAttribute(LateParsedAttribute &LA,
+ ParsedAttributes *OutAttrs = nullptr);
void ParseLexedMethodDeclarations(ParsingClass &Class);
void ParseLexedMethodDeclaration(LateParsedMethodDeclaration &LM);
void ParseLexedMethodDefs(ParsingClass &Class);
@@ -2530,7 +2532,8 @@ class Parser : public CodeCompletionHandler {
void ParseStructDeclaration(
ParsingDeclSpec &DS,
- llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback);
+ llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
+ LateParsedAttrList *LateFieldAttrs = nullptr);
DeclGroupPtrTy ParseTopLevelStmtDecl();
@@ -3107,6 +3110,8 @@ class Parser : public CodeCompletionHandler {
SourceLocation ScopeLoc,
ParsedAttr::Form Form);
+ void DistributeCLateParsedAttrs(Decl *Dcl, LateParsedAttrList *LateAttrs);
+
void ParseBoundsAttribute(IdentifierInfo &AttrName,
SourceLocation AttrNameLoc, ParsedAttributes &Attrs,
IdentifierInfo *ScopeName, SourceLocation ScopeLoc,
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 1ca523ec88c2f9..f1e1be24f4fd2f 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -11509,7 +11509,8 @@ class Sema final : public SemaBase {
QualType BuildMatrixType(QualType T, Expr *NumRows, Expr *NumColumns,
SourceLocation AttrLoc);
- QualType BuildCountAttributedArrayType(QualType WrappedTy, Expr *CountExpr);
+ QualType BuildCountAttributedArrayOrPointerType(QualType WrappedTy,
+ Expr *CountExpr);
QualType BuildAddressSpaceAttr(QualType &T, LangAS ASIdx, Expr *AddrSpace,
SourceLocation AttrLoc);
diff --git a/clang/lib/AST/Type.cpp b/clang/lib/AST/Type.cpp
index 68e81f45b4c28e..316d70e0b79ecf 100644
--- a/clang/lib/AST/Type.cpp
+++ b/clang/lib/AST/Type.cpp
@@ -631,6 +631,18 @@ bool Type::isStructureType() const {
return false;
}
+bool Type::isStructureTypeWithFlexibleArrayMember() const {
+ const auto *RT = getAs<RecordType>();
+ if (!RT)
+ return false;
+ const auto *Decl = RT->getDecl();
+ if (!Decl->isStruct())
+ return false;
+ if (!Decl->hasFlexibleArrayMember())
+ return false;
+ return true;
+}
+
bool Type::isObjCBoxableRecordType() const {
if (const auto *RT = getAs<RecordType>())
return RT->getDecl()->hasAttr<ObjCBoxableAttr>();
diff --git a/clang/lib/Parse/ParseDecl.cpp b/clang/lib/Parse/ParseDecl.cpp
index 7431c256d2c135..470d26fb3d73be 100644
--- a/clang/lib/Parse/ParseDecl.cpp
+++ b/clang/lib/Parse/ParseDecl.cpp
@@ -3282,6 +3282,19 @@ void Parser::ParseAlignmentSpecifier(ParsedAttributes &Attrs,
}
}
+void Parser::DistributeCLateParsedAttrs(Decl *Dcl,
+ LateParsedAttrList *LateAttrs) {
+ assert(Dcl);
+
+ if (!LateAttrs)
+ return;
+
+ for (auto *LateAttr : *LateAttrs) {
+ if (LateAttr->Decls.empty())
+ LateAttr->addDecl(Dcl);
+ }
+}
+
/// Bounds attributes (e.g., counted_by):
/// AttrName '(' expression ')'
void Parser::ParseBoundsAttribute(IdentifierInfo &AttrName,
@@ -4815,13 +4828,14 @@ static void DiagnoseCountAttributedTypeInUnnamedAnon(ParsingDeclSpec &DS,
///
void Parser::ParseStructDeclaration(
ParsingDeclSpec &DS,
- llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback) {
+ llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
+ LateParsedAttrList *LateFieldAttrs) {
if (Tok.is(tok::kw___extension__)) {
// __extension__ silences extension warnings in the subexpression.
ExtensionRAIIObject O(Diags); // Use RAII to do this.
ConsumeToken();
- return ParseStructDeclaration(DS, FieldsCallback);
+ return ParseStructDeclaration(DS, FieldsCallback, LateFieldAttrs);
}
// Parse leading attributes.
@@ -4886,10 +4900,12 @@ void Parser::ParseStructDeclaration(
}
// If attributes exist after the declarator, parse them.
- MaybeParseGNUAttributes(DeclaratorInfo.D);
+ MaybeParseGNUAttributes(DeclaratorInfo.D, LateFieldAttrs);
// We're done with this declarator; invoke the callback.
- FieldsCallback(DeclaratorInfo);
+ Decl *Field = FieldsCallback(DeclaratorInfo);
+ if (Field)
+ DistributeCLateParsedAttrs(Field, LateFieldAttrs);
// If we don't have a comma, it is either the end of the list (a ';')
// or an error, bail out.
@@ -4900,6 +4916,69 @@ void Parser::ParseStructDeclaration(
}
}
+/// Finish parsing an attribute for which parsing was delayed.
+/// This will be called at the end of parsing a class declaration
+/// for each LateParsedAttribute. We consume the saved tokens and
+/// create an attribute with the arguments filled in. We add this
+/// to the Attribute list for the decl.
+void Parser::ParseLexedCAttribute(LateParsedAttribute &LA,
+ ParsedAttributes *OutAttrs) {
+ // Create a fake EOF so that attribute parsing won't go off the end of the
+ // attribute.
+ Token AttrEnd;
+ AttrEnd.startToken();
+ AttrEnd.setKind(tok::eof);
+ AttrEnd.setLocation(Tok.getLocation());
+ AttrEnd.setEofData(LA.Toks.data());
+ LA.Toks.push_back(AttrEnd);
+
+ // Append the current token at the end of the new token stream so that it
+ // doesn't get lost.
+ LA.Toks.push_back(Tok);
+ PP.EnterTokenStream(LA.Toks, /*DisableMacroExpansion=*/true,
+ /*IsReinject=*/true);
+ // Drop the current token and bring the first cached one. It's the same token
+ // as when we entered this function.
+ ConsumeAnyToken(/*ConsumeCodeCompletionTok=*/true);
+
+ ParsedAttributes Attrs(AttrFactory);
+
+ assert(LA.Decls.size() <= 1 &&
+ "late field attribute expects to have at most one declaration.");
+
+ // Dispatch based on the attribute and parse it
+ const AttributeCommonInfo::Form ParsedForm = ParsedAttr::Form::GNU();
+ IdentifierInfo *ScopeName = nullptr;
+ const ParsedAttr::Kind AttrKind =
+ ParsedAttr::getParsedKind(&LA.AttrName, /*ScopeName=*/ScopeName,
+ /*SyntaxUsed=*/ParsedForm.getSyntax());
+ switch (AttrKind) {
+ case ParsedAttr::Kind::AT_CountedBy:
+ ParseBoundsAttribute(LA.AttrName, LA.AttrNameLoc, Attrs,
+ /*ScopeName=*/ScopeName, SourceLocation(),
+ /*Form=*/ParsedForm);
+ break;
+ default:
+ llvm_unreachable("Unhandled late parsed attribute");
+ }
+
+ for (auto *D : LA.Decls)
+ Actions.ActOnFinishDelayedAttribute(getCurScope(), D, Attrs);
+
+ // Due to a parsing error, we either went over the cached tokens or
+ // there are still cached tokens left, so we skip the leftover tokens.
+ while (Tok.isNot(tok::eof))
+ ConsumeAnyToken();
+
+ // Consume the fake EOF token if it's there
+ if (Tok.is(tok::eof) && Tok.getEofData() == AttrEnd.getEofData())
+ ConsumeAnyToken();
+
+ if (OutAttrs) {
+ OutAttrs->takeAllFrom(Attrs);
+ }
+}
+
/// ParseStructUnionBody
/// struct-contents:
/// struct-declaration-list
@@ -4923,6 +5002,11 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
ParseScope StructScope(this, Scope::ClassScope|Scope::DeclScope);
Actions.ActOnTagStartDefinition(getCurScope(), TagDecl);
+ // `LateAttrParseExperimentalExtOnly=true` requests that only attributes
+ // marked with `LateAttrParseExperimentalExt` are late parsed.
+ LateParsedAttrList LateFieldAttrs(/*PSoon=*/false,
+ /*LateAttrParseExperimentalExtOnly=*/true);
+
// While we still have something to read, read the declarations in the struct.
while (!tryParseMisplacedModuleImport() && Tok.isNot(tok::r_brace) &&
Tok.isNot(tok::eof)) {
@@ -4973,18 +5057,19 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
}
if (!Tok.is(tok::at)) {
- auto CFieldCallback = [&](ParsingFieldDeclarator &FD) {
+ auto CFieldCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
// Install the declarator into the current TagDecl.
Decl *Field =
Actions.ActOnField(getCurScope(), TagDecl,
FD.D.getDeclSpec().getSourceRange().getBegin(),
FD.D, FD.BitfieldSize);
FD.complete(Field);
+ return Field;
};
// Parse all the comma separated declarators.
ParsingDeclSpec DS(*this);
- ParseStructDeclaration(DS, CFieldCallback);
+ ParseStructDeclaration(DS, CFieldCallback, &LateFieldAttrs);
} else { // Handle @defs
ConsumeToken();
if (!Tok.isObjCAtKeyword(tok::objc_defs)) {
@@ -5025,7 +5110,12 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
ParsedAttributes attrs(AttrFactory);
// If attributes exist after struct contents, parse them.
- MaybeParseGNUAttributes(attrs);
+ MaybeParseGNUAttributes(attrs, &LateFieldAttrs);
+
+ // Late parse field attributes if necessary.
+ assert(!getLangOpts().CPlusPlus);
+ for (auto *LateAttr : LateFieldAttrs)
+ ParseLexedCAttribute(*LateAttr);
SmallVector<Decl *, 32> FieldDecls(TagDecl->fields());
diff --git a/clang/lib/Parse/ParseObjc.cpp b/clang/lib/Parse/ParseObjc.cpp
index 8e54fe012c55d7..2eaeff38a24a6f 100644
--- a/clang/lib/Parse/ParseObjc.cpp
+++ b/clang/lib/Parse/ParseObjc.cpp
@@ -778,16 +778,16 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
}
bool addedToDeclSpec = false;
- auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) {
+ auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
if (FD.D.getIdentifier() == nullptr) {
Diag(AtLoc, diag::err_objc_property_requires_field_name)
<< FD.D.getSourceRange();
- return;
+ return nullptr;
}
if (FD.BitfieldSize) {
Diag(AtLoc, diag::err_objc_property_bitfield)
<< FD.D.getSourceRange();
- return;
+ return nullptr;
}
// Map a nullability property attribute to a context-sensitive keyword
@@ -816,6 +816,7 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
MethodImplKind);
FD.complete(Property);
+ return Property;
};
// Parse all the comma separated declarators.
@@ -2024,7 +2025,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
continue;
}
- auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) {
+ auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
assert(getObjCDeclContext() == interfaceDecl &&
"Ivar should have interfaceDecl as its decl context");
// Install the declarator into the interface decl.
@@ -2035,6 +2036,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
if (Field)
AllIvarDecls.push_back(Field);
FD.complete(Field);
+ return Field;
};
// Parse all the comma separated declarators.
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 363ae93cb62df1..7ae1e09c7a5dcd 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -8588,31 +8588,71 @@ static const RecordDecl *GetEnclosingNamedOrTopAnonRecord(const FieldDecl *FD) {
return RD;
}
-static bool
-CheckCountExpr(Sema &S, FieldDecl *FD, Expr *E,
- llvm::SmallVectorImpl<TypeCoupledDeclRefInfo> &Decls) {
+enum class CountedByInvalidPointeeTypeKind {
+ INCOMPLETE,
+ SIZELESS,
+ FUNCTION,
+ FLEXIBLE_ARRAY_MEMBER,
+ VALID,
+};
+
+static bool CheckCountedByAttrOnField(
+ Sema &S, FieldDecl *FD, Expr *E,
+ llvm::SmallVectorImpl<TypeCoupledDeclRefInfo> &Decls) {
+ // Check the context the attribute is used in
+
if (FD->getParent()->isUnion()) {
S.Diag(FD->getBeginLoc(), diag::err_counted_by_attr_in_union)
<< FD->getSourceRange();
return true;
}
- if (!E->getType()->isIntegerType() || E->getType()->isBooleanType()) {
- S.Diag(E->getBeginLoc(), diag::err_counted_by_attr_argument_not_integer)
- << E->getSourceRange();
+ const auto FieldTy = FD->getType();
+ if (!FieldTy->isArrayType() && !FieldTy->isPointerType()) {
+ S.Diag(FD->getBeginLoc(),
+ diag::err_counted_by_attr_not_on_ptr_or_flexible_array_member)
+ << FD->getLocation();
return true;
}
LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
LangOptions::StrictFlexArraysLevelKind::IncompleteOnly;
-
- if (!Decl::isFlexibleArrayMemberLike(S.getASTContext(), FD, FD->getType(),
+ if (FieldTy->isArrayType() &&
+ !Decl::isFlexibleArrayMemberLike(S.getASTContext(), FD, FieldTy,
StrictFlexArraysLevel, true)) {
- // The "counted_by" attribute must be on a flexible array member.
- SourceRange SR = FD->getLocation();
- S.Diag(SR.getBegin(),
- diag::err_counted_by_attr_not_on_flexible_array_member)
- << SR;
+ S.Diag(FD->getBeginLoc(),
+ diag::err_counted_by_attr_on_array_not_flexible_array_member)
+ << FD->getLocation();
+ return true;
+ }
+
+ if (FieldTy->isPointerType()) {
+ CountedByInvalidPointeeTypeKind InvalidTypeKind =
+ CountedByInvalidPointeeTypeKind::VALID;
+ const auto PointeeTy = FieldTy->getPointeeType();
+ if (PointeeTy->isIncompleteType()) {
+ InvalidTypeKind = CountedByInvalidPointeeTypeKind::INCOMPLETE;
+ } else if (PointeeTy->isSizelessType()) {
+ InvalidTypeKind = CountedByInvalidPointeeTypeKind::SIZELESS;
+ } else if (PointeeTy->isFunctionType()) {
+ InvalidTypeKind = CountedByInvalidPointeeTypeKind::FUNCTION;
+ } else if (PointeeTy->isStructureTypeWithFlexibleArrayMember()) {
+ InvalidTypeKind = CountedByInvalidPointeeTypeKind::FLEXIBLE_ARRAY_MEMBER;
+ }
+
+ if (InvalidTypeKind != CountedByInvalidPointeeTypeKind::VALID) {
+ S.Diag(FD->getBeginLoc(), diag::err_counted_by_attr_pointee_unknown_size)
+ << FieldTy->getPointeeType() << (int)InvalidTypeKind
+ << FD->getSourceRange();
+ return true;
+ }
+ }
+
+ // Check the expression
+
+ if (!E->getType()->isIntegerType() || E->getType()->isBooleanType()) {
+ S.Diag(E->getBeginLoc(), diag::err_counted_by_attr_argument_not_integer)
+ << E->getSourceRange();
return true;
}
@@ -8675,10 +8715,11 @@ static void handleCountedByAttrField(Sema &S, Decl *D, const ParsedAttr &AL) {
return;
llvm::SmallVector<TypeCoupledDeclRefInfo, 1> Decls;
- if (CheckCountExpr(S, FD, CountExpr, Decls))
+ if (CheckCountedByAttrOnField(S, FD, CountExpr, Decls))
return;
- QualType CAT = S.BuildCountAttributedArrayType(FD->getType(), CountExpr);
+ QualType CAT =
+ S.BuildCountAttributedArrayOrPointerType(FD->getType(), CountExpr);
FD->setType(CAT);
}
diff --git a/clang/lib/Sema/SemaType.cpp b/clang/lib/Sema/SemaType.cpp
index fddc3545ecb61c..29e9700959cf51 100644
--- a/clang/lib/Sema/SemaType.cpp
+++ b/clang/lib/Sema/SemaType.cpp
@@ -9790,9 +9790,9 @@ BuildTypeCoupledDecls(Expr *E,
De...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sema.h
changes look good to me.
return true; | ||
} | ||
|
||
if (FieldTy->isPointerType()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be limited to pointers? Seems to me that it could apply to arrays also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Flexible array member
struct Has_VLA {
int count;
char buffer[] __counted_by(count);
};
struct Test {
int count;
struct Has_VLA Arr[] __counted_by(count);
};
There's no diagnostic here currently. I think the code above is something we should disallow (at least for -fbounds-safety
). I'll adjust the patch to disallow this.
@isanbard Is the above code pattern something you actually want to support? In -fbounds-safety
we disallow it because it means computing the bounds of Arr
is not constant time because the size of Has_VLA
is not a compile time constant. To compute the bounds of Arr
at runtime it would require walking every Has_VLA
object in Arr
to get its count
field. That seems expensive and is also at risk of running into a race condition (i.e. the size of a VLA changes while the bounds are being computed).
Incomplete type
struct Test {
int count;
void Arr[] __counted_by(count);
};
We already have these diagnostics:
tmp/arr_sizeless_ty.c:5:13: error: array has incomplete element type 'void'
5 | void Arr[] __counted_by(count);
| ^
tmp/arr_sizeless_ty.c:5:5: error: 'counted_by' only applies to pointers or C99 flexible array members
5 | void Arr[] __counted_by(count);
| ^ ~~~
2 errors generated.
I don't think we need to add another.
Sizeless types
struct Test {
int count;
__SVInt8_t Arr[] __counted_by(count);
};
We already have these diagnostics:
tmp/arr_sizeless_ty.c:5:19: error: array has sizeless element type '__SVInt8_t'
5 | __SVInt8_t Arr[] __counted_by(count);
| ^
tmp/arr_sizeless_ty.c:5:5: error: 'counted_by' only applies to pointers or C99 flexible array members
5 | __SVInt8_t Arr[] __counted_by(count);
| ^ ~~~
2 errors generated.
I don't think we need to add another.
Function Type
typedef void(foo_fn)(int);
struct Test {
int count;
foo_fn Arr[] __counted_by(count);
};
We already have these diagnostics:
tmp/arr_sizeless_ty.c:7:15: error: 'Arr' declared as array of functions of type 'foo_fn' (aka 'void (int)')
7 | foo_fn Arr[] __counted_by(count);
| ^
tmp/arr_sizeless_ty.c:7:5: error: 'counted_by' only applies to pointers or C99 flexible array members
7 | foo_fn Arr[] __counted_by(count);
| ^ ~~~
2 errors generated.
I don't think we need to add another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isanbard Is the above code pattern something you actually want to support? In -fbounds-safety we disallow it because it means computing the bounds of Arr is not constant time because the size of Has_VLA is not a compile time constant. To compute the bounds of Arr at runtime it would require walking every Has_VLA object in Arr to get its count field. That seems expensive and is also at risk of running into a race condition (i.e. the size of a VLA changes while the bounds are being computed).
Is it even valid C? I suppose it could be valid if struct Has_VLA
wasn't allocated with extra bytes, i.e. count
is 0 and buffer
has no size.
But that's a horrible use case and I think it's fine to flag it as an error.
@@ -335,6 +336,22 @@ Attribute Changes in Clang | |||
- Clang now warns that the ``exclude_from_explicit_instantiation`` attribute | |||
is ignored when applied to a local class or a member thereof. | |||
|
|||
- The ``counted_by`` attribute can now be late parsed in C when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should mention that counted_by
was extended to work with pointers in structs first. Otherwise, "last parsing" of the attribute doesn't make a lot of sense since a flexible array member is at the end of a struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can mention it but I'm wondering if it should be in the release notes at all because this patch only makes parsing work. I've not done anything on the codegen side to make use of the attribute on pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't have a use yet, maybe you don't need to add it in the release notes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to add some context here. The original reason there was a release note is because in #88596 (which this patch builds upon) @AaronBallman asked for there to be a release note about the new -fexperimental-late-parse-attributes
flag. This patch means -fexperimental-late-parse-attributes
actually does something now (before it did nothing) so I updated the release notes on that flag and it "felt" like it made sense to explain in the release notes the change on the attributes in relation to the flag.
I don't have strong opinions here. I'm ok to drop the changes to the release notes or keep them or something in between. @AaronBallman any opinions here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion either. :-) It's fine to keep the blurb in.
"because %0 is %select{" | ||
"an incomplete type|" // CountedByInvalidPointeeTypeKind::INCOMPLETE | ||
"a sizeless type|" // CountedByInvalidPointeeTypeKind::SIZELESS | ||
"a function type|" // CountedByInvalidPointeeTypeKind::FUNCTION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused why a function type is excluded. Isn't it just a pointer? @kees can correct me, but I think the point of counted_by
on a pointer is that it could be a list of pointers, and we don't want to allow someone to access beyond that list:
struct s {
int count;
int *ptr __counted_by(count);
};
struct s *alloc(size_t num_elems) {
struct s *p = malloc(sizeof(struct s));
p->count = num_elems;
p->ptr = calloc(num_elems, sizeof(int));
return p;
}
If that's the case, then any pointer should be okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bwendling I was also confused by this initially. A "function type" and "function pointer type" are different. A function type doesn't have a size so we have to forbid it, but a function pointer does have a size (it's the size of a pointer).
Hopefully these examples illustrate the problem:
#define __counted_by(f) __attribute__((counted_by(f)))
typedef void(fn_ty)(int);
typedef void(*fn_ptr_ty)(int);
struct a {
int count;
// Buffer of functions is invalid. A "function" has no size
fn_ty* buffer __counted_by(count);
fn_ptr_ty buffer2 __counted_by(count);
};
struct b {
int count;
// Valid: A buffer of function pointers is allowed
fn_ty** buffer __counted_by(count);
fn_ptr_ty* buffer2 __counted_by(count);
};
A similar thing exists for arrays. If you write this
struct c {
fn_ty Arr[];
};
this produces the following error:
error: 'Arr' declared as array of functions of type 'fn_ty' (aka 'void (int)')
22 | fn_ty Arr[];
| ^
However these two struct definitions are allowed by clang:
struct d {
fn_ty* Arr[];
};
struct e {
fn_ptr_ty Arr2[];
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A "function type" and "function pointer type" are different.
Ah! my mistake. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: Given the above I expected pointer arithmetic on a function pointer to be forbidden but it's not...
typedef void(fn_ty)(int);
typedef void(*fn_ptr_ty)(int);
void test(fn_ty* fn) {
fn(5); // ok
// WHAT!?: How can pointer arithmetic be possible when the size of
// the function isn't a defined thing...
//
// Turns out clang codegens this as if `sizeof(*fn) == 1`
_Static_assert(sizeof(*fn) == 1, "oh boy...we're in for a wild ride");
++fn;
// Let's jump somewhere into the function that's 1 byte along from the start.
// what could possibly go wrong!?
fn(5); // CRASH
return;
}
It's fine to do pointer arithmetic on an array of function pointers though
typedef void(fn_ty)(int);
typedef void(*fn_ptr_ty)(int);
void do_something(int i) { printf("hello %d\n", i);}
void do_something2(int i) { printf("hello2 %d\n", i);}
fn_ptr_ty Funcs[] = { do_something, do_something2};
void test2(fn_ty** fn) {
// fn == do_something
(*fn)(5);
++fn;
// Now fn == do_something2
(*fn)(5);
return;
}
int main(void) {
test2(Funcs);
return 0;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that as long as it's a pointer we're okay, at least w.r.t. the counted_by
attribute. We know the size of pointers at least. :-)
return true; | ||
} | ||
|
||
if (FieldTy->isPointerType()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@isanbard Is the above code pattern something you actually want to support? In -fbounds-safety we disallow it because it means computing the bounds of Arr is not constant time because the size of Has_VLA is not a compile time constant. To compute the bounds of Arr at runtime it would require walking every Has_VLA object in Arr to get its count field. That seems expensive and is also at risk of running into a race condition (i.e. the size of a VLA changes while the bounds are being computed).
Is it even valid C? I suppose it could be valid if struct Has_VLA
wasn't allocated with extra bytes, i.e. count
is 0 and buffer
has no size.
But that's a horrible use case and I think it's fine to flag it as an error.
a89cca7
to
9ec3203
Compare
I've been thinking about this restriction. Why is this necessary? My assumption was that applying struct foo;
struct bar {
int a;
int fam[] __counted_by(a);
};
struct x {
void *p __counted_by(count); // void * is treated like char *, I think.
struct foo *f __counted_by(count); // sizeof(f) is the size of a general pointer.
struct bar *b __counted_by(count); // a list of pointers to 'struct bar's should be okay.
int count;
}; |
I think that there's room to allow
For structs with a flexible array member, the reason is that |
@rapidsna Please add additional points if I accidentally miss something. I can try to explain the restrictions from the perspective of the What I mean specifically by wide pointer is
template <class T> struct WidePtr {
T *ptr; // The address that is dereferenced
// The bounds of the memory that forms the range of valid bytes to access
// [lower_bound, upper_bound)
void *lower_bound;
void *upper_bound;
WidePtr(T * /*__counted_by(count)*/ p, size_t count) {
ptr = p;
lower_bound = (void *)p;
// NEED sizeof(T) here!
upper_bound = ((char *)p) + sizeof(T) * count;
}
T& operator*() {
check_bounds();
return *ptr;
}
void check_bounds() {
if (ptr < lower_bound)
__builtin_trap();
uintptr_t LastByteAccessedPlusOne;
// NEED sizeof(T) here!
if (__builtin_uadd_overflow(ptr, sizeof(T), &LastByteAccessedPlusOne))
__builtin_trap();
if (LastByteAccessedPlusOne > upper_bound)
__builtin_trap();
}
// Lots of other operators are overloaded (e.g. +, -[], ++, -- and ->).
}; We need to know what
I've tried to annotate your code below with the specific reasons they are forbidden. // Both `foo` and `bar` are legal
struct foo;
struct bar {
int a;
int fam[] __counted_by(a);
};
struct x {
// Illegal in `-fbounds-safety`. We can't compute the upper bound of this pointer
// because `sizeof(void)*count` isn't valid. In `-fbounds-safety` the attribute you
// would use here is `__sized_by()` which is a byte count rather than an element count.
void *p __counted_by(count);
// Illegal in `-fbounds-safety`. We can't compute the upper bound of this pointer because
// `sizeof(struct foo)*count` isn't valid. In particular `sizeof(struct foo)` is invalid because
// `struct foo` is an incomplete type.
struct foo *f __counted_by(count);
// Illegal in `-fbounds-safety`. While we can technically compute `sizeof(struct bar)*count`
// that computation is completely wrong unless the `bar::a` is `0`. To actually get the correct
// upper bound for `b` we would need to walk every `struct bar` object in the buffer pointed to by `b` to
// read `b::a`. This could be expensive (and prone to race conditions) so we simply
// don't support this right now.
struct bar *b __counted_by(count);
int count;
}; |
@bwendling It's because these types are not indexable really. void: function types Incomplete structs |
What I'm thinking is something like: a.c:
struct foo;
struct bar {
size_t count;
struct foo *ptr __counted_by(count);
};
struct bar *alloc_bar(size_t num_elems) {
struct bar *p = malloc(sizeof(bar));
p->count = num_elems;
p->ptr = malloc(sizeof(struct foo *) * num_elems);
return p;
}
/* later on, or in another TU */
struct foo {
/* fields */
};
void bork(size_t num_elems) {
struct bar *p = alloc_bar(num_elems);
for (size_t i = 0; i < num_elems; ++i) {
p->ptr[i] = alloc_foo();
}
/* some use of p */
} I was incorrect about
It's not a lie, because the contents of a pointer don't contribute to the size of the struct containing that pointer. |
@rapidsna @delcypher @apple-fcloutier @kees: Okay, I think I see what the complication is. Are you trying to prevent the use case of someone writing something like: struct bar;
struct foo {
size_t count;
struct bar *ptr __counted_by(count);
}; where struct foo *alloc_foo(size_t count) {
struct foo *p = malloc(sizeof(struct foo));
p->count;
p->ptr = malloc(sizeof(struct bar) * count); // struct bar can't be incomplete here
return p;
} |
With the PR in its current form we are preventing this code but only because As @apple-fcloutier @rapidsna noted this is how |
The main concern I have with delaying support for this is that header files could find themselves in a state where they could not be refactored without removing counted_by attributes that refer to now-incomplete structs. For example, in Linux we've been separating structs from implementations, and that usually means using incomplete structs as much as possible to avoid lots of #includes. So, no objection on this PR, but I think the more complete behavior needs to follow soonish. :) |
Consider this example. It tries to illustrate why putting struct HasFAM {
int count;
char buffer[] __counted_by(count); // This is OK
};
struct BufferOfFAMS {
int count;
struct HasFAM* fams __counted_by(count); // This is invalid
};
struct BufferOfFAMS* setup(void) {
const int numFams = 2;
const int famSizes[] = { 8, 16};
struct BufferOfFAMS *f = (struct BufferOfFAMS *)malloc(
sizeof(struct BufferOfFAMS) + (sizeof(struct HasFam *) * numFams));
f->count = numFams;
size_t famsBufferSize = 0;
for (int i=0; i < numFams; ++i) {
famsBufferSize += sizeof(struct HasFAM) + sizeof(char)* famSizes[i];
}
f->fams = (struct HasFAM*) malloc(famsBufferSize);
memset(f->fams, 0, famsBufferSize);
size_t byteOffset = 0;
for (int i=0; i < numFams; ++i) {
struct HasFAM* cur = (struct HasFAM*) (((char*)f->fams) + byteOffset);
cur->count = famSizes[i];
byteOffset = sizeof(struct HasFAM) + (sizeof(char) * famSizes[i]);
}
return f;
}
int main(void) {
// How would we compute the bounds of f::fams here??
// `sizeof(struct HasFAM) * f->count` is WRONG. It's too small but this
// is what writing `__counted_by(count)` on `HasFAM::buffer` means.
//
// It's actually
// `(sizeof(struct HasFAM) * f->count) + (sizeof(char) * (8 + 16))`
//
// This means we can't use the `__counted_by()` attribute on
// `BufferOfFAMS::fams` to compute its bounds. But allowing the bounds
// to be computed is the whole point of the attribute.
struct BufferOfFAMS* f = setup();
// Similary doing any kind of indexing on the pointer
// is extremely problematic for exactly the same reason.
//
// The address is completely wrong because the offset is computed using
// `sizeof(struct HasFAM)` which won't include the array at the end.
//
// So `bogus_ptr` points to the first byte of the first `HasFAM::buffer`,
// instead of the second `struct HasFAM`.
struct HasFAM* bogus_ptr = &f->fams[1];
return 0;
} |
Agreed: structs with flexible array members must be considered to be singletons. This property is actually important for being able to have
This must always return |
// expected-error@+2{{'counted_by' only applies to pointers or C99 flexible array members}} | ||
// expected-error@+1{{'buf' declared as array of functions of type 'fn_ty' (aka 'void (int)')}} | ||
fn_ty buf[] __counted_by(count); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a positive test for typedef void(*fn_ty_p)(int)
(i.e. a function pointer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kees Good catch. Fixed
e6fb7a3
to
80dbab4
Compare
Looks like I broke the |
Test fixed by |
This seems to have broken the Linux build: 0ec3b97 breaks the build for Linux, added by https://git.kernel.org/linus/781d41fed19caf900c8405064676813dc9921d32: |
@bwendling Thanks for reporting. We will relax the restrictions for arrays to not break the existing users. |
@bwendling @kees Wait. |
@bwendling @kees Likely, we should not put |
@bwendling This is unfortunate
Based on our previous discussion (#90786 (comment)) I hope we can agree using the attribute this way is problematic. I don't see how To unbreak this particular use quickly I'll downgrade the struct with FAM case to a warning and from there we'll need to decide how to proceed. |
Ah! I see what you mean. I'll bring this up with the developer. (Actually, that construct makes me nervous about their code in general...) |
…n on flexible array members In 0ec3b97 an additional restriction was added when applying the `counted_by` attribute to flexible array members in structs. The restriction prevented the element type being a struct that itself had a flexible array member. E.g.: ``` struct has_unannotated_VLA { int count; char buffer[]; }; struct buffer_of_structs_with_unnannotated_vla { int count; struct has_unannotated_VLA Arr[] __counted_by(count); }; ``` In this example assuming the size of `Arr` is `sizeof(struct has_unannotated_VLA)*count` (which is what the attribute says) is wrong because it doesn't account for the size of `has_unannotated_VLA::buffer`. This is why this kind of code construct was treated as an error. However, it turns out existing Linux kernel code used the attribute on a flexible array member in this way (#90786 (comment)). To unbreak the build this restriction is downgraded to a warning with the plan to make it an error again once the errornous use of the attribute in the Linux kernel is resolved.
@kees @bwendling @rapidsna The workaround to downgrade this error to a warning has landed |
Thank you. I wrote to the author. I hope he'll be able to come up with a change on his end. Or at least an explanation that makes sense :-) |
Hmm. Apparently there's a memory leak. https://lab.llvm.org/buildbot/#/builders/239/builds/7043
|
The leak via LA.Toks.push_back(AttrEnd); which we just added in this PR. and the leak via is
which is really old code so I doubt this is directly responsible for the leak. I'm a little confused because LSan is telling us that these are indirect leaks which means "reachable from other leaked blocks" but LSan isn't showing any direct leaks (not reachable from anywhere). This might suggest the leak is some sort of cycle (multiple objects leaked that point to each other). |
Ok. Now I see what's happening. These lines here are basically giving ownership of
However
so it won't free the pointers when the list is destroyed. If we look at the C++ code that handles this we can see a bunch of manual memory management.
This is a silly footgun. For now we should just replicate what is done for C++ but once we're confident we've fixed things we should make |
…ructs in C (#90786)" Memory leak: https://lab.llvm.org/buildbot/#/builders/5/builds/43403 Issue #92687 This reverts commit 0ec3b97.
I'll put up a new version of this PR with the memory leak fixed soon. |
… C (llvm#90786) Previously the attribute was only allowed on flexible array members. This patch patch changes this to also allow the attribute on pointer fields in structs and also allows late parsing of the attribute in some contexts. For example this previously wasn't allowed: ``` struct BufferTypeDeclAttributePosition { size_t count; char* buffer __counted_by(count); // Now allowed } ``` Note the attribute is prevented on pointee types where the size isn't known at compile time. In particular pointee types that are: * Incomplete (e.g. `void`) and sizeless types * Function types (e.g. the pointee of a function pointer) * Struct types with a flexible array member This patch also introduces late parsing of the attribute when used in the declaration attribute position. For example ``` struct BufferTypeDeclAttributePosition { char* buffer __counted_by(count); // Now allowed size_t count; } ``` is now allowed but **only** when passing `-fexperimental-late-parse-attributes`. The motivation for using late parsing here is to avoid breaking the data layout of structs in existing code that want to use the `counted_by` attribute. This patch is the first use of `LateAttrParseExperimentalExt` in `Attr.td` that was introduced in a previous patch. Note by allowing the attribute on struct member pointers this now allows the possiblity of writing the attribute in the type attribute position. For example: ``` struct BufferTypeAttributePosition { size_t count; char *__counted_by(count) buffer; // Now allowed } ``` However, the attribute in this position is still currently parsed immediately rather than late parsed. So this will not parse currently: ``` struct BufferTypeAttributePosition { char *__counted_by(count) buffer; // Fails to parse size_t count; } ``` The intention is to lift this restriction in future patches. It has not been done in this patch to keep this size of this commit small. There are also several other follow up changes that will need to be addressed in future patches: * Make late parsing working with anonymous structs (see `on_pointer_anon_buf` in `attr-counted-by-late-parsed-struct-ptrs.c`). * Allow `counted_by` on more subjects (e.g. parameters, returns types) when `-fbounds-safety` is enabled. * Make use of the attribute on pointer types in code gen (e.g. for `_builtin_dynamic_object_size` and UBSan's array-bounds checks). This work is heavily based on a patch originally written by Yeoul Na. rdar://125400257 Co-authored-by: Dan Liew <[email protected]>
…n on flexible array members In 0ec3b97 an additional restriction was added when applying the `counted_by` attribute to flexible array members in structs. The restriction prevented the element type being a struct that itself had a flexible array member. E.g.: ``` struct has_unannotated_VLA { int count; char buffer[]; }; struct buffer_of_structs_with_unnannotated_vla { int count; struct has_unannotated_VLA Arr[] __counted_by(count); }; ``` In this example assuming the size of `Arr` is `sizeof(struct has_unannotated_VLA)*count` (which is what the attribute says) is wrong because it doesn't account for the size of `has_unannotated_VLA::buffer`. This is why this kind of code construct was treated as an error. However, it turns out existing Linux kernel code used the attribute on a flexible array member in this way (llvm#90786 (comment)). To unbreak the build this restriction is downgraded to a warning with the plan to make it an error again once the errornous use of the attribute in the Linux kernel is resolved.
…s in structs in C) (#93121) [BoundsSafety] Reland #93121 Allow 'counted_by' attribute on pointers in structs in C (#93121) Fixes #92687. Previously the attribute was only allowed on flexible array members. This patch patch changes this to also allow the attribute on pointer fields in structs and also allows late parsing of the attribute in some contexts. For example this previously wasn't allowed: ``` struct BufferTypeDeclAttributePosition { size_t count; char* buffer __counted_by(count); // Now allowed } ``` Note the attribute is prevented on pointee types where the size isn't known at compile time. In particular pointee types that are: * Incomplete (e.g. `void`) and sizeless types * Function types (e.g. the pointee of a function pointer) * Struct types with a flexible array member This patch also introduces late parsing of the attribute when used in the declaration attribute position. For example ``` struct BufferTypeDeclAttributePosition { char* buffer __counted_by(count); // Now allowed size_t count; } ``` is now allowed but **only** when passing `-fexperimental-late-parse-attributes`. The motivation for using late parsing here is to avoid breaking the data layout of structs in existing code that want to use the `counted_by` attribute. This patch is the first use of `LateAttrParseExperimentalExt` in `Attr.td` that was introduced in a previous patch. Note by allowing the attribute on struct member pointers this now allows the possiblity of writing the attribute in the type attribute position. For example: ``` struct BufferTypeAttributePosition { size_t count; char *__counted_by(count) buffer; // Now allowed } ``` However, the attribute in this position is still currently parsed immediately rather than late parsed. So this will not parse currently: ``` struct BufferTypeAttributePosition { char *__counted_by(count) buffer; // Fails to parse size_t count; } ``` The intention is to lift this restriction in future patches. It has not been done in this patch to keep this size of this commit small. There are also several other follow up changes that will need to be addressed in future patches: * Make late parsing working with anonymous structs (see `on_pointer_anon_buf` in `attr-counted-by-late-parsed-struct-ptrs.c`). * Allow `counted_by` on more subjects (e.g. parameters, returns types) when `-fbounds-safety` is enabled. * Make use of the attribute on pointer types in code gen (e.g. for `_builtin_dynamic_object_size` and UBSan's array-bounds checks). This work is heavily based on a patch originally written by Yeoul Na. ** Differences between #93121 and this patch ** * The memory leak that caused #93121 to be reverted (see #92687) should now be fixed. See "The Memory Leak". * The fix to `pragma-attribute-supported-attributes-list.test` (originally in cef6387) has been incorporated into this patch. * A relaxation of counted_by semantics (originally in 112eadd) has been incorporated into this patch. * The assert in `Parser::DistributeCLateParsedAttrs` has been removed because that broke downstream code. * The switch statement in `Parser::ParseLexedCAttribute` has been removed in favor of using `Parser::ParseGNUAttributeArgs` which does the same thing but is more feature complete. * The `EnterScope` parameter has been plumbed through `Parser::ParseLexedCAttribute` and `Parser::ParseLexedCAttributeList`. It currently doesn't do anything but it will be needed in future commits. ** The Memory Leak ** The problem was that these lines parsed the attributes but then did nothing to free the memory ``` assert(!getLangOpts().CPlusPlus); for (auto *LateAttr : LateFieldAttrs) ParseLexedCAttribute(*LateAttr); ``` To fix this this a new `Parser::ParseLexedCAttributeList` method has been added (based on `Parser::ParseLexedAttributeList`) which does the necessary memory management. The intention is to merge these two methods together so there is just one implementation in a future patch (#93263). A more principled fixed here would be to fix the ownership of the `LateParsedAttribute` objects. In principle `LateParsedAttrList` should own its pointers exclusively and be responsible for deallocating them. Unfortunately this is complicated by `LateParsedAttribute` objects also being stored in another data structure (`LateParsedDeclarations`) as can be seen below (`LA` gets stored in two places). ``` // Handle attributes with arguments that require late parsing. LateParsedAttribute *LA = new LateParsedAttribute(this, *AttrName, AttrNameLoc); LateAttrs->push_back(LA); // Attributes in a class are parsed at the end of the class, along // with other late-parsed declarations. if (!ClassStack.empty() && !LateAttrs->parseSoon()) getCurrentClass().LateParsedDeclarations.push_back(LA); ``` this means the ownership of LateParsedAttribute objects isn't very clear. rdar://125400257
…ption and use it to tweak `counted_by`'s semantics This adds the `-fexperimental-bounds-safety` CC1 and corresponding language option. This language option enables "-fbounds-safety" which is a bounds-safety extension for C that is being incrementally upstreamed. This CC1 flag is not exposed as a driver flag yet because most of the implementation isn't upstream yet. The language option is used to make a small semantic change to how the `counted_by` attribute is treated. Without `-fexperimental-bounds-safety` the attribute is allowed (but emits a warning) on a flexible array member where the element type is a struct with a flexible array member. With the flag this situation is an error. E.g. ``` struct has_unannotated_FAM { int count; char buffer[]; }; struct buffer_of_structs_with_unnannotated_FAM { int count; // Forbidden with `-fexperimental-bounds-safety` struct has_unannotated_FAM Arr[] __counted_by(count); }; ``` The above code **should always** be an error. However, when llvm#90786 was originally landed (which allowed `counted_by` to be used on pointers in structs) it exposed an issue in code in the Linux kernel that was using the `counted_by` attribute incorrectly (see llvm#90786 (comment)) which was now caught by a new error diagnostic in the PR. To unbreak the build of the Linux kernel the error diagnostic was temporarily downgraded to be a warning to give the kernel authors time to fix their code. This downgrading of the error diagnostic to a warning is a departure from the intended semantics of `-fbounds-safety` so in order to have both behaviors (error and warning) it is necessary for Clang to actually have a notion of `-fbounds-safety` being on vs off. rdar://125400392
Previously the attribute was only allowed on flexible array members. This patch patch changes this to also allow the attribute on pointer fields in structs and also allows late parsing of the attribute in some contexts.
For example this previously wasn't allowed:
Note the attribute is prevented on pointee types where the size isn't known at compile time. In particular pointee types that are:
void
) and sizeless typesThis patch also introduces late parsing of the attribute when used in the declaration attribute position. For example
is now allowed but only when passing
-fexperimental-late-parse-attributes
. The motivation for using late parsing here is to avoid breaking the data layout of structs in existing code that want to use thecounted_by
attribute. This patch is the first use ofLateAttrParseExperimentalExt
inAttr.td
that was introduced in a previous patch.Note by allowing the attribute on struct member pointers this now allows the possiblity of writing the attribute in the type attribute position. For example:
However, the attribute in this position is still currently parsed immediately rather than late parsed. So this will not parse currently:
The intention is to lift this restriction in future patches. It has not been done in this patch to keep this size of this commit small.
There are also several other follow up changes that will need to be addressed in future patches:
on_pointer_anon_buf
inattr-counted-by-late-parsed-struct-ptrs.c
).counted_by
on more subjects (e.g. parameters, returns types) when-fbounds-safety
is enabled._builtin_dynamic_object_size
and UBSan's array-bounds checks).This work is heavily based on a patch originally written by Yeoul Na.
rdar://125400257