-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[analyzer] Use AllocaRegion in MallocChecker #72402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...to model the results of alloca() and _alloca() calls. Previously it acted as if these functions were returning memory from the heap, which led to alpha.security.ArrayBoundV2 producing incorrect messages.
@llvm/pr-subscribers-clang-static-analyzer-1 @llvm/pr-subscribers-clang Author: None (DonatNagyE) Changes...to model the results of alloca() and _alloca() calls. Previously it acted as if these functions were returning memory from the heap, which led to alpha.security.ArrayBoundV2 producing incorrect messages. Full diff: https://github.com/llvm/llvm-project/pull/72402.diff 7 Files Affected:
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
index 692c4384586569e..a64cf7ae4efcb82 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
@@ -215,6 +215,15 @@ class SValBuilder {
const LocationContext *LCtx,
QualType type, unsigned Count);
+ /// Create an SVal representing the result of an alloca()-like call, that is,
+ /// an AllocaRegion on the stack.
+ ///
+ /// After calling this function, it's a good idea to set the extent of the
+ /// returned AllocaRegion.
+ loc::MemRegionVal getAllocaRegionVal(const Expr *E,
+ const LocationContext *LCtx,
+ unsigned Count);
+
DefinedOrUnknownSVal getDerivedRegionValueSymbolVal(
SymbolRef parentSymbol, const TypedValueRegion *region);
diff --git a/clang/lib/StaticAnalyzer/Checkers/BuiltinFunctionChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/BuiltinFunctionChecker.cpp
index 4a56156de4b27fe..143326c435cf815 100644
--- a/clang/lib/StaticAnalyzer/Checkers/BuiltinFunctionChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/BuiltinFunctionChecker.cpp
@@ -81,22 +81,21 @@ bool BuiltinFunctionChecker::evalCall(const CallEvent &Call,
case Builtin::BI__builtin_alloca_with_align:
case Builtin::BI__builtin_alloca: {
- // FIXME: Refactor into StoreManager itself?
- MemRegionManager& RM = C.getStoreManager().getRegionManager();
- const AllocaRegion* R =
- RM.getAllocaRegion(CE, C.blockCount(), C.getLocationContext());
-
- // Set the extent of the region in bytes. This enables us to use the
- // SVal of the argument directly. If we save the extent in bits, we
- // cannot represent values like symbol*8.
- auto Size = Call.getArgSVal(0);
- if (Size.isUndef())
- return true; // Return true to model purity.
-
- state = setDynamicExtent(state, R, Size.castAs<DefinedOrUnknownSVal>(),
- C.getSValBuilder());
+ SValBuilder &SVB = C.getSValBuilder();
+ const loc::MemRegionVal R =
+ SVB.getAllocaRegionVal(CE, C.getLocationContext(), C.blockCount());
- C.addTransition(state->BindExpr(CE, LCtx, loc::MemRegionVal(R)));
+ // Set the extent of the region in bytes. This enables us to use the SVal
+ // of the argument directly. If we saved the extent in bits, it'd be more
+ // difficult to reason about values like symbol*8.
+ auto Size = Call.getArgSVal(0);
+ if (auto DefSize = Size.getAs<DefinedOrUnknownSVal>()) {
+ state = setDynamicExtent(state, R.getRegion(), *DefSize, SVB);
+ // FIXME: perhaps the following transition should be moved out of the
+ // 'if' to bind an AllocaRegion (with unknown/unspecified size) even in
+ // the unlikely case when the size argument is undefined.
+ C.addTransition(state->BindExpr(CE, LCtx, R));
+ }
return true;
}
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index d3a4020280616b0..417305e26c41b09 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -1731,10 +1731,12 @@ ProgramStateRef MallocChecker::MallocMemAux(CheckerContext &C,
// TODO: We could rewrite post visit to eval call; 'malloc' does not have
// side effects other than what we model here.
unsigned Count = C.blockCount();
- SValBuilder &svalBuilder = C.getSValBuilder();
+ SValBuilder &SVB = C.getSValBuilder();
const LocationContext *LCtx = C.getPredecessor()->getLocationContext();
- DefinedSVal RetVal = svalBuilder.getConjuredHeapSymbolVal(CE, LCtx, Count)
- .castAs<DefinedSVal>();
+ DefinedSVal RetVal =
+ ((Family == AF_Alloca) ? SVB.getAllocaRegionVal(CE, LCtx, Count)
+ : SVB.getConjuredHeapSymbolVal(CE, LCtx, Count)
+ .castAs<DefinedSVal>());
State = State->BindExpr(CE, C.getLocationContext(), RetVal);
// Fill the region with the initialization value.
@@ -1746,7 +1748,7 @@ ProgramStateRef MallocChecker::MallocMemAux(CheckerContext &C,
// Set the region's extent.
State = setDynamicExtent(State, RetVal.getAsRegion(),
- Size.castAs<DefinedOrUnknownSVal>(), svalBuilder);
+ Size.castAs<DefinedOrUnknownSVal>(), SVB);
return MallocUpdateRefState(C, CE, State, Family);
}
diff --git a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
index e2f0fb6a6731ca0..eb9cde5c8918fc1 100644
--- a/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
@@ -231,6 +231,14 @@ SValBuilder::getConjuredHeapSymbolVal(const Expr *E,
return loc::MemRegionVal(MemMgr.getSymbolicHeapRegion(sym));
}
+loc::MemRegionVal SValBuilder::getAllocaRegionVal(const Expr *E,
+ const LocationContext *LCtx,
+ unsigned VisitCount) {
+ const AllocaRegion *R =
+ getRegionManager().getAllocaRegion(E, VisitCount, LCtx);
+ return loc::MemRegionVal(R);
+}
+
DefinedSVal SValBuilder::getMetadataSymbolVal(const void *symbolTag,
const MemRegion *region,
const Expr *expr, QualType type,
diff --git a/clang/test/Analysis/malloc.c b/clang/test/Analysis/malloc.c
index a3f7a69b8cef347..14b7555c2c58197 100644
--- a/clang/test/Analysis/malloc.c
+++ b/clang/test/Analysis/malloc.c
@@ -266,13 +266,18 @@ void CheckUseZeroAllocated1(void) {
}
char CheckUseZeroAllocated2(void) {
+ // FIXME: The return value of `alloca()` is modeled with `AllocaRegion`
+ // instead of `SymbolicRegion`, so the current implementation of
+ // `MallocChecker::checkUseZeroAllocated()` cannot handle it; and we get an
+ // unrelated, but suitable warning from core.uninitialized.UndefReturn.
char *p = alloca(0);
- return *p; // expected-warning {{Use of memory allocated with size zero}}
+ return *p; // expected-warning {{Undefined or garbage value returned to caller}}
}
char CheckUseZeroWinAllocated2(void) {
+ // FIXME: Same situation as `CheckUseZeroAllocated2()`.
char *p = _alloca(0);
- return *p; // expected-warning {{Use of memory allocated with size zero}}
+ return *p; // expected-warning {{Undefined or garbage value returned to caller}}
}
void UseZeroAllocated(int *p) {
@@ -727,6 +732,11 @@ void paramFree(int *p) {
myfoo(p); // expected-warning {{Use of memory after it is freed}}
}
+void allocaFree(void) {
+ int *p = alloca(sizeof(int));
+ free(p); // expected-warning {{Memory allocated by alloca() should not be deallocated}}
+}
+
int* mallocEscapeRet(void) {
int *p = malloc(12);
return p; // no warning
diff --git a/clang/test/Analysis/memory-model.cpp b/clang/test/Analysis/memory-model.cpp
index fbdef4fddbea93b..fd5a286acb60c06 100644
--- a/clang/test/Analysis/memory-model.cpp
+++ b/clang/test/Analysis/memory-model.cpp
@@ -97,7 +97,7 @@ void symbolic_malloc() {
void symbolic_alloca() {
int *a = (int *)alloca(12);
- clang_analyzer_dump(a); // expected-warning {{Element{HeapSymRegion{conj}}
+ clang_analyzer_dump(a); // expected-warning {{Element{alloca{}}
clang_analyzer_dumpExtent(a); // expected-warning {{12 S64b}}
clang_analyzer_dumpElementCount(a); // expected-warning {{3 S64b}}
}
diff --git a/clang/test/Analysis/out-of-bounds-diagnostics.c b/clang/test/Analysis/out-of-bounds-diagnostics.c
index 4f5d0d0bc91c54a..da1573665fa7951 100644
--- a/clang/test/Analysis/out-of-bounds-diagnostics.c
+++ b/clang/test/Analysis/out-of-bounds-diagnostics.c
@@ -107,12 +107,8 @@ void *alloca(size_t size);
int allocaRegion(void) {
int *mem = (int*)alloca(2*sizeof(int));
mem[3] = -2;
- // expected-warning@-1 {{Out of bound access to memory after the end of the heap area}}
- // expected-note@-2 {{Access of the heap area at index 3, while it holds only 2 'int' elements}}
- // FIXME: this should be
- // {{Out of bound access to memory after the end of the memory returned by 'alloca'}}
- // {{Access of the memory returned by 'alloca' at index 3, while it holds only 2 'int' elements}}
- // but apparently something models 'alloca' as if it was allocating on the heap
+ // expected-warning@-1 {{Out of bound access to memory after the end of the memory returned by 'alloca'}}
+ // expected-note@-2 {{Access of the memory returned by 'alloca' at index 3, while it holds only 2 'int' elements}}
return *mem;
}
|
Hmm, it really worries me that To the best of my knowledge, the only legal way to use |
char *p = alloca(0); | ||
return *p; // expected-warning {{Use of memory allocated with size zero}} | ||
return *p; // expected-warning {{Undefined or garbage value returned to caller}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we'd have no warning in case of
char CheckUseZeroAllocatedAndInitialized(void) {
char *p = alloca(0);
*p = 4;
return *p;
}
? Might be worth testing.
(It's probably not hard to fix it as well? It's not like AllocaRegion
is special when it comes to being able to carry dynamic extent?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll test that. (Edit: I tried and the error disappears if I add *p = 4
. However, as I write below, ArrayBoundV2 is already checking this kind of overflow, and that checker produces an useful error message even in the case when *p = 4
is added.)
Unfortunately this "allocated with size zero" report is based on the private "Symbol -> state enum" map that's maintained by MallocChecker (so it's independent of the dynamic extent). I'd guess that switching to dynamic extent wouldn't be too difficult and it could simplify the code, but I think that belongs to a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that the report "Use of memory allocated with size zero" is redundant with ArrayBoundV2, which detects and reports that the offset (0) is not smaller than the extent (also 0). Based on this I'm not sure that it's useful to maintain this "size zero" special case. What do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it is not the real question, what we are to do with the 0-size alloca
calls, but just to highlight some practical concerns, I found these sources:
https://discourse.llvm.org/t/malloc-free-and-alloca-with-zero-size/9284/3
https://stackoverflow.com/questions/8036654/what-does-alloca0-do-and-return-on-various-platforms
So alloca(0)
sometimes has a special meaning. If we can give more specific error messages in these cases, I would prefer to handle those error messages in the more specific checker.
Even if ArrayBoundV2 has more user-friendly and mature error reporting (and would cover this case strictly speaking), making this more specific checker emit better diagnostics as well is something worth considering IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that alloca()
in general and alloca(0)
in particular can mean many things, and I don't think that it's worth to create a specific error message because I cannot say anything concrete in it. This is a nonstandard function, and while we can model its "basic" behavior, I think that we shouldn't try to deal with its corner cases.
I agree and there's already a FIXME which suggests moving this modeling step into an EvalCall callback. If you feel that this is an important improvement, I could create a separate commit that does it.
I was thinking about an alternative solution: we could turn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, added two remarks inline, but those can be separate patches as well.
auto Size = Call.getArgSVal(0); | ||
if (auto DefSize = Size.getAs<DefinedOrUnknownSVal>()) { | ||
state = setDynamicExtent(state, R.getRegion(), *DefSize, SVB); | ||
// FIXME: perhaps the following transition should be moved out of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would go with first asserting that the Size is DefinedOrUnknown anyway, and if we have a crash with a reproducer, then we can add the if and the test case for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the situation with an assert and a straightforward testcase, and it turns out that currently the getAs()
always succeeds, because core.CallAndMessage
stops the execution with a bug report when an Undefined
argument is passed to a function. (However, if I disabled this functionality of core.CallAndMessage
, the test -- obviously -- triggered the assertion.)
Based on this, I decided to keep the current defensive check, because I didn't want to add a hidden and marginal dependency relationship between these two checkers. I know that we may assume that core
checkers are always on, but that doesn't mean that I should hide landmines like assert(core.CallAndMessage is active and works as I assume)
in unrelated checkers.
I updated the comment to explain this situation, and I also moved the addTransition()
call out of the if
because that's the more logical place for that method call and this "supporting fire" from core.CallAndMessage
guarantees that this change won't have unintended side effects (because the if
succeeds in normal operation).
char *p = alloca(0); | ||
return *p; // expected-warning {{Use of memory allocated with size zero}} | ||
return *p; // expected-warning {{Undefined or garbage value returned to caller}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it is not the real question, what we are to do with the 0-size alloca
calls, but just to highlight some practical concerns, I found these sources:
https://discourse.llvm.org/t/malloc-free-and-alloca-with-zero-size/9284/3
https://stackoverflow.com/questions/8036654/what-does-alloca0-do-and-return-on-various-platforms
So alloca(0)
sometimes has a special meaning. If we can give more specific error messages in these cases, I would prefer to handle those error messages in the more specific checker.
Even if ArrayBoundV2 has more user-friendly and mature error reporting (and would cover this case strictly speaking), making this more specific checker emit better diagnostics as well is something worth considering IMO.
@haoNoQ What do you think about this commit? May I merge it? |
I think the original Edit: we already have a TODO, but I think that one does not really do justice to the urgency of the problem, it does not mention that doing bindings in checkers outside of |
The difficulty is that we switch from the It would be possible to reintroduce the original
I'd say that "non-portable" and "undesirable in most cases" is true for practically all applications of
I'll spice up that TODO note; and after merging this commit I could start working on fixing it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I'm in favor of this change.
On the other hand, I'd urge for not to regress on the diagnostics.
To me, alloca
is like a VLA; which is prone to misuses, thus the edge-cases count there (esp. if tainted).
I'm merging this commit in its current shape because even if I'd reimplement a warning for the "use of Also note that these messages are coming from a Anyway, I think that this is a very minor and secondary feature and the only reason why we're talking about it is that it was accidentally working when we modeled the zero-allocated alloca region as a zero-allocated heap region. |
...to model the results of alloca() and _alloca() calls. Previously it acted as if these functions were returning memory from the heap, which led to alpha.security.ArrayBoundV2 producing incorrect messages.