Skip to content

[analyzer] Let the checkers query upper and lower bounds on symbols #74141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 4, 2023

Conversation

NagyDonat
Copy link
Contributor

This commit extends the class SValBuilder with the methods getMinValue() and getMaxValue() to that work like SValBuilder::getKnownValue() but return the minimal/maximal possible value the SVal is not perfectly constrained.

This extension of the ConstraintManager API is discussed at: https://discourse.llvm.org/t/expose-the-inferred-range-information-in-warning-messages/75192

As a simple proof-of-concept application of this new API, this commit extends a message from core.BitwiseShift with some range information that reports the assumptions of the analyzer.

My main motivation for adding these methods is that I'll also want to use them in ArrayBoundCheckerV2 to make the error messages less awkward, but I'm starting with this simpler and less important usecase because I want to avoid merge conflicts with my other commit #72107 which is currently under review.

The testcase too_large_right_operand_compound() shows a situation where querying the range information does not work (and the extra information is not added to the error message). This also affects the debug utility clang_analyzer_value(), so the problem isn't in the fresh code. I'll do some investigations to resolve this, but I think that this commit is a step forward even with this limitation.

This commit extends the class `SValBuilder` with the methods
`getMinValue()` and `getMaxValue()` to that work like
`SValBuilder::getKnownValue()` but return the minimal/maximal possible
value the `SVal` is not perfectly constrained.

This extension of the ConstraintManager API is discussed at:
https://discourse.llvm.org/t/expose-the-inferred-range-information-in-warning-messages/75192

As a simple proof-of-concept application of this new API, this commit
extends a message from `core.BitwiseShift` with some range information
that reports the assumptions of the analyzer.

My main motivation for adding these methods is that I'll also want to
use them in `ArrayBoundCheckerV2` to make the error messages less
awkward, but I'm starting with this simpler and less important usecase
because I want to avoid merge conflicts with my other commit
llvm#72107 which is currently under
review.

The testcase `too_large_right_operand_compound()` shows a situation
where querying the range information does not work (and the extra
information is not added to the error message). This also affects the
debug utility `clang_analyzer_value()`, so the problem isn't in the
fresh code. I'll do some investigations to resolve this, but I think
that this commit is a step forward even with this limitation.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Dec 1, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 1, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: None (DonatNagyE)

Changes

This commit extends the class SValBuilder with the methods getMinValue() and getMaxValue() to that work like SValBuilder::getKnownValue() but return the minimal/maximal possible value the SVal is not perfectly constrained.

This extension of the ConstraintManager API is discussed at: https://discourse.llvm.org/t/expose-the-inferred-range-information-in-warning-messages/75192

As a simple proof-of-concept application of this new API, this commit extends a message from core.BitwiseShift with some range information that reports the assumptions of the analyzer.

My main motivation for adding these methods is that I'll also want to use them in ArrayBoundCheckerV2 to make the error messages less awkward, but I'm starting with this simpler and less important usecase because I want to avoid merge conflicts with my other commit #72107 which is currently under review.

The testcase too_large_right_operand_compound() shows a situation where querying the range information does not work (and the extra information is not added to the error message). This also affects the debug utility clang_analyzer_value(), so the problem isn't in the fresh code. I'll do some investigations to resolve this, but I think that this commit is a step forward even with this limitation.


Full diff: https://github.com/llvm/llvm-project/pull/74141.diff

6 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h (+16)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h (+8)
  • (modified) clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp (+11-5)
  • (modified) clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp (+22)
  • (modified) clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp (+50-6)
  • (modified) clang/test/Analysis/bitwise-shift-common.c (+11-3)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h
index 4e94ad49a783360..4de04bc4d397ac4 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h
@@ -110,6 +110,22 @@ class ConstraintManager {
     return nullptr;
   }
 
+  /// Attempt to return the minimal possible value for a given symbol. Note
+  /// that a ConstraintManager is not obligated to return a lower bound, it may
+  /// also return nullptr.
+  virtual const llvm::APSInt *getSymMinVal(ProgramStateRef state,
+                                           SymbolRef sym) const {
+    return nullptr;
+  }
+
+  /// Attempt to return the minimal possible value for a given symbol. Note
+  /// that a ConstraintManager is not obligated to return a lower bound, it may
+  /// also return nullptr.
+  virtual const llvm::APSInt *getSymMaxVal(ProgramStateRef state,
+                                           SymbolRef sym) const {
+    return nullptr;
+  }
+
   /// Scan all symbols referenced by the constraints. If the symbol is not
   /// alive, remove it.
   virtual ProgramStateRef removeDeadBindings(ProgramStateRef state,
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
index a64cf7ae4efcb82..d7cff49036cb810 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
@@ -110,6 +110,14 @@ class SValBuilder {
   /// that value is returned. Otherwise, returns NULL.
   virtual const llvm::APSInt *getKnownValue(ProgramStateRef state, SVal val) = 0;
 
+  /// Tries to get the minimal possible (integer) value of a given SVal. If the
+  /// constraint manager cannot provide an useful answer, this returns NULL.
+  virtual const llvm::APSInt *getMinValue(ProgramStateRef state, SVal val) = 0;
+
+  /// Tries to get the maximal possible (integer) value of a given SVal. If the
+  /// constraint manager cannot provide an useful answer, this returns NULL.
+  virtual const llvm::APSInt *getMaxValue(ProgramStateRef state, SVal val) = 0;
+
   /// Simplify symbolic expressions within a given SVal. Return an SVal
   /// that represents the same value, but is hopefully easier to work with
   /// than the original SVal.
diff --git a/clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp
index 8a933d124d7700d..d4aa9fa1339f47c 100644
--- a/clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp
@@ -172,18 +172,24 @@ BugReportPtr BitwiseShiftValidator::checkOvershift() {
 
   const SVal Right = Ctx.getSVal(operandExpr(OperandSide::Right));
 
-  std::string RightOpStr = "";
+  std::string RightOpStr = "", LowerBoundStr = "";
   if (auto ConcreteRight = Right.getAs<nonloc::ConcreteInt>())
     RightOpStr = formatv(" '{0}'", ConcreteRight->getValue());
+  else {
+    SValBuilder &SVB = Ctx.getSValBuilder();
+    if (const llvm::APSInt *MinRight = SVB.getMinValue(FoldedState, Right)) {
+      LowerBoundStr = formatv(" >= {0},", MinRight->getExtValue());
+    }
+  }
 
   std::string ShortMsg = formatv(
       "{0} shift{1}{2} overflows the capacity of '{3}'",
       isLeftShift() ? "Left" : "Right", RightOpStr.empty() ? "" : " by",
       RightOpStr, LHSTy.getAsString());
-  std::string Msg =
-      formatv("The result of {0} shift is undefined because the right "
-              "operand{1} is not smaller than {2}, the capacity of '{3}'",
-              shiftDir(), RightOpStr, LHSBitWidth, LHSTy.getAsString());
+  std::string Msg = formatv(
+      "The result of {0} shift is undefined because the right "
+      "operand{1} is{2} not smaller than {3}, the capacity of '{4}'",
+      shiftDir(), RightOpStr, LowerBoundStr, LHSBitWidth, LHSTy.getAsString());
   return createBugReport(ShortMsg, Msg);
 }
 
diff --git a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
index 5de99384449a4c8..25d066c4652f2bf 100644
--- a/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
@@ -1876,6 +1876,12 @@ class RangeConstraintManager : public RangedConstraintManager {
   const llvm::APSInt *getSymVal(ProgramStateRef State,
                                 SymbolRef Sym) const override;
 
+  const llvm::APSInt *getSymMinVal(ProgramStateRef State,
+                                   SymbolRef Sym) const override;
+
+  const llvm::APSInt *getSymMaxVal(ProgramStateRef State,
+                                   SymbolRef Sym) const override;
+
   ProgramStateRef removeDeadBindings(ProgramStateRef State,
                                      SymbolReaper &SymReaper) override;
 
@@ -2863,6 +2869,22 @@ const llvm::APSInt *RangeConstraintManager::getSymVal(ProgramStateRef St,
   return T ? T->getConcreteValue() : nullptr;
 }
 
+const llvm::APSInt *RangeConstraintManager::getSymMinVal(ProgramStateRef St,
+                                                         SymbolRef Sym) const {
+  const RangeSet *T = getConstraint(St, Sym);
+  if (!T || T->isEmpty())
+    return nullptr;
+  return &T->getMinValue();
+}
+
+const llvm::APSInt *RangeConstraintManager::getSymMaxVal(ProgramStateRef St,
+                                                         SymbolRef Sym) const {
+  const RangeSet *T = getConstraint(St, Sym);
+  if (!T || T->isEmpty())
+    return nullptr;
+  return &T->getMaxValue();
+}
+
 //===----------------------------------------------------------------------===//
 //                Remove dead symbols from existing constraints
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
index 0f2d6c8fc80bb02..45e48d435aca6a2 100644
--- a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
@@ -28,7 +28,11 @@ class SimpleSValBuilder : public SValBuilder {
   // returns NULL.
   // This is an implementation detail. Checkers should use `getKnownValue()`
   // instead.
-  const llvm::APSInt *getConstValue(ProgramStateRef state, SVal V);
+  static const llvm::APSInt *getConstValue(ProgramStateRef state, SVal V);
+
+  // Helper function that returns the value stored in a nonloc::ConcreteInt or
+  // loc::ConcreteInt.
+  static const llvm::APSInt *getConcreteValue(SVal V);
 
   // With one `simplifySValOnce` call, a compound symbols might collapse to
   // simpler symbol tree that is still possible to further simplify. Thus, we
@@ -76,6 +80,16 @@ class SimpleSValBuilder : public SValBuilder {
   /// (integer) value, that value is returned. Otherwise, returns NULL.
   const llvm::APSInt *getKnownValue(ProgramStateRef state, SVal V) override;
 
+  /// Evaluates a given SVal by recursively evaluating and simplifying the
+  /// children SVals, then returns its minimal possible (integer) value. If the
+  /// constraint manager cannot provide a meaningful answer, this returns NULL.
+  const llvm::APSInt *getMinValue(ProgramStateRef state, SVal V) override;
+
+  /// Evaluates a given SVal by recursively evaluating and simplifying the
+  /// children SVals, then returns its maximal possible (integer) value. If the
+  /// constraint manager cannot provide a meaningful answer, this returns NULL.
+  const llvm::APSInt *getMaxValue(ProgramStateRef state, SVal V) override;
+
   SVal simplifySVal(ProgramStateRef State, SVal V) override;
 
   SVal MakeSymIntVal(const SymExpr *LHS, BinaryOperator::Opcode op,
@@ -1182,18 +1196,22 @@ SVal SimpleSValBuilder::evalBinOpLN(ProgramStateRef state,
 
 const llvm::APSInt *SimpleSValBuilder::getConstValue(ProgramStateRef state,
                                                      SVal V) {
-  if (V.isUnknownOrUndef())
-    return nullptr;
+  if (const llvm::APSInt *Res = getConcreteValue(V))
+    return Res;
 
+  if (SymbolRef Sym = V.getAsSymbol())
+    return state->getConstraintManager().getSymVal(state, Sym);
+
+  return nullptr;
+}
+
+const llvm::APSInt *SimpleSValBuilder::getConcreteValue(SVal V) {
   if (std::optional<loc::ConcreteInt> X = V.getAs<loc::ConcreteInt>())
     return &X->getValue();
 
   if (std::optional<nonloc::ConcreteInt> X = V.getAs<nonloc::ConcreteInt>())
     return &X->getValue();
 
-  if (SymbolRef Sym = V.getAsSymbol())
-    return state->getConstraintManager().getSymVal(state, Sym);
-
   return nullptr;
 }
 
@@ -1202,6 +1220,32 @@ const llvm::APSInt *SimpleSValBuilder::getKnownValue(ProgramStateRef state,
   return getConstValue(state, simplifySVal(state, V));
 }
 
+const llvm::APSInt *SimpleSValBuilder::getMinValue(ProgramStateRef state,
+                                                   SVal V) {
+  V = simplifySVal(state, V);
+
+  if (const llvm::APSInt *Res = getConcreteValue(V))
+    return Res;
+
+  if (SymbolRef Sym = V.getAsSymbol())
+    return state->getConstraintManager().getSymMinVal(state, Sym);
+
+  return nullptr;
+}
+
+const llvm::APSInt *SimpleSValBuilder::getMaxValue(ProgramStateRef state,
+                                                   SVal V) {
+  V = simplifySVal(state, V);
+
+  if (const llvm::APSInt *Res = getConcreteValue(V))
+    return Res;
+
+  if (SymbolRef Sym = V.getAsSymbol())
+    return state->getConstraintManager().getSymMaxVal(state, Sym);
+
+  return nullptr;
+}
+
 SVal SimpleSValBuilder::simplifyUntilFixpoint(ProgramStateRef State, SVal Val) {
   SVal SimplifiedVal = simplifySValOnce(State, Val);
   while (SimplifiedVal != Val) {
diff --git a/clang/test/Analysis/bitwise-shift-common.c b/clang/test/Analysis/bitwise-shift-common.c
index fe49f0f992072dd..39108bc838bf277 100644
--- a/clang/test/Analysis/bitwise-shift-common.c
+++ b/clang/test/Analysis/bitwise-shift-common.c
@@ -1,4 +1,5 @@
 // RUN: %clang_analyze_cc1 -analyzer-checker=core.BitwiseShift \
+// RUN:    -analyzer-checker=debug.ExprInspection \
 // RUN:    -analyzer-output=text -verify \
 // RUN:    -triple x86_64-pc-linux-gnu -x c %s \
 // RUN:    -Wno-shift-count-negative -Wno-shift-negative-value \
@@ -6,6 +7,7 @@
 // RUN:    -Wno-shift-sign-overflow
 //
 // RUN: %clang_analyze_cc1 -analyzer-checker=core.BitwiseShift \
+// RUN:    -analyzer-checker=debug.ExprInspection \
 // RUN:    -analyzer-config core.BitwiseShift:Pedantic=true \
 // RUN:    -analyzer-output=text -verify \
 // RUN:    -triple x86_64-pc-linux-gnu -x c++ -std=c++20 %s \
@@ -85,17 +87,23 @@ int too_large_right_operand_symbolic(int left, int right) {
     return 0;
   return left >> right;
   // expected-warning@-1 {{Right shift overflows the capacity of 'int'}}
-  // expected-note@-2 {{The result of right shift is undefined because the right operand is not smaller than 32, the capacity of 'int'}}
-  // NOTE: the messages of the checker are a bit vague in this case, but the
-  // tracking of the variables reveals our knowledge about them.
+  // expected-note@-2 {{The result of right shift is undefined because the right operand is >= 32, not smaller than 32, the capacity of 'int'}}
 }
 
+void clang_analyzer_value(int);
 int too_large_right_operand_compound(unsigned short arg) {
   // Note: this would be valid code with an 'unsigned int' because
   // unsigned addition is allowed to overflow.
+  clang_analyzer_value(32+arg);
+  // expected-warning@-1 {{32s:{ [-2147483648, 2147483647] }}
+  // expected-note@-2 {{32s:{ [-2147483648, 2147483647] }}
   return 1 << (32 + arg);
   // expected-warning@-1 {{Left shift overflows the capacity of 'int'}}
   // expected-note@-2 {{The result of left shift is undefined because the right operand is not smaller than 32, the capacity of 'int'}}
+  // FIXME: this message should be
+  //     {{The result of left shift is undefined because the right operand is >= 32, not smaller than 32, the capacity of 'int'}}
+  // but for some reason neither the new logic, nor debug.ExprInspection and
+  // clang_analyzer_value reports this range information.
 }
 
 // TEST STATE UPDATES

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch makes sense to me.

@NagyDonat
Copy link
Contributor Author

I tested this patch on four open source projects (openssl, ffmpeg, postgres, contour) and the results were not affected (and there were no crashes). This is not surprising, because released open source code doesn't contain too many bugs and the message change only affects a specific subcase.

@NagyDonat NagyDonat merged commit 67f387c into llvm:main Dec 4, 2023
@NagyDonat NagyDonat deleted the expose_symbol_bounds branch December 4, 2023 16:19
@Xazax-hun
Copy link
Collaborator

Does this work with Z3 as the solver? Since Z3 is not officially supported, I think it is not a blocker, but I'd love to see some FIXMEs/tickets opened in that case.

@NagyDonat
Copy link
Contributor Author

With Z3 the new functionality does not work, but its absence is handled gracefully (the new methods return NULL and core.BitwiseShift emits the "old" message).

The Z3 solver is very "shaky" ground for me (if I understand it correctly it's too slow for stand-alone use, and so I didn't pay too much attention to it), so I don't have the confidence to create FIXMEs/tickets (my understanding of the goals is too vague and I don't understand the priorities/importance at all).

@steakhal
Copy link
Contributor

steakhal commented Dec 5, 2023

My take is that the z3-based solver is crashing all over the place. So its not just slower. We anyways don't have CI checks for it.
Given all these, I'd rather not put more burden to the issue tracker regarding this. I'd consider it if these issues wouldn't be present though, but we are really far from that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants