-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[SCEVExpander] Preserve gep nuw during expansion #102133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Nikita Popov (nikic) ChangesWhen expanding SCEV adds to geps, transfer the nuw flag to the resulting gep. (Note that this doesn't apply to IV increment GEPs, which go through a different code path.) Patch is 22.34 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/102133.diff 15 Files Affected:
diff --git a/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h b/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
index 62c1e15a9a60e..48b923c7c2edc 100644
--- a/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
+++ b/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
@@ -450,7 +450,7 @@ class SCEVExpander : public SCEVVisitor<SCEVExpander, Value *> {
/// Expand a SCEVAddExpr with a pointer type into a GEP instead of using
/// ptrtoint+arithmetic+inttoptr.
- Value *expandAddToGEP(const SCEV *Op, Value *V);
+ Value *expandAddToGEP(const SCEV *Op, Value *V, SCEV::NoWrapFlags Flags);
/// Find a previous Value in ExprValueMap for expand.
/// DropPoisonGeneratingInsts is populated with instructions for which
diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index c7d758aa575e6..461a36d44695c 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -347,16 +347,19 @@ Value *SCEVExpander::InsertBinop(Instruction::BinaryOps Opcode,
/// loop-invariant portions of expressions, after considering what
/// can be folded using target addressing modes.
///
-Value *SCEVExpander::expandAddToGEP(const SCEV *Offset, Value *V) {
+Value *SCEVExpander::expandAddToGEP(const SCEV *Offset, Value *V,
+ SCEV::NoWrapFlags Flags) {
assert(!isa<Instruction>(V) ||
SE.DT.dominates(cast<Instruction>(V), &*Builder.GetInsertPoint()));
Value *Idx = expand(Offset);
+ GEPNoWrapFlags NW = (Flags & SCEV::FlagNUW) ? GEPNoWrapFlags::noUnsignedWrap()
+ : GEPNoWrapFlags::none();
// Fold a GEP with constant operands.
if (Constant *CLHS = dyn_cast<Constant>(V))
if (Constant *CRHS = dyn_cast<Constant>(Idx))
- return Builder.CreatePtrAdd(CLHS, CRHS);
+ return Builder.CreatePtrAdd(CLHS, CRHS, "", NW);
// Do a quick scan to see if we have this GEP nearby. If so, reuse it.
unsigned ScanLimit = 6;
@@ -393,7 +396,7 @@ Value *SCEVExpander::expandAddToGEP(const SCEV *Offset, Value *V) {
}
// Emit a GEP.
- return Builder.CreatePtrAdd(V, Idx, "scevgep");
+ return Builder.CreatePtrAdd(V, Idx, "scevgep", NW);
}
/// PickMostRelevantLoop - Given two loops pick the one that's most relevant for
@@ -540,7 +543,7 @@ Value *SCEVExpander::visitAddExpr(const SCEVAddExpr *S) {
X = SE.getSCEV(U->getValue());
NewOps.push_back(X);
}
- Sum = expandAddToGEP(SE.getAddExpr(NewOps), Sum);
+ Sum = expandAddToGEP(SE.getAddExpr(NewOps), Sum, S->getNoWrapFlags());
} else if (Op->isNonConstantNegative()) {
// Instead of doing a negate and add, just do a subtract.
Value *W = expand(SE.getNegativeSCEV(Op));
@@ -1242,7 +1245,8 @@ Value *SCEVExpander::visitAddRecExpr(const SCEVAddRecExpr *S) {
if (!S->getStart()->isZero()) {
if (isa<PointerType>(S->getType())) {
Value *StartV = expand(SE.getPointerBase(S));
- return expandAddToGEP(SE.removePointerBase(S), StartV);
+ return expandAddToGEP(SE.removePointerBase(S), StartV,
+ S->getNoWrapFlags(SCEV::FlagNUW));
}
SmallVector<const SCEV *, 4> NewOps(S->operands());
diff --git a/llvm/test/Analysis/ScalarEvolution/scev-expander-reuse-gep.ll b/llvm/test/Analysis/ScalarEvolution/scev-expander-reuse-gep.ll
index 32d6a958198e6..579f5149340b5 100644
--- a/llvm/test/Analysis/ScalarEvolution/scev-expander-reuse-gep.ll
+++ b/llvm/test/Analysis/ScalarEvolution/scev-expander-reuse-gep.ll
@@ -3,7 +3,7 @@
; RUN: opt -mtriple=i386-apple-macosx10.12.0 < %s -loop-reduce -S | FileCheck %s
; CHECK: %ptr4.ptr1 = select i1 %cmp.i, ptr %ptr4, ptr %ptr1
-; CHECK-NEXT: %scevgep = getelementptr i8, ptr %ptr4.ptr1, i32 1
+; CHECK-NEXT: %scevgep = getelementptr nuw i8, ptr %ptr4.ptr1, i32 1
; CHECK-NEXT: br label %while.cond.i
target datalayout = "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128"
diff --git a/llvm/test/Transforms/LoopIdiom/basic-address-space.ll b/llvm/test/Transforms/LoopIdiom/basic-address-space.ll
index de55a46a164e6..9945aeb2021a9 100644
--- a/llvm/test/Transforms/LoopIdiom/basic-address-space.ll
+++ b/llvm/test/Transforms/LoopIdiom/basic-address-space.ll
@@ -13,7 +13,7 @@ define void @test10(ptr addrspace(2) %X) nounwind ssp {
; CHECK: bb.nph:
; CHECK-NEXT: [[I_04:%.*]] = phi i16 [ 0, [[ENTRY:%.*]] ], [ [[INC12:%.*]], [[FOR_INC10:%.*]] ]
; CHECK-NEXT: [[TMP0:%.*]] = mul nuw nsw i16 [[I_04]], 100
-; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr addrspace(2) [[X]], i16 [[TMP0]]
+; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr addrspace(2) [[X]], i16 [[TMP0]]
; CHECK-NEXT: br label [[FOR_BODY5:%.*]]
; CHECK: for.body5:
; CHECK-NEXT: [[J_02:%.*]] = phi i16 [ 0, [[BB_NPH]] ], [ [[INC:%.*]], [[FOR_BODY5]] ]
diff --git a/llvm/test/Transforms/LoopIdiom/basic.ll b/llvm/test/Transforms/LoopIdiom/basic.ll
index 912b1542a267b..e6fc11625317b 100644
--- a/llvm/test/Transforms/LoopIdiom/basic.ll
+++ b/llvm/test/Transforms/LoopIdiom/basic.ll
@@ -481,7 +481,7 @@ define void @test10(ptr %X) nounwind ssp {
; CHECK-NEXT: [[INDVAR:%.*]] = phi i64 [ [[INDVAR_NEXT:%.*]], [[FOR_INC10:%.*]] ], [ 0, [[ENTRY:%.*]] ]
; CHECK-NEXT: [[I_04:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[INC12:%.*]], [[FOR_INC10]] ]
; CHECK-NEXT: [[TMP0:%.*]] = mul nuw nsw i64 [[INDVAR]], 100
-; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[X]], i64 [[TMP0]]
+; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[X]], i64 [[TMP0]]
; CHECK-NEXT: br label [[FOR_BODY5:%.*]]
; CHECK: for.body5:
; CHECK-NEXT: [[J_02:%.*]] = phi i32 [ 0, [[BB_NPH]] ], [ [[INC:%.*]], [[FOR_BODY5]] ]
@@ -682,7 +682,7 @@ define void @PR14241(ptr %s, i64 %size) {
; CHECK-NEXT: entry:
; CHECK-NEXT: [[END_IDX:%.*]] = add i64 [[SIZE:%.*]], -1
; CHECK-NEXT: [[END_PTR:%.*]] = getelementptr inbounds i32, ptr [[S:%.*]], i64 [[END_IDX]]
-; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[S]], i64 4
+; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[S]], i64 4
; CHECK-NEXT: [[TMP0:%.*]] = shl i64 [[SIZE]], 2
; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[TMP0]], -8
; CHECK-NEXT: [[TMP2:%.*]] = lshr i64 [[TMP1]], 2
diff --git a/llvm/test/Transforms/LoopIdiom/memset-pr52104.ll b/llvm/test/Transforms/LoopIdiom/memset-pr52104.ll
index 0c7d1ec59b2cd..0cacb74bf01be 100644
--- a/llvm/test/Transforms/LoopIdiom/memset-pr52104.ll
+++ b/llvm/test/Transforms/LoopIdiom/memset-pr52104.ll
@@ -6,12 +6,12 @@ define void @f(ptr nocapture nonnull align 4 dereferenceable(20) %0, i32 %1) loc
; CHECK-NEXT: [[TMP3:%.*]] = trunc i32 [[TMP1:%.*]] to i2
; CHECK-NEXT: [[TMP4:%.*]] = zext i2 [[TMP3]] to i64
; CHECK-NEXT: [[TMP5:%.*]] = shl nuw nsw i64 [[TMP4]], 2
-; CHECK-NEXT: [[UGLYGEP:%.*]] = getelementptr i8, ptr [[TMP0:%.*]], i64 [[TMP5]]
+; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[TMP0:%.*]], i64 [[TMP5]]
; CHECK-NEXT: [[TMP6:%.*]] = sub i2 -1, [[TMP3]]
; CHECK-NEXT: [[TMP7:%.*]] = zext i2 [[TMP6]] to i64
; CHECK-NEXT: [[TMP8:%.*]] = shl nuw nsw i64 [[TMP7]], 2
; CHECK-NEXT: [[TMP9:%.*]] = add nuw nsw i64 [[TMP8]], 4
-; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 4 [[UGLYGEP]], i8 0, i64 [[TMP9]], i1 false)
+; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 4 [[SCEVGEP]], i8 0, i64 [[TMP9]], i1 false)
; CHECK-NEXT: br label [[TMP10:%.*]]
; CHECK: 10:
; CHECK-NEXT: [[TMP11:%.*]] = phi i32 [ [[TMP15:%.*]], [[TMP10]] ], [ [[TMP1]], [[TMP2:%.*]] ]
diff --git a/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll b/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
index c4aa6c7725d41..492746615d846 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
@@ -96,7 +96,7 @@ define void @test2(i32 %n) nounwind uwtable {
; CHECK-NEXT: br label [[FOR_COND468:%.*]]
; CHECK: for.cond468:
; CHECK-NEXT: [[LSR_IV1:%.*]] = phi i32 [ 1, [[FOR_COND468_PREHEADER]] ], [ [[LSR_IV_NEXT:%.*]], [[IF_THEN477:%.*]] ]
-; CHECK-NEXT: [[LSR_IV:%.*]] = phi ptr [ getelementptr inbounds (i8, ptr @tags, i64 8), [[FOR_COND468_PREHEADER]] ], [ [[SCEVGEP:%.*]], [[IF_THEN477]] ]
+; CHECK-NEXT: [[LSR_IV:%.*]] = phi ptr [ getelementptr inbounds nuw (i8, ptr @tags, i64 8), [[FOR_COND468_PREHEADER]] ], [ [[SCEVGEP:%.*]], [[IF_THEN477]] ]
; CHECK-NEXT: [[K_0:%.*]] = load i32, ptr [[LSR_IV]], align 4
; CHECK-NEXT: [[CMP469:%.*]] = icmp slt i32 [[LSR_IV1]], [[N:%.*]]
; CHECK-NEXT: br i1 [[CMP469]], label [[FOR_BODY471:%.*]], label [[FOR_INC498_PREHEADER:%.*]]
diff --git a/llvm/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll b/llvm/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll
index f73f92246fff2..d0a910a0758a9 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/post-inc-icmpzero.ll
@@ -18,7 +18,7 @@ define void @_Z15IntegerToStringjjR7Vector2(i32 %i, i32 %radix, ptr nocapture %r
; CHECK-NEXT: [[ADD_PTR:%.*]] = getelementptr [33 x i16], ptr [[BUFFER]], i64 0, i64 33
; CHECK-NEXT: [[SUB_PTR_LHS_CAST:%.*]] = ptrtoint ptr [[ADD_PTR]] to i64
; CHECK-NEXT: [[SUB_PTR_RHS_CAST:%.*]] = ptrtoint ptr [[ADD_PTR]] to i64
-; CHECK-NEXT: [[SCEVGEP3:%.*]] = getelementptr i8, ptr [[BUFFER]], i64 64
+; CHECK-NEXT: [[SCEVGEP3:%.*]] = getelementptr nuw i8, ptr [[BUFFER]], i64 64
; CHECK-NEXT: br label [[DO_BODY:%.*]]
; CHECK: do.body:
; CHECK-NEXT: [[LSR_IV10:%.*]] = phi i64 [ [[LSR_IV_NEXT11:%.*]], [[DO_BODY]] ], [ -1, [[ENTRY:%.*]] ]
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
index d6794420c403f..b40018f093bc2 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
@@ -1461,7 +1461,7 @@ define void @PR34743(ptr %a, ptr %b, i64 %n) #1 {
; CHECK-NEXT: [[TMP5:%.*]] = and i64 [[TMP4]], -4
; CHECK-NEXT: [[TMP6:%.*]] = getelementptr i8, ptr [[B:%.*]], i64 [[TMP5]]
; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[TMP6]], i64 4
-; CHECK-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[A]], i64 2
+; CHECK-NEXT: [[SCEVGEP1:%.*]] = getelementptr nuw i8, ptr [[A]], i64 2
; CHECK-NEXT: [[TMP7:%.*]] = getelementptr i8, ptr [[A]], i64 [[TMP5]]
; CHECK-NEXT: [[SCEVGEP2:%.*]] = getelementptr i8, ptr [[TMP7]], i64 6
; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[SCEVGEP2]], [[B]]
@@ -1521,10 +1521,10 @@ define void @PR34743(ptr %a, ptr %b, i64 %n) #1 {
; CHECK-NEXT: [[SCALAR_RECUR_INIT:%.*]] = phi i16 [ [[VECTOR_RECUR_EXTRACT]], [[MIDDLE_BLOCK]] ], [ [[DOTPRE]], [[ENTRY]] ], [ [[DOTPRE]], [[VECTOR_MEMCHECK]] ]
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[SCALAR_RECUR:%.*]] = phi i16 [ [[SCALAR_RECUR_INIT]], [[SCALAR_PH]] ], [ [[LOAD2:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[TMP33:%.*]] = phi i16 [ [[SCALAR_RECUR_INIT]], [[SCALAR_PH]] ], [ [[LOAD2:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV2:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[I:%.*]] = phi i64 [ [[BC_RESUME_VAL3]], [[SCALAR_PH]] ], [ [[I1:%.*]], [[LOOP]] ]
-; CHECK-NEXT: [[CONV:%.*]] = sext i16 [[SCALAR_RECUR]] to i32
+; CHECK-NEXT: [[CONV:%.*]] = sext i16 [[TMP33]] to i32
; CHECK-NEXT: [[I1]] = add nuw nsw i64 [[I]], 1
; CHECK-NEXT: [[IV1:%.*]] = or disjoint i64 [[IV]], 1
; CHECK-NEXT: [[IV2]] = add nuw nsw i64 [[IV]], 2
diff --git a/llvm/test/Transforms/LoopVectorize/induction.ll b/llvm/test/Transforms/LoopVectorize/induction.ll
index 45674acaae538..fafd14c1def8b 100644
--- a/llvm/test/Transforms/LoopVectorize/induction.ll
+++ b/llvm/test/Transforms/LoopVectorize/induction.ll
@@ -1561,7 +1561,7 @@ define void @scalarize_induction_variable_04(ptr %a, ptr %p, i32 %n) {
; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP2]], 2
; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
; CHECK: vector.memcheck:
-; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[P:%.*]], i64 4
+; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[P:%.*]], i64 4
; CHECK-NEXT: [[TMP3:%.*]] = add i32 [[N]], -1
; CHECK-NEXT: [[TMP4:%.*]] = zext i32 [[TMP3]] to i64
; CHECK-NEXT: [[TMP5:%.*]] = shl nuw nsw i64 [[TMP4]], 3
@@ -1626,7 +1626,7 @@ define void @scalarize_induction_variable_04(ptr %a, ptr %p, i32 %n) {
; IND-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp eq i32 [[TMP0]], 0
; IND-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
; IND: vector.memcheck:
-; IND-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[P:%.*]], i64 4
+; IND-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[P:%.*]], i64 4
; IND-NEXT: [[TMP3:%.*]] = add i32 [[N]], -1
; IND-NEXT: [[TMP4:%.*]] = zext i32 [[TMP3]] to i64
; IND-NEXT: [[TMP5:%.*]] = shl nuw nsw i64 [[TMP4]], 3
@@ -1689,7 +1689,7 @@ define void @scalarize_induction_variable_04(ptr %a, ptr %p, i32 %n) {
; UNROLL-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[TMP0]], 3
; UNROLL-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
; UNROLL: vector.memcheck:
-; UNROLL-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[P:%.*]], i64 4
+; UNROLL-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[P:%.*]], i64 4
; UNROLL-NEXT: [[TMP3:%.*]] = add i32 [[N]], -1
; UNROLL-NEXT: [[TMP4:%.*]] = zext i32 [[TMP3]] to i64
; UNROLL-NEXT: [[TMP5:%.*]] = shl nuw nsw i64 [[TMP4]], 3
@@ -1766,7 +1766,7 @@ define void @scalarize_induction_variable_04(ptr %a, ptr %p, i32 %n) {
; UNROLL-NO-IC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP2]], 4
; UNROLL-NO-IC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
; UNROLL-NO-IC: vector.memcheck:
-; UNROLL-NO-IC-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[P:%.*]], i64 4
+; UNROLL-NO-IC-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[P:%.*]], i64 4
; UNROLL-NO-IC-NEXT: [[TMP3:%.*]] = add i32 [[N]], -1
; UNROLL-NO-IC-NEXT: [[TMP4:%.*]] = zext i32 [[TMP3]] to i64
; UNROLL-NO-IC-NEXT: [[TMP5:%.*]] = shl nuw nsw i64 [[TMP4]], 3
@@ -1845,7 +1845,7 @@ define void @scalarize_induction_variable_04(ptr %a, ptr %p, i32 %n) {
; INTERLEAVE-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[TMP0]], 8
; INTERLEAVE-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
; INTERLEAVE: vector.memcheck:
-; INTERLEAVE-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[P:%.*]], i64 4
+; INTERLEAVE-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[P:%.*]], i64 4
; INTERLEAVE-NEXT: [[TMP3:%.*]] = add i32 [[N]], -1
; INTERLEAVE-NEXT: [[TMP4:%.*]] = zext i32 [[TMP3]] to i64
; INTERLEAVE-NEXT: [[TMP5:%.*]] = shl nuw nsw i64 [[TMP4]], 3
diff --git a/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll b/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll
index 61ed955ea13e4..5fb5b705f0e17 100644
--- a/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll
+++ b/llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll
@@ -1478,7 +1478,7 @@ define void @PR34743(ptr %a, ptr %b, i64 %n) {
; CHECK-NEXT: [[TMP3:%.*]] = and i64 [[TMP2]], -4
; CHECK-NEXT: [[TMP4:%.*]] = getelementptr i8, ptr [[B:%.*]], i64 [[TMP3]]
; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[TMP4]], i64 4
-; CHECK-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[A]], i64 2
+; CHECK-NEXT: [[SCEVGEP1:%.*]] = getelementptr nuw i8, ptr [[A]], i64 2
; CHECK-NEXT: [[TMP5:%.*]] = getelementptr i8, ptr [[A]], i64 [[TMP3]]
; CHECK-NEXT: [[SCEVGEP2:%.*]] = getelementptr i8, ptr [[TMP5]], i64 6
; CHECK-NEXT: [[BOUND0:%.*]] = icmp ugt ptr [[SCEVGEP2]], [[B]]
diff --git a/llvm/test/Transforms/LoopVectorize/multiple-strides-vectorization.ll b/llvm/test/Transforms/LoopVectorize/multiple-strides-vectorization.ll
index 66a0c5fd807f2..a0cd3c64f2d77 100644
--- a/llvm/test/Transforms/LoopVectorize/multiple-strides-vectorization.ll
+++ b/llvm/test/Transforms/LoopVectorize/multiple-strides-vectorization.ll
@@ -38,12 +38,12 @@ define void @Test(ptr nocapture %obj, i64 %z) #0 {
; CHECK-NEXT: [[I:%.*]] = phi i64 [ 0, [[TMP0:%.*]] ], [ [[I_NEXT:%.*]], [[DOTOUTER:%.*]] ]
; CHECK-NEXT: [[TMP3:%.*]] = shl nuw nsw i64 [[I]], 7
; CHECK-NEXT: [[TMP4:%.*]] = add i64 [[TMP3]], 256
-; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[OBJ]], i64 [[TMP4]]
+; CHECK-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[OBJ]], i64 [[TMP4]]
; CHECK-NEXT: [[TMP5:%.*]] = add i64 [[TMP2]], [[TMP3]]
; CHECK-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[OBJ]], i64 [[TMP5]]
; CHECK-NEXT: [[TMP6:%.*]] = shl nuw nsw i64 [[I]], 2
; CHECK-NEXT: [[TMP7:%.*]] = add i64 [[TMP6]], 128
-; CHECK-NEXT: [[SCEVGEP3:%.*]] = getelementptr i8, ptr [[OBJ]], i64 [[TMP7]]
+; CHECK-NEXT: [[SCEVGEP3:%.*]] = getelementptr nuw i8, ptr [[OBJ]], i64 [[TMP7]]
; CHECK-NEXT: [[TMP8:%.*]] = add i64 [[TMP6]], 132
; CHECK-NEXT: [[SCEVGEP4:%.*]] = getelementptr i8, ptr [[OBJ]], i64 [[TMP8]]
; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds [[STRUCT_S:%.*]], ptr [[OBJ]], i64 0, i32 1, i64 [[I]]
@@ -107,12 +107,12 @@ define void @Test(ptr nocapture %obj, i64 %z) #0 {
; CHECK-NEXT: br i1 [[EXITCOND_INNER]], label [[DOTOUTER]], label [[DOTINNER]], !llvm.loop [[LOOP11:![0-9]+]]
;
; CHECK-HOIST-LABEL: @Test(
-; CHECK-HOIST-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[OBJ:%.*]], i64 256
+; CHECK-HOIST-NEXT: [[SCEVGEP:%.*]] = getelementptr nuw i8, ptr [[OBJ:%.*]], i64 256
; CHECK-HOIST-NEXT: [[TMP1:%.*]] = shl i64 [[Z:%.*]], 2
; CHECK-HOIST-NEXT: [[TMP2:%.*]] = add i64 [[TMP1]], 4224
; CHECK-HOIST-NEXT: [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[OBJ]], i64 [[TMP2]]
; CHECK-HOIST-NEXT: [[SCEVGEP2:%.*]] = getelementptr i8, ptr [[OBJ]], i64 [[TMP1]]
-; CHECK-HOIST-NEXT: [[SCEVGEP3:%.*]] = getelementptr i8, ptr [[OBJ]], i64 128
+; CHECK-HOIST-NEXT: [[SCEVGEP3:%.*]] = getelementptr nuw i8, ptr [[OBJ]], i64 128
; CHECK-HOIST-NEXT: br label [[DOTOUTER_PREHEADER:%.*]]
; CHECK-HOIST: .outer.preheader:
; CHECK-HOIST-NEXT: [[I:%.*]] = phi i64 [ 0, [[TMP0:%.*]] ], [ [[I_NEXT:%.*]], [[DOTOUTER:%.*]] ]
diff --git a/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll b/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
index 55bbf54d1f39d..e323d583184c7 100644
--- a/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
+++ b/llvm/test/Transforms/LoopVectorize/runtime-checks-difference.ll
@@ -184,7 +184,7 @@ define void @nested_loop_outer_iv_addrec_invariant_in_inner1(ptr %a, ptr %b, i64
; CHECK: outer.header:
; CHECK: [[OUTER_IV_SHL_2:%.]] = shl i64 %outer.iv, 2
-; CHECK-NEXT: [[A_GEP_UPPER:%.*]] = getelementptr i8, ptr %a, i64 [[OUTER_IV_SHL_2]]
+; CHECK-NEXT: [[A_GEP_UPPER:%.*]] = getelementptr nuw i8, ptr %a, i64 [[OUTER_IV_SHL_2]]
; CHECK-NEXT: [[OUTER_IV_4:%.]] = add i64 [[OUTER_IV_SHL_2]], 4
; CHECK-NEXT: [[A_GEP_UPPER_4:%.*]] = getelementptr i8, ptr %a, i64 [[OUTER_IV_4]]
; CHECK: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[N:%.*]], 4
@@ -234,7 +234,7 @@ define void @nested_loop_outer_iv_addrec_invariant_in_inner2(ptr %a, ptr %b, i64
; CHECK: outer.header:
; CHECK: [[OUTER_IV_SHL_2:%.]] = shl i64 %outer.iv, 2
-; CHECK-NEXT: [[A_GEP_UPPER:%.*]] = getelementptr i8, ptr %a, i64 [[OUTER_IV_SHL_2]]
+; CHECK-NEXT: [[A_GEP_UPPER:%.*]] = getelementptr nuw i8, ptr %a, i64 [[OUTER_IV_SHL_2]]
; CHECK-NEXT: [[OUTER_IV_4:%.]] = add i64 [[OUTER_IV_SHL_2]], 4
; CHECK-NEXT: [[A_GEP_UPPER_4:%.*]] = getelementptr i8, ptr %a, i64 [[OUTER_IV_4]]
; CHECK: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[N:%.*]], 4
diff --git a/llvm/test/Transforms/LoopVersioning/add-phi-update-users.ll b/llvm/test/Transforms/LoopVersioning/add-phi-update-users.ll
index e326064175d18..16ad4...
[truncated]
|
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
|
||
// Fold a GEP with constant operands. | ||
if (Constant *CLHS = dyn_cast<Constant>(V)) | ||
if (Constant *CRHS = dyn_cast<Constant>(Idx)) | ||
return Builder.CreatePtrAdd(CLHS, CRHS); | ||
return Builder.CreatePtrAdd(CLHS, CRHS, "", NW); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to merge the flags if we find an existing GEP in the loop below? In particular, is it safe to reuse a GEP with flags when expanding one without?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This looks like a pre-existing issue, as we could also be reusing a gep inbounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally got around to this, the fix is here: #109293
As pointed out in the review of llvm#102133, SCEVExpander currently incorrectly reuses GEP instructions that have poison-generating flags set. Fix this by clearing the flags on the reused instruction.
As pointed out in the review of llvm#102133, SCEVExpander currently incorrectly reuses GEP instructions that have poison-generating flags set. Fix this by clearing the flags on the reused instruction.
As pointed out in the review of #102133, SCEVExpander currently incorrectly reuses GEP instructions that have poison-generating flags set. Fix this by clearing the flags on the reused instruction.
When expanding SCEV adds to geps, transfer the nuw flag to the resulting gep. (Note that this doesn't apply to IV increment GEPs, which go through a different code path.)
fd00ff3
to
fc97a6b
Compare
Rebased over #109293, this now has the correct flag intersection behavior when reusing an existing GEP. |
As pointed out in the review of llvm#102133, SCEVExpander currently incorrectly reuses GEP instructions that have poison-generating flags set. Fix this by clearing the flags on the reused instruction.
When expanding SCEV adds to geps, transfer the nuw flag to the resulting gep. (Note that this doesn't apply to IV increment GEPs, which go through a different code path.)
When expanding SCEV adds to geps, transfer the nuw flag to the resulting gep. (Note that this doesn't apply to IV increment GEPs, which go through a different code path.)