Skip to content

[GlobalIsel][AArch64] more legal icmps #78239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5314,6 +5314,17 @@ LegalizerHelper::moreElementsVector(MachineInstr &MI, unsigned TypeIdx,
Observer.changedInstr(MI);
return Legalized;
}
case TargetOpcode::G_ICMP: {
// TODO: the symmetric MoreTy works for targets like, e.g. NEON.
// For targets, like e.g. MVE, the result is a predicated vector (i1).
// This will need some refactoring.
Observer.changingInstr(MI);
moreElementsVectorSrc(MI, MoreTy, 2);
moreElementsVectorSrc(MI, MoreTy, 3);
moreElementsVectorDst(MI, MoreTy, 0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much this matters, but this looks like it is assuming that the three types are all the same, not that one is a predicate vector (i1 elements). Neon will work with full vector lane mask results, so it should work OK there, but possible not for other architectures with vector predicate register like SVE and MVE.

Copy link
Author

@tschuett tschuett Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. I should be the SrcTy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MoreTy comes from here:

LegalizerHelper::moreElementsVector(MachineInstr &MI, unsigned TypeIdx,

Properly supporting predicates will take more work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might need to be similar to the trunc code:

LLT SrcTy = LLT::fixed_vector(
, but it might be better to wait until a vector target needs it, and just add a TODO at the moment?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, SVE and moreElements is incompatible. I cannot change the length of a scalable vector.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Observer.changedInstr(MI);
return Legalized;
}
default:
return UnableToLegalize;
}
Expand Down
8 changes: 7 additions & 1 deletion llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -495,6 +495,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
})
.clampScalar(0, MinFPScalar, s128);

// FIXME: fix moreElementsToNextPow2
getActionDefinitionsBuilder(G_ICMP)
.legalFor({{s32, s32},
{s32, s64},
Expand Down Expand Up @@ -524,7 +525,11 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
.minScalarOrEltIf(
[=](const LegalityQuery &Query) { return Query.Types[1] == v2p0; }, 0,
s64)
.clampNumElements(0, v2s32, v4s32);
.moreElementsToNextPow2(0)
.clampNumElements(0, v8s8, v16s8)
.clampNumElements(0, v4s16, v8s16)
.clampNumElements(0, v2s32, v4s32)
.clampNumElements(0, v2s64, v2s64);

getActionDefinitionsBuilder(G_FCMP)
// If we don't have full FP16 support, then scalarize the elements of
Expand Down Expand Up @@ -863,6 +868,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
},
0, s8)
.minScalarOrElt(0, s8) // Worst case, we need at least s8.
.moreElementsToNextPow2(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtractVector should probably be in a separate pr. I was looking into some of the vector operations a little while ago, but I think that still needs cleaning up a bit. I had a patch for fcmp too which I will try and upload.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need the change for the icmp tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, because they extract a single lane for the tests. It sounds harmless enough to keep then.

.clampMaxNumElements(1, s64, 2)
.clampMaxNumElements(1, s32, 4)
.clampMaxNumElements(1, s16, 8)
Expand Down
205 changes: 205 additions & 0 deletions llvm/test/CodeGen/AArch64/GlobalISel/legalize-cmp.mir
Original file line number Diff line number Diff line change
Expand Up @@ -330,3 +330,208 @@ body: |
successors:
bb.3:
RET_ReallyLR
...
---
name: test_3xs32_eq_pr_78181
tracksRegLiveness: true
body: |
bb.1:
liveins: $x0
; CHECK-LABEL: name: test_3xs32_eq_pr_78181
; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %const:_(s32) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32)
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<4 x s32>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<4 x s32>), [[BUILD_VECTOR1]]
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[ICMP]](<4 x s32>), [[C]](s64)
; CHECK-NEXT: $w0 = COPY [[EVEC]](s32)
; CHECK-NEXT: RET_ReallyLR
%const:_(s32) = G_IMPLICIT_DEF
%rhs:_(<3 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32)
%lhs:_(<3 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32)
%cmp:_(<3 x s32>) = G_ICMP intpred(eq), %lhs(<3 x s32>), %rhs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test vectors of all the touched element types?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused. The error was that it failed to legalize <3 x s32>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you changed clampMaxNumElements on more element types than that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use now clampNumElements to limit lower and upper bound. There are now some clamp tests.

%1:_(s32) = G_CONSTANT i32 1
%2:_(s32) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s32>), %1(s32)
$w0 = COPY %2(s32)
RET_ReallyLR
...
---
name: test_3xs16_eq_pr_78181
tracksRegLiveness: true
body: |
bb.1:
liveins: $x0
; CHECK-LABEL: name: test_3xs16_eq_pr_78181
; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %const:_(s16) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16)
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<4 x s16>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<4 x s16>), [[BUILD_VECTOR1]]
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s16), [[UV1:%[0-9]+]]:_(s16), [[UV2:%[0-9]+]]:_(s16), [[UV3:%[0-9]+]]:_(s16) = G_UNMERGE_VALUES [[ICMP]](<4 x s16>)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UV]](s16)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UV1]](s16)
; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UV2]](s16)
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32), [[DEF]](s32)
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[BUILD_VECTOR2]](<4 x s32>), [[C]](s64)
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535
; CHECK-NEXT: %zext:_(s32) = G_AND [[EVEC]], [[C1]]
; CHECK-NEXT: $w0 = COPY %zext(s32)
; CHECK-NEXT: RET_ReallyLR
%const:_(s16) = G_IMPLICIT_DEF
%rhs:_(<3 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16)
%lhs:_(<3 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16)
%cmp:_(<3 x s16>) = G_ICMP intpred(eq), %lhs(<3 x s16>), %rhs
%1:_(s32) = G_CONSTANT i32 1
%2:_(s16) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s16>), %1(s32)
%zext:_(s32) = G_ZEXT %2(s16)
$w0 = COPY %zext(s32)
RET_ReallyLR
...
---
name: test_3xs8_eq_pr_78181
tracksRegLiveness: true
body: |
bb.1:
liveins: $x0
; CHECK-LABEL: name: test_3xs8_eq_pr_78181
; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %const:_(s8) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<8 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8)
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<8 x s8>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<8 x s8>), [[BUILD_VECTOR1]]
; CHECK-NEXT: [[UV:%[0-9]+]]:_(<4 x s8>), [[UV1:%[0-9]+]]:_(<4 x s8>) = G_UNMERGE_VALUES [[ICMP]](<8 x s8>)
; CHECK-NEXT: [[UV2:%[0-9]+]]:_(s8), [[UV3:%[0-9]+]]:_(s8), [[UV4:%[0-9]+]]:_(s8), [[UV5:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[UV]](<4 x s8>)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UV2]](s8)
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UV3]](s8)
; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UV4]](s8)
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32), [[DEF]](s32)
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[BUILD_VECTOR2]](<4 x s32>), [[C]](s64)
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 255
; CHECK-NEXT: %zext:_(s32) = G_AND [[EVEC]], [[C1]]
; CHECK-NEXT: $w0 = COPY %zext(s32)
; CHECK-NEXT: RET_ReallyLR
%const:_(s8) = G_IMPLICIT_DEF
%rhs:_(<3 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8)
%lhs:_(<3 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8)
%cmp:_(<3 x s8>) = G_ICMP intpred(eq), %lhs(<3 x s8>), %rhs
%1:_(s32) = G_CONSTANT i32 1
%2:_(s8) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s8>), %1(s32)
%zext:_(s32) = G_ZEXT %2(s8)
$w0 = COPY %zext(s32)
RET_ReallyLR
...
---
name: test_3xs64_eq_clamp
tracksRegLiveness: true
body: |
bb.1:
liveins: $x0
; CHECK-LABEL: name: test_3xs64_eq_clamp
; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %const:_(s64) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64)
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<2 x s64>), [[BUILD_VECTOR1]]
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s64) = G_EXTRACT_VECTOR_ELT [[ICMP]](<2 x s64>), [[C]](s64)
; CHECK-NEXT: $x0 = COPY [[EVEC]](s64)
; CHECK-NEXT: RET_ReallyLR
%const:_(s64) = G_IMPLICIT_DEF
%rhs:_(<3 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64), %const(s64)
%lhs:_(<3 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64), %const(s64)
%cmp:_(<3 x s64>) = G_ICMP intpred(eq), %lhs(<3 x s64>), %rhs
%1:_(s32) = G_CONSTANT i32 1
%2:_(s64) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s64>), %1(s32)
$x0 = COPY %2(s64)
RET_ReallyLR
...
---
name: test_5xs32_eq_clamp
tracksRegLiveness: true
body: |
bb.1:
liveins: $x0
; CHECK-LABEL: name: test_5xs32_eq_clamp
; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %const:_(s32) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32)
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<4 x s32>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<4 x s32>), [[BUILD_VECTOR1]]
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[ICMP]](<4 x s32>), [[C]](s64)
; CHECK-NEXT: $w0 = COPY [[EVEC]](s32)
; CHECK-NEXT: RET_ReallyLR
%const:_(s32) = G_IMPLICIT_DEF
%rhs:_(<5 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32), %const(s32)
%lhs:_(<5 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32), %const(s32)
%cmp:_(<5 x s32>) = G_ICMP intpred(eq), %lhs(<5 x s32>), %rhs
%1:_(s32) = G_CONSTANT i32 1
%2:_(s32) = G_EXTRACT_VECTOR_ELT %cmp(<5 x s32>), %1(s32)
$w0 = COPY %2(s32)
RET_ReallyLR
...
---
name: test_7xs16_eq_clamp
tracksRegLiveness: true
body: |
bb.1:
liveins: $x0
; CHECK-LABEL: name: test_7xs16_eq_clamp
; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %const:_(s16) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16)
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<8 x s16>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<8 x s16>), [[BUILD_VECTOR1]]
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s16) = G_EXTRACT_VECTOR_ELT [[ICMP]](<8 x s16>), [[C]](s64)
; CHECK-NEXT: %zext:_(s32) = G_ZEXT [[EVEC]](s16)
; CHECK-NEXT: $w0 = COPY %zext(s32)
; CHECK-NEXT: RET_ReallyLR
%const:_(s16) = G_IMPLICIT_DEF
%rhs:_(<7 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16)
%lhs:_(<7 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16)
%cmp:_(<7 x s16>) = G_ICMP intpred(eq), %lhs(<7 x s16>), %rhs
%1:_(s32) = G_CONSTANT i32 1
%2:_(s16) = G_EXTRACT_VECTOR_ELT %cmp(<7 x s16>), %1(s32)
%zext:_(s32) = G_ZEXT %2(s16)
$w0 = COPY %zext(s32)
RET_ReallyLR
...
---
name: test_9xs8_eq_clamp
tracksRegLiveness: true
body: |
bb.1:
liveins: $x0
; CHECK-LABEL: name: test_9xs8_eq_clamp
; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: %const:_(s8) = G_IMPLICIT_DEF
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8)
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<16 x s8>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<16 x s8>), [[BUILD_VECTOR1]]
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s8) = G_EXTRACT_VECTOR_ELT [[ICMP]](<16 x s8>), [[C]](s64)
; CHECK-NEXT: %zext:_(s32) = G_ZEXT [[EVEC]](s8)
; CHECK-NEXT: $w0 = COPY %zext(s32)
; CHECK-NEXT: RET_ReallyLR
%const:_(s8) = G_IMPLICIT_DEF
%rhs:_(<9 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8)
%lhs:_(<9 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8)
%cmp:_(<9 x s8>) = G_ICMP intpred(eq), %lhs(<9 x s8>), %rhs
%1:_(s32) = G_CONSTANT i32 1
%2:_(s8) = G_EXTRACT_VECTOR_ELT %cmp(<9 x s8>), %1(s32)
%zext:_(s32) = G_ZEXT %2(s8)
$w0 = COPY %zext(s32)
RET_ReallyLR
20 changes: 4 additions & 16 deletions llvm/test/CodeGen/AArch64/GlobalISel/legalize-shuffle-vector.mir
Original file line number Diff line number Diff line change
Expand Up @@ -316,26 +316,14 @@ body: |
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY]](s32), [[COPY1]](s32), [[COPY2]](s32), [[DEF]](s32)
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY]](s32), [[COPY1]](s32), [[COPY2]](s32), [[DEF]](s32)
; CHECK-NEXT: [[SHUF:%[0-9]+]]:_(<4 x s32>) = G_SHUFFLE_VECTOR [[BUILD_VECTOR]](<4 x s32>), [[BUILD_VECTOR1]], shufflemask(0, 1, 5, 6)
; CHECK-NEXT: [[UV:%[0-9]+]]:_(<2 x s32>), [[UV1:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[UV2:%[0-9]+]]:_(<2 x s32>), [[UV3:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[UV4:%[0-9]+]]:_(<2 x s32>), [[UV5:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[UV6:%[0-9]+]]:_(<2 x s32>), [[UV7:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[UV8:%[0-9]+]]:_(<2 x s32>), [[UV9:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[UV10:%[0-9]+]]:_(<2 x s32>), [[UV11:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[UV12:%[0-9]+]]:_(<2 x s32>), [[UV13:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[UV14:%[0-9]+]]:_(<2 x s32>), [[UV15:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>)
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV]](<2 x s32>), [[UV3]](<2 x s32>)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improved legalization of G_EXTRACT_VECTOR_ELT.

; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS]](<4 x s32>), [[C]](s64)
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C]](s64)
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK-NEXT: [[CONCAT_VECTORS1:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV4]](<2 x s32>), [[UV7]](<2 x s32>)
; CHECK-NEXT: [[EVEC1:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS1]](<4 x s32>), [[C1]](s64)
; CHECK-NEXT: [[EVEC1:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C1]](s64)
; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CHECK-NEXT: [[CONCAT_VECTORS2:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV8]](<2 x s32>), [[UV11]](<2 x s32>)
; CHECK-NEXT: [[EVEC2:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS2]](<4 x s32>), [[C2]](s64)
; CHECK-NEXT: [[EVEC2:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C2]](s64)
; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
; CHECK-NEXT: [[CONCAT_VECTORS3:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV12]](<2 x s32>), [[UV15]](<2 x s32>)
; CHECK-NEXT: [[EVEC3:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS3]](<4 x s32>), [[C3]](s64)
; CHECK-NEXT: [[EVEC3:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C3]](s64)
; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[EVEC]](s32), [[EVEC1]](s32), [[EVEC2]](s32), [[EVEC3]](s32)
; CHECK-NEXT: $q0 = COPY [[BUILD_VECTOR2]](<4 x s32>)
; CHECK-NEXT: RET_ReallyLR implicit $q0
Expand Down
Loading