-
Notifications
You must be signed in to change notification settings - Fork 14k
[GlobalIsel][AArch64] more legal icmps #78239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
e51001f
18524de
e3021e2
da1d8e1
4d6c6a2
6955ac3
5dd7e9e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -495,6 +495,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) | |
}) | ||
.clampScalar(0, MinFPScalar, s128); | ||
|
||
// FIXME: fix moreElementsToNextPow2 | ||
getActionDefinitionsBuilder(G_ICMP) | ||
.legalFor({{s32, s32}, | ||
{s32, s64}, | ||
|
@@ -524,7 +525,11 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) | |
.minScalarOrEltIf( | ||
[=](const LegalityQuery &Query) { return Query.Types[1] == v2p0; }, 0, | ||
s64) | ||
.clampNumElements(0, v2s32, v4s32); | ||
.moreElementsToNextPow2(0) | ||
.clampNumElements(0, v8s8, v16s8) | ||
.clampNumElements(0, v4s16, v8s16) | ||
.clampNumElements(0, v2s32, v4s32) | ||
.clampNumElements(0, v2s64, v2s64); | ||
|
||
getActionDefinitionsBuilder(G_FCMP) | ||
// If we don't have full FP16 support, then scalarize the elements of | ||
|
@@ -863,6 +868,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) | |
}, | ||
0, s8) | ||
.minScalarOrElt(0, s8) // Worst case, we need at least s8. | ||
.moreElementsToNextPow2(1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ExtractVector should probably be in a separate pr. I was looking into some of the vector operations a little while ago, but I think that still needs cleaning up a bit. I had a patch for fcmp too which I will try and upload. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I need the change for the icmp tests. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, because they extract a single lane for the tests. It sounds harmless enough to keep then. |
||
.clampMaxNumElements(1, s64, 2) | ||
.clampMaxNumElements(1, s32, 4) | ||
.clampMaxNumElements(1, s16, 8) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -330,3 +330,208 @@ body: | | |
successors: | ||
bb.3: | ||
RET_ReallyLR | ||
... | ||
--- | ||
name: test_3xs32_eq_pr_78181 | ||
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $x0 | ||
; CHECK-LABEL: name: test_3xs32_eq_pr_78181 | ||
; CHECK: liveins: $x0 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: %const:_(s32) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32) | ||
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<4 x s32>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<4 x s32>), [[BUILD_VECTOR1]] | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[ICMP]](<4 x s32>), [[C]](s64) | ||
; CHECK-NEXT: $w0 = COPY [[EVEC]](s32) | ||
; CHECK-NEXT: RET_ReallyLR | ||
%const:_(s32) = G_IMPLICIT_DEF | ||
%rhs:_(<3 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32) | ||
%lhs:_(<3 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32) | ||
%cmp:_(<3 x s32>) = G_ICMP intpred(eq), %lhs(<3 x s32>), %rhs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Test vectors of all the touched element types? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am confused. The error was that it failed to legalize There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But you changed clampMaxNumElements on more element types than that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I use now |
||
%1:_(s32) = G_CONSTANT i32 1 | ||
%2:_(s32) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s32>), %1(s32) | ||
$w0 = COPY %2(s32) | ||
RET_ReallyLR | ||
... | ||
--- | ||
name: test_3xs16_eq_pr_78181 | ||
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $x0 | ||
; CHECK-LABEL: name: test_3xs16_eq_pr_78181 | ||
; CHECK: liveins: $x0 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: %const:_(s16) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16) | ||
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<4 x s16>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<4 x s16>), [[BUILD_VECTOR1]] | ||
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s16), [[UV1:%[0-9]+]]:_(s16), [[UV2:%[0-9]+]]:_(s16), [[UV3:%[0-9]+]]:_(s16) = G_UNMERGE_VALUES [[ICMP]](<4 x s16>) | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UV]](s16) | ||
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UV1]](s16) | ||
; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UV2]](s16) | ||
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32), [[DEF]](s32) | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[BUILD_VECTOR2]](<4 x s32>), [[C]](s64) | ||
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 65535 | ||
; CHECK-NEXT: %zext:_(s32) = G_AND [[EVEC]], [[C1]] | ||
; CHECK-NEXT: $w0 = COPY %zext(s32) | ||
; CHECK-NEXT: RET_ReallyLR | ||
%const:_(s16) = G_IMPLICIT_DEF | ||
%rhs:_(<3 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16) | ||
%lhs:_(<3 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16) | ||
%cmp:_(<3 x s16>) = G_ICMP intpred(eq), %lhs(<3 x s16>), %rhs | ||
%1:_(s32) = G_CONSTANT i32 1 | ||
%2:_(s16) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s16>), %1(s32) | ||
%zext:_(s32) = G_ZEXT %2(s16) | ||
$w0 = COPY %zext(s32) | ||
RET_ReallyLR | ||
... | ||
--- | ||
name: test_3xs8_eq_pr_78181 | ||
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $x0 | ||
; CHECK-LABEL: name: test_3xs8_eq_pr_78181 | ||
; CHECK: liveins: $x0 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: %const:_(s8) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<8 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8) | ||
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<8 x s8>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<8 x s8>), [[BUILD_VECTOR1]] | ||
; CHECK-NEXT: [[UV:%[0-9]+]]:_(<4 x s8>), [[UV1:%[0-9]+]]:_(<4 x s8>) = G_UNMERGE_VALUES [[ICMP]](<8 x s8>) | ||
; CHECK-NEXT: [[UV2:%[0-9]+]]:_(s8), [[UV3:%[0-9]+]]:_(s8), [[UV4:%[0-9]+]]:_(s8), [[UV5:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[UV]](<4 x s8>) | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[UV2]](s8) | ||
; CHECK-NEXT: [[ANYEXT1:%[0-9]+]]:_(s32) = G_ANYEXT [[UV3]](s8) | ||
; CHECK-NEXT: [[ANYEXT2:%[0-9]+]]:_(s32) = G_ANYEXT [[UV4]](s8) | ||
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(s32) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[ANYEXT]](s32), [[ANYEXT1]](s32), [[ANYEXT2]](s32), [[DEF]](s32) | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[BUILD_VECTOR2]](<4 x s32>), [[C]](s64) | ||
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 255 | ||
; CHECK-NEXT: %zext:_(s32) = G_AND [[EVEC]], [[C1]] | ||
; CHECK-NEXT: $w0 = COPY %zext(s32) | ||
; CHECK-NEXT: RET_ReallyLR | ||
%const:_(s8) = G_IMPLICIT_DEF | ||
%rhs:_(<3 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8) | ||
%lhs:_(<3 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8) | ||
%cmp:_(<3 x s8>) = G_ICMP intpred(eq), %lhs(<3 x s8>), %rhs | ||
%1:_(s32) = G_CONSTANT i32 1 | ||
%2:_(s8) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s8>), %1(s32) | ||
%zext:_(s32) = G_ZEXT %2(s8) | ||
$w0 = COPY %zext(s32) | ||
RET_ReallyLR | ||
... | ||
--- | ||
name: test_3xs64_eq_clamp | ||
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $x0 | ||
; CHECK-LABEL: name: test_3xs64_eq_clamp | ||
; CHECK: liveins: $x0 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: %const:_(s64) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64) | ||
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<2 x s64>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<2 x s64>), [[BUILD_VECTOR1]] | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s64) = G_EXTRACT_VECTOR_ELT [[ICMP]](<2 x s64>), [[C]](s64) | ||
; CHECK-NEXT: $x0 = COPY [[EVEC]](s64) | ||
; CHECK-NEXT: RET_ReallyLR | ||
%const:_(s64) = G_IMPLICIT_DEF | ||
%rhs:_(<3 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64), %const(s64) | ||
%lhs:_(<3 x s64>) = G_BUILD_VECTOR %const(s64), %const(s64), %const(s64) | ||
%cmp:_(<3 x s64>) = G_ICMP intpred(eq), %lhs(<3 x s64>), %rhs | ||
%1:_(s32) = G_CONSTANT i32 1 | ||
%2:_(s64) = G_EXTRACT_VECTOR_ELT %cmp(<3 x s64>), %1(s32) | ||
$x0 = COPY %2(s64) | ||
RET_ReallyLR | ||
... | ||
--- | ||
name: test_5xs32_eq_clamp | ||
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $x0 | ||
; CHECK-LABEL: name: test_5xs32_eq_clamp | ||
; CHECK: liveins: $x0 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: %const:_(s32) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32) | ||
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<4 x s32>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<4 x s32>), [[BUILD_VECTOR1]] | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[ICMP]](<4 x s32>), [[C]](s64) | ||
; CHECK-NEXT: $w0 = COPY [[EVEC]](s32) | ||
; CHECK-NEXT: RET_ReallyLR | ||
%const:_(s32) = G_IMPLICIT_DEF | ||
%rhs:_(<5 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32), %const(s32) | ||
%lhs:_(<5 x s32>) = G_BUILD_VECTOR %const(s32), %const(s32), %const(s32), %const(s32), %const(s32) | ||
%cmp:_(<5 x s32>) = G_ICMP intpred(eq), %lhs(<5 x s32>), %rhs | ||
%1:_(s32) = G_CONSTANT i32 1 | ||
%2:_(s32) = G_EXTRACT_VECTOR_ELT %cmp(<5 x s32>), %1(s32) | ||
$w0 = COPY %2(s32) | ||
RET_ReallyLR | ||
... | ||
--- | ||
name: test_7xs16_eq_clamp | ||
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $x0 | ||
; CHECK-LABEL: name: test_7xs16_eq_clamp | ||
; CHECK: liveins: $x0 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: %const:_(s16) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<8 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16) | ||
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<8 x s16>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<8 x s16>), [[BUILD_VECTOR1]] | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s16) = G_EXTRACT_VECTOR_ELT [[ICMP]](<8 x s16>), [[C]](s64) | ||
; CHECK-NEXT: %zext:_(s32) = G_ZEXT [[EVEC]](s16) | ||
; CHECK-NEXT: $w0 = COPY %zext(s32) | ||
; CHECK-NEXT: RET_ReallyLR | ||
%const:_(s16) = G_IMPLICIT_DEF | ||
%rhs:_(<7 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16) | ||
%lhs:_(<7 x s16>) = G_BUILD_VECTOR %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16), %const(s16) | ||
%cmp:_(<7 x s16>) = G_ICMP intpred(eq), %lhs(<7 x s16>), %rhs | ||
%1:_(s32) = G_CONSTANT i32 1 | ||
%2:_(s16) = G_EXTRACT_VECTOR_ELT %cmp(<7 x s16>), %1(s32) | ||
%zext:_(s32) = G_ZEXT %2(s16) | ||
$w0 = COPY %zext(s32) | ||
RET_ReallyLR | ||
... | ||
--- | ||
name: test_9xs8_eq_clamp | ||
tracksRegLiveness: true | ||
body: | | ||
bb.1: | ||
liveins: $x0 | ||
; CHECK-LABEL: name: test_9xs8_eq_clamp | ||
; CHECK: liveins: $x0 | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: %const:_(s8) = G_IMPLICIT_DEF | ||
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<16 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8) | ||
; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(<16 x s8>) = G_ICMP intpred(eq), [[BUILD_VECTOR]](<16 x s8>), [[BUILD_VECTOR1]] | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s8) = G_EXTRACT_VECTOR_ELT [[ICMP]](<16 x s8>), [[C]](s64) | ||
; CHECK-NEXT: %zext:_(s32) = G_ZEXT [[EVEC]](s8) | ||
; CHECK-NEXT: $w0 = COPY %zext(s32) | ||
; CHECK-NEXT: RET_ReallyLR | ||
%const:_(s8) = G_IMPLICIT_DEF | ||
%rhs:_(<9 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8) | ||
%lhs:_(<9 x s8>) = G_BUILD_VECTOR %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8), %const(s8) | ||
%cmp:_(<9 x s8>) = G_ICMP intpred(eq), %lhs(<9 x s8>), %rhs | ||
%1:_(s32) = G_CONSTANT i32 1 | ||
%2:_(s8) = G_EXTRACT_VECTOR_ELT %cmp(<9 x s8>), %1(s32) | ||
%zext:_(s32) = G_ZEXT %2(s8) | ||
$w0 = COPY %zext(s32) | ||
RET_ReallyLR |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -316,26 +316,14 @@ body: | | |
; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY]](s32), [[COPY1]](s32), [[COPY2]](s32), [[DEF]](s32) | ||
; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY]](s32), [[COPY1]](s32), [[COPY2]](s32), [[DEF]](s32) | ||
; CHECK-NEXT: [[SHUF:%[0-9]+]]:_(<4 x s32>) = G_SHUFFLE_VECTOR [[BUILD_VECTOR]](<4 x s32>), [[BUILD_VECTOR1]], shufflemask(0, 1, 5, 6) | ||
; CHECK-NEXT: [[UV:%[0-9]+]]:_(<2 x s32>), [[UV1:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[UV2:%[0-9]+]]:_(<2 x s32>), [[UV3:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[UV4:%[0-9]+]]:_(<2 x s32>), [[UV5:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[UV6:%[0-9]+]]:_(<2 x s32>), [[UV7:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[UV8:%[0-9]+]]:_(<2 x s32>), [[UV9:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[UV10:%[0-9]+]]:_(<2 x s32>), [[UV11:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[UV12:%[0-9]+]]:_(<2 x s32>), [[UV13:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[UV14:%[0-9]+]]:_(<2 x s32>), [[UV15:%[0-9]+]]:_(<2 x s32>) = G_UNMERGE_VALUES [[SHUF]](<4 x s32>) | ||
; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0 | ||
; CHECK-NEXT: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV]](<2 x s32>), [[UV3]](<2 x s32>) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. improved legalization of |
||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS]](<4 x s32>), [[C]](s64) | ||
; CHECK-NEXT: [[EVEC:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C]](s64) | ||
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 | ||
; CHECK-NEXT: [[CONCAT_VECTORS1:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV4]](<2 x s32>), [[UV7]](<2 x s32>) | ||
; CHECK-NEXT: [[EVEC1:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS1]](<4 x s32>), [[C1]](s64) | ||
; CHECK-NEXT: [[EVEC1:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C1]](s64) | ||
; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 | ||
; CHECK-NEXT: [[CONCAT_VECTORS2:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV8]](<2 x s32>), [[UV11]](<2 x s32>) | ||
; CHECK-NEXT: [[EVEC2:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS2]](<4 x s32>), [[C2]](s64) | ||
; CHECK-NEXT: [[EVEC2:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C2]](s64) | ||
; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 3 | ||
; CHECK-NEXT: [[CONCAT_VECTORS3:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[UV12]](<2 x s32>), [[UV15]](<2 x s32>) | ||
; CHECK-NEXT: [[EVEC3:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[CONCAT_VECTORS3]](<4 x s32>), [[C3]](s64) | ||
; CHECK-NEXT: [[EVEC3:%[0-9]+]]:_(s32) = G_EXTRACT_VECTOR_ELT [[SHUF]](<4 x s32>), [[C3]](s64) | ||
; CHECK-NEXT: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[EVEC]](s32), [[EVEC1]](s32), [[EVEC2]](s32), [[EVEC3]](s32) | ||
; CHECK-NEXT: $q0 = COPY [[BUILD_VECTOR2]](<4 x s32>) | ||
; CHECK-NEXT: RET_ReallyLR implicit $q0 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how much this matters, but this looks like it is assuming that the three types are all the same, not that one is a predicate vector (i1 elements). Neon will work with full vector lane mask results, so it should work OK there, but possible not for other architectures with vector predicate register like SVE and MVE.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point. I should be the
SrcTy
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
MoreTy
comes from here:llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
Line 5155 in c1a4424
Properly supporting predicates will take more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might need to be similar to the trunc code:
llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
Line 5309 in c1a4424
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, SVE and moreElements is incompatible. I cannot change the length of a scalable vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.