Skip to content

[DAG] Fold (and X, (bswap/bitreverse (not Y))) -> (and X, (not (bswap/bitreverse Y))) #112547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 28, 2024

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Oct 16, 2024

On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Fixes #112425

@RKSimon RKSimon requested review from dtcxzyw and goldsteinn October 16, 2024 14:01
@llvmbot llvmbot added backend:X86 llvm:SelectionDAG SelectionDAGISel as well labels Oct 16, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2024

@llvm/pr-subscribers-backend-x86

Author: Simon Pilgrim (RKSimon)

Changes

Fixes #112425


Full diff: https://github.com/llvm/llvm-project/pull/112547.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+15)
  • (modified) llvm/test/CodeGen/X86/andnot-patterns.ll (+32-44)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index ca91d35573c3ec..7c2d840e48515b 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -7350,6 +7350,21 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
   if (SDValue R = foldLogicOfShifts(N, N1, N0, DAG))
     return R;
 
+  // If the target supports ANDNOT, attempt to reconstruct an ANDNOT pattern
+  // that might have become separated by a bitwise-agnostic instruction.
+  if (TLI.hasAndNot(SDValue(N, 0))) {
+    SDValue X, Y;
+
+    // Fold (and X, (bswap (not Y))) -> (and X, (not (bswap Y)))
+    // Fold (and X, (bitreverse (not Y))) -> (and X, (not (bitreverse Y)))
+    for (unsigned Opc : {ISD::BSWAP, ISD::BITREVERSE})
+      if (sd_match(N, m_And(m_Value(X),
+                            m_OneUse(m_UnaryOp(Opc, m_Not(m_Value(Y)))))) &&
+          !sd_match(X, m_Not(m_Value())))
+        return DAG.getNode(ISD::AND, DL, VT, X,
+                           DAG.getNOT(DL, DAG.getNode(Opc, DL, VT, Y), VT));
+  }
+
   // Masking the negated extension of a boolean is just the zero-extended
   // boolean:
   // and (sub 0, zext(bool X)), 1 --> zext(bool X)
diff --git a/llvm/test/CodeGen/X86/andnot-patterns.ll b/llvm/test/CodeGen/X86/andnot-patterns.ll
index 46ebe6ba76567a..47d1d69f4ea357 100644
--- a/llvm/test/CodeGen/X86/andnot-patterns.ll
+++ b/llvm/test/CodeGen/X86/andnot-patterns.ll
@@ -321,22 +321,18 @@ define i8 @andnot_rotr_i8(i8 %a0, i8 %a1, i8 %a2) nounwind {
 define i64 @andnot_bswap_i64(i64 %a0, i64 %a1) nounwind {
 ; X86-LABEL: andnot_bswap_i64:
 ; X86:       # %bb.0:
-; X86-NEXT:    movl {{[0-9]+}}(%esp), %edx
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %ecx
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
-; X86-NEXT:    notl %edx
-; X86-NEXT:    bswapl %edx
 ; X86-NEXT:    bswapl %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %edx
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
+; X86-NEXT:    bswapl %ecx
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %ecx, %edx
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bswap_i64:
 ; X64:       # %bb.0:
-; X64-NEXT:    movq %rsi, %rax
-; X64-NEXT:    notq %rax
-; X64-NEXT:    bswapq %rax
-; X64-NEXT:    andq %rdi, %rax
+; X64-NEXT:    bswapq %rsi
+; X64-NEXT:    andnq %rdi, %rsi, %rax
 ; X64-NEXT:    retq
   %not = xor i64 %a1, -1
   %bswap = tail call i64 @llvm.bswap.i64(i64 %not)
@@ -348,17 +344,14 @@ define i32 @andnot_bswap_i32(i32 %a0, i32 %a1) nounwind {
 ; X86-LABEL: andnot_bswap_i32:
 ; X86:       # %bb.0:
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
 ; X86-NEXT:    bswapl %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bswap_i32:
 ; X64:       # %bb.0:
-; X64-NEXT:    movl %esi, %eax
-; X64-NEXT:    notl %eax
-; X64-NEXT:    bswapl %eax
-; X64-NEXT:    andl %edi, %eax
+; X64-NEXT:    bswapl %esi
+; X64-NEXT:    andnl %edi, %esi, %eax
 ; X64-NEXT:    retq
   %not = xor i32 %a1, -1
   %bswap = tail call i32 @llvm.bswap.i32(i32 %not)
@@ -399,8 +392,24 @@ define i64 @andnot_bitreverse_i64(i64 %a0, i64 %a1) nounwind {
 ; X86:       # %bb.0:
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %ecx
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
-; X86-NEXT:    notl %ecx
+; X86-NEXT:    bswapl %eax
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    andl $252645135, %edx # imm = 0xF0F0F0F
+; X86-NEXT:    shll $4, %edx
+; X86-NEXT:    shrl $4, %eax
+; X86-NEXT:    andl $252645135, %eax # imm = 0xF0F0F0F
+; X86-NEXT:    orl %edx, %eax
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    andl $858993459, %edx # imm = 0x33333333
+; X86-NEXT:    shrl $2, %eax
+; X86-NEXT:    andl $858993459, %eax # imm = 0x33333333
+; X86-NEXT:    leal (%eax,%edx,4), %eax
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    andl $1431655765, %edx # imm = 0x55555555
+; X86-NEXT:    shrl %eax
+; X86-NEXT:    andl $1431655765, %eax # imm = 0x55555555
+; X86-NEXT:    leal (%eax,%edx,2), %eax
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
 ; X86-NEXT:    bswapl %ecx
 ; X86-NEXT:    movl %ecx, %edx
 ; X86-NEXT:    andl $252645135, %edx # imm = 0xF0F0F0F
@@ -417,31 +426,12 @@ define i64 @andnot_bitreverse_i64(i64 %a0, i64 %a1) nounwind {
 ; X86-NEXT:    andl $1431655765, %edx # imm = 0x55555555
 ; X86-NEXT:    shrl %ecx
 ; X86-NEXT:    andl $1431655765, %ecx # imm = 0x55555555
-; X86-NEXT:    leal (%ecx,%edx,2), %edx
-; X86-NEXT:    bswapl %eax
-; X86-NEXT:    movl %eax, %ecx
-; X86-NEXT:    andl $252645135, %ecx # imm = 0xF0F0F0F
-; X86-NEXT:    shll $4, %ecx
-; X86-NEXT:    shrl $4, %eax
-; X86-NEXT:    andl $252645135, %eax # imm = 0xF0F0F0F
-; X86-NEXT:    orl %ecx, %eax
-; X86-NEXT:    movl %eax, %ecx
-; X86-NEXT:    andl $858993459, %ecx # imm = 0x33333333
-; X86-NEXT:    shrl $2, %eax
-; X86-NEXT:    andl $858993459, %eax # imm = 0x33333333
-; X86-NEXT:    leal (%eax,%ecx,4), %eax
-; X86-NEXT:    movl %eax, %ecx
-; X86-NEXT:    andl $1431655765, %ecx # imm = 0x55555555
-; X86-NEXT:    shrl %eax
-; X86-NEXT:    andl $1431655765, %eax # imm = 0x55555555
-; X86-NEXT:    leal (%eax,%ecx,2), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %edx
+; X86-NEXT:    leal (%ecx,%edx,2), %ecx
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %ecx, %edx
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bitreverse_i64:
 ; X64:       # %bb.0:
-; X64-NEXT:    notq %rsi
 ; X64-NEXT:    bswapq %rsi
 ; X64-NEXT:    movq %rsi, %rax
 ; X64-NEXT:    shrq $4, %rax
@@ -462,7 +452,7 @@ define i64 @andnot_bitreverse_i64(i64 %a0, i64 %a1) nounwind {
 ; X64-NEXT:    shrq %rax
 ; X64-NEXT:    andq %rcx, %rax
 ; X64-NEXT:    leaq (%rax,%rdx,2), %rax
-; X64-NEXT:    andq %rdi, %rax
+; X64-NEXT:    andnq %rdi, %rax, %rax
 ; X64-NEXT:    retq
   %not = xor i64 %a1, -1
   %bitrev = tail call i64 @llvm.bitreverse.i64(i64 %not)
@@ -474,7 +464,6 @@ define i32 @andnot_bitreverse_i32(i32 %a0, i32 %a1) nounwind {
 ; X86-LABEL: andnot_bitreverse_i32:
 ; X86:       # %bb.0:
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
 ; X86-NEXT:    bswapl %eax
 ; X86-NEXT:    movl %eax, %ecx
 ; X86-NEXT:    andl $252645135, %ecx # imm = 0xF0F0F0F
@@ -492,13 +481,12 @@ define i32 @andnot_bitreverse_i32(i32 %a0, i32 %a1) nounwind {
 ; X86-NEXT:    shrl %eax
 ; X86-NEXT:    andl $1431655765, %eax # imm = 0x55555555
 ; X86-NEXT:    leal (%eax,%ecx,2), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bitreverse_i32:
 ; X64:       # %bb.0:
 ; X64-NEXT:    # kill: def $esi killed $esi def $rsi
-; X64-NEXT:    notl %esi
 ; X64-NEXT:    bswapl %esi
 ; X64-NEXT:    movl %esi, %eax
 ; X64-NEXT:    andl $252645135, %eax # imm = 0xF0F0F0F
@@ -516,7 +504,7 @@ define i32 @andnot_bitreverse_i32(i32 %a0, i32 %a1) nounwind {
 ; X64-NEXT:    shrl %eax
 ; X64-NEXT:    andl $1431655765, %eax # imm = 0x55555555
 ; X64-NEXT:    leal (%rax,%rcx,2), %eax
-; X64-NEXT:    andl %edi, %eax
+; X64-NEXT:    andnl %edi, %eax, %eax
 ; X64-NEXT:    retq
   %not = xor i32 %a1, -1
   %bitrev = tail call i32 @llvm.bitreverse.i32(i32 %not)

@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2024

@llvm/pr-subscribers-llvm-selectiondag

Author: Simon Pilgrim (RKSimon)

Changes

Fixes #112425


Full diff: https://github.com/llvm/llvm-project/pull/112547.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+15)
  • (modified) llvm/test/CodeGen/X86/andnot-patterns.ll (+32-44)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index ca91d35573c3ec..7c2d840e48515b 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -7350,6 +7350,21 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
   if (SDValue R = foldLogicOfShifts(N, N1, N0, DAG))
     return R;
 
+  // If the target supports ANDNOT, attempt to reconstruct an ANDNOT pattern
+  // that might have become separated by a bitwise-agnostic instruction.
+  if (TLI.hasAndNot(SDValue(N, 0))) {
+    SDValue X, Y;
+
+    // Fold (and X, (bswap (not Y))) -> (and X, (not (bswap Y)))
+    // Fold (and X, (bitreverse (not Y))) -> (and X, (not (bitreverse Y)))
+    for (unsigned Opc : {ISD::BSWAP, ISD::BITREVERSE})
+      if (sd_match(N, m_And(m_Value(X),
+                            m_OneUse(m_UnaryOp(Opc, m_Not(m_Value(Y)))))) &&
+          !sd_match(X, m_Not(m_Value())))
+        return DAG.getNode(ISD::AND, DL, VT, X,
+                           DAG.getNOT(DL, DAG.getNode(Opc, DL, VT, Y), VT));
+  }
+
   // Masking the negated extension of a boolean is just the zero-extended
   // boolean:
   // and (sub 0, zext(bool X)), 1 --> zext(bool X)
diff --git a/llvm/test/CodeGen/X86/andnot-patterns.ll b/llvm/test/CodeGen/X86/andnot-patterns.ll
index 46ebe6ba76567a..47d1d69f4ea357 100644
--- a/llvm/test/CodeGen/X86/andnot-patterns.ll
+++ b/llvm/test/CodeGen/X86/andnot-patterns.ll
@@ -321,22 +321,18 @@ define i8 @andnot_rotr_i8(i8 %a0, i8 %a1, i8 %a2) nounwind {
 define i64 @andnot_bswap_i64(i64 %a0, i64 %a1) nounwind {
 ; X86-LABEL: andnot_bswap_i64:
 ; X86:       # %bb.0:
-; X86-NEXT:    movl {{[0-9]+}}(%esp), %edx
+; X86-NEXT:    movl {{[0-9]+}}(%esp), %ecx
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
-; X86-NEXT:    notl %edx
-; X86-NEXT:    bswapl %edx
 ; X86-NEXT:    bswapl %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %edx
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
+; X86-NEXT:    bswapl %ecx
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %ecx, %edx
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bswap_i64:
 ; X64:       # %bb.0:
-; X64-NEXT:    movq %rsi, %rax
-; X64-NEXT:    notq %rax
-; X64-NEXT:    bswapq %rax
-; X64-NEXT:    andq %rdi, %rax
+; X64-NEXT:    bswapq %rsi
+; X64-NEXT:    andnq %rdi, %rsi, %rax
 ; X64-NEXT:    retq
   %not = xor i64 %a1, -1
   %bswap = tail call i64 @llvm.bswap.i64(i64 %not)
@@ -348,17 +344,14 @@ define i32 @andnot_bswap_i32(i32 %a0, i32 %a1) nounwind {
 ; X86-LABEL: andnot_bswap_i32:
 ; X86:       # %bb.0:
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
 ; X86-NEXT:    bswapl %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bswap_i32:
 ; X64:       # %bb.0:
-; X64-NEXT:    movl %esi, %eax
-; X64-NEXT:    notl %eax
-; X64-NEXT:    bswapl %eax
-; X64-NEXT:    andl %edi, %eax
+; X64-NEXT:    bswapl %esi
+; X64-NEXT:    andnl %edi, %esi, %eax
 ; X64-NEXT:    retq
   %not = xor i32 %a1, -1
   %bswap = tail call i32 @llvm.bswap.i32(i32 %not)
@@ -399,8 +392,24 @@ define i64 @andnot_bitreverse_i64(i64 %a0, i64 %a1) nounwind {
 ; X86:       # %bb.0:
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %ecx
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
-; X86-NEXT:    notl %ecx
+; X86-NEXT:    bswapl %eax
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    andl $252645135, %edx # imm = 0xF0F0F0F
+; X86-NEXT:    shll $4, %edx
+; X86-NEXT:    shrl $4, %eax
+; X86-NEXT:    andl $252645135, %eax # imm = 0xF0F0F0F
+; X86-NEXT:    orl %edx, %eax
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    andl $858993459, %edx # imm = 0x33333333
+; X86-NEXT:    shrl $2, %eax
+; X86-NEXT:    andl $858993459, %eax # imm = 0x33333333
+; X86-NEXT:    leal (%eax,%edx,4), %eax
+; X86-NEXT:    movl %eax, %edx
+; X86-NEXT:    andl $1431655765, %edx # imm = 0x55555555
+; X86-NEXT:    shrl %eax
+; X86-NEXT:    andl $1431655765, %eax # imm = 0x55555555
+; X86-NEXT:    leal (%eax,%edx,2), %eax
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
 ; X86-NEXT:    bswapl %ecx
 ; X86-NEXT:    movl %ecx, %edx
 ; X86-NEXT:    andl $252645135, %edx # imm = 0xF0F0F0F
@@ -417,31 +426,12 @@ define i64 @andnot_bitreverse_i64(i64 %a0, i64 %a1) nounwind {
 ; X86-NEXT:    andl $1431655765, %edx # imm = 0x55555555
 ; X86-NEXT:    shrl %ecx
 ; X86-NEXT:    andl $1431655765, %ecx # imm = 0x55555555
-; X86-NEXT:    leal (%ecx,%edx,2), %edx
-; X86-NEXT:    bswapl %eax
-; X86-NEXT:    movl %eax, %ecx
-; X86-NEXT:    andl $252645135, %ecx # imm = 0xF0F0F0F
-; X86-NEXT:    shll $4, %ecx
-; X86-NEXT:    shrl $4, %eax
-; X86-NEXT:    andl $252645135, %eax # imm = 0xF0F0F0F
-; X86-NEXT:    orl %ecx, %eax
-; X86-NEXT:    movl %eax, %ecx
-; X86-NEXT:    andl $858993459, %ecx # imm = 0x33333333
-; X86-NEXT:    shrl $2, %eax
-; X86-NEXT:    andl $858993459, %eax # imm = 0x33333333
-; X86-NEXT:    leal (%eax,%ecx,4), %eax
-; X86-NEXT:    movl %eax, %ecx
-; X86-NEXT:    andl $1431655765, %ecx # imm = 0x55555555
-; X86-NEXT:    shrl %eax
-; X86-NEXT:    andl $1431655765, %eax # imm = 0x55555555
-; X86-NEXT:    leal (%eax,%ecx,2), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %edx
+; X86-NEXT:    leal (%ecx,%edx,2), %ecx
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %ecx, %edx
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bitreverse_i64:
 ; X64:       # %bb.0:
-; X64-NEXT:    notq %rsi
 ; X64-NEXT:    bswapq %rsi
 ; X64-NEXT:    movq %rsi, %rax
 ; X64-NEXT:    shrq $4, %rax
@@ -462,7 +452,7 @@ define i64 @andnot_bitreverse_i64(i64 %a0, i64 %a1) nounwind {
 ; X64-NEXT:    shrq %rax
 ; X64-NEXT:    andq %rcx, %rax
 ; X64-NEXT:    leaq (%rax,%rdx,2), %rax
-; X64-NEXT:    andq %rdi, %rax
+; X64-NEXT:    andnq %rdi, %rax, %rax
 ; X64-NEXT:    retq
   %not = xor i64 %a1, -1
   %bitrev = tail call i64 @llvm.bitreverse.i64(i64 %not)
@@ -474,7 +464,6 @@ define i32 @andnot_bitreverse_i32(i32 %a0, i32 %a1) nounwind {
 ; X86-LABEL: andnot_bitreverse_i32:
 ; X86:       # %bb.0:
 ; X86-NEXT:    movl {{[0-9]+}}(%esp), %eax
-; X86-NEXT:    notl %eax
 ; X86-NEXT:    bswapl %eax
 ; X86-NEXT:    movl %eax, %ecx
 ; X86-NEXT:    andl $252645135, %ecx # imm = 0xF0F0F0F
@@ -492,13 +481,12 @@ define i32 @andnot_bitreverse_i32(i32 %a0, i32 %a1) nounwind {
 ; X86-NEXT:    shrl %eax
 ; X86-NEXT:    andl $1431655765, %eax # imm = 0x55555555
 ; X86-NEXT:    leal (%eax,%ecx,2), %eax
-; X86-NEXT:    andl {{[0-9]+}}(%esp), %eax
+; X86-NEXT:    andnl {{[0-9]+}}(%esp), %eax, %eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: andnot_bitreverse_i32:
 ; X64:       # %bb.0:
 ; X64-NEXT:    # kill: def $esi killed $esi def $rsi
-; X64-NEXT:    notl %esi
 ; X64-NEXT:    bswapl %esi
 ; X64-NEXT:    movl %esi, %eax
 ; X64-NEXT:    andl $252645135, %eax # imm = 0xF0F0F0F
@@ -516,7 +504,7 @@ define i32 @andnot_bitreverse_i32(i32 %a0, i32 %a1) nounwind {
 ; X64-NEXT:    shrl %eax
 ; X64-NEXT:    andl $1431655765, %eax # imm = 0x55555555
 ; X64-NEXT:    leal (%rax,%rcx,2), %eax
-; X64-NEXT:    andl %edi, %eax
+; X64-NEXT:    andnl %edi, %eax, %eax
 ; X64-NEXT:    retq
   %not = xor i32 %a1, -1
   %bitrev = tail call i32 @llvm.bitreverse.i32(i32 %not)

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

for (unsigned Opc : {ISD::BSWAP, ISD::BITREVERSE})
if (sd_match(N, m_And(m_Value(X),
m_OneUse(m_UnaryOp(Opc, m_Not(m_Value(Y)))))) &&
!sd_match(X, m_Not(m_Value())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? Seems like it expose ~X & ~Y -> ~(X | Y)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ~X & bitreverse/bswap(~Y).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - its to prevent the fold when we already have a ANDNOT pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean isn't ~X & bitreverse/bswap(~Y) -> ~X & ~bitreverse/bswap(Y) also profitable? Irrelivant of ANDNOT in fact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess assuming ~Y is one-use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for the NOT(Y) we can allow multiple uses as with a ANDNOT we'll be getting it for free - I'd be willing to allow this fold to occur on targets without ANDNOT, if for those targets we require the NOT(Y) to have one use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like

if (sd_match(N, m_And(m_Value(X),
                            m_OneUse(m_UnaryOp(Opc, m_Value(NotY)))) &&
   match(NotY, m_Not(m_Value(Y))) && (TLI.hasAndNot(SDValue(N, 0) || NotY->hasOneUse())

should work.

RKSimon added a commit that referenced this pull request Oct 18, 2024
Extra test coverage for #112547 to test cases where we don't create a ANDNOT instruction
@RKSimon RKSimon force-pushed the dag-andnot-bitswap branch from e582048 to f6a7038 Compare October 18, 2024 16:59
@RKSimon
Copy link
Collaborator Author

RKSimon commented Oct 18, 2024

Added non-ANDNOT handling for single use cases of the NOT

@RKSimon RKSimon requested review from goldsteinn and dtcxzyw October 18, 2024 17:13
@@ -529,8 +815,7 @@ define i32 @andnot_bitreverse_i32(i32 %a0, i32 %a1) nounwind {
define i16 @andnot_bitreverse_i16(i16 %a0, i16 %a1) nounwind {
; X86-LABEL: andnot_bitreverse_i16:
; X86: # %bb.0:
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: notl %eax
; X86-NEXT: movzwl {{[0-9]+}}(%esp), %eax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this equivilent?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - we only care about this lower 16-bits but arg passing allows us to dereference 32-bits

if (sd_match(N,
m_And(m_Value(X), m_OneUse(m_UnaryOp(Opc, m_Value(NotY))))) &&
sd_match(NotY, m_Not(m_Value(Y))) &&
(TLI.hasAndNot(SDValue(N, 0)) || NotY->hasOneUse()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should be || (NotY->hasOneUse() && sd_match(X, m_Not(m_Value()))

There is a lot of diff from just re-assosiating the not which seems netural.

No strong opinion, however.

@RKSimon RKSimon force-pushed the dag-andnot-bitswap branch from 967cf46 to 77adbb3 Compare October 25, 2024 17:12
@goldsteinn
Copy link
Contributor

LGTM

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@RKSimon RKSimon changed the title [DAG] Fold (and X, (bswap/bitreverse (not Y))) -> (and X, (not (bswap/bitreverse Y))) on ANDNOT capable targets [DAG] Fold (and X, (bswap/bitreverse (not Y))) -> (and X, (not (bswap/bitreverse Y))) Oct 27, 2024
@RKSimon RKSimon merged commit 056cf93 into llvm:main Oct 28, 2024
8 checks passed
@RKSimon RKSimon deleted the dag-andnot-bitswap branch October 28, 2024 11:52
RKSimon added a commit that referenced this pull request Oct 30, 2024
On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Followup to #112547
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…/bitreverse Y))) (llvm#112547)

On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Fixes llvm#112425
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT

Followup to llvm#112547
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow AND-NOT to be emitted from NOT + <some bit/byte reordering> + AND
4 participants