-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[InstCombine] Failure to factorize add
/sub
and max
/min
using distributivity
#92433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Similar rewrites may be possible for floating point with appropriate fast-math flags, but alive times out |
I confirmed that |
Hi! This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below. |
@llvm/issue-subscribers-good-first-issue Author: Karl Meakin (Kmeakin)
[alive proof](https://alive2.llvm.org/ce/z/2t4UfC)
use std::cmp::max;
use std::cmp::min;
// * `(a umin c) umax (b umin c) => (a umax b) umin c`
#[no_mangle]
pub fn src1(a: u32, b: u32, c: u32) -> u32 {
max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt1(a: u32, b: u32, c: u32) -> u32 {
min(max(a, b), c)
}
// * `(a umax c) umin (b umax c) => (a umin b) umax c`
#[no_mangle]
pub fn src2(a: u32, b: u32, c: u32) -> u32 {
min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt2(a: u32, b: u32, c: u32) -> u32 {
max(min(a, b), c)
}
// * `(a smin c) smax (b smin c) => (a smax b) smin c`
#[no_mangle]
pub fn src3(a: i32, b: i32, c: i32) -> i32 {
max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt3(a: i32, b: i32, c: i32) -> i32 {
min(max(a, b), c)
}
// * `(a smax c) smin (b smax c) => (a smin b) smax c`
#[no_mangle]
pub fn src4(a: i32, b: i32, c: i32) -> i32 {
min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt4(a: i32, b: i32, c: i32) -> i32 {
max(min(a, b), c)
}
// * `umax(a +usat b, a +usat c) => a +usat (b umax c)`
#[no_mangle]
pub fn src5(a: u32, b: u32, c: u32) -> u32 {
max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt5(a: u32, b: u32, c: u32) -> u32 {
a.saturating_add(max(b, c))
}
// * `umax(a +nuw b, a +nuw c) => a +nuw (b umax c)`
#[no_mangle]
pub fn src6(a: u32, b: u32, c: u32) -> u32 {
unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt6(a: u32, b: u32, c: u32) -> u32 {
unsafe { a.unchecked_add(max(b, c)) }
}
// * `umin(a +usat b, a +usat c) => a +usat (b umin c)`
#[no_mangle]
pub fn src7(a: u32, b: u32, c: u32) -> u32 {
min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt7(a: u32, b: u32, c: u32) -> u32 {
a.saturating_add(min(b, c))
}
// * `umin(a +nuw b, a +nuw c) => a +nuw (b umin c)`
#[no_mangle]
pub fn src8(a: u32, b: u32, c: u32) -> u32 {
unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt8(a: u32, b: u32, c: u32) -> u32 {
unsafe { a.unchecked_add(min(b, c)) }
}
// * `smax(a +ssat b, a +ssat c) => a +ssat (b smax c)`
#[no_mangle]
pub fn src10(a: i32, b: i32, c: i32) -> i32 {
max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt10(a: i32, b: i32, c: i32) -> i32 {
a.saturating_add(max(b, c))
}
// * `smax(a +nsw b, a +nsw c) => a +nsw (b smax c)`
#[no_mangle]
pub fn src9(a: i32, b: i32, c: i32) -> i32 {
unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt9(a: i32, b: i32, c: i32) -> i32 {
unsafe { a.unchecked_add(max(b, c)) }
}
// * `smin(a +ssat b, a +ssat c) => a +ssat (b smin c)`
#[no_mangle]
pub fn src11(a: i32, b: i32, c: i32) -> i32 {
min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt11(a: i32, b: i32, c: i32) -> i32 {
a.saturating_add(min(b, c))
}
// * `smin(a +nsw b, a +nsw c) => a +nsw (b smin c)`
#[no_mangle]
pub fn src12(a: i32, b: i32, c: i32) -> i32 {
unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt12(a: i32, b: i32, c: i32) -> i32 {
unsafe { a.unchecked_add(min(b, c)) }
} |
This also works for unsigned multiplication (but not signed): https://alive2.llvm.org/ce/z/jJE_FE use std::cmp::max;
use std::cmp::min;
// * `umin(a *nuw b, a *nuw c) => a *nuw (b umin c)`
#[no_mangle]
pub unsafe fn src1(a: u8, b: u8, c: u8) -> u8 {
min(a.unchecked_mul(b), a.unchecked_mul(c))
}
#[no_mangle]
pub unsafe fn tgt1(a: u8, b: u8, c: u8) -> u8 {
a.unchecked_mul(min(b, c))
}
// * `umin(a *usat b, a *usat c) => a *usat (b umin c)`
#[no_mangle]
pub unsafe fn src2(a: u8, b: u8, c: u8) -> u8 {
min(a.saturating_mul(b), a.unchecked_mul(c))
}
#[no_mangle]
pub unsafe fn tgt2(a: u8, b: u8, c: u8) -> u8 {
a.saturating_mul(min(b, c))
}
// * `umax(a *nuw b, a *nuw c) => a *nuw (b umax c)`
#[no_mangle]
pub unsafe fn src3(a: u8, b: u8, c: u8) -> u8 {
max(a.unchecked_mul(b), a.unchecked_mul(c))
}
#[no_mangle]
pub unsafe fn tgt3(a: u8, b: u8, c: u8) -> u8 {
a.unchecked_mul(max(b, c))
}
// * `umax(a *usat b, a *usat c) => a *usat (b umax c)`
#[no_mangle]
pub unsafe fn src4(a: u8, b: u8, c: u8) -> u8 {
max(a.saturating_mul(b), a.unchecked_mul(c))
}
#[no_mangle]
pub unsafe fn tgt4(a: u8, b: u8, c: u8) -> u8 {
a.saturating_mul(max(b, c))
} |
I'd love to work on this if possible? |
Sure! Welcome to LLVM! Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting your patch :) |
@jf-botto are you still working on this? |
Yes. Just need to figure out the most complex cases before making a pr. |
@jf-botto It's not necessary to handle all cases at once -- in fact, if you tried to do that, we'd likely ask for the PR to be split up... |
This PR fixes part of #92433. It specifically adds the 4 cases mentioned in #92433 (comment). I've added 8 positive tests, 4 of which are mentioned in the comment above and 4 which are their commutative equivalents. Alive proof: https://alive2.llvm.org/ce/z/z6eFTb I've also added 8 negative tests, because we want to make sure we do not optimise if the relevant flags are not relevant because the optimisation wouldn't be sound. Alive proof that the optimisation is invalid: https://alive2.llvm.org/ce/z/NvNjTD I did have to make the integer types `i4` to make Alive not timeout and to fit them all on one page.
…1717) This PR fixes part of llvm#92433. It specifically adds the 4 cases mentioned in llvm#92433 (comment). I've added 8 positive tests, 4 of which are mentioned in the comment above and 4 which are their commutative equivalents. Alive proof: https://alive2.llvm.org/ce/z/z6eFTb I've also added 8 negative tests, because we want to make sure we do not optimise if the relevant flags are not relevant because the optimisation wouldn't be sound. Alive proof that the optimisation is invalid: https://alive2.llvm.org/ce/z/NvNjTD I did have to make the integer types `i4` to make Alive not timeout and to fit them all on one page.
…1717) This PR fixes part of llvm#92433. It specifically adds the 4 cases mentioned in llvm#92433 (comment). I've added 8 positive tests, 4 of which are mentioned in the comment above and 4 which are their commutative equivalents. Alive proof: https://alive2.llvm.org/ce/z/z6eFTb I've also added 8 negative tests, because we want to make sure we do not optimise if the relevant flags are not relevant because the optimisation wouldn't be sound. Alive proof that the optimisation is invalid: https://alive2.llvm.org/ce/z/NvNjTD I did have to make the integer types `i4` to make Alive not timeout and to fit them all on one page.
alive proof
The text was updated successfully, but these errors were encountered: