[InstCombine] Failure to factorize `add`/`sub` and `max`/`min` using distributivity #92433

Kmeakin · 2024-05-16T17:38:35Z

use std::cmp::max;
use std::cmp::min;

// * `(a umin c) umax (b umin c) => (a umax b) umin c`
#[no_mangle]
pub fn src1(a: u32, b: u32, c: u32) -> u32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt1(a: u32, b: u32, c: u32) -> u32 {
    min(max(a, b), c)
}

// * `(a umax c) umin (b umax c) => (a umin b) umax c`
#[no_mangle]
pub fn src2(a: u32, b: u32, c: u32) -> u32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt2(a: u32, b: u32, c: u32) -> u32 {
    max(min(a, b), c)
}

// * `(a smin c) smax (b smin c) => (a smax b) smin c`
#[no_mangle]
pub fn src3(a: i32, b: i32, c: i32) -> i32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt3(a: i32, b: i32, c: i32) -> i32 {
    min(max(a, b), c)
}

// * `(a smax c) smin (b smax c) => (a smin b) smax c`
#[no_mangle]
pub fn src4(a: i32, b: i32, c: i32) -> i32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt4(a: i32, b: i32, c: i32) -> i32 {
    max(min(a, b), c)
}

// * `umax(a +usat b, a +usat c) => a +usat (b umax c)`
#[no_mangle]
pub fn src5(a: u32, b: u32, c: u32) -> u32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt5(a: u32, b: u32, c: u32) -> u32 {
    a.saturating_add(max(b, c))
}

// * `umax(a +nuw b, a +nuw c) => a +nuw (b umax c)`
#[no_mangle]
pub fn src6(a: u32, b: u32, c: u32) -> u32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt6(a: u32, b: u32, c: u32) -> u32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `umin(a +usat b, a +usat c) => a +usat (b umin c)`
#[no_mangle]
pub fn src7(a: u32, b: u32, c: u32) -> u32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt7(a: u32, b: u32, c: u32) -> u32 {
    a.saturating_add(min(b, c))
}

// * `umin(a +nuw b, a +nuw c) => a +nuw (b umin c)`
#[no_mangle]
pub fn src8(a: u32, b: u32, c: u32) -> u32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt8(a: u32, b: u32, c: u32) -> u32 {
    unsafe { a.unchecked_add(min(b, c)) }
}

// * `smax(a +ssat b, a +ssat c) => a +ssat (b smax c)`
#[no_mangle]
pub fn src10(a: i32, b: i32, c: i32) -> i32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt10(a: i32, b: i32, c: i32) -> i32 {
    a.saturating_add(max(b, c))
}

// * `smax(a +nsw b, a +nsw c) => a +nsw (b smax c)`
#[no_mangle]
pub fn src9(a: i32, b: i32, c: i32) -> i32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt9(a: i32, b: i32, c: i32) -> i32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `smin(a +ssat b, a +ssat c) => a +ssat (b smin c)`
#[no_mangle]
pub fn src11(a: i32, b: i32, c: i32) -> i32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt11(a: i32, b: i32, c: i32) -> i32 {
    a.saturating_add(min(b, c))
}

// * `smin(a +nsw b, a +nsw c) => a +nsw (b smin c)`
#[no_mangle]
pub fn src12(a: i32, b: i32, c: i32) -> i32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt12(a: i32, b: i32, c: i32) -> i32 {
    unsafe { a.unchecked_add(min(b, c)) }
}

Kmeakin · 2024-05-16T17:47:33Z

Similar rewrites may be possible for floating point with appropriate fast-math flags, but alive times out

dtcxzyw · 2024-05-17T08:49:53Z

I confirmed that src6/src8/src9/src12 exists in some real-world applications.

llvmbot · 2024-05-17T08:50:24Z

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
Fix the issue locally.
Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
Create a Git commit.
Run git clang-format HEAD~1 to format your changes.
Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

llvmbot · 2024-05-17T08:50:25Z

@llvm/issue-subscribers-good-first-issue

Author: Karl Meakin (Kmeakin)

[alive proof](https://alive2.llvm.org/ce/z/2t4UfC)

use std::cmp::max;
use std::cmp::min;

// * `(a umin c) umax (b umin c) =&gt; (a umax b) umin c`
#[no_mangle]
pub fn src1(a: u32, b: u32, c: u32) -&gt; u32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt1(a: u32, b: u32, c: u32) -&gt; u32 {
    min(max(a, b), c)
}

// * `(a umax c) umin (b umax c) =&gt; (a umin b) umax c`
#[no_mangle]
pub fn src2(a: u32, b: u32, c: u32) -&gt; u32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt2(a: u32, b: u32, c: u32) -&gt; u32 {
    max(min(a, b), c)
}

// * `(a smin c) smax (b smin c) =&gt; (a smax b) smin c`
#[no_mangle]
pub fn src3(a: i32, b: i32, c: i32) -&gt; i32 {
    max(min(a, c), min(b, c))
}
#[no_mangle]
pub fn tgt3(a: i32, b: i32, c: i32) -&gt; i32 {
    min(max(a, b), c)
}

// * `(a smax c) smin (b smax c) =&gt; (a smin b) smax c`
#[no_mangle]
pub fn src4(a: i32, b: i32, c: i32) -&gt; i32 {
    min(max(a, c), max(b, c))
}
#[no_mangle]
pub fn tgt4(a: i32, b: i32, c: i32) -&gt; i32 {
    max(min(a, b), c)
}

// * `umax(a +usat b, a +usat c) =&gt; a +usat (b umax c)`
#[no_mangle]
pub fn src5(a: u32, b: u32, c: u32) -&gt; u32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt5(a: u32, b: u32, c: u32) -&gt; u32 {
    a.saturating_add(max(b, c))
}

// * `umax(a +nuw b, a +nuw c) =&gt; a +nuw (b umax c)`
#[no_mangle]
pub fn src6(a: u32, b: u32, c: u32) -&gt; u32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt6(a: u32, b: u32, c: u32) -&gt; u32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `umin(a +usat b, a +usat c) =&gt; a +usat (b umin c)`
#[no_mangle]
pub fn src7(a: u32, b: u32, c: u32) -&gt; u32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt7(a: u32, b: u32, c: u32) -&gt; u32 {
    a.saturating_add(min(b, c))
}

// * `umin(a +nuw b, a +nuw c) =&gt; a +nuw (b umin c)`
#[no_mangle]
pub fn src8(a: u32, b: u32, c: u32) -&gt; u32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt8(a: u32, b: u32, c: u32) -&gt; u32 {
    unsafe { a.unchecked_add(min(b, c)) }
}

// * `smax(a +ssat b, a +ssat c) =&gt; a +ssat (b smax c)`
#[no_mangle]
pub fn src10(a: i32, b: i32, c: i32) -&gt; i32 {
    max(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt10(a: i32, b: i32, c: i32) -&gt; i32 {
    a.saturating_add(max(b, c))
}

// * `smax(a +nsw b, a +nsw c) =&gt; a +nsw (b smax c)`
#[no_mangle]
pub fn src9(a: i32, b: i32, c: i32) -&gt; i32 {
    unsafe { max(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt9(a: i32, b: i32, c: i32) -&gt; i32 {
    unsafe { a.unchecked_add(max(b, c)) }
}

// * `smin(a +ssat b, a +ssat c) =&gt; a +ssat (b smin c)`
#[no_mangle]
pub fn src11(a: i32, b: i32, c: i32) -&gt; i32 {
    min(a.saturating_add(b), a.saturating_add(c))
}
#[no_mangle]
pub fn tgt11(a: i32, b: i32, c: i32) -&gt; i32 {
    a.saturating_add(min(b, c))
}

// * `smin(a +nsw b, a +nsw c) =&gt; a +nsw (b smin c)`
#[no_mangle]
pub fn src12(a: i32, b: i32, c: i32) -&gt; i32 {
    unsafe { min(a.unchecked_add(b), a.unchecked_add(c)) }
}
#[no_mangle]
pub fn tgt12(a: i32, b: i32, c: i32) -&gt; i32 {
    unsafe { a.unchecked_add(min(b, c)) }
}

Kmeakin · 2024-05-17T14:36:04Z

This also works for unsigned multiplication (but not signed): https://alive2.llvm.org/ce/z/jJE_FE

use std::cmp::max;
use std::cmp::min;

// * `umin(a *nuw b, a *nuw c) => a *nuw (b umin c)`
#[no_mangle]
pub unsafe fn src1(a: u8, b: u8, c: u8) -> u8 {
    min(a.unchecked_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt1(a: u8, b: u8, c: u8) -> u8 {
    a.unchecked_mul(min(b, c))
}

// * `umin(a *usat b, a *usat c) => a *usat (b umin c)`
#[no_mangle]
pub unsafe fn src2(a: u8, b: u8, c: u8) -> u8 {
    min(a.saturating_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt2(a: u8, b: u8, c: u8) -> u8 {
    a.saturating_mul(min(b, c))
}

// * `umax(a *nuw b, a *nuw c) => a *nuw (b umax c)`
#[no_mangle]
pub unsafe fn src3(a: u8, b: u8, c: u8) -> u8 {
    max(a.unchecked_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt3(a: u8, b: u8, c: u8) -> u8 {
    a.unchecked_mul(max(b, c))
}

// * `umax(a *usat b, a *usat c) => a *usat (b umax c)`
#[no_mangle]
pub unsafe fn src4(a: u8, b: u8, c: u8) -> u8 {
    max(a.saturating_mul(b), a.unchecked_mul(c))
}

#[no_mangle]
pub unsafe fn tgt4(a: u8, b: u8, c: u8) -> u8 {
    a.saturating_mul(max(b, c))
}

jf-botto · 2024-05-18T16:45:09Z

I'd love to work on this if possible?

dtcxzyw · 2024-05-20T08:12:00Z

I'd love to work on this if possible?

Sure! Welcome to LLVM!

Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting your patch :)

wizardengineer · 2024-07-23T04:26:50Z

@jf-botto are you still working on this?

jf-botto · 2024-07-23T06:59:13Z

@jf-botto are you still working on this?

Yes. Just need to figure out the most complex cases before making a pr.

nikic · 2024-07-23T07:46:52Z

@jf-botto It's not necessary to handle all cases at once -- in fact, if you tried to do that, we'd likely ask for the PR to be split up...

This PR fixes part of #92433. It specifically adds the 4 cases mentioned in #92433 (comment). I've added 8 positive tests, 4 of which are mentioned in the comment above and 4 which are their commutative equivalents. Alive proof: https://alive2.llvm.org/ce/z/z6eFTb I've also added 8 negative tests, because we want to make sure we do not optimise if the relevant flags are not relevant because the optimisation wouldn't be sound. Alive proof that the optimisation is invalid: https://alive2.llvm.org/ce/z/NvNjTD I did have to make the integer types `i4` to make Alive not timeout and to fit them all on one page.

…1717) This PR fixes part of llvm#92433. It specifically adds the 4 cases mentioned in llvm#92433 (comment). I've added 8 positive tests, 4 of which are mentioned in the comment above and 4 which are their commutative equivalents. Alive proof: https://alive2.llvm.org/ce/z/z6eFTb I've also added 8 negative tests, because we want to make sure we do not optimise if the relevant flags are not relevant because the optimisation wouldn't be sound. Alive proof that the optimisation is invalid: https://alive2.llvm.org/ce/z/NvNjTD I did have to make the integer types `i4` to make Alive not timeout and to fit them all on one page.

Kmeakin added llvm:instcombine missed-optimization labels May 16, 2024

dtcxzyw added the good first issue https://github.com/llvm/llvm-project/contribute label May 17, 2024

dtcxzyw assigned jf-botto May 20, 2024

c8ef mentioned this issue Jun 25, 2024

[InstCombine] factorize max/min using distributivity #96645

Closed

jrose-signal mentioned this issue Jul 23, 2024

Missed optimization combining saturating-add with min #100277

Open

This was referenced Aug 1, 2024

[InstCombine] Factorise add/sub and max/min using distributivity #101507

Open

[InstCombine] Factorise Add and Min/Max using Distributivity #101717

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine] Failure to factorize `add`/`sub` and `max`/`min` using distributivity #92433

[InstCombine] Failure to factorize `add`/`sub` and `max`/`min` using distributivity #92433

Kmeakin commented May 16, 2024

Kmeakin commented May 16, 2024

Uh oh!

dtcxzyw commented May 17, 2024

Uh oh!

llvmbot commented May 17, 2024

Uh oh!

llvmbot commented May 17, 2024

Uh oh!

Kmeakin commented May 17, 2024

Uh oh!

jf-botto commented May 18, 2024

Uh oh!

dtcxzyw commented May 20, 2024

Uh oh!

wizardengineer commented Jul 23, 2024

Uh oh!

jf-botto commented Jul 23, 2024

Uh oh!

nikic commented Jul 23, 2024

Uh oh!

[InstCombine] Failure to factorize add/sub and max/min using distributivity #92433

[InstCombine] Failure to factorize add/sub and max/min using distributivity #92433

Comments

Kmeakin commented May 16, 2024

Kmeakin commented May 16, 2024

Uh oh!

dtcxzyw commented May 17, 2024

Uh oh!

llvmbot commented May 17, 2024

Uh oh!

llvmbot commented May 17, 2024

Uh oh!

Kmeakin commented May 17, 2024

Uh oh!

jf-botto commented May 18, 2024

Uh oh!

dtcxzyw commented May 20, 2024

Uh oh!

wizardengineer commented Jul 23, 2024

Uh oh!

jf-botto commented Jul 23, 2024

Uh oh!

nikic commented Jul 23, 2024

Uh oh!

[InstCombine] Failure to factorize `add`/`sub` and `max`/`min` using distributivity #92433

[InstCombine] Failure to factorize `add`/`sub` and `max`/`min` using distributivity #92433