Skip to content

Enable the gimple verification #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 11, 2021
Merged

Enable the gimple verification #4

merged 1 commit into from
Oct 11, 2021

Conversation

antoyo
Copy link

@antoyo antoyo commented Oct 10, 2021

No description provided.

@antoyo antoyo merged commit 5182fae into master Oct 11, 2021
@antoyo antoyo deleted the enable-verify-gimple branch October 11, 2021 14:26
antoyo pushed a commit that referenced this pull request Jan 9, 2023
With many thanks to H.J. for doing all the hard work, this patch resolves
two P1 regressions; PR target/106933 and PR target/106959.

Although superficially similar, the i386 backend's two scalar-to-vector
(STV) passes perform their transformations in importantly different ways.
The original pass converting SImode and DImode operations to V4SImode
or V2DImode operations is "soft", allowing values to be maintained in
both integer and vector hard registers.  The newer pass converting TImode
operations to V1TImode is "hard" (all or nothing) that converts all uses
of a pseudo to vector form.  To implement this it invokes powerful ju-ju
calling SET_MODE on a reg_rtx, which due to RTL sharing, often updates
this pseudo's mode everywhere in the RTL chain.  Hence, TImode STV can only
be performed when all uses of a pseudo are convertible to V1TImode form.
To ensure this the STV passes currently use data-flow analysis to inspect
all DEFs and USEs in a chain.  This works fine for chains that are in
the usual single assignment form, but the occurrence of uninitialized
variables, or multiple assignments that split a pseudo's usage into
several independent chains (lifetimes) can lead to situations where
some but not all of a pseudo's occurrences need to be updated.  This is
safe for the SImode/DImode pass, but leads to the above bugs during
the TImode pass.

My one minor tweak to HJ's patch from comment #4 of bugzilla PR106959
is to only perform the new single_def_chain_p check for TImode STV; it
turns out that STV of SImode/DImode min/max operates safely on multiple-def
chains, and prohibiting this leads to testsuite regressions.  We don't
(yet) support V1TImode min/max, so this idiom isn't an issue during the
TImode STV pass.

For the record, the two alternate possible fixes are (i) make the TImode
STV pass "soft", by eliminating use of SET_MODE, instead using replace_rtx
with a new pseudo, or (ii) merging "chains" so that multiple DFA
chains/lifetimes are considered a single STV chain.

2022-12-23  H.J. Lu  <[email protected]>
	    Roger Sayle  <[email protected]>

gcc/ChangeLog
	PR target/106933
	PR target/106959
	* config/i386/i386-features.cc (single_def_chain_p): New predicate
	function to check that a pseudo's use-def chain is in SSA form.
	(timode_scalar_to_vector_candidate_p): Check that TImode regs that
	are SET_DEST or SET_SRC of an insn match/are single_def_chain_p.

gcc/testsuite/ChangeLog
	PR target/106933
	PR target/106959
	* gcc.target/i386/pr106933-1.c: New test case.
	* gcc.target/i386/pr106933-2.c: Likewise.
	* gcc.target/i386/pr106959-1.c: Likewise.
	* gcc.target/i386/pr106959-2.c: Likewise.
	* gcc.target/i386/pr106959-3.c: Likewise.
antoyo pushed a commit that referenced this pull request Nov 21, 2024
Whenever C1 and C2 are integer constants, X is of a wrapping type, and
cmp is a relational operator, the expression X +- C1 cmp C2 can be
simplified in the following cases:

(a) If cmp is <= and C2 -+ C1 == +INF(1), we can transform the initial
comparison in the following way:
   X +- C1 <= C2
   -INF <= X +- C1 <= C2 (add left hand side which holds for any X, C1)
   -INF -+ C1 <= X <= C2 -+ C1 (add -+C1 to all 3 expressions)
   -INF -+ C1 <= X <= +INF (due to (1))
   -INF -+ C1 <= X (eliminate the right hand side since it holds for any X)

(b) By analogy, if cmp if >= and C2 -+ C1 == -INF(1), use the following
sequence of transformations:

   X +- C1 >= C2
   +INF >= X +- C1 >= C2 (add left hand side which holds for any X, C1)
   +INF -+ C1 >= X >= C2 -+ C1 (add -+C1 to all 3 expressions)
   +INF -+ C1 >= X >= -INF (due to (1))
   +INF -+ C1 >= X (eliminate the right hand side since it holds for any X)

(c) The > and < cases are negations of (a) and (b), respectively.

This transformation allows to occasionally save add / sub instructions,
for instance the expression

3 + (uint32_t)f() < 2

compiles to

cmn     w0, #4
cset    w0, ls

instead of

add     w0, w0, 3
cmp     w0, 2
cset    w0, ls

on aarch64.

Testcases that go together with this patch have been split into two
separate files, one containing testcases for unsigned variables and the
other for wrapping signed ones (and thus compiled with -fwrapv).
Additionally, one aarch64 test has been adjusted since the patch has
caused the generated code to change from

cmn     w0, #2
csinc   w0, w1, wzr, cc   (x < -2)

to

cmn     w0, #3
csinc   w0, w1, wzr, cs   (x <= -3)

This patch has been bootstrapped and regtested on aarch64, x86_64, and
i386, and additionally regtested on riscv32.

gcc/ChangeLog:

	PR tree-optimization/116024
	* match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr116024-2.c: New test.
	* gcc.dg/tree-ssa/pr116024-2-fwrapv.c: Ditto.
	* gcc.target/aarch64/gtu_to_ltu_cmp_1.c: Adjust.
antoyo pushed a commit that referenced this pull request Nov 21, 2024
Update test case for armv8.1-m.main that supports conditional
arithmetic.

armv7-m:
        push    {r4, lr}
        ldr     r4, .L6
        ldr     r4, [r4]
        lsls    r4, r4, #29
        it      mi
        addmi   r2, r2, #1
        bl      bar
        movs    r0, #0
        pop     {r4, pc}

armv8.1-m.main:
        push    {r3, r4, r5, lr}
        ldr     r4, .L5
        ldr     r5, [r4]
        tst     r5, #4
        csinc   r2, r2, r2, eq
        bl      bar
        movs    r0, #0
        pop     {r3, r4, r5, pc}

gcc/testsuite/ChangeLog:

	* gcc.target/arm/epilog-1.c: Use check-function-bodies.

Signed-off-by: Torbjörn SVENSSON <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant