Skip to content

Commit 24a7980

Browse files
committed
PR target/106933: Limit TImode STV to SSA-like def-use chains on x86.
With many thanks to H.J. for doing all the hard work, this patch resolves two P1 regressions; PR target/106933 and PR target/106959. Although superficially similar, the i386 backend's two scalar-to-vector (STV) passes perform their transformations in importantly different ways. The original pass converting SImode and DImode operations to V4SImode or V2DImode operations is "soft", allowing values to be maintained in both integer and vector hard registers. The newer pass converting TImode operations to V1TImode is "hard" (all or nothing) that converts all uses of a pseudo to vector form. To implement this it invokes powerful ju-ju calling SET_MODE on a reg_rtx, which due to RTL sharing, often updates this pseudo's mode everywhere in the RTL chain. Hence, TImode STV can only be performed when all uses of a pseudo are convertible to V1TImode form. To ensure this the STV passes currently use data-flow analysis to inspect all DEFs and USEs in a chain. This works fine for chains that are in the usual single assignment form, but the occurrence of uninitialized variables, or multiple assignments that split a pseudo's usage into several independent chains (lifetimes) can lead to situations where some but not all of a pseudo's occurrences need to be updated. This is safe for the SImode/DImode pass, but leads to the above bugs during the TImode pass. My one minor tweak to HJ's patch from comment #4 of bugzilla PR106959 is to only perform the new single_def_chain_p check for TImode STV; it turns out that STV of SImode/DImode min/max operates safely on multiple-def chains, and prohibiting this leads to testsuite regressions. We don't (yet) support V1TImode min/max, so this idiom isn't an issue during the TImode STV pass. For the record, the two alternate possible fixes are (i) make the TImode STV pass "soft", by eliminating use of SET_MODE, instead using replace_rtx with a new pseudo, or (ii) merging "chains" so that multiple DFA chains/lifetimes are considered a single STV chain. 2022-12-23 H.J. Lu <[email protected]> Roger Sayle <[email protected]> gcc/ChangeLog PR target/106933 PR target/106959 * config/i386/i386-features.cc (single_def_chain_p): New predicate function to check that a pseudo's use-def chain is in SSA form. (timode_scalar_to_vector_candidate_p): Check that TImode regs that are SET_DEST or SET_SRC of an insn match/are single_def_chain_p. gcc/testsuite/ChangeLog PR target/106933 PR target/106959 * gcc.target/i386/pr106933-1.c: New test case. * gcc.target/i386/pr106933-2.c: Likewise. * gcc.target/i386/pr106959-1.c: Likewise. * gcc.target/i386/pr106959-2.c: Likewise. * gcc.target/i386/pr106959-3.c: Likewise.
1 parent db3c583 commit 24a7980

File tree

6 files changed

+122
-0
lines changed

6 files changed

+122
-0
lines changed

gcc/config/i386/i386-features.cc

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1756,6 +1756,19 @@ pseudo_reg_set (rtx_insn *insn)
17561756
return set;
17571757
}
17581758

1759+
/* Return true if the register REG is defined in a single DEF chain.
1760+
If it is defined in more than one DEF chains, we may not be able
1761+
to convert it in all chains. */
1762+
1763+
static bool
1764+
single_def_chain_p (rtx reg)
1765+
{
1766+
df_ref ref = DF_REG_DEF_CHAIN (REGNO (reg));
1767+
if (!ref)
1768+
return false;
1769+
return DF_REF_NEXT_REG (ref) == nullptr;
1770+
}
1771+
17591772
/* Check if comparison INSN may be transformed into vector comparison.
17601773
Currently we transform equality/inequality checks which look like:
17611774
(set (reg:CCZ 17 flags) (compare:CCZ (reg:TI x) (reg:TI y))) */
@@ -1972,9 +1985,14 @@ timode_scalar_to_vector_candidate_p (rtx_insn *insn)
19721985
&& !TARGET_SSE_UNALIGNED_STORE_OPTIMAL)
19731986
return false;
19741987

1988+
if (REG_P (dst) && !single_def_chain_p (dst))
1989+
return false;
1990+
19751991
switch (GET_CODE (src))
19761992
{
19771993
case REG:
1994+
return single_def_chain_p (src);
1995+
19781996
case CONST_WIDE_INT:
19791997
return true;
19801998

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
/* { dg-do compile { target int128 } } */
2+
/* { dg-options "-O2" } */
3+
4+
short int
5+
bar (void);
6+
7+
__int128
8+
empty (void)
9+
{
10+
}
11+
12+
__attribute__ ((simd)) int
13+
foo (__int128 *p)
14+
{
15+
int a = 0x80000000;
16+
17+
*p = empty ();
18+
19+
return *p == (a < bar ());
20+
}
21+
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
/* { dg-do compile { target int128 } } */
2+
/* { dg-options "-msse4 -Os" } */
3+
4+
__int128 n;
5+
6+
__int128
7+
empty (void)
8+
{
9+
}
10+
11+
int
12+
foo (void)
13+
{
14+
n = empty ();
15+
16+
return n == 0;
17+
}
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
/* { dg-do compile { target int128 } } */
2+
/* { dg-options "-msse4 -O2 -fno-tree-loop-im --param max-combine-insns=2 -Wno-shift-count-overflow" } */
3+
4+
unsigned __int128 n;
5+
6+
int
7+
foo (int x)
8+
{
9+
__int128 a = 0;
10+
int b = !!(n * 2);
11+
12+
while (x < 2)
13+
{
14+
if (a)
15+
{
16+
if (n)
17+
n ^= 1;
18+
else
19+
x <<= 32;
20+
}
21+
22+
a = 1;
23+
}
24+
25+
return b;
26+
}
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
/* { dg-do compile { target int128 } } */
2+
/* { dg-options "-msse4 -O2 -fno-tree-loop-im -Wno-shift-count-overflow" } */
3+
4+
unsigned __int128 n;
5+
6+
int
7+
foo (int x)
8+
{
9+
__int128 a = 0;
10+
int b = !!(n * 2);
11+
12+
while (x < 2)
13+
{
14+
if (a)
15+
{
16+
if (n)
17+
n ^= 1;
18+
else
19+
x <<= 32;
20+
}
21+
22+
a = 1;
23+
}
24+
25+
return b;
26+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
/* { dg-do compile { target int128 } } */
2+
/* { dg-options "-O2 -fpeel-loops" } */
3+
4+
unsigned __int128 m;
5+
int n;
6+
7+
__attribute__ ((simd)) void
8+
foo (int x)
9+
{
10+
x = n ? n : (short int) x;
11+
if (x)
12+
m /= 2;
13+
}
14+

0 commit comments

Comments
 (0)