Skip to content

[AutoBump] Merge with bd56950b (Jan 22) (13) #551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 202 commits into
base: bump_to_67b9d3ff
Choose a base branch
from

Conversation

jorickert
Copy link

No description provided.

dstutt and others added 30 commits January 21, 2025 09:37
Old entry-point metadata being updated. Nothing is required
to account for deprecation as nothing uses the old style
Store exports in SymbolTable instead of Configuration.
Extends what we already do for i1 types and don't serialize vXi1 logical expressions to improve ILP.

llvm-test-suite numbers
llvm#64840 (comment)
indicate that both reassociations are a net win.

Fixes llvm#64840
Fixes llvm#63946
Extends NPM pipeline support till PostRegAlloc passes (greedy is in the
works)
Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).
This prevents legacy PM from mistakenly removing these analyses if
`SILowerWWMCopies` is the last user of them. (it removes dead analyses
after its last use)
This was missed during the introduction of select. This also unifies the
various .inc files used for each, as they were essentially identical.

The __clc_select function is now also built for SPIR-V targets.
The half variants were missing. The integer bitselect builtins weren't
going through __clc_bitselect due to an oversight when the CLC version
was introduced.
The FEAT_SPEv1p2 feature (known to LLVM as FeatureSPE_EEF and +spe-eef)
was incorrectly marked as a required feature of Armv8.7-A (and later),
which is incorrect because it is optional, and some CPUs do not
implement it. This moves it to the default features list, so that it is
still enabled by -march=armv8.7-a, but can be configured individually
for each processor.

For Cortex-A520 and Cortex-A520AE, I've checked that these do not have any of
the FEAT_SPE* features, so updated the tests accordingly. All other
Arm-designed v8.7A+ and v9.2A+ CPUs should continue to have it enabled. For
Ampere1B and Fujitsu Monaka, these CPUs do not have the feature, so I've
removed it from their tests. For Apple M4, I haven't found any reference for
whether that CPU should have this feature, so I've added it to the CPU
definition to avoid this being a functional change.
…123071)

PR llvm#118823 added a
DAG combine for extracting elements of a vector returned from
SETCC, however it doesn't correctly deal with the case where
the vector element type is not i1. In this case we have to
take account of the boolean contents, which are represented
differently between vectors and scalars. The code now
explicitly performs an inreg sign extend in order to get the
same result.

Fixes llvm#121372
Hopefuly fixes MSVC build after 547bfda.
…est case and fix a memory leak (llvm#123725)

Adding SPIRV to LLVM_ALL_TARGETS
(llvm#119653) revealed a series of
minor compilation problems and sanitizer complaints. This PR is to move
unit tests resources (a Module ptr) from the class-scope to a local
scope of the class member function to be sure that before the test env
is teared down the ptr is released.
This commit promotes the SPIR-V backend from experimental to official
status. As a result, SPIR-V will be built by default, simplifying
integration and increasing accessibility for downstream projects.

Discussion and RFC on Discourse:
https://discourse.llvm.org/t/rfc-promoting-spir-v-to-an-official-target/83614

The PR reapplies the original patch
llvm#119653 and consecutive
llvm#123654, reverted due to
buildbot failures.
llvm#116831)

SVE2.2 introduces instructions with predicated forms with zeroing of
the inactive lanes. This allows in some cases to save a `movprfx` or
a `mov` instruction when emitting code for `_x` or `_z` variants of
intrinsics.

This patch adds support for emitting the zeroing forms of certain
`SCVTF`, and `UCVTF` instructions.
llvm#123384)

This commit is a follow-up to 99a562b,
which migrated some of the mlir-vulkan-runner tests to mlir-cpu-runner
using a new pipeline and set of wrappers. That commit could not migrate
all the tests, because the existing calling conventions/ABIs for kernel
arguments generated by GPUToLLVMConversionPass were not a good fit for
the Vulkan runtime. This commit fixes this and migrates the remaining
tests. With this commit, mlir-vulkan-runner and many related components
are now unused, and they will be removed in a later commit (see llvm#73457).

The old calling conventions require both the caller (host LLVM code) and
callee (device code) to have compile-time knowledge of the precise
argument types. This works for CUDA, ROCm and SYCL, where there is a
C-like calling convention agreed between the host and device code, and
the runtime passes through arguments as raw data without comprehension.
For Vulkan, however, the interface declared by the shader/kernel is in a
more abstract form, so the device code has indirect access to the
argument data, and the runtime must process the arguments to set up and
bind appropriately-sized buffer descriptors.

This commit introduces a new calling convention option to meet the
Vulkan runtime's needs. It lowers memref arguments to {void*, size_t}
pairs, which can be trivially interpreted by the runtime without it
needing to know the original argument types. Unlike the stopgap measure
in the previous commit, this system can support memrefs of various ranks
and element types, which unblocked migrating the remaining tests.
Change scope of resource usage info MC symbols to align with the function linkage type
…vm#123493)

This renames the `ExportBlockIndentation` option and adds a config parse
test, as requested in llvm#110381.
…3585)

Update the following intrinsics to have FP8 variants:
``` c
svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx);
svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm);
svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm);
svint8_t svtbxq[_s8](svint8_t fallback, svint8_t zn, svuint8_t zm);
svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svuzpq2[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svzipq1[_u8](svuint8_t zn, svuint8_t zm);
svuint8_t svzipq2[_u8](svuint8_t zn, svuint8_t zm);
```
llvm#116832)

SVE2.2 introduces instructions with predicated forms with zeroing of
the inactive lanes. This allows in some cases to save a `movprfx` or
a `mov` instruction when emitting code for `_x` or `_z` variants of
intrinsics.

This patch adds support for emitting the zeroing forms of certain
`CLS`, `CLZ`, `CNT`, `CNOT`, and `NOT` instructions.
llvm#123738)

…s isa<> calls in isa<> calls

To ease integration with downstream projects.

Follow-up to PR llvm#123326.
Summary:
This is buggy and is currently being tracked in
llvm#123241. For now, replace it
with a macro so that we can use address spaces directly.
…nt use vector registers when computing VALU hazard (llvm#123627)
… to `simd` (llvm#122632)

Extends conversion support for `loop` directives. This PR handles
standalone `loop` constructs that do not have a `bind` clause attached
by rewriting them to equivalent `simd` constructs. The reasoning behind
that decision is documented in the rewrite function itself.
…#104661)

Clang uses a long-time special handling of the case where 3 element
vector loads and stores are performed as 4 element, and then a
shufflevector is used to extract the used elements. Odd sized vector
codegen should now work reasonably well.

This patch removes the compiler argument `-fpreserve-vec3-type` and adds
a target hook to determine if the special handling of vector type is
needed.

---------

Co-authored-by: Matt Arsenault <[email protected]>
…premature optimization

Later SimplifyDemandedVectorElts calls will simplify any remaining shuffles though the X86ISD::PMULUDQ node.

Avoids regression in llvm#123596
As stated on the commit by @shafik, the previous patch left in some code
from development.  This removes it, as it is unreachable.
bernhardu and others added 30 commits January 22, 2025 10:22
…llvm#123844)

This patch adds several instructions seen when trying to run a
executable built with ASan with llvm-mingw.
(x86 and x86_64, using the git tip in llvm-project).

Also includes instructions collected by
Roman Pišl and Eric Pouech in the Wine bug reports below.
```
Related: llvm#96270

Co-authored-by: Roman Pišl <[email protected]>
                https://bugs.winehq.org/show_bug.cgi?id=50993
                https://bugs.winehq.org/attachment.cgi?id=70233
Co-authored-by: Eric Pouech <[email protected]>
                https://bugs.winehq.org/show_bug.cgi?id=52386
                https://bugs.winehq.org/attachment.cgi?id=71626
```
…module of other units in the same TU

See the test for the case. It is similar with
llvm@baa5b76
…llvm#123709)

This patch adds several instructions seen when trying to run a
executable built with ASan with llvm-mingw.
(x86 and x86_64, using the git tip in llvm-project).

Also includes instructions collected by
Roman Pišl and Eric Pouech in the Wine bug reports below.

```
Related: llvm#96270

Co-authored-by: Roman Pišl <[email protected]>
                https://bugs.winehq.org/show_bug.cgi?id=50993
                https://bugs.winehq.org/attachment.cgi?id=70233
Co-authored-by: Eric Pouech <[email protected]>
                https://bugs.winehq.org/show_bug.cgi?id=52386
                https://bugs.winehq.org/attachment.cgi?id=71626
```
…lvm#122606)

Tunnels `Manger` object into the `ScanningAllProjectModules` so it can
be used to perform necessary command-line modifications (which also adds
`--resources` path previously added there explicitly). This allows using
the experimental C++ modules support with gcc.

This was discussed in the issue with @ChuanqiXu9 and @kadircet

Closes llvm#112635
```llvm
define <16 x i8> @scalar_to_16xi8(i8 %val) {
  %ret = insertelement <16 x i8> undef, i8 %val, i32 0
  ret <16 x i8> %ret
}
```

before
```asm
addi.d	$sp, $sp, -16
st.b	$a0, $sp, 0
vld	$vr0, $sp, 0
addi.d	$sp, $sp, 16
ret
```

after
```asm
vinsgr2vr.b $vr0, $a0, 0
ret
```

---------

Co-authored-by: Lu Weining <[email protected]>
…64 (llvm#123363)

This PR adds the release note point for LLDB 20, discussed in
llvm#104547 (comment)
for the same ticket

---------

Co-authored-by: David Spickett <[email protected]>
Having the fp16 pragmas enabled in the header file is risky. The macros
defined by that header don't (and can't) include the pragmas that make
fp16 types themselves legal, and another header may disable the fp16
pragma before the macro's use.

The safest thing to do is the use of pragmas surrounding each use of the
macro in the implementation files. This pattern is also far more common
across the codebase.
…lvm#123219)

I've removed the HasUncountableEarlyExit variable, since we can
already determine whether or not a loop has an early exit by seeing
if we found an uncountable exit.

I have also deleted the old UncountableExitingBlocks and
UncountableExitBlocks lists and replaced them with a single
uncountable edge. This means we don't need to worry about keeping the
list entries in sync and makes it clear which exiting block
corresponds to which exit block.
llvm#112041 replaced `llvm-mc` with `clang`.
The args are now feeding to clang.
… when coalescing SUBREG_TO_REG" (llvm#123632)"

There's a regression with one of the bootstrap builds for x86.
I'll revert this while I investigate.

This reverts commit 4df6d3d.
madvise/mprotect/msync/mincore calls with care for signature difference
for the latter.
…1454)

0-d vectors are supported now and so these patterns are no longer
required. This covers a part of this issue
llvm#112913 . Additionally this
removes %arg2 in mlir/test/Conversion/GPUCommon/transfer_write.mlir and
renames %arg3 to %arg2 as %arg2 was originally not required.
…3350)

On PS5, if a custom --sysroot is supplied, `<sysroot>/target/lib` should
be added to the library search paths (this already occurs if the default
`--sysroot` is not overridden). Until now, this has been hardcoded as a
downstream patch in lld. Add it to the driver so that the private patch
can be removed.

On PS4 the library search paths remain unchanged. The proprietary linker
will continue to handle this aspect.

On either platform, warn if `<sysroot>/target/lib` is absent.
Previously, such warnings were emitted only when the default --sysroot
was not overridden.

SIE tracker: TOOLCHAIN-16704
…vm#121943)

Re-write the sema and codegen for the atomic_test_and_set and
atomic_clear builtin functions to go via AtomicExpr, like the other
atomic builtins do. This simplifies the code, because AtomicExpr already
handles things like generating code for to dynamically select the memory
ordering, which was duplicated for these builtins. This also fixes a few
crash bugs, one when passing an integer to the pointer argument, and one
when using an array.

This also adds diagnostics for the memory orderings which are not valid
for atomic_clear according to
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which
were missing before.

Fixes llvm#111293.

This is a re-land of llvm#120449, modified to allow any non-const pointer
type for the first argument.
…3204)

This patch makes Arm builtins aware of endianness in VMOVs.

Before this patch, the functions' definitions assumed little endian,
which made any program compiled for big endian incorrect.
…m#123919)

These intrinsics are overloaded. The documentation should not single out
the i32 overload.
Add MCA test for jump instructions.
…23882)

The flag set in `MCInstrDesc` is not accurate and we should use
the result of `MCInstrAnalysis`.
Remove CUDA_127 and CUDA_129 defines incorrectly added in
llvm#123398
This patch updates the motivating test for the above PR so that it does
not conflict with urem PR llvm#122236
Similarly to llvm#122918, leading
comments are currently not being moved.

```
struct Foo {
  // This one is the cool field.
  int a;
  int b;
};
```

becomes:

```
struct Foo {
  // This one is the cool field.
  int b;
  int a;
};
```

but should be:

```
struct Foo {
  int b;
  // This one is the cool field.
  int a;
};
```
Fixes llvm#123669

- Update Dockerfiles to work with the LLVM trunk
- Adapt Documentation accordingly
- Fix duplicate `-c` flag
…inter (llvm#122088)

We currently have ad-hoc filtering logic for temporary object member
access in `VisitGSLPointerArg`. This logic filters out more cases than
it should, leading to false negatives. Furthermore, this location lacks
sufficient context to implement a more accurate solution.

This patch refines the filtering logic by moving it to the central
filtering location, `analyzePathForGSLPointer`, consolidating the logic
and avoiding scattered filtering across multiple places. As a result,
the special handling for conditional operators (llvm#120233) is no longer
necessary.

This change also resolves llvm#120543.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.