[AutoBump] Merge with bd56950b (Jan 22) (13) #551

jorickert · 2025-05-21T06:47:02Z

No description provided.

Old entry-point metadata being updated. Nothing is required to account for deprecation as nothing uses the old style

Store exports in SymbolTable instead of Configuration.

Extends what we already do for i1 types and don't serialize vXi1 logical expressions to improve ILP. llvm-test-suite numbers llvm#64840 (comment) indicate that both reassociations are a net win. Fixes llvm#64840 Fixes llvm#63946

Extends NPM pipeline support till PostRegAlloc passes (greedy is in the works)

Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).

This prevents legacy PM from mistakenly removing these analyses if `SILowerWWMCopies` is the last user of them. (it removes dead analyses after its last use)

This was missed during the introduction of select. This also unifies the various .inc files used for each, as they were essentially identical. The __clc_select function is now also built for SPIR-V targets.

The half variants were missing. The integer bitselect builtins weren't going through __clc_bitselect due to an oversight when the CLC version was introduced.

The FEAT_SPEv1p2 feature (known to LLVM as FeatureSPE_EEF and +spe-eef) was incorrectly marked as a required feature of Armv8.7-A (and later), which is incorrect because it is optional, and some CPUs do not implement it. This moves it to the default features list, so that it is still enabled by -march=armv8.7-a, but can be configured individually for each processor. For Cortex-A520 and Cortex-A520AE, I've checked that these do not have any of the FEAT_SPE* features, so updated the tests accordingly. All other Arm-designed v8.7A+ and v9.2A+ CPUs should continue to have it enabled. For Ampere1B and Fujitsu Monaka, these CPUs do not have the feature, so I've removed it from their tests. For Apple M4, I haven't found any reference for whether that CPU should have this feature, so I've added it to the CPU definition to avoid this being a functional change.

…123071) PR llvm#118823 added a DAG combine for extracting elements of a vector returned from SETCC, however it doesn't correctly deal with the case where the vector element type is not i1. In this case we have to take account of the boolean contents, which are represented differently between vectors and scalars. The code now explicitly performs an inreg sign extend in order to get the same result. Fixes llvm#121372

Hopefuly fixes MSVC build after 547bfda.

…est case and fix a memory leak (llvm#123725) Adding SPIRV to LLVM_ALL_TARGETS (llvm#119653) revealed a series of minor compilation problems and sanitizer complaints. This PR is to move unit tests resources (a Module ptr) from the class-scope to a local scope of the class member function to be sure that before the test env is teared down the ptr is released.

This commit promotes the SPIR-V backend from experimental to official status. As a result, SPIR-V will be built by default, simplifying integration and increasing accessibility for downstream projects. Discussion and RFC on Discourse: https://discourse.llvm.org/t/rfc-promoting-spir-v-to-an-official-target/83614 The PR reapplies the original patch llvm#119653 and consecutive llvm#123654, reverted due to buildbot failures.

llvm#116831) SVE2.2 introduces instructions with predicated forms with zeroing of the inactive lanes. This allows in some cases to save a `movprfx` or a `mov` instruction when emitting code for `_x` or `_z` variants of intrinsics. This patch adds support for emitting the zeroing forms of certain `SCVTF`, and `UCVTF` instructions.

llvm#123272) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

llvm#123384) This commit is a follow-up to 99a562b, which migrated some of the mlir-vulkan-runner tests to mlir-cpu-runner using a new pipeline and set of wrappers. That commit could not migrate all the tests, because the existing calling conventions/ABIs for kernel arguments generated by GPUToLLVMConversionPass were not a good fit for the Vulkan runtime. This commit fixes this and migrates the remaining tests. With this commit, mlir-vulkan-runner and many related components are now unused, and they will be removed in a later commit (see llvm#73457). The old calling conventions require both the caller (host LLVM code) and callee (device code) to have compile-time knowledge of the precise argument types. This works for CUDA, ROCm and SYCL, where there is a C-like calling convention agreed between the host and device code, and the runtime passes through arguments as raw data without comprehension. For Vulkan, however, the interface declared by the shader/kernel is in a more abstract form, so the device code has indirect access to the argument data, and the runtime must process the arguments to set up and bind appropriately-sized buffer descriptors. This commit introduces a new calling convention option to meet the Vulkan runtime's needs. It lowers memref arguments to {void*, size_t} pairs, which can be trivially interpreted by the runtime without it needing to know the original argument types. Unlike the stopgap measure in the previous commit, this system can support memrefs of various ranks and element types, which unblocked migrating the remaining tests.

Change scope of resource usage info MC symbols to align with the function linkage type

This resolves llvm#123212

…vm#123493) This renames the `ExportBlockIndentation` option and adds a config parse test, as requested in llvm#110381.

…3585) Update the following intrinsics to have FP8 variants: ``` c svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx); svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm); svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm); svint8_t svtbxq[_s8](svint8_t fallback, svint8_t zn, svuint8_t zm); svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svuzpq2[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq2[_u8](svuint8_t zn, svuint8_t zm); ```

llvm#116832) SVE2.2 introduces instructions with predicated forms with zeroing of the inactive lanes. This allows in some cases to save a `movprfx` or a `mov` instruction when emitting code for `_x` or `_z` variants of intrinsics. This patch adds support for emitting the zeroing forms of certain `CLS`, `CLZ`, `CNT`, `CNOT`, and `NOT` instructions.

llvm#123738) …s isa<> calls in isa<> calls To ease integration with downstream projects. Follow-up to PR llvm#123326.

…23634) Fixes llvm#113926. Fixes llvm#63976.

Summary: This is buggy and is currently being tracked in llvm#123241. For now, replace it with a macro so that we can use address spaces directly.

…nt use vector registers when computing VALU hazard (llvm#123627)

… to `simd` (llvm#122632) Extends conversion support for `loop` directives. This PR handles standalone `loop` constructs that do not have a `bind` clause attached by rewriting them to equivalent `simd` constructs. The reasoning behind that decision is documented in the rewrite function itself.

…#104661) Clang uses a long-time special handling of the case where 3 element vector loads and stores are performed as 4 element, and then a shufflevector is used to extract the used elements. Odd sized vector codegen should now work reasonably well. This patch removes the compiler argument `-fpreserve-vec3-type` and adds a target hook to determine if the special handling of vector type is needed. --------- Co-authored-by: Matt Arsenault <[email protected]>

…premature optimization Later SimplifyDemandedVectorElts calls will simplify any remaining shuffles though the X86ISD::PMULUDQ node. Avoids regression in llvm#123596

@shafik

As stated on the commit by @shafik, the previous patch left in some code from development. This removes it, as it is unreachable.

…llvm#123844) This patch adds several instructions seen when trying to run a executable built with ASan with llvm-mingw. (x86 and x86_64, using the git tip in llvm-project). Also includes instructions collected by Roman Pišl and Eric Pouech in the Wine bug reports below. ``` Related: llvm#96270 Co-authored-by: Roman Pišl <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=50993 https://bugs.winehq.org/attachment.cgi?id=70233 Co-authored-by: Eric Pouech <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=52386 https://bugs.winehq.org/attachment.cgi?id=71626 ```

…module of other units in the same TU See the test for the case. It is similar with llvm@baa5b76

…3772)

…llvm#123709) This patch adds several instructions seen when trying to run a executable built with ASan with llvm-mingw. (x86 and x86_64, using the git tip in llvm-project). Also includes instructions collected by Roman Pišl and Eric Pouech in the Wine bug reports below. ``` Related: llvm#96270 Co-authored-by: Roman Pišl <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=50993 https://bugs.winehq.org/attachment.cgi?id=70233 Co-authored-by: Eric Pouech <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=52386 https://bugs.winehq.org/attachment.cgi?id=71626 ```

@ChuanqiXu9

…lvm#122606) Tunnels `Manger` object into the `ScanningAllProjectModules` so it can be used to perform necessary command-line modifications (which also adds `--resources` path previously added there explicitly). This allows using the experimental C++ modules support with gcc. This was discussed in the issue with @ChuanqiXu9 and @kadircet Closes llvm#112635

```llvm define <16 x i8> @scalar_to_16xi8(i8 %val) { %ret = insertelement <16 x i8> undef, i8 %val, i32 0 ret <16 x i8> %ret } ``` before ```asm addi.d $sp, $sp, -16 st.b $a0, $sp, 0 vld $vr0, $sp, 0 addi.d $sp, $sp, 16 ret ``` after ```asm vinsgr2vr.b $vr0, $a0, 0 ret ``` --------- Co-authored-by: Lu Weining <[email protected]>

…64 (llvm#123363) This PR adds the release note point for LLDB 20, discussed in llvm#104547 (comment) for the same ticket --------- Co-authored-by: David Spickett <[email protected]>

Having the fp16 pragmas enabled in the header file is risky. The macros defined by that header don't (and can't) include the pragmas that make fp16 types themselves legal, and another header may disable the fp16 pragma before the macro's use. The safest thing to do is the use of pragmas surrounding each use of the macro in the implementation files. This pattern is also far more common across the codebase.

…lvm#123219) I've removed the HasUncountableEarlyExit variable, since we can already determine whether or not a loop has an early exit by seeing if we found an uncountable exit. I have also deleted the old UncountableExitingBlocks and UncountableExitBlocks lists and replaced them with a single uncountable edge. This means we don't need to worry about keeping the list entries in sync and makes it clear which exiting block corresponds to which exit block.

llvm#112041 replaced `llvm-mc` with `clang`. The args are now feeding to clang.

…s. (llvm#123914) missing pid_t first argument. Fix llvm#123839

Fixes 0165d04

… when coalescing SUBREG_TO_REG" (llvm#123632)" There's a regression with one of the bootstrap builds for x86. I'll revert this while I investigate. This reverts commit 4df6d3d.

madvise/mprotect/msync/mincore calls with care for signature difference for the latter.

…1454) 0-d vectors are supported now and so these patterns are no longer required. This covers a part of this issue llvm#112913 . Additionally this removes %arg2 in mlir/test/Conversion/GPUCommon/transfer_write.mlir and renames %arg3 to %arg2 as %arg2 was originally not required.

…3350) On PS5, if a custom --sysroot is supplied, `<sysroot>/target/lib` should be added to the library search paths (this already occurs if the default `--sysroot` is not overridden). Until now, this has been hardcoded as a downstream patch in lld. Add it to the driver so that the private patch can be removed. On PS4 the library search paths remain unchanged. The proprietary linker will continue to handle this aspect. On either platform, warn if `<sysroot>/target/lib` is absent. Previously, such warnings were emitted only when the default --sysroot was not overridden. SIE tracker: TOOLCHAIN-16704

…lvm#123787

…vm#121943) Re-write the sema and codegen for the atomic_test_and_set and atomic_clear builtin functions to go via AtomicExpr, like the other atomic builtins do. This simplifies the code, because AtomicExpr already handles things like generating code for to dynamically select the memory ordering, which was duplicated for these builtins. This also fixes a few crash bugs, one when passing an integer to the pointer argument, and one when using an array. This also adds diagnostics for the memory orderings which are not valid for atomic_clear according to https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which were missing before. Fixes llvm#111293. This is a re-land of llvm#120449, modified to allow any non-const pointer type for the first argument.

…3204) This patch makes Arm builtins aware of endianness in VMOVs. Before this patch, the functions' definitions assumed little endian, which made any program compiled for big endian incorrect.

…m#123919) These intrinsics are overloaded. The documentation should not single out the i32 overload.

Add MCA test for jump instructions.

…23882) The flag set in `MCInstrDesc` is not accurate and we should use the result of `MCInstrAnalysis`.

Remove CUDA_127 and CUDA_129 defines incorrectly added in llvm#123398

This patch updates the motivating test for the above PR so that it does not conflict with urem PR llvm#122236

Similarly to llvm#122918, leading comments are currently not being moved. ``` struct Foo { // This one is the cool field. int a; int b; }; ``` becomes: ``` struct Foo { // This one is the cool field. int b; int a; }; ``` but should be: ``` struct Foo { int b; // This one is the cool field. int a; }; ```

Fixes llvm#123669 - Update Dockerfiles to work with the LLVM trunk - Adapt Documentation accordingly - Fix duplicate `-c` flag

…inter (llvm#122088) We currently have ad-hoc filtering logic for temporary object member access in `VisitGSLPointerArg`. This logic filters out more cases than it should, leading to false negatives. Furthermore, this location lacks sufficient context to implement a more accurate solution. This patch refines the filtering logic by moving it to the central filtering location, `analyzePathForGSLPointer`, consolidating the logic and avoiding scattered filtering across multiple places. As a result, the special handling for conditional operators (llvm#120233) is no longer necessary. This change also resolves llvm#120543.

dstutt and others added 30 commits January 21, 2025 09:37

[AMDGPU] Update entry point name for PAL metadata (llvm#123581)

ebc5020

Old entry-point metadata being updated. Nothing is required to account for deprecation as nothing uses the old style

[LLD][COFF] Separate EC and native exports for ARM64X (llvm#123652)

455b3d6

Store exports in SymbolTable instead of Configuration.

[AMDGPU][NewPM] Port SIFixVGPRCopies to NPM (llvm#123592)

9b6e8df

Extends NPM pipeline support till PostRegAlloc passes (greedy is in the works)

[NVPTX] Add support for PTX 8.6 and CUDA 12.6 (12.8) (llvm#123398)

616979e

Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).

[AMDGPU][CodeGen] SILowerWWMCopies: Declare used analyses (llvm#123710)

7acad68

This prevents legacy PM from mistakenly removing these analyses if `SILowerWWMCopies` is the last user of them. (it removes dead analyses after its last use)

[libclc] Route select through __clc_select (llvm#123647)

d96ec48

This was missed during the introduction of select. This also unifies the various .inc files used for each, as they were essentially identical. The __clc_select function is now also built for SPIR-V targets.

[libclc] Route int bitselect through CLC; add half (llvm#123653)

eaf3e1b

The half variants were missing. The integer bitselect builtins weren't going through __clc_bitselect due to an oversight when the CLC version was introduced.

[Clang] Add numeric for iota.

6dc356d

Hopefuly fixes MSVC build after 547bfda.

[X86][AVX10.2-MINMAX][NFC] Remove NE[P] from intrinsic and instruction (

13c6abf

llvm#123272) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965

[AMDGPU] Change scope of resource usage info symbols (llvm#114810)

8294459

Change scope of resource usage info MC symbols to align with the function linkage type

[GISel] Fold shifts to constant result. (llvm#123510)

5d9c717

This resolves llvm#123212

[clang-format] Rename ExportBlockIndentation -> IndentExportBlock (ll…

3365693

…vm#123493) This renames the `ExportBlockIndentation` option and adds a config parse test, as requested in llvm#110381.

[mlir][IR] CommonTypeConstraints: fully qualify low-precision FP type… (

67a412f

llvm#123738) …s isa<> calls in isa<> calls To ease integration with downstream projects. Follow-up to PR llvm#123326.

[include-cleaner] Respect langopts when analyzing macro names (llvm#1…

ec6c344

…23634) Fixes llvm#113926. Fixes llvm#63976.

[OpenMP] Remove usage of pointer-to-member in lookup (llvm#123671)

f233a54

Summary: This is buggy and is currently being tracked in llvm#123241. For now, replace it with a macro so that we can use address spaces directly.

[AMDGPU] Fix crash due to missing check for FLAT instructions that do…

9ca1323

…nt use vector registers when computing VALU hazard (llvm#123627)

[X86] urem-seteq-vec-tautological.ll - regenerate VPTERNLOG comment

5183ec4

[X86] LowerMUL/LowerRotate - avoid undefs in shuffle mask to prevent …

0eb7195

…premature optimization Later SimplifyDemandedVectorElts calls will simplify any remaining shuffles though the X86ISD::PMULUDQ node. Avoids regression in llvm#123596

[OpenACC] Remove unreachable code

13918f5

As stated on the commit by @shafik, the previous patch left in some code from development. This removes it, as it is unreachable.

bernhardu and others added 30 commits January 22, 2025 10:22

[AMDGPU][NewPM] Port SILowerWWMCopies to NPM (llvm#123695)

a343b8e

[C++20] [Modules] Correct the visibility of decls in implicit global …

d2e5103

…module of other units in the same TU See the test for the case. It is similar with llvm@baa5b76

[llvm][Docs] Add lldb user expressions related release notes (llvm#12…

0d24130

…3772)

[llvm][Docs] Release note for LLDB optionally disabled regsets for RV…

ef37c3d

…64 (llvm#123363) This PR adds the release note point for LLDB 20, discussed in llvm#104547 (comment) for the same ticket --------- Co-authored-by: David Spickett <[email protected]>

[Clang] Fix tests broken by 0a9c08c

28c819c

[HIP] [NFC] Rename to ClangArgs

0165d04

llvm#112041 replaced `llvm-mc` with `clang`. The args are now feeding to clang.

[compiler-rt][rtsan] Fix process_vm_readv/process_vm_writev signature…

6123a81

…s. (llvm#123914) missing pid_t first argument. Fix llvm#123839

[HIP] [NFC] Rename to ClangArgs (really)

974f678

Fixes 0165d04

Revert "Reland "RegisterCoalescer: Add implicit-def of super register…

6b1db79

… when coalescing SUBREG_TO_REG" (llvm#123632)" There's a regression with one of the bootstrap builds for x86. I'll revert this while I investigate. This reverts commit 4df6d3d.

[compiler-rt][rtsan] page regions api interception update. (llvm#123601)

c745ece

madvise/mprotect/msync/mincore calls with care for signature difference for the latter.

[X86] fixup-bw-inst.ll - regenerate test checks to simplify diff for l…

58be6fd

…lvm#123787

[compiler-rt] Make Arm builtins aware of endianness in VMOVs (llvm#12…

ffde268

…3204) This patch makes Arm builtins aware of endianness in VMOVs. Before this patch, the functions' definitions assumed little endian, which made any program compiled for big endian incorrect.

[AMDGPU] Remove .i32 suffix from comments documenting intrinsics (llv…

b7423e9

…m#123919) These intrinsics are overloaded. The documentation should not single out the i32 overload.

[RISCV] Add precommit test for llvm#123882

d03fab1

Add MCA test for jump instructions.

[MCA] Use MCInstrAnalysis to analyse call/return instructions (llvm#1…

9d676e2

…23882) The flag set in `MCInstrDesc` is not accurate and we should use the result of `MCInstrAnalysis`.

Remove incorrect CUDA defines (llvm#123898)

97c3a99

Remove CUDA_127 and CUDA_129 defines incorrectly added in llvm#123398

[SLP][NFC] Update test for PR llvm#118055 (llvm#122696)

c6c6475

This patch updates the motivating test for the above PR so that it does not conflict with urem PR llvm#122236

[Tools][Docker] Update Dockerfiles and Docker guide (llvm#123841)

5136c6d

Fixes llvm#123669 - Update Dockerfiles to work with the LLVM trunk - Adapt Documentation accordingly - Fix duplicate `-c` flag

[AutoBump] Merge with bd56950 (Jan 22)

11819d6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AutoBump] Merge with bd56950b (Jan 22) (13) #551

[AutoBump] Merge with bd56950b (Jan 22) (13) #551

Uh oh!

jorickert commented May 21, 2025

Uh oh!

Uh oh!

[AutoBump] Merge with bd56950b (Jan 22) (13) #551

Are you sure you want to change the base?

[AutoBump] Merge with bd56950b (Jan 22) (13) #551

Uh oh!

Conversation

jorickert commented May 21, 2025

Uh oh!

Uh oh!