forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
[AutoBump] Merge with bd56950b (Jan 22) (13) #551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jorickert
wants to merge
202
commits into
bump_to_67b9d3ff
Choose a base branch
from
bump_to_bd56950b
base: bump_to_67b9d3ff
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Old entry-point metadata being updated. Nothing is required to account for deprecation as nothing uses the old style
Store exports in SymbolTable instead of Configuration.
Extends what we already do for i1 types and don't serialize vXi1 logical expressions to improve ILP. llvm-test-suite numbers llvm#64840 (comment) indicate that both reassociations are a net win. Fixes llvm#64840 Fixes llvm#63946
Extends NPM pipeline support till PostRegAlloc passes (greedy is in the works)
Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).
This prevents legacy PM from mistakenly removing these analyses if `SILowerWWMCopies` is the last user of them. (it removes dead analyses after its last use)
This was missed during the introduction of select. This also unifies the various .inc files used for each, as they were essentially identical. The __clc_select function is now also built for SPIR-V targets.
The half variants were missing. The integer bitselect builtins weren't going through __clc_bitselect due to an oversight when the CLC version was introduced.
The FEAT_SPEv1p2 feature (known to LLVM as FeatureSPE_EEF and +spe-eef) was incorrectly marked as a required feature of Armv8.7-A (and later), which is incorrect because it is optional, and some CPUs do not implement it. This moves it to the default features list, so that it is still enabled by -march=armv8.7-a, but can be configured individually for each processor. For Cortex-A520 and Cortex-A520AE, I've checked that these do not have any of the FEAT_SPE* features, so updated the tests accordingly. All other Arm-designed v8.7A+ and v9.2A+ CPUs should continue to have it enabled. For Ampere1B and Fujitsu Monaka, these CPUs do not have the feature, so I've removed it from their tests. For Apple M4, I haven't found any reference for whether that CPU should have this feature, so I've added it to the CPU definition to avoid this being a functional change.
…123071) PR llvm#118823 added a DAG combine for extracting elements of a vector returned from SETCC, however it doesn't correctly deal with the case where the vector element type is not i1. In this case we have to take account of the boolean contents, which are represented differently between vectors and scalars. The code now explicitly performs an inreg sign extend in order to get the same result. Fixes llvm#121372
Hopefuly fixes MSVC build after 547bfda.
…est case and fix a memory leak (llvm#123725) Adding SPIRV to LLVM_ALL_TARGETS (llvm#119653) revealed a series of minor compilation problems and sanitizer complaints. This PR is to move unit tests resources (a Module ptr) from the class-scope to a local scope of the class member function to be sure that before the test env is teared down the ptr is released.
This commit promotes the SPIR-V backend from experimental to official status. As a result, SPIR-V will be built by default, simplifying integration and increasing accessibility for downstream projects. Discussion and RFC on Discourse: https://discourse.llvm.org/t/rfc-promoting-spir-v-to-an-official-target/83614 The PR reapplies the original patch llvm#119653 and consecutive llvm#123654, reverted due to buildbot failures.
llvm#116831) SVE2.2 introduces instructions with predicated forms with zeroing of the inactive lanes. This allows in some cases to save a `movprfx` or a `mov` instruction when emitting code for `_x` or `_z` variants of intrinsics. This patch adds support for emitting the zeroing forms of certain `SCVTF`, and `UCVTF` instructions.
llvm#123384) This commit is a follow-up to 99a562b, which migrated some of the mlir-vulkan-runner tests to mlir-cpu-runner using a new pipeline and set of wrappers. That commit could not migrate all the tests, because the existing calling conventions/ABIs for kernel arguments generated by GPUToLLVMConversionPass were not a good fit for the Vulkan runtime. This commit fixes this and migrates the remaining tests. With this commit, mlir-vulkan-runner and many related components are now unused, and they will be removed in a later commit (see llvm#73457). The old calling conventions require both the caller (host LLVM code) and callee (device code) to have compile-time knowledge of the precise argument types. This works for CUDA, ROCm and SYCL, where there is a C-like calling convention agreed between the host and device code, and the runtime passes through arguments as raw data without comprehension. For Vulkan, however, the interface declared by the shader/kernel is in a more abstract form, so the device code has indirect access to the argument data, and the runtime must process the arguments to set up and bind appropriately-sized buffer descriptors. This commit introduces a new calling convention option to meet the Vulkan runtime's needs. It lowers memref arguments to {void*, size_t} pairs, which can be trivially interpreted by the runtime without it needing to know the original argument types. Unlike the stopgap measure in the previous commit, this system can support memrefs of various ranks and element types, which unblocked migrating the remaining tests.
Change scope of resource usage info MC symbols to align with the function linkage type
…vm#123493) This renames the `ExportBlockIndentation` option and adds a config parse test, as requested in llvm#110381.
…3585) Update the following intrinsics to have FP8 variants: ``` c svuint8_t svdup_laneq[_u8](svuint8_t zn, uint64_t imm_idx); svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm); svint8_t svtblq[_s8](svint8_t zn, svuint8_t zm); svint8_t svtbxq[_s8](svint8_t fallback, svint8_t zn, svuint8_t zm); svuint8_t svuzpq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svuzpq2[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq1[_u8](svuint8_t zn, svuint8_t zm); svuint8_t svzipq2[_u8](svuint8_t zn, svuint8_t zm); ```
llvm#116832) SVE2.2 introduces instructions with predicated forms with zeroing of the inactive lanes. This allows in some cases to save a `movprfx` or a `mov` instruction when emitting code for `_x` or `_z` variants of intrinsics. This patch adds support for emitting the zeroing forms of certain `CLS`, `CLZ`, `CNT`, `CNOT`, and `NOT` instructions.
llvm#123738) …s isa<> calls in isa<> calls To ease integration with downstream projects. Follow-up to PR llvm#123326.
Summary: This is buggy and is currently being tracked in llvm#123241. For now, replace it with a macro so that we can use address spaces directly.
…nt use vector registers when computing VALU hazard (llvm#123627)
… to `simd` (llvm#122632) Extends conversion support for `loop` directives. This PR handles standalone `loop` constructs that do not have a `bind` clause attached by rewriting them to equivalent `simd` constructs. The reasoning behind that decision is documented in the rewrite function itself.
…#104661) Clang uses a long-time special handling of the case where 3 element vector loads and stores are performed as 4 element, and then a shufflevector is used to extract the used elements. Odd sized vector codegen should now work reasonably well. This patch removes the compiler argument `-fpreserve-vec3-type` and adds a target hook to determine if the special handling of vector type is needed. --------- Co-authored-by: Matt Arsenault <[email protected]>
…premature optimization Later SimplifyDemandedVectorElts calls will simplify any remaining shuffles though the X86ISD::PMULUDQ node. Avoids regression in llvm#123596
As stated on the commit by @shafik, the previous patch left in some code from development. This removes it, as it is unreachable.
…llvm#123844) This patch adds several instructions seen when trying to run a executable built with ASan with llvm-mingw. (x86 and x86_64, using the git tip in llvm-project). Also includes instructions collected by Roman Pišl and Eric Pouech in the Wine bug reports below. ``` Related: llvm#96270 Co-authored-by: Roman Pišl <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=50993 https://bugs.winehq.org/attachment.cgi?id=70233 Co-authored-by: Eric Pouech <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=52386 https://bugs.winehq.org/attachment.cgi?id=71626 ```
…module of other units in the same TU See the test for the case. It is similar with llvm@baa5b76
…llvm#123709) This patch adds several instructions seen when trying to run a executable built with ASan with llvm-mingw. (x86 and x86_64, using the git tip in llvm-project). Also includes instructions collected by Roman Pišl and Eric Pouech in the Wine bug reports below. ``` Related: llvm#96270 Co-authored-by: Roman Pišl <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=50993 https://bugs.winehq.org/attachment.cgi?id=70233 Co-authored-by: Eric Pouech <[email protected]> https://bugs.winehq.org/show_bug.cgi?id=52386 https://bugs.winehq.org/attachment.cgi?id=71626 ```
…lvm#122606) Tunnels `Manger` object into the `ScanningAllProjectModules` so it can be used to perform necessary command-line modifications (which also adds `--resources` path previously added there explicitly). This allows using the experimental C++ modules support with gcc. This was discussed in the issue with @ChuanqiXu9 and @kadircet Closes llvm#112635
```llvm define <16 x i8> @scalar_to_16xi8(i8 %val) { %ret = insertelement <16 x i8> undef, i8 %val, i32 0 ret <16 x i8> %ret } ``` before ```asm addi.d $sp, $sp, -16 st.b $a0, $sp, 0 vld $vr0, $sp, 0 addi.d $sp, $sp, 16 ret ``` after ```asm vinsgr2vr.b $vr0, $a0, 0 ret ``` --------- Co-authored-by: Lu Weining <[email protected]>
…64 (llvm#123363) This PR adds the release note point for LLDB 20, discussed in llvm#104547 (comment) for the same ticket --------- Co-authored-by: David Spickett <[email protected]>
Having the fp16 pragmas enabled in the header file is risky. The macros defined by that header don't (and can't) include the pragmas that make fp16 types themselves legal, and another header may disable the fp16 pragma before the macro's use. The safest thing to do is the use of pragmas surrounding each use of the macro in the implementation files. This pattern is also far more common across the codebase.
…lvm#123219) I've removed the HasUncountableEarlyExit variable, since we can already determine whether or not a loop has an early exit by seeing if we found an uncountable exit. I have also deleted the old UncountableExitingBlocks and UncountableExitBlocks lists and replaced them with a single uncountable edge. This means we don't need to worry about keeping the list entries in sync and makes it clear which exiting block corresponds to which exit block.
llvm#112041 replaced `llvm-mc` with `clang`. The args are now feeding to clang.
…s. (llvm#123914) missing pid_t first argument. Fix llvm#123839
… when coalescing SUBREG_TO_REG" (llvm#123632)" There's a regression with one of the bootstrap builds for x86. I'll revert this while I investigate. This reverts commit 4df6d3d.
madvise/mprotect/msync/mincore calls with care for signature difference for the latter.
…1454) 0-d vectors are supported now and so these patterns are no longer required. This covers a part of this issue llvm#112913 . Additionally this removes %arg2 in mlir/test/Conversion/GPUCommon/transfer_write.mlir and renames %arg3 to %arg2 as %arg2 was originally not required.
…3350) On PS5, if a custom --sysroot is supplied, `<sysroot>/target/lib` should be added to the library search paths (this already occurs if the default `--sysroot` is not overridden). Until now, this has been hardcoded as a downstream patch in lld. Add it to the driver so that the private patch can be removed. On PS4 the library search paths remain unchanged. The proprietary linker will continue to handle this aspect. On either platform, warn if `<sysroot>/target/lib` is absent. Previously, such warnings were emitted only when the default --sysroot was not overridden. SIE tracker: TOOLCHAIN-16704
…vm#121943) Re-write the sema and codegen for the atomic_test_and_set and atomic_clear builtin functions to go via AtomicExpr, like the other atomic builtins do. This simplifies the code, because AtomicExpr already handles things like generating code for to dynamically select the memory ordering, which was duplicated for these builtins. This also fixes a few crash bugs, one when passing an integer to the pointer argument, and one when using an array. This also adds diagnostics for the memory orderings which are not valid for atomic_clear according to https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html, which were missing before. Fixes llvm#111293. This is a re-land of llvm#120449, modified to allow any non-const pointer type for the first argument.
…3204) This patch makes Arm builtins aware of endianness in VMOVs. Before this patch, the functions' definitions assumed little endian, which made any program compiled for big endian incorrect.
…m#123919) These intrinsics are overloaded. The documentation should not single out the i32 overload.
Add MCA test for jump instructions.
…23882) The flag set in `MCInstrDesc` is not accurate and we should use the result of `MCInstrAnalysis`.
Remove CUDA_127 and CUDA_129 defines incorrectly added in llvm#123398
This patch updates the motivating test for the above PR so that it does not conflict with urem PR llvm#122236
Similarly to llvm#122918, leading comments are currently not being moved. ``` struct Foo { // This one is the cool field. int a; int b; }; ``` becomes: ``` struct Foo { // This one is the cool field. int b; int a; }; ``` but should be: ``` struct Foo { int b; // This one is the cool field. int a; }; ```
Fixes llvm#123669 - Update Dockerfiles to work with the LLVM trunk - Adapt Documentation accordingly - Fix duplicate `-c` flag
…inter (llvm#122088) We currently have ad-hoc filtering logic for temporary object member access in `VisitGSLPointerArg`. This logic filters out more cases than it should, leading to false negatives. Furthermore, this location lacks sufficient context to implement a more accurate solution. This patch refines the filtering logic by moving it to the central filtering location, `analyzePathForGSLPointer`, consolidating the logic and avoiding scattered filtering across multiple places. As a result, the special handling for conditional operators (llvm#120233) is no longer necessary. This change also resolves llvm#120543.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.