forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 3
Flaub sync #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Flaub sync #23
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Optimize some specific immediates selection by materializing them with sub/mvn instructions as opposed to loading them from the constant pool. Patch by Ben Shi, [email protected]. Differential Revision: https://reviews.llvm.org/D83745
I have added tests to: CodeGen/AArch64/sve-intrinsics-int-arith.ll for doing simple integer add operations on tuple types. Since these tests introduced new warnings due to incorrect use of getVectorNumElements() I have also fixed up these warnings in the same patch. These fixes are: 1. In narrowExtractedVectorBinOp I have changed the code to bail out early for scalable vector types, since we've not yet hit a case that proves the optimisations are profitable for scalable vectors. 2. In DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS I have replaced calls to getVectorNumElements with getVectorMinNumElements in cases that work with scalable vectors. For the other cases I have added asserts that the vector is not scalable because we should not be using shuffle vectors and build vectors in such cases. Differential revision: https://reviews.llvm.org/D84016
Currently, getCastInstrCost has limited information about the cast it's rating, often just the opcode and types. Sometimes there is a context instruction as well, but it isn't trustworthy: for instance, when the vectorizer is rating a plan, it calls getCastInstrCost with the old instructions when, in fact, it's trying to evaluate the cost of the instruction post-vectorization. Thus, the current system can get the cost of certain casts incorrect as the correct cost can vary greatly based on the context in which it's used. For example, if the vectorizer queries getCastInstrCost to evaluate the cost of a sext(load) with tail predication enabled, getCastInstrCost will think it's free most of the time, but it's not always free. On ARM MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar situations can come up with how masked loads can be extended when being split. To fix that, this path adds a new parameter to getCastInstrCost to give it a hint about the context of the cast. It adds a CastContextHint enum which contains the type of the load/store being created by the vectorizer - one for each of the types it can produce. Original patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79162
… masked stores This patch uses the feature added in D79162 to fix the cost of a sext/zext of a masked load, or a trunc for a masked store. Previously, those were considered cheap or even free, but it's not the case as we cannot split the load in the same way we would for normal loads. This updates the costs to better reflect reality, and adds a test for it in test/Analysis/CostModel/ARM/cast.ll. It also adds a vectorizer test that showcases the improvement: in some cases, the vectorizer will now choose a smaller VF when tail-predication is enabled, which results in better codegen. (Because if it were to use a higher VF in those cases, the code we see above would be generated, and the vmovs would block tail-predication later in the process, resulting in very poor codegen overall) Original Patch by Pierre van Houtryve Differential Revision: https://reviews.llvm.org/D79163
`std.dim` currently only accepts ranked memrefs and `std.rank` is limited to tensors. Differential Revision: https://reviews.llvm.org/D84790
…InstrCost variant This will simplify target overrides, and matches what we do for most integer intrinsic costs.
Differential Revision: https://reviews.llvm.org/D84749
…s used A list of target features is disabled when there is no hardware floating-point support. This is the case when one of the following options is passed to clang: - -mfloat-abi=soft - -mfpu=none This option list is missing, however, the extension "+nofp" that can be specified in -march flags, such as "-march=armv8-a+nofp". This patch also disables unsupported target features when nofp is passed to -march. Differential Revision: https://reviews.llvm.org/D82948
Fix testcase introduced in d1a3396.
There's a slight difference in functionality with the new CHECK lines: before, we allowed either -0.0 or 0.0 for maxnum/minnum. That matches the definition, but we should always get a deterministic result from constant folding within the compiler, so now we assert that we got the single expected result in all cases.
Differential Revision: https://reviews.llvm.org/D84832
In general Decl::getASTContext() is relatively expensive and here the changes are non-invasive. NFC.
The lowering does not support all types for its source operations. This change makes the patterns fail in a well-defined manner. Differential Revision: https://reviews.llvm.org/D84443
This patch teaches SCEVExpander to directly preserve LCSSA. As it is currently, SCEV does not look through PHI nodes in loops, as it might break LCSSA form. Once SCEVExpander can preserve LCSSA form, it should be safe for SCEV to look through PHIs. To preserve LCSSA form, this patch uses formLCSSAForInstructions on operands of newly created instructions, if the definition is inside a different loop than the new instruction. The final value we return from expandCodeFor may also need LCSSA phis, depending on the insert point. As no user for it exists there yet, create a temporary instruction at the insert point, which can be passed to formLCSSAForInstructions. This temporary instruction is removed after LCSSA construction. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D71538
When an instrinsic function is declared in a type declaration statement we need to set the INTRINSIC attribute and (per 8.2(3)) ignore the specified type. To simplify the check, add IsIntrinsic utility to BaseVisitor. Also, intrinsics and external procedures were getting assigned a size and offset and they shouldn't be. Differential Revision: https://reviews.llvm.org/D84702
This patch introduces 2 new address spaces in OpenCL: global_device and global_host which are a subset of a global address space, so the address space scheme will be looking like: ``` generic->global->host ->device ->private ->local constant ``` Justification: USM allocations may be associated with both host and device memory. We want to give users a way to tell the compiler the allocation type of a USM pointer for optimization purposes. (Link to the Unified Shared Memory extension: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc) Before this patch USM pointer could be only in opencl_global address space, hence a device backend can't tell if a particular pointer points to host or device memory. On FPGAs at least we can generate more efficient hardware code if the user tells us where the pointer can point - being able to distinguish between these types of pointers at compile time allows us to instantiate simpler load-store units to perform memory transactions. Patch by Dmitry Sidorov. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D82174
The previous fix for this, https://reviews.llvm.org/D76761, Passed test cases but failed in the real world as std::string has a non trivial destructor so creates a CXXBindTemporaryExpr. This handles that shortfall and updates the test case std::basic_string implementation to use a non trivial destructor to reflect real world behaviour. Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D84831
This commit is part of a greater project which aims to add full end-to-end support for convolutions inside mlir. The reason behind having conv ops for each rank rather than having one generic ConvOp is to enable better optimizations for every N-D case which reflects memory layout of input/kernel buffers better and simplifies code as well. We expect plain linalg.conv to be progressively retired. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D83879
… types Differential Revision: https://reviews.llvm.org/D84444
Differential Revision: https://reviews.llvm.org/D84069
If both operands are undef, return undef. If one operand is undef, clamp to limit constant.
This patch replaces 'AddrSize'/'SegSize' with 'AddressSize'/'SegmentSelectorSize'. NFC.
…ities Not a bug that is ever likely to materialise, but still worth fixing Reviewed By: DmitryPolukhin Differential Revision: https://reviews.llvm.org/D84850
…verlappingMultipleDef In MachineCopyPropagation::BackwardPropagatableCopy(), a check is added for multiple destination registers. The copy propagation is avoided if the copied destination register is the same register as another destination on the same instruction. A new test is added. This used to fail on ARM like this: error: unpredictable instruction, RdHi and RdLo must be different umull r9, r9, lr, r0 Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D82638
We previously used a non-aggregate RValue to represent the passed value, which violated the assumptions of call arg lowering in some cases, in particular on 32-bit Windows, where we'd end up producing an FCA store with TBAA metadata, that the IR verifier would reject.
Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D84926
This allow declaring buffers and alloc of vectors so that we can support vector load/store. Differential Revision: https://reviews.llvm.org/D84982
Avoid recursively calling copyPhysReg for AGPR handling. This was dropping the necessary super register implicit defs to avoid liveness verifier errors.
Differential Revision: https://reviews.llvm.org/D84984
This allows clients to detect invalid transformations applied by JITLink passes (e.g. inserting or removing symbols in unexpected ways) and terminate linking with an error. This change is used to simplify the error propagation logic in ObjectLinkingLayer.
The -harness option enables new testing use-cases for llvm-jitlink. It takes a list of objects to treat as a test harness for any regular objects passed to llvm-jitlink. If any files are passed using the -harness option then the following transformations are applied to all other files: (1) Symbols definitions that are referenced by the harness files are promoted to default scope. (This enables access to statics from test harness). (2) Symbols definitions that clash with definitions in the harness files are deleted. (This enables interposition by test harness). (3) All other definitions in regular files are demoted to local scope. (This causes untested code to be dead stripped, reducing memory cost and eliminating spurious unresolved symbol errors from untested code). These transformations allow the harness files to reference and interpose symbols in the regular object files, which can be used to support execution tests (including fuzz tests) of functions in relocatable objects produced by a build.
…cessary fptosi/fptoui have similar, but not identical, semantics. In particular, the behavior on overflow is different. Fixes https://bugs.llvm.org/show_bug.cgi?id=46844 for 64-bit. (The corresponding patch for 32-bit is more involved because the equivalent intrinsics don't exist, as far as I can tell.) Differential Revision: https://reviews.llvm.org/D84703
This tool will be used to generate C wrappers for the C++ LLVM libc implementations. This change does not hook this tool up to anything yet. However, it can be useful for cases where one does not want to run the objcopy step (to insert the C symbol in the object file) but can make use of LTO to eliminate the cost of the additional wrapper call. This can be relevant for certain downstream platforms. If this tool can benefit other libc platforms in general, then it can be integrated into the build system with options to use or not use the wrappers. An example of such a platform is CUDA. Reviewed By: abrachet Differential Revision: https://reviews.llvm.org/D84848
clang-tidy's llvm-header-guard rule references the LLVM style - where it's missing. Differential Revision: https://reviews.llvm.org/D84989
…INSIC_LRINT. Differential Revision: https://reviews.llvm.org/D84552
Just the obvious implementation that rewrites the result type. Also fix warning from EXTRACT_SUBVECTOR legalization that triggers on the test. Differential Revision: https://reviews.llvm.org/D84706
…nslated processes When we detect a process that the native debugserver cannot handle, handoff the connection fd to the translated debugserver.
InstrProfilingBuffer.c.o is generic code that must support compilation into freestanding projects. This gets rid of its dependence on the _getpagesize symbol from libc, shifting it to InstrProfilingFile.c.o. This fixes a build failure seen in a firmware project. rdar://66249701
…nsic This includes basic support for computeKnownBits on abs. I've left FIXMEs for more complicated things we could do. Differential Revision: https://reviews.llvm.org/D84963
This patch addes time trace functionality to have a better understanding of the analysis times. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D84980
…res and tuning features After the recent change to the tuning settings for pentium4 to improve our default 32-bit behavior, I've decided to see about implementing -mtune support. This way we could have a default architecture CPU of "pentium4" or "x86-64" and a default tuning cpu of "generic". And we could change our "pentium4" tuning settings back to what they were before. As a step to supporting this, this patch separates all of the features lists for the CPUs into 2 lists. I'm using the Proc class and a new ProcModel class to concat the 2 lists before passing to the target independent ProcessorModel. Future work to truly support mtune would change ProcessorModel to take 2 lists separately. I've diffed the X86GenSubtargetInfo.inc file before and after this patch to ensure that the final feature list for the CPUs isn't changed. Differential Revision: https://reviews.llvm.org/D84879
…VI) mitigations Fix for the issue raised in rust-lang/rust#74632. The current heuristic for inserting LFENCEs uses a quadratic-time algorithm. This can apparently cause substantial compilation slowdowns for building Rust projects, where functions > 5000 LoC are apparently common. The updated heuristic in this patch implements a linear-time algorithm. On a set of benchmarks, the slowdown factor for the generated code was comparable (2.55x geo mean for the quadratic-time heuristic, vs. 2.58x for the linear-time heuristic). Both heuristics offer the same security properties, namely, mitigating LVI. This patch also includes some formatting fixes. Differential Revision: https://reviews.llvm.org/D84471
Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D84903
Refactored the function `target` to make preparation for fixing the issue of ahead-of-time device memory deallocation. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D84816
Differential Revision: https://reviews.llvm.org/D84616
haoyouab
pushed a commit
to Flex-plaidml-team/llvm-project
that referenced
this pull request
Sep 4, 2020
When `Target::GetEntryPointAddress()` calls `exe_module->GetObjectFile()->GetEntryPointAddress()`, and the returned `entry_addr` is valid, it can immediately be returned. However, just before that, an `llvm::Error` value has been setup, but in this case it is not consumed before returning, like is done further below in the function. In https://bugs.freebsd.org/248745 we got a bug report for this, where a very simple test case aborts and dumps core: ``` * thread plaidml#1, name = 'testcase', stop reason = breakpoint 1.1 frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5 1 int main(int argc, char *argv[]) 2 { -> 3 return 0; 4 } (lldb) p argc Program aborted due to an unhandled Error: Error value was Success. (Note: Success values must still be checked prior to being destroyed). Thread 1 received signal SIGABRT, Aborted. thr_kill () at thr_kill.S:3 3 thr_kill.S: No such file or directory. (gdb) bt #0 thr_kill () at thr_kill.S:3 plaidml#1 0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52 plaidml#2 0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 plaidml#3 0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112 plaidml#4 0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267 plaidml#5 0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67 plaidml#6 0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114 plaidml#7 0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97 plaidml#8 0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604 plaidml#9 0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347 plaidml#10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383 plaidml#11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301 plaidml#12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331 plaidml#13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190 plaidml#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372 plaidml#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414 plaidml#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646 plaidml#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003 plaidml#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762 plaidml#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760 plaidml#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548 plaidml#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903 plaidml#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946 plaidml#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169 plaidml#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675 plaidml#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890``` Fix the incorrect error catch by only instantiating an `Error` object if it is necessary. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D86355
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.