Skip to content

Bump to latest master #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1,985 commits into from
Sep 21, 2020
Merged

Bump to latest master #28

merged 1,985 commits into from
Sep 21, 2020

Conversation

flaub
Copy link

@flaub flaub commented Sep 18, 2020

No description provided.

JDevlieghere and others added 30 commits September 10, 2020 18:50
This fixes the following assertion in TestPlatformPython.py.

  Assertion failed: (id != 0 && "Forgot to add function to
  registry?")
…fic TLS slot

An upcoming change to Scudo will change how we use the TLS slot
in tsd_shared.h, which will be a little easier to deal with if
we can remove the code path that calls pthread_getspecific and
pthread_setspecific. The only known user of this code path is Fuchsia.

We can't eliminate this code path by making Fuchsia use ELF TLS
because although Fuchsia supports ELF TLS, it is not supported within
libc itself. To address this, Roland McGrath on the Fuchsia team has
proposed that Scudo will optionally call a platform-provided function
to access a TLS slot reserved for Scudo. Android also has a reserved
TLS slot, but the code that accesses the TLS slot lives in Scudo.

We can eliminate some complexity and duplicated code by having Android
use the same mechanism that was proposed for Fuchsia, which is what
this change does. A separate change to Android implements it.

Differential Revision: https://reviews.llvm.org/D87420
Replace all remaining uses with thread_local, which is a C++11
standard feature.

Differential Revision: https://reviews.llvm.org/D87478
- It seems no long required for shared library builds.
This reverts commit c982682 as it
breaks regression tests.
Update both thread and stack.
Update thread and stack as atomic operation.
Keep all 32bit of TID as now we have enough bits.

Depends on D87135.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D87217
In addition to calculate hash consistently by swapping SELECT's
operands, we also need to inverse the select pattern favor to match the
original logic.

[EarlyCSE] Equivalent SELECTs should hash equally

DenseMap<SimpleValue> assumes that, if its isEqual method returns true
for two elements, then its getHashValue method must return the same value
for them. This invariant is broken when one SELECT node is a min/max
operation, and the other can be transformed into an equivalent min/max by
inverting its predicate and swapping its operands. This patch fixes an
assertion failure that would occur intermittently while compiling the
following IR:

    define i32 @t(i32 %i) {
      %cmp = icmp sle i32 0, %i
      %twin1 = select i1 %cmp, i32 %i, i32 0
      %cmpinv = icmp sgt i32 0, %i
      %twin2 = select i1 %cmpinv,  i32 0, i32 %i
      %sink = add i32 %twin1, %twin2
      ret i32 %sink
    }

Differential Revision: https://reviews.llvm.org/D86843
…ubrange.

    This is to fix CodeView build failure https://bugs.llvm.org/show_bug.cgi?id=47287
    after DIsSubrange upgrade D80197

    Assert condition is now removed and Count is calculated in case LowerBound
    is absent or zero and Count or UpperBound is constant. If Count is unknown
    it is later handled as VLA (currently Count is set to zero).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D87406
- As min/max are commutative operators, there is no need to swap
  operands. That breaks the convention calculating the hash value.
This patch updates the documentation about `__builtin_memcpy_inline` and reorders the sections so it is more consitent and understandable.

Differential Revision: https://reviews.llvm.org/D87458
…or aarch64

The following EmitWinEHHandlerData() implicitly switches to .xdata, just
like on x86_64.

This became orphaned from the original code requiring it in
0b61d22 / https://reviews.llvm.org/D61095.

Differential Revision: https://reviews.llvm.org/D87447
Convert 2-byte opcodes to equivalent 1-byte ones.

Adjust the existing exhaustive testcase to avoid being altered by
the simplification rules (to keep that test exercising all individual
opcodes).

Fix the assembler parser limits for register pairs; for .seh_save_regp
and .seh_save_regp_x, we can allow up to x29, for a x29+x30 pair
(which gets remapped to the UOP_SaveFPLR(X) opcodes), for .seh_save_fregp
and .seh_save_fregpx, allow up to d14+d15.

Not creating .seh_save_next for float register pairs, as the
actual unwinder implementation in current versions of Windows is buggy
for that case.

This gives a minimal but measurable size reduction. (For a 6.5 MB
DLL with 300 KB .xdata, the .xdata shrinks by 48 bytes. The opcode
sequences are padded to a 4 byte boundary, so very small improvements
might not end up mattering directly.)

Differential Revision: https://reviews.llvm.org/D87367
This gives a pretty substantial size reduction; for a 6.5 MB
DLL with 300 KB .xdata, the .xdata shrinks by 66 KB.

Differential Revision: https://reviews.llvm.org/D87369
Check that all passes, which report they preserve CFG,
are really preserving CFG.
A new standard instrumentation is introduced. It can be
switched on/off by the flag verify-cfg-preserved, which
is on by default for debug builds.

Reviewers: kuhar, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D81558
When inlining functions containing allocas of scalable vectors we
cannot specify the size in the lifetime markers, since we don't
know this at compile time.

Added new test here:

  test/Transforms/Inline/AArch64/sve-alloca-merge.ll

Differential Revision: https://reviews.llvm.org/D87139
…peration.

The LinalgTilingPattern class dervied from the base deletes the
original operation. This allows for the use case where the more
transformations are necessary on the original operation after
tiling. In such cases the pattern can derive from
LinalgBaseTilingPattern instead of LinalgTilingPattern.

Differential Revision: https://reviews.llvm.org/D87308
As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit
Sparc, unlike `gcc` on Solaris.  In a 1-stage build with `gcc`, only two
testcases are affected (currently `XFAIL`ed), while in a 2-stage build more
than 100 tests `FAIL` due to this issue.

The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit
Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike
with `clang`'s default of `-mcpu=v8`.  This patch changes `clang` to use
`-mcpu=v9` on 32-bit Solaris/SPARC, too.

Doing so uncovered two bugs:

`clang -m32 -mcpu=v9` chokes with any Solaris system headers included:

  /usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined"
  #error "Both _ILP32 and _LP64 are defined"

While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9`
compilation, neither `gcc` nor Studio `cc` do.  In fact, the Studio 12.6
`cc(1)` man page clearly states:

            These predefinitions are valid in all modes:
  [...]
               __sparcv8 (SPARC)
               __sparcv9 (SPARC -m64)

At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]`
for a 32-bit Sparc compilation with any V9 cpu.  I've also changed
`MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle
Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic
Types, 3.1 Size and Alignment of Atomic C Types).

The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed
again.

Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.

Differential Revision: https://reviews.llvm.org/D86621
This changes adjusts the documentation generation for the AVX512 dialect. The machanism to generate documentation was changed with llvm@1a083f0.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D87460
This fixes a failed assert if expensive checks are enabled,
since 1308bb9.
…nique_ptr."

This reverts commit c74900c.

This appears to be breaking some builds on macOS and has been causing
build failures on Green Dragon (see below). I am reverting this for now,
to unblock testing on Green Dragon.

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/18144/console

[65/187] /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iexamples/ThinLtoJIT -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT -Iinclude -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -O3  -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.9    -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -MF examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o.d -o examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -c /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoDiscoveryThread.cpp
FAILED: examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Iexamples/ThinLtoJIT -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT -Iinclude -I/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -O3  -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -mmacosx-version-min=10.9    -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -MF examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o.d -o examples/ThinLtoJIT/CMakeFiles/ThinLtoJIT.dir/ThinLtoDiscoveryThread.cpp.o -c /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoDiscoveryThread.cpp
In file included from /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoDiscoveryThread.cpp:7:
/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/examples/ThinLtoJIT/ThinLtoInstrumentationLayer.h:37:68: error: non-virtual member function marked 'override' hides virtual member function
  void emit(MaterializationResponsibility R, ThreadSafeModule TSM) override;
                                                                   ^
/Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm-project/llvm/include/llvm/ExecutionEngine/Orc/Layer.h:103:16: note: hidden overloaded virtual function 'llvm::orc::IRLayer::emit' declared here: type mismatch at 1st parameter ('std::unique_ptr<MaterializationResponsibility>' vs 'llvm::orc::MaterializationResponsibility')
  virtual void emit(std::unique_ptr<MaterializationResponsibility> R,
               ^
1 error generated.
Previously only the input type was printed, and the parser applied it to
both input and output, creating an invalid transpose. Print and parse
both types, and verify that they match.

Differential Revision: https://reviews.llvm.org/D87462
Without this, future optimizer improvements can optimize the entire
function to "return 0".
rotateright and others added 26 commits September 14, 2020 10:07
The new tests are duplicated from the sibling patch for codegen:
D87571
We use the same code structure for folding integer min/max.
…ype conversion operations.

Added support to the Std dialect cast operations to do casts in vector types when feasible.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D87410
In C++20, since P0896R4, std::ostream_iterator and std::ostreambuf_iterator
must have std::ptrdiff_t instead of void as a difference_type.

Tests by Casey Carter (thanks!).

Differential Revision: https://reviews.llvm.org/D87459
1ce8201 added a fix to restrict phi optimizations after phi
translations. But the current use of performedPhiTranslation only
checked whether phi translation happened for the first iterator and
missed cases where phi translations happens at subsequent
iterators/upwards defs.

This patch changes upward_defs_iteartor to take a pointer to a bool, so
we can easily ensure the final value includes all visited defs, while
still being able to conveniently use it with make_range & co.
maxnum(ninf X, +FLT_MAX) --> +FLT_MAX
minnum(ninf X, -FLT_MAX) --> -FLT_MAX

This is based on the similar codegen transform proposed in:
D87571
This adds the `__swift_objc_members__` attribute to the semantic
analysis.  It allows for annotating ObjC interfaces to provide Swift
semantics indicating that the types derived from this interface will be
back-bridged to Objective-C to allow interoperability with Objective-C
and Swift.

This is based on the work of the original changes in
llvm/llvm-project-staging@8afaf3a

Differential Revision: https://reviews.llvm.org/D87395
Reviewed By: Aaron Ballman, Dmitri Gribenko
maximum(nnan X, +INF) --> +INF
minimum(nnan X, -INF) --> -INF

This is based on the similar codegen transform proposed in:
D87571
Integer case values were being compared as unsigned by operator<
on evaluate::value::Integer. Change that to signed so that overlap
can be detected correctly.

Explicit CompareUnsigned and BLT are still available if unsigned
comparison is needed.

Fixes https://bugs.llvm.org/show_bug.cgi?id=47309

Differential Revision: https://reviews.llvm.org/D87595
This revision removes dependencies that exist between different string functions. This allows for the libc user to use a specific function X of this library without also depending on Y and Z.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D87421
…ent PPC64 thunk range errors

Prefer `errorOrWarn` to `fatal` for recoverable errors and graceful degradation
when --noinhibit-exec is specified.

Mention the destination symbol, otherwise the diagnostic is not really actionable.
Two errors are not tested but the patch does not intend to add the coverage.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D87486
Change the analyzed form of type-bound assignment to match that of call
statements. Resolve the binding name to a specific subprogram when
possible by using `GetBindingResolution`. Otherwise leave it as a
type-bound procedure call.

Differential Revision: https://reviews.llvm.org/D87541
…b_addr_map section, instead of emitting special unary-encoded symbols.

This patch introduces the new .bb_addr_map section feature which allows us to emit the bits needed for mapping binary profiles to basic blocks into a separate section.
The format of the emitted data is represented as follows. It includes a header for every function:

|  Address of the function                      |  -> 8 bytes (pointer size)
|  Number of basic blocks in this function (>0) |  -> ULEB128

The header is followed by a BB record for every basic block. These records are ordered in the same order as MachineBasicBlocks are placed in the function. Each BB Info is structured as follows:

|  Offset of the basic block relative to function begin |  -> ULEB128
|  Binary size of the basic block                       |  -> ULEB128
|  BB metadata                                          |  -> ULEB128  [ MBB.isReturn() OR MBB.hasTailCall() << 1  OR  MBB.isEHPad() << 2 ]

The new feature will replace the existing "BB labels" functionality with -basic-block-sections=labels.
The .bb_addr_map section scrubs the specially-encoded BB symbols from the binary and makes it friendly to profilers and debuggers.
Furthermore, the new feature reduces the binary size overhead from 70% bloat to only 12%.

For more information and results please refer to the RFC: https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html

Reviewed By: MaskRay, snehasish

Differential Revision: https://reviews.llvm.org/D85408
Fixes clang-tidy warnings first noticed on D87452.
- Fix a small issue caused by a conflicting name (GetObject) on Windows.
  The fix was to rename the internal GetObject function to
  GetNextFunction.
Similar to D87415, this folds the various float min/max opcodes
with a constant INF or -INF operand, or FLT_MAX / -FLT_MAX operand
if the ninf flag is set. Some of the folds are only possible under
nnan.

The fminnum(X, INF) with nnan and fmaxnum(X, -INF) with nnan cases
are needed to improve the VECREDUCE_FMIN/FMAX lowerings on X86,
the rest is here for the sake of completeness.

Differential Revision: https://reviews.llvm.org/D87571
For selects of the type X == Y ? A : B, check if we can simplify A
by using the X == Y equality and replace the operand if that's
possible. We already try to do this in InstSimplify, but will only
fold if the result of the simplification is the same as B, in which
case the select can be dropped entirely. Here the select will be
retained, just one operand simplified.

As we are performing an actual replacement here, we don't have
problems with refinement / poison values.

Differential Revision: https://reviews.llvm.org/D87480
Add signed aliases for integral types, as well as the "DF" abbreviation for the FWORD type.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D87246
MASM structs are end-padded to have size a multiple of the smaller of the requested alignment and the size of their largest field (taken recursively, if they have a field of STRUCT type).

This matches the behavior of ml.exe and ml64.exe. Our original implementation followed the MASM 6.0 documentation, which instead specified that MASM structs were padded to a multiple of their requested alignment.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D87248
Resolved a conflict.
@flaub flaub force-pushed the llitchev-move-llvm-pointer branch from 8901161 to 39bd349 Compare September 21, 2020 18:46
@flaub flaub merged commit 7c0ec48 into plaidml/plaidml-v1 Sep 21, 2020
@flaub flaub deleted the llitchev-move-llvm-pointer branch September 21, 2020 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.