-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Allow fixed vector operand for LLVM_AtomicRMWOp #110553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-llvm Author: Ilya V (joviliast) ChangesAs far as AMDGPU target supports vectorization for atomic_rmw operation, allow construction of LLVM_AtomicRMWOp with 16 bit floating point values. This patch enables building of LLVM_AtomicRMWOp with fixed vectors of 16 bit fp values as operands. See also: #94845, #95393, #95394 Full diff: https://github.com/llvm/llvm-project/pull/110553.diff 3 Files Affected:
diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
index 030160821bd823..615c0a39f3acd0 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
@@ -1737,7 +1737,8 @@ def LLVM_ConstantOp
// Atomic operations.
//
-def LLVM_AtomicRMWType : AnyTypeOf<[LLVM_AnyFloat, LLVM_AnyPointer, AnySignlessInteger]>;
+def LLVM_AtomicRMWType
+ : AnyTypeOf<[LLVM_AnyPointer, AnySignlessInteger, LLVM_ScalarOrVectorOf<LLVM_AnyFloat>]>;
def LLVM_AtomicRMWOp : LLVM_MemAccessOpBase<"atomicrmw", [
TypesMatchWith<"result #0 and operand #1 have the same type",
diff --git a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
index 0561c364c7d591..99b3dc79fda664 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
@@ -3008,9 +3008,19 @@ void AtomicRMWOp::build(OpBuilder &builder, OperationState &state,
LogicalResult AtomicRMWOp::verify() {
auto valType = getVal().getType();
- if (getBinOp() == AtomicBinOp::fadd || getBinOp() == AtomicBinOp::fsub ||
- getBinOp() == AtomicBinOp::fmin || getBinOp() == AtomicBinOp::fmax) {
- if (!mlir::LLVM::isCompatibleFloatingPointType(valType))
+ if (getBinOp() == AtomicBinOp::fadd && isCompatibleVectorType(valType)) {
+ // Currently, only fadd operation supports fixed vector operands.
+ if (isScalableVectorType(valType))
+ return emitOpError("expected LLVM IR fixed vector type");
+ Type elemType = getVectorElementType(valType);
+ if (!(isCompatibleFloatingPointType(elemType) &&
+ elemType.getIntOrFloatBitWidth() == 16))
+ return emitOpError("unexpected LLVM IR type for vector element");
+ } else if (getBinOp() == AtomicBinOp::fadd ||
+ getBinOp() == AtomicBinOp::fsub ||
+ getBinOp() == AtomicBinOp::fmin ||
+ getBinOp() == AtomicBinOp::fmax) {
+ if (!isCompatibleFloatingPointType(valType))
return emitOpError("expected LLVM IR floating point type");
} else if (getBinOp() == AtomicBinOp::xchg) {
DataLayout dataLayout = DataLayout::closest(*this);
diff --git a/mlir/test/Dialect/LLVMIR/invalid.mlir b/mlir/test/Dialect/LLVMIR/invalid.mlir
index 9388d7ef24936e..978572a2b3cca2 100644
--- a/mlir/test/Dialect/LLVMIR/invalid.mlir
+++ b/mlir/test/Dialect/LLVMIR/invalid.mlir
@@ -643,6 +643,22 @@ func.func @atomicrmw_expected_float(%i32_ptr : !llvm.ptr, %i32 : i32) {
// -----
+func.func @atomicrmw_unexpected_scalable_vector(%i32_ptr : !llvm.ptr, %i16_fvec : vector<[3]xf16>) {
+ // expected-error@+1 {{expected LLVM IR fixed vector type}}
+ %0 = llvm.atomicrmw fadd %i32_ptr, %i16_fvec unordered : !llvm.ptr, i32
+ llvm.return
+}
+
+// -----
+
+func.func @atomicrmw_unexpected_vector_element(%i32_ptr : !llvm.ptr, %i16_fvec : vector<3xi16>) {
+ // expected-error@+1 {{unexpected LLVM IR type for vector element}}
+ %0 = llvm.atomicrmw fadd %i32_ptr, %i16_fvec unordered : !llvm.ptr, i32
+ llvm.return
+}
+
+// -----
+
func.func @atomicrmw_unexpected_xchg_type(%i1_ptr : !llvm.ptr, %i1 : i1) {
// expected-error@+1 {{unexpected LLVM IR type for 'xchg' bin_op}}
%0 = llvm.atomicrmw xchg %i1_ptr, %i1 unordered : !llvm.ptr, i1
|
844d68f
to
fdb0108
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Only few nits
fdb0108
to
3728f65
Compare
f8c56ec
to
4f9c2d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! @antiagainst do you want to have a final look?
@@ -1535,11 +1536,13 @@ llvm.func @atomicrmw( | |||
%17 = llvm.atomicrmw usub_cond %i32_ptr, %i32 monotonic : !llvm.ptr, i32 | |||
// CHECK: atomicrmw usub_sat ptr %{{.*}}, i32 %{{.*}} monotonic | |||
%18 = llvm.atomicrmw usub_sat %i32_ptr, %i32 monotonic : !llvm.ptr, i32 | |||
// CHECK: atomicrmw fadd ptr %{{.*}}, <2 x half> %{{.*}} monotonic | |||
%19 = llvm.atomicrmw fadd %f16_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also test fmin/fmax/fsub with vector.
Also the scalar cases are supported, as well as bfloat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also test fmin/fmax/fsub with vector.
I'm not sure about fmin/fmax/fsub, so currently I filtered other operations here.
I suggest just to fix a title and description for now. WDYT?
Also the scalar cases are supported, as well as bfloat
Bfloat also should be tested (done), agree, but scalar cases is out of scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're applying a bunch of restrictions to these that simply do not exist in the underlying IR. FP vectors are supported for all the FP operations (except xchg, for now, which doesn't really count)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the LLVM dialect verifiers should follow the LLVM IR verifiers as closely as possible:
https://github.com/llvm/llvm-project/blob/6c331e50e4bfb4158d16ec3fe17ad7bb5c739e9f/llvm/lib/IR/Verifier.cpp#L4332C1-L4333C1
-> seems to contain the relevant code for this PR.
At the moment, the LLVM dialect verifiers are still very incomplete since it is mostly a lowering dialect in MLIR. However, we should try to avoid having verifiers that fail on correct LLVM IR (as it obviously was the case before your PR).
@@ -643,6 +643,14 @@ func.func @atomicrmw_expected_float(%i32_ptr : !llvm.ptr, %i32 : i32) { | |||
|
|||
// ----- | |||
|
|||
func.func @atomicrmw_unexpected_vector_element(%ptr : !llvm.ptr, %f32_vec : vector<3xf32>) { | |||
// expected-error@+1 {{unexpected LLVM IR type for vector element}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an IR rule, just pass through any FP vector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand, here we need to test verifiers logic, so here better to use vector of ints for example, to satisfy documentation.
F32 is to be supported in the future.
4f9c2d9
to
31d07e6
Compare
LLVM_AtomicRMWOp fadd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the verifier!
@@ -1535,11 +1536,13 @@ llvm.func @atomicrmw( | |||
%17 = llvm.atomicrmw usub_cond %i32_ptr, %i32 monotonic : !llvm.ptr, i32 | |||
// CHECK: atomicrmw usub_sat ptr %{{.*}}, i32 %{{.*}} monotonic | |||
%18 = llvm.atomicrmw usub_sat %i32_ptr, %i32 monotonic : !llvm.ptr, i32 | |||
// CHECK: atomicrmw fadd ptr %{{.*}}, <2 x half> %{{.*}} monotonic | |||
%19 = llvm.atomicrmw fadd %f16_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the LLVM dialect verifiers should follow the LLVM IR verifiers as closely as possible:
https://github.com/llvm/llvm-project/blob/6c331e50e4bfb4158d16ec3fe17ad7bb5c739e9f/llvm/lib/IR/Verifier.cpp#L4332C1-L4333C1
-> seems to contain the relevant code for this PR.
At the moment, the LLVM dialect verifiers are still very incomplete since it is mostly a lowering dialect in MLIR. However, we should try to avoid having verifiers that fail on correct LLVM IR (as it obviously was the case before your PR).
31d07e6
to
39ed304
Compare
LLVM_AtomicRMWOp fadd
Got rid of restrictions for oter then fadd fp operations, updated verifier logic, changed tests a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks LGTM modulo the last comments.
%1 = llvm.atomicrmw volatile fsub %ptr, %val syncscope("singlethread") monotonic {alignment = 16 : i64} : !llvm.ptr, f32 | ||
%1 = llvm.atomicrmw volatile fsub %f32_ptr, %f32 syncscope("singlethread") monotonic {alignment = 16 : i64} : !llvm.ptr, f32 | ||
// CHECK: llvm.atomicrmw fmin %{{.*}}, %{{.*}} monotonic : !llvm.ptr, vector<2xf16> | ||
%2 = llvm.atomicrmw fmin %f32_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%2 = llvm.atomicrmw fmin %f32_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16> | |
%2 = llvm.atomicrmw fmin %ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16> |
nit: this should probably be a f16_vec_ptr. However, since we do not work with type pointers feel free to pass just one pointer argument to the test.
func.func @atomicrmw_unexpected_vector_element(%ptr : !llvm.ptr, %i32_vec : vector<3xi32>) { | ||
// expected-error@+1 {{expected LLVM IR floating point type for vector element}} | ||
%0 = llvm.atomicrmw fadd %ptr, %i32_vec unordered : !llvm.ptr, vector<3xi32> | ||
llvm.return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test with a scalable vector as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please check https://github.com/llvm/llvm-project/pull/110553/files#diff-ff9a14cb96ea30dc57bad4dc2c44b34d54d57a25777288c26f305279e387f1a1R646
But it verifies tablegen introduced rule, not verifier.
Verifier checks scalability second time. In my opinion it is more consistant to have such check in verifier then don't have. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see there is also LLVM_AnyFixedVector
. I think it is ok to keep the redundant check yes.
39ed304
to
cb11888
Compare
This PR fixes `LLVM_AtomicRMWOp` allowed semantics and verifier logic to enable building of `LLVM_AtomicRMWOp` with fixed vectors of compatible fp values as operands for fp rmw operation. See also: https://llvm.org/docs/LangRef.html#id231 Signed-off-by: Ilya Veselov <[email protected]>
Hi @arsenm, could you have another look? |
@joviliast Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
This updates LLVM to pull in two fixes we need for AMD: * llvm/llvm-project#110553 * llvm/llvm-project#104743 Fixed `LLVM::CallOp` and `LLVM::CallIntrinsicOp` builder API after * llvm/llvm-project#108933
This updates LLVM to pull in two fixes we need for AMD: * llvm/llvm-project#110553 * llvm/llvm-project#104743 Fixed `LLVM::CallOp` and `LLVM::CallIntrinsicOp` builder API after * llvm/llvm-project#108933
This updates LLVM to pull in two fixes we need for AMD: * llvm/llvm-project#110553 * llvm/llvm-project#104743 Fixed `LLVM::CallOp` and `LLVM::CallIntrinsicOp` builder API after * llvm/llvm-project#108933
This PR fixes
LLVM_AtomicRMWOp
allowed semantics and verifier logic toenable building of
LLVM_AtomicRMWOp
with fixed vectors of compatible fp valuesas operands for fp rmw operation.
See also: https://llvm.org/docs/LangRef.html#id231