Skip to content

Allow fixed vector operand for LLVM_AtomicRMWOp #110553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 3, 2024

Conversation

joviliast
Copy link
Contributor

@joviliast joviliast commented Sep 30, 2024

This PR fixes LLVM_AtomicRMWOp allowed semantics and verifier logic to
enable building of LLVM_AtomicRMWOp with fixed vectors of compatible fp values
as operands for fp rmw operation.

See also: https://llvm.org/docs/LangRef.html#id231

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-llvm

Author: Ilya V (joviliast)

Changes

As far as AMDGPU target supports vectorization for atomic_rmw operation, allow construction of LLVM_AtomicRMWOp with 16 bit floating point values. This patch enables building of LLVM_AtomicRMWOp with fixed vectors of 16 bit fp values as operands.

See also: #94845, #95393, #95394


Full diff: https://github.com/llvm/llvm-project/pull/110553.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+2-1)
  • (modified) mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp (+13-3)
  • (modified) mlir/test/Dialect/LLVMIR/invalid.mlir (+16)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
index 030160821bd823..615c0a39f3acd0 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
@@ -1737,7 +1737,8 @@ def LLVM_ConstantOp
 // Atomic operations.
 //
 
-def LLVM_AtomicRMWType : AnyTypeOf<[LLVM_AnyFloat, LLVM_AnyPointer, AnySignlessInteger]>;
+def LLVM_AtomicRMWType
+    : AnyTypeOf<[LLVM_AnyPointer, AnySignlessInteger, LLVM_ScalarOrVectorOf<LLVM_AnyFloat>]>;
 
 def LLVM_AtomicRMWOp : LLVM_MemAccessOpBase<"atomicrmw", [
       TypesMatchWith<"result #0 and operand #1 have the same type",
diff --git a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
index 0561c364c7d591..99b3dc79fda664 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
@@ -3008,9 +3008,19 @@ void AtomicRMWOp::build(OpBuilder &builder, OperationState &state,
 
 LogicalResult AtomicRMWOp::verify() {
   auto valType = getVal().getType();
-  if (getBinOp() == AtomicBinOp::fadd || getBinOp() == AtomicBinOp::fsub ||
-      getBinOp() == AtomicBinOp::fmin || getBinOp() == AtomicBinOp::fmax) {
-    if (!mlir::LLVM::isCompatibleFloatingPointType(valType))
+  if (getBinOp() == AtomicBinOp::fadd && isCompatibleVectorType(valType)) {
+    // Currently, only fadd operation supports fixed vector operands.
+    if (isScalableVectorType(valType))
+      return emitOpError("expected LLVM IR fixed vector type");
+    Type elemType = getVectorElementType(valType);
+    if (!(isCompatibleFloatingPointType(elemType) &&
+          elemType.getIntOrFloatBitWidth() == 16))
+      return emitOpError("unexpected LLVM IR type for vector element");
+  } else if (getBinOp() == AtomicBinOp::fadd ||
+             getBinOp() == AtomicBinOp::fsub ||
+             getBinOp() == AtomicBinOp::fmin ||
+             getBinOp() == AtomicBinOp::fmax) {
+    if (!isCompatibleFloatingPointType(valType))
       return emitOpError("expected LLVM IR floating point type");
   } else if (getBinOp() == AtomicBinOp::xchg) {
     DataLayout dataLayout = DataLayout::closest(*this);
diff --git a/mlir/test/Dialect/LLVMIR/invalid.mlir b/mlir/test/Dialect/LLVMIR/invalid.mlir
index 9388d7ef24936e..978572a2b3cca2 100644
--- a/mlir/test/Dialect/LLVMIR/invalid.mlir
+++ b/mlir/test/Dialect/LLVMIR/invalid.mlir
@@ -643,6 +643,22 @@ func.func @atomicrmw_expected_float(%i32_ptr : !llvm.ptr, %i32 : i32) {
 
 // -----
 
+func.func @atomicrmw_unexpected_scalable_vector(%i32_ptr : !llvm.ptr, %i16_fvec : vector<[3]xf16>) {
+  // expected-error@+1 {{expected LLVM IR fixed vector type}}
+  %0 = llvm.atomicrmw fadd %i32_ptr, %i16_fvec unordered : !llvm.ptr, i32
+  llvm.return
+}
+
+// -----
+
+func.func @atomicrmw_unexpected_vector_element(%i32_ptr : !llvm.ptr, %i16_fvec : vector<3xi16>) {
+  // expected-error@+1 {{unexpected LLVM IR type for vector element}}
+  %0 = llvm.atomicrmw fadd %i32_ptr, %i16_fvec unordered : !llvm.ptr, i32
+  llvm.return
+}
+
+// -----
+
 func.func @atomicrmw_unexpected_xchg_type(%i1_ptr : !llvm.ptr, %i1 : i1) {
   // expected-error@+1 {{unexpected LLVM IR type for 'xchg' bin_op}}
   %0 = llvm.atomicrmw xchg %i1_ptr, %i1 unordered : !llvm.ptr, i1

@joviliast joviliast force-pushed the atomicrmw-vector-operand branch from 844d68f to fdb0108 Compare October 1, 2024 10:24
Copy link
Contributor

@giuseros giuseros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Only few nits

@joviliast joviliast force-pushed the atomicrmw-vector-operand branch from fdb0108 to 3728f65 Compare October 1, 2024 17:43
@joviliast joviliast force-pushed the atomicrmw-vector-operand branch 2 times, most recently from f8c56ec to 4f9c2d9 Compare October 2, 2024 10:00
Copy link
Contributor

@giuseros giuseros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @antiagainst do you want to have a final look?

@@ -1535,11 +1536,13 @@ llvm.func @atomicrmw(
%17 = llvm.atomicrmw usub_cond %i32_ptr, %i32 monotonic : !llvm.ptr, i32
// CHECK: atomicrmw usub_sat ptr %{{.*}}, i32 %{{.*}} monotonic
%18 = llvm.atomicrmw usub_sat %i32_ptr, %i32 monotonic : !llvm.ptr, i32
// CHECK: atomicrmw fadd ptr %{{.*}}, <2 x half> %{{.*}} monotonic
%19 = llvm.atomicrmw fadd %f16_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also test fmin/fmax/fsub with vector.

Also the scalar cases are supported, as well as bfloat

Copy link
Contributor Author

@joviliast joviliast Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also test fmin/fmax/fsub with vector.

I'm not sure about fmin/fmax/fsub, so currently I filtered other operations here.
I suggest just to fix a title and description for now. WDYT?

Also the scalar cases are supported, as well as bfloat

Bfloat also should be tested (done), agree, but scalar cases is out of scope of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're applying a bunch of restrictions to these that simply do not exist in the underlying IR. FP vectors are supported for all the FP operations (except xchg, for now, which doesn't really count)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the LLVM dialect verifiers should follow the LLVM IR verifiers as closely as possible:
https://github.com/llvm/llvm-project/blob/6c331e50e4bfb4158d16ec3fe17ad7bb5c739e9f/llvm/lib/IR/Verifier.cpp#L4332C1-L4333C1
-> seems to contain the relevant code for this PR.

At the moment, the LLVM dialect verifiers are still very incomplete since it is mostly a lowering dialect in MLIR. However, we should try to avoid having verifiers that fail on correct LLVM IR (as it obviously was the case before your PR).

@@ -643,6 +643,14 @@ func.func @atomicrmw_expected_float(%i32_ptr : !llvm.ptr, %i32 : i32) {

// -----

func.func @atomicrmw_unexpected_vector_element(%ptr : !llvm.ptr, %f32_vec : vector<3xf32>) {
// expected-error@+1 {{unexpected LLVM IR type for vector element}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an IR rule, just pass through any FP vector?

Copy link
Contributor Author

@joviliast joviliast Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, here we need to test verifiers logic, so here better to use vector of ints for example, to satisfy documentation.
F32 is to be supported in the future.

@joviliast joviliast force-pushed the atomicrmw-vector-operand branch from 4f9c2d9 to 31d07e6 Compare October 2, 2024 16:51
@joviliast joviliast changed the title Allow 16 bit floating point operand for LLVM_AtomicRMWOp Allow 16 bit floating point operand for LLVM_AtomicRMWOp fadd Oct 2, 2024
@joviliast joviliast requested a review from arsenm October 2, 2024 17:05
Copy link
Contributor

@gysit gysit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the verifier!

@@ -1535,11 +1536,13 @@ llvm.func @atomicrmw(
%17 = llvm.atomicrmw usub_cond %i32_ptr, %i32 monotonic : !llvm.ptr, i32
// CHECK: atomicrmw usub_sat ptr %{{.*}}, i32 %{{.*}} monotonic
%18 = llvm.atomicrmw usub_sat %i32_ptr, %i32 monotonic : !llvm.ptr, i32
// CHECK: atomicrmw fadd ptr %{{.*}}, <2 x half> %{{.*}} monotonic
%19 = llvm.atomicrmw fadd %f16_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the LLVM dialect verifiers should follow the LLVM IR verifiers as closely as possible:
https://github.com/llvm/llvm-project/blob/6c331e50e4bfb4158d16ec3fe17ad7bb5c739e9f/llvm/lib/IR/Verifier.cpp#L4332C1-L4333C1
-> seems to contain the relevant code for this PR.

At the moment, the LLVM dialect verifiers are still very incomplete since it is mostly a lowering dialect in MLIR. However, we should try to avoid having verifiers that fail on correct LLVM IR (as it obviously was the case before your PR).

@joviliast joviliast force-pushed the atomicrmw-vector-operand branch from 31d07e6 to 39ed304 Compare October 3, 2024 12:41
@joviliast joviliast changed the title Allow 16 bit floating point operand for LLVM_AtomicRMWOp fadd Allow fixed vector operand for LLVM_AtomicRMWOp Oct 3, 2024
@joviliast joviliast requested a review from gysit October 3, 2024 13:53
@joviliast
Copy link
Contributor Author

Got rid of restrictions for oter then fadd fp operations, updated verifier logic, changed tests a bit.

Copy link
Contributor

@gysit gysit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM modulo the last comments.

%1 = llvm.atomicrmw volatile fsub %ptr, %val syncscope("singlethread") monotonic {alignment = 16 : i64} : !llvm.ptr, f32
%1 = llvm.atomicrmw volatile fsub %f32_ptr, %f32 syncscope("singlethread") monotonic {alignment = 16 : i64} : !llvm.ptr, f32
// CHECK: llvm.atomicrmw fmin %{{.*}}, %{{.*}} monotonic : !llvm.ptr, vector<2xf16>
%2 = llvm.atomicrmw fmin %f32_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%2 = llvm.atomicrmw fmin %f32_vec_ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16>
%2 = llvm.atomicrmw fmin %ptr, %f16_vec monotonic : !llvm.ptr, vector<2xf16>

nit: this should probably be a f16_vec_ptr. However, since we do not work with type pointers feel free to pass just one pointer argument to the test.

func.func @atomicrmw_unexpected_vector_element(%ptr : !llvm.ptr, %i32_vec : vector<3xi32>) {
// expected-error@+1 {{expected LLVM IR floating point type for vector element}}
%0 = llvm.atomicrmw fadd %ptr, %i32_vec unordered : !llvm.ptr, vector<3xi32>
llvm.return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test with a scalable vector as well?

Copy link
Contributor Author

@joviliast joviliast Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please check https://github.com/llvm/llvm-project/pull/110553/files#diff-ff9a14cb96ea30dc57bad4dc2c44b34d54d57a25777288c26f305279e387f1a1R646
But it verifies tablegen introduced rule, not verifier.
Verifier checks scalability second time. In my opinion it is more consistant to have such check in verifier then don't have. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see there is also LLVM_AnyFixedVector. I think it is ok to keep the redundant check yes.

@joviliast joviliast force-pushed the atomicrmw-vector-operand branch from 39ed304 to cb11888 Compare October 3, 2024 15:04
This PR fixes `LLVM_AtomicRMWOp` allowed semantics and verifier logic to
enable building of `LLVM_AtomicRMWOp` with fixed vectors of compatible fp values
as operands for fp rmw operation.

See also: https://llvm.org/docs/LangRef.html#id231

Signed-off-by: Ilya Veselov <[email protected]>
@giuseros
Copy link
Contributor

giuseros commented Oct 3, 2024

Hi @arsenm, could you have another look?

@giuseros giuseros merged commit 1b4b0c4 into llvm:main Oct 3, 2024
8 checks passed
Copy link

github-actions bot commented Oct 3, 2024

@joviliast Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

antiagainst added a commit to triton-lang/triton that referenced this pull request Oct 4, 2024
This updates LLVM to pull in two fixes we need for AMD:

* llvm/llvm-project#110553
* llvm/llvm-project#104743

Fixed `LLVM::CallOp` and `LLVM::CallIntrinsicOp` builder API after
* llvm/llvm-project#108933
Luosuu pushed a commit to Luosuu/triton that referenced this pull request Nov 13, 2024
This updates LLVM to pull in two fixes we need for AMD:

* llvm/llvm-project#110553
* llvm/llvm-project#104743

Fixed `LLVM::CallOp` and `LLVM::CallIntrinsicOp` builder API after
* llvm/llvm-project#108933
bertmaher pushed a commit to bertmaher/triton that referenced this pull request Dec 10, 2024
This updates LLVM to pull in two fixes we need for AMD:

* llvm/llvm-project#110553
* llvm/llvm-project#104743

Fixed `LLVM::CallOp` and `LLVM::CallIntrinsicOp` builder API after
* llvm/llvm-project#108933
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants