Skip to content

Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions clang/docs/ClangOffloadPackager.rst
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,8 @@ the following values for the :ref:`offload kind<table-offload_kind>` and the
+------------+-------+---------------------------------------+
| OFK_HIP | 0x03 | The producer was HIP |
+------------+-------+---------------------------------------+
| OFK_SYCL | 0x04 | The producer was SYCL |
+------------+-------+---------------------------------------+

The flags are used to signify certain conditions, such as the presence of
debugging information or whether or not LTO was used. The string entry table is
Expand Down
10 changes: 10 additions & 0 deletions clang/test/Driver/linker-wrapper.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,15 @@
// REQUIRES: x86-registered-target
// REQUIRES: nvptx-registered-target
// REQUIRES: amdgpu-registered-target
// REQUIRES: spirv-registered-target

// An externally visible variable so static libraries extract.
__attribute__((visibility("protected"), used)) int x;

// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.elf.o
// RUN: %clang -cc1 %s -triple nvptx64-nvidia-cuda -emit-llvm-bc -o %t.nvptx.bc
// RUN: %clang -cc1 %s -triple amdgcn-amd-amdhsa -emit-llvm-bc -o %t.amdgpu.bc
// RUN: %clang -cc1 %s -triple spirv64-unknown-unknown -emit-llvm-bc -o %t.spirv.bc

// RUN: clang-offload-packager -o %t.out \
// RUN: --image=file=%t.elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
Expand Down Expand Up @@ -49,6 +51,14 @@ __attribute__((visibility("protected"), used)) int x;

// AMDGPU-LTO-TEMPS: clang{{.*}} --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -flto {{.*}}-save-temps

// RUN: clang-offload-packager -o %t.out \
// RUN: --image=file=%t.spirv.bc,kind=sycl,triple=spirv64-unknown-unknown,arch=generic
// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
// RUN: --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=SPIRV-LINK

// SPIRV-LINK: clang{{.*}} -o {{.*}}.img --target=spirv64-unknown-unknown {{.*}}.o --sycl-link -Xlinker -triple=spirv64-unknown-unknown -Xlinker -arch=

// RUN: clang-offload-packager -o %t.out \
// RUN: --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu \
// RUN: --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu
Expand Down
71 changes: 67 additions & 4 deletions clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -464,7 +464,8 @@ fatbinary(ArrayRef<std::pair<StringRef, StringRef>> InputFiles,
} // namespace amdgcn

namespace generic {
Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args,
bool HasSYCLOffloadKind = false) {
llvm::TimeTraceScope TimeScope("Clang");
// Use `clang` to invoke the appropriate device tools.
Expected<std::string> ClangPath =
Expand Down Expand Up @@ -554,6 +555,17 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
if (Args.hasArg(OPT_embed_bitcode))
CmdArgs.push_back("-Wl,--lto-emit-llvm");

// For linking device code with the SYCL offload kind, special handling is
// required. Passing --sycl-link to clang results in a call to
// clang-sycl-linker. Additional linker flags required by clang-sycl-linker
// will be communicated via the -Xlinker option.
if (HasSYCLOffloadKind) {
CmdArgs.push_back("--sycl-link");
CmdArgs.append(
{"-Xlinker", Args.MakeArgString("-triple=" + Triple.getTriple())});
CmdArgs.append({"-Xlinker", Args.MakeArgString("-arch=" + Arch)});
}

for (StringRef Arg : Args.getAllArgValues(OPT_linker_arg_EQ))
CmdArgs.append({"-Xlinker", Args.MakeArgString(Arg)});
for (StringRef Arg : Args.getAllArgValues(OPT_compiler_arg_EQ))
Expand All @@ -567,7 +579,8 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
} // namespace generic

Expected<StringRef> linkDevice(ArrayRef<StringRef> InputFiles,
const ArgList &Args) {
const ArgList &Args,
bool HasSYCLOffloadKind = false) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way we can avoid this? omp/hip/cuda seem to work without special handing here, and it would be idea if SYCL did as well

Copy link
Owner Author

@asudarsa asudarsa Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it exists today, we need to call clang tool with --sycl-link option for sycl offloads (in order to call clang-sycl-linker that includes device code splitting and other SYCL specific stuff). omp/cuda/hip do not have this requirement. They do not require any special handling (or require any special options to be passed to clang) during the linking stage.

clang-linker-wrapper is designed in such a way that the offload kind does not matter during the device code linking stage. Only the targets matter. Unfortunately, for SYCL offload, it is not clear how we can get rid of 'special handling' during the device code linking stage.

We can eventually try to align SYCL device code linking flow with community flow. But we need to have a SYCL device code linking in the first place for that.

Please let me know if I am missing something here.

Thanks
P.S: I have tried to keep the deviation 'as clean as possible' here. Any tips to make this better will be great.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline, makes sense for now

Copy link
Owner Author

@asudarsa asudarsa Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarnex, Added a detailed PR description as we discussed offline.

Thanks

const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ));
switch (Triple.getArch()) {
case Triple::nvptx:
Expand All @@ -582,7 +595,7 @@ Expected<StringRef> linkDevice(ArrayRef<StringRef> InputFiles,
case Triple::spirv64:
case Triple::systemz:
case Triple::loongarch64:
return generic::clang(InputFiles, Args);
return generic::clang(InputFiles, Args, HasSYCLOffloadKind);
default:
return createStringError(Triple.getArchName() +
" linking is not supported");
Expand Down Expand Up @@ -924,9 +937,20 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
auto LinkerArgs = getLinkerArgs(Input, BaseArgs);

DenseSet<OffloadKind> ActiveOffloadKinds;
for (const auto &File : Input)
// Currently, SYCL device code linking process differs from generic device
// code linking.
// TODO: Remove check for offload kind, once SYCL device code linking is
// aligned with generic linking.
bool HasSYCLOffloadKind = false;
bool HasNonSYCLOffloadKind = false;
for (const auto &File : Input) {
if (File.getBinary()->getOffloadKind() != OFK_None)
ActiveOffloadKinds.insert(File.getBinary()->getOffloadKind());
if (File.getBinary()->getOffloadKind() == OFK_SYCL)
HasSYCLOffloadKind = true;
else
HasNonSYCLOffloadKind = true;
}

// Write any remaining device inputs to an output file.
SmallVector<StringRef> InputFiles;
Expand All @@ -937,13 +961,47 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
InputFiles.emplace_back(*FileNameOrErr);
}

if (HasSYCLOffloadKind) {
// Link the remaining device files using the device linker.
auto OutputOrErr = linkDevice(InputFiles, LinkerArgs, HasSYCLOffloadKind);
if (!OutputOrErr)
return OutputOrErr.takeError();
// Output is a packaged object of device images. Unpackage the images and
// copy them to Images[Kind]
ErrorOr<std::unique_ptr<MemoryBuffer>> BufferOrErr =
MemoryBuffer::getFileOrSTDIN(*OutputOrErr);
if (std::error_code EC = BufferOrErr.getError())
return createFileError(*OutputOrErr, EC);

MemoryBufferRef Buffer = **BufferOrErr;
SmallVector<OffloadFile> Binaries;
if (Error Err = extractOffloadBinaries(Buffer, Binaries))
return std::move(Err);
for (auto &OffloadFile : Binaries) {
auto TheBinary = OffloadFile.getBinary();
OffloadingImage TheImage{};
TheImage.TheImageKind = TheBinary->getImageKind();
TheImage.TheOffloadKind = TheBinary->getOffloadKind();
TheImage.StringData["triple"] = TheBinary->getTriple();
TheImage.StringData["arch"] = TheBinary->getArch();
TheImage.Image = MemoryBuffer::getMemBufferCopy(TheBinary->getImage());
Images[OFK_SYCL].emplace_back(std::move(TheImage));
}
}

if (!HasNonSYCLOffloadKind)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on if this is necessary or if we can just check the ActiveOffloadKinds for !OFK_SYCL

return Error::success();

// Link the remaining device files using the device linker.
auto OutputOrErr = linkDevice(InputFiles, LinkerArgs);
if (!OutputOrErr)
return OutputOrErr.takeError();

// Store the offloading image for each linked output file.
for (OffloadKind Kind : ActiveOffloadKinds) {
// For SYCL, Offloading images were created inside clang-sycl-linker
if (Kind == OFK_SYCL)
continue;
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> FileOrErr =
llvm::MemoryBuffer::getFileOrSTDIN(*OutputOrErr);
if (std::error_code EC = FileOrErr.getError()) {
Expand Down Expand Up @@ -986,6 +1044,11 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
A.StringData["arch"] > B.StringData["arch"] ||
A.TheOffloadKind < B.TheOffloadKind;
});
if (Kind == OFK_SYCL) {
// TODO: Update once SYCL offload wrapping logic is available.
reportError(
createStringError("SYCL offload wrapping logic is not available"));
}
auto BundledImagesOrErr = bundleLinkedOutput(Input, Args, Kind);
if (!BundledImagesOrErr)
return BundledImagesOrErr.takeError();
Expand Down
84 changes: 65 additions & 19 deletions clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,8 @@ static StringRef OutputFile;
/// Directory to dump SPIR-V IR if requested by user.
static SmallString<128> SPIRVDumpDir;

using OffloadingImage = OffloadBinary::OffloadingImage;

static void printVersion(raw_ostream &OS) {
OS << clang::getClangToolFullVersion("clang-sycl-linker") << '\n';
}
Expand Down Expand Up @@ -168,10 +170,10 @@ Expected<SmallVector<std::string>> getInput(const ArgList &Args) {
/// are LLVM IR bitcode files.
// TODO: Support SPIR-V IR files.
Expected<std::unique_ptr<Module>> getBitcodeModule(StringRef File,
LLVMContext &C) {
LLVMContext &Ctx) {
SMDiagnostic Err;

auto M = getLazyIRFileModule(File, Err, C);
auto M = getLazyIRFileModule(File, Err, Ctx);
if (M)
return std::move(M);
return createStringError(Err.getMessage());
Expand Down Expand Up @@ -211,16 +213,16 @@ Expected<SmallVector<std::string>> getSYCLDeviceLibs(const ArgList &Args) {
/// 3. Link all the images gathered in Step 2 with the output of Step 1 using
/// linkInModule API. LinkOnlyNeeded flag is used.
Expected<StringRef> linkDeviceCode(ArrayRef<std::string> InputFiles,
const ArgList &Args, LLVMContext &C) {
const ArgList &Args, LLVMContext &Ctx) {
llvm::TimeTraceScope TimeScope("SYCL link device code");

assert(InputFiles.size() && "No inputs to link");

auto LinkerOutput = std::make_unique<Module>("sycl-device-link", C);
auto LinkerOutput = std::make_unique<Module>("sycl-device-link", Ctx);
Linker L(*LinkerOutput);
// Link SYCL device input files.
for (auto &File : InputFiles) {
auto ModOrErr = getBitcodeModule(File, C);
auto ModOrErr = getBitcodeModule(File, Ctx);
if (!ModOrErr)
return ModOrErr.takeError();
if (L.linkInModule(std::move(*ModOrErr)))
Expand All @@ -235,7 +237,7 @@ Expected<StringRef> linkDeviceCode(ArrayRef<std::string> InputFiles,
// Link in SYCL device library files.
const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ));
for (auto &File : *SYCLDeviceLibFiles) {
auto LibMod = getBitcodeModule(File, C);
auto LibMod = getBitcodeModule(File, Ctx);
if (!LibMod)
return LibMod.takeError();
if ((*LibMod)->getTargetTriple() == Triple) {
Expand Down Expand Up @@ -278,18 +280,18 @@ Expected<StringRef> linkDeviceCode(ArrayRef<std::string> InputFiles,
/// Converts 'File' from LLVM bitcode to SPIR-V format using SPIR-V backend.
/// 'Args' encompasses all arguments required for linking device code and will
/// be parsed to generate options required to be passed into the backend.
static Expected<StringRef> runSPIRVCodeGen(StringRef File, const ArgList &Args,
LLVMContext &C) {
static Error runSPIRVCodeGen(StringRef File, const ArgList &Args,
StringRef OutputFile, LLVMContext &Ctx) {
llvm::TimeTraceScope TimeScope("SPIR-V code generation");

// Parse input module.
SMDiagnostic Err;
std::unique_ptr<Module> M = parseIRFile(File, Err, C);
SMDiagnostic E;
std::unique_ptr<Module> M = parseIRFile(File, E, Ctx);
if (!M)
return createStringError(Err.getMessage());
return createStringError(E.getMessage());

if (Error Err = M->materializeAll())
return std::move(Err);
return Err;

Triple TargetTriple(Args.getLastArgValue(OPT_triple_EQ));
M->setTargetTriple(TargetTriple);
Expand Down Expand Up @@ -333,7 +335,7 @@ static Expected<StringRef> runSPIRVCodeGen(StringRef File, const ArgList &Args,
errs() << formatv("SPIR-V Backend: input: {0}, output: {1}\n", File,
OutputFile);

return OutputFile;
return Error::success();
}

/// Performs the following steps:
Expand All @@ -342,17 +344,61 @@ static Expected<StringRef> runSPIRVCodeGen(StringRef File, const ArgList &Args,
Error runSYCLLink(ArrayRef<std::string> Files, const ArgList &Args) {
llvm::TimeTraceScope TimeScope("SYCL device linking");

LLVMContext C;
LLVMContext Ctx;

// Link all input bitcode files and SYCL device library files, if any.
auto LinkedFile = linkDeviceCode(Files, Args, C);
auto LinkedFile = linkDeviceCode(Files, Args, Ctx);
if (!LinkedFile)
reportError(LinkedFile.takeError());

// TODO: SYCL post link functionality involves device code splitting and will
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code splitting PR is under review here: llvm#131347

Thanks

// result in multiple bitcode codes.
// The following lines are placeholders to represent multiple files and will
// be refactored once SYCL post link support is available.
SmallVector<std::string> SplitModules;
SplitModules.emplace_back(*LinkedFile);

// SPIR-V code generation step.
auto SPVFile = runSPIRVCodeGen(*LinkedFile, Args, C);
if (!SPVFile)
return SPVFile.takeError();
for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
auto Stem = OutputFile.rsplit('.').first;
std::string SPVFile(Stem);
SPVFile.append("_" + utostr(I) + ".spv");
auto Err = runSPIRVCodeGen(SplitModules[I], Args, SPVFile, Ctx);
if (Err)
return std::move(Err);
SplitModules[I] = SPVFile;
}

// Write the final output into file.
int FD = -1;
if (std::error_code EC = sys::fs::openFileForWrite(OutputFile, FD))
return errorCodeToError(EC);
llvm::raw_fd_ostream FS(FD, /*shouldClose=*/true);

for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
auto File = SplitModules[I];
llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> FileOrErr =
llvm::MemoryBuffer::getFileOrSTDIN(File);
if (std::error_code EC = FileOrErr.getError()) {
if (DryRun)
FileOrErr = MemoryBuffer::getMemBuffer("");
else
return createFileError(File, EC);
}
OffloadingImage TheImage{};
TheImage.TheImageKind = IMG_Object;
TheImage.TheOffloadKind = OFK_SYCL;
TheImage.StringData["triple"] =
Args.MakeArgString(Args.getLastArgValue(OPT_triple_EQ));
TheImage.StringData["arch"] =
Args.MakeArgString(Args.getLastArgValue(OPT_arch_EQ));
TheImage.Image = std::move(*FileOrErr);

llvm::SmallString<0> Buffer = OffloadBinary::write(TheImage);
if (Buffer.size() % OffloadBinary::getAlignment() != 0)
return createStringError("Offload binary has invalid size alignment");
FS << Buffer;
}
return Error::success();
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove whitespace changes

Expand Down Expand Up @@ -394,7 +440,7 @@ int main(int argc, char **argv) {
DryRun = Args.hasArg(OPT_dry_run);
SaveTemps = Args.hasArg(OPT_save_temps);

OutputFile = "a.spv";
OutputFile = "a.out";
if (Args.hasArg(OPT_o))
OutputFile = Args.getLastArgValue(OPT_o);

Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/Object/OffloadBinary.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ enum OffloadKind : uint16_t {
OFK_OpenMP,
OFK_Cuda,
OFK_HIP,
OFK_SYCL,
OFK_LAST,
};

Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Object/OffloadBinary.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,7 @@ OffloadKind object::getOffloadKind(StringRef Name) {
.Case("openmp", OFK_OpenMP)
.Case("cuda", OFK_Cuda)
.Case("hip", OFK_HIP)
.Case("sycl", OFK_SYCL)
.Default(OFK_None);
}

Expand All @@ -312,6 +313,8 @@ StringRef object::getOffloadKindName(OffloadKind Kind) {
return "cuda";
case OFK_HIP:
return "hip";
case OFK_SYCL:
return "sycl";
default:
return "none";
}
Expand Down