Skip to content

[LLD] Adding a modern target-selection flag to the drivers #97124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Ericson2314 opened this issue Jun 28, 2024 · 5 comments
Closed

[LLD] Adding a modern target-selection flag to the drivers #97124

Ericson2314 opened this issue Jun 28, 2024 · 5 comments
Labels
enhancement Improving things as opposed to bug fixing, e.g. new or missing feature lld:ELF

Comments

@Ericson2314
Copy link
Member

LLD's frontends are currently very faithful to the linkers they are based on. But that means the target-selection mechanisms they have are rather underpowered. I think would be good to have modern flags that would allow us to set the EM_* choice, ELFOSABI_* choice, and ELFKind independently (for valid combinations)

A good use for this would be better handling of the "OSABI" field. For example:

Problems

FreeBSD existing hacks

if (s.ends_with("_fbsd")) {
s = s.drop_back(5);
osabi = ELFOSABI_FREEBSD;
}

is an ad-hoc hack for FreeBSD. The corresponding code in Clang to use it is even uglier:

// Explicitly set the linker emulation for platforms that might not
// be the default emulation for the linker.
switch (Arch) {
case llvm::Triple::x86:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf_i386_fbsd");
break;
case llvm::Triple::ppc:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf32ppc_fbsd");
break;
case llvm::Triple::ppcle:
CmdArgs.push_back("-m");
// Use generic -- only usage is for freestanding.
CmdArgs.push_back("elf32lppc");
break;
case llvm::Triple::mips:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf32btsmip_fbsd");
break;
case llvm::Triple::mipsel:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf32ltsmip_fbsd");
break;
case llvm::Triple::mips64:
CmdArgs.push_back("-m");
if (tools::mips::hasMipsAbiArg(Args, "n32"))
CmdArgs.push_back("elf32btsmipn32_fbsd");
else
CmdArgs.push_back("elf64btsmip_fbsd");
break;
case llvm::Triple::mips64el:
CmdArgs.push_back("-m");
if (tools::mips::hasMipsAbiArg(Args, "n32"))
CmdArgs.push_back("elf32ltsmipn32_fbsd");
else
CmdArgs.push_back("elf64ltsmip_fbsd");
break;
case llvm::Triple::riscv64:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf64lriscv");
break;
default:
break;
}

If Clang could use "regular" code to transform the CPU into a -m flag, and then separately tell LLD that the ELFOSABI_* is ELFOSABI_FREEBSD, that would be much cleaner.

OpenBSD has similar needs

As discussed in #92675, we ought to have CI for OpenBSD, but OpenBSD has some outstanding downstream changes that need to be upstreamed before upstream-tool-produced binaries will work, and many of those changes today assume OpenBSD->OpenBSD native compilation and so are unfit to upstream as is

#97122 is the first such patch I've rebased. This is a somewhat borderline case, as there is already blanket handling to .openbsd.random regardless of the ELFOSABI_* in use. Still, "stealing names" from all ELF usages of LLD doesn't seem very elegant, even if the .openbsd.random case is grandfathered in --- I much rather start requiring ELFOSABI_OPENBSD with the .openbsd.random case deprecated with a warning. That said, if doing this would require a _obsd hack like FreeBSD's _fbsd, I can't help but think the medicine is as almost as bad is the disease.

Solutions

Proposal A: --target flag

Add a flag for "regular" LLVM triples --- like Clang's --target or LLVM's -mtriple --- so we have more expressive power. Those triples should be mostly possible to map to the choices above, and -m flags could still be used to fill in the gaps.

Advantages:

  • "no new syntax"
  • tools like Clang can just forward their --target argument as-is and hope for the best

Disadvantages:

LLVM triples can both say to much and too little

Proposal B: New greenfield flags

Add multiple new flags for specifying the these parts independently. Certainly ELFOSABI_ needs one. -m does the EM_ and ELFKind residual alright, perhaps, or perhaps they get fresh new flags too.

Advantages: No syntax vs semantics mismatch / friction / corner cases.

Disadvantage: Greenfield new flags for other tooling to have to learn about.


CC @brad0 because OpenBSD

C @mstorsjo because I am curious if Windows stuff has similar needs / not sure what to do about lld-link (clang-cl takes --target and many other GNU-style flags, but lld-link only takes /flag MS-style-flags)

@Ericson2314 Ericson2314 added the enhancement Improving things as opposed to bug fixing, e.g. new or missing feature label Jun 28, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2024

@llvm/issue-subscribers-lld-elf

Author: John Ericson (Ericson2314)

LLD's frontends are currently very faithful to the linkers they are based on. But that means the target-selection mechanisms they have are rather underpowered. I think would be good to have modern flags that would allow us to set the `EM_*` choice, `ELFOSABI_*` choice, and `ELFKind` independently (for valid combinations)

A good use for this would be better handling of the "OSABI" field. For example:

Problems

FreeBSD existing hacks

if (s.ends_with("_fbsd")) {
s = s.drop_back(5);
osabi = ELFOSABI_FREEBSD;
}

is an ad-hoc hack for FreeBSD. The corresponding code in Clang to use it is even uglier:

// Explicitly set the linker emulation for platforms that might not
// be the default emulation for the linker.
switch (Arch) {
case llvm::Triple::x86:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf_i386_fbsd");
break;
case llvm::Triple::ppc:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf32ppc_fbsd");
break;
case llvm::Triple::ppcle:
CmdArgs.push_back("-m");
// Use generic -- only usage is for freestanding.
CmdArgs.push_back("elf32lppc");
break;
case llvm::Triple::mips:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf32btsmip_fbsd");
break;
case llvm::Triple::mipsel:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf32ltsmip_fbsd");
break;
case llvm::Triple::mips64:
CmdArgs.push_back("-m");
if (tools::mips::hasMipsAbiArg(Args, "n32"))
CmdArgs.push_back("elf32btsmipn32_fbsd");
else
CmdArgs.push_back("elf64btsmip_fbsd");
break;
case llvm::Triple::mips64el:
CmdArgs.push_back("-m");
if (tools::mips::hasMipsAbiArg(Args, "n32"))
CmdArgs.push_back("elf32ltsmipn32_fbsd");
else
CmdArgs.push_back("elf64ltsmip_fbsd");
break;
case llvm::Triple::riscv64:
CmdArgs.push_back("-m");
CmdArgs.push_back("elf64lriscv");
break;
default:
break;
}

If Clang could use "regular" code to transform the CPU into a -m flag, and then separately tell LLD that the ELFOSABI_* is ELFOSABI_FREEBSD, that would be much cleaner.

OpenBSD has similar needs

As discussed in #92675, we ought to have CI for OpenBSD, but OpenBSD has some outstanding downstream changes that need to be upstreamed before upstream-tool-produced binaries will work, and many of those changes today assume OpenBSD->OpenBSD native compilation and so are unfit to upstream as is

#97122 is the first such patch I've rebased. This is a somewhat borderline case, as there is already blanket handling to .openbsd.random regardless of the ELFOSABI_* in use. Still, "stealing names" from all ELF usages of LLD doesn't seem very elegant, even if the .openbsd.random case is grandfathered in --- I much rather start requiring ELFOSABI_OPENBSD with the .openbsd.random case deprecated with a warning. That said, if doing this would require a _obsd hack like FreeBSD's _fbsd, I can't help but think the medicine is as almost as bad is the disease.

Solutions

Proposal A: --target flag

Add a flag for "regular" LLVM triples --- like Clang's --target or LLVM's -mtriple --- so we have more expressive power. Those triples should be mostly possible to map to the choices above, and -m flags could still be used to fill in the gaps.

Advantages:

  • "no new syntax"
  • tools like Clang can just forward their --target argument as-is and hope for the best

Disadvantages:

LLVM triples can both say to much and too little

Proposal B: New greenfield flags

Add multiple new flags for specifying the these parts independently. Certainly ELFOSABI_ needs one. -m does the EM_ and ELFKind residual alright, perhaps, or perhaps they get fresh new flags too.

Advantages: No syntax vs semantics mismatch / friction / corner cases.

Disadvantage: Greenfield new flags for other tooling to have to learn about.


CC @brad0 because OpenBSD

C @mstorsjo because I am curious if Windows stuff has similar needs / not sure what to do about lld-link (clang-cl takes --target and many other GNU-style flags, but lld-link only takes /flag MS-style-flags)

@MaskRay
Copy link
Member

MaskRay commented Jun 29, 2024

We can do something more lightweight. Does #97144 work for you?
OpenBSD can make all relocatable files tagged with ELFOSABI_OPENBSD. It could also ensure that Scrt1.o/crtbeginS.o are tagged.

While OpenBSD has proposed many interesting ideas for security hardening, I am deeply concerned of its development practices and
the rapid pace of introducing questionable extensions.
Particularly on the object file format side, the frequent addition of new .openbsd.* sections and PT_OPENBSD_* program headers raises my eyebrows.
In getOutputSectionName, the number of prefixes matters for performance.

I hope that there is special support to merge .openbsd.xxx.$unique into .openbsd.xxx. Just define .openbsd.xxx instead of .openbsd.xxx.$unique.

@mstorsjo
Copy link
Member

because I am curious if Windows stuff has similar needs

Not in particular, I think. The main distinction, probably somewhat similar to the target OS of an ELF object, is about whether it targets the mingw or MSVC ABI. And we already handle that in LLD, by using two separate entry points (ld.lld with a windows -m parameter, or lld-link), and the mingw entry point invokes lld-link with the parameter -lldmingw.

not sure what to do about lld-link (clang-cl takes --target and many other GNU-style flags, but lld-link only takes /flag MS-style-flags)

I don't think it's needed, but you could invent a lld-link style spelling of it, e.g. /target:<triple>. If you'd have Clang passing it automatically, that should only be done when we know the linker is lld-link and not plain link.exe. But in practice, when linking with lld-link, you seldom use the compiler driver (clang) to invoke the linker, but the user or build system most often invokes lld-link directly. So in that case, we wouldn't implicitly be getting the target triple for free anyway, unless we teach all build systems to do it.

But as said above, I don't see a direct need for it at all.

@Ericson2314
Copy link
Member Author

Ericson2314 commented Jun 30, 2024

We can do something more lightweight. Does #97144 work for you?

Yes, I think it does! Assuming Clang already sets that or can easily be made to do so, which I think is true.

@mstorsjo Thanks for the feedback. It's nice that @MaskRay's solution sidesteps the need for a new flag. clang-cl already takes --target so I think we are good here!

@Ericson2314
Copy link
Member Author

Closing because #97144 does work for me :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving things as opposed to bug fixing, e.g. new or missing feature lld:ELF
Projects
None yet
Development

No branches or pull requests

5 participants