Skip to content

[LLD][COFF] Add support for including native ARM64 objects in ARM64EC images #137653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 15, 2025

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Apr 28, 2025

MSVC linker accepts native ARM64 object files as input with -machine:arm64ec, similar to -machine:arm64x. Its usefulness is very limited; for example, both exports and imports are not reflected in the PE structures and can't work. However, their symbol tables are otherwise functional.

Since we already have handling of multiple symbol tables implemented for ARM64X, the required changes are mostly about adjusting relevant checks to account for them on the ARM64EC target.

Delay-load helper handling is a bit of a shortcut. The patch never pulls it for native object files and just ensures that the code is fine with that. In general, I think it would be nice to adjust the driver to pull it only when it's actually referenced, which would allow applying the same logic to the native symbol table on ARM64EC without worrying about pulling too much.

@llvmbot
Copy link
Member

llvmbot commented Apr 28, 2025

@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lld

Author: Jacek Caban (cjacek)

Changes

MSVC linker accepts native ARM64 object files as input with -machine:arm64ec, similar to -machine:arm64x. Its usefulness is very limited; for example, both exports and imports are not reflected in the PE structures and can't work. However, their symbol tables are otherwise functional.

Since we already have handling of multiple symbol tables implemented for ARM64X, the required changes are mostly about adjusting relevant checks to account for them on the ARM64EC target.

Delay-load helper handling is a bit of a shortcut. The patch never pulls it for native object files and just ensures that the code is fine with that. In general, I think it would be nice to adjust the driver to pull it only when it's actually referenced, which would allow applying the same logic to the native symbol table on ARM64EC without worrying about pulling too much.


Patch is 49.70 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137653.diff

23 Files Affected:

  • (modified) lld/COFF/COFFLinkerContext.h (+8)
  • (modified) lld/COFF/Chunks.cpp (+1-1)
  • (modified) lld/COFF/DLL.cpp (+43-31)
  • (modified) lld/COFF/Driver.cpp (+16-15)
  • (modified) lld/COFF/InputFiles.cpp (+1-3)
  • (modified) lld/COFF/SymbolTable.cpp (+1-1)
  • (modified) lld/COFF/Writer.cpp (+7-6)
  • (modified) lld/test/COFF/arm64ec-entry-mangle.test (+1-1)
  • (modified) lld/test/COFF/arm64ec-hybmp.s (+2-2)
  • (modified) lld/test/COFF/arm64ec-lib.test (+3-1)
  • (modified) lld/test/COFF/arm64ec-patchable-thunks.test (+1-1)
  • (modified) lld/test/COFF/arm64ec-range-thunks.s (+5)
  • (modified) lld/test/COFF/arm64ec.test (+5-4)
  • (modified) lld/test/COFF/arm64x-altnames.s (+6)
  • (modified) lld/test/COFF/arm64x-buildid.s (+3)
  • (modified) lld/test/COFF/arm64x-comm.s (+3)
  • (modified) lld/test/COFF/arm64x-crt-sec.s (+3)
  • (modified) lld/test/COFF/arm64x-ctors-sec.s (+4)
  • (modified) lld/test/COFF/arm64x-guardcf.s (+25-17)
  • (modified) lld/test/COFF/arm64x-import.test (+93-57)
  • (modified) lld/test/COFF/arm64x-symtab.s (+16)
  • (modified) lld/test/COFF/arm64x-wrap.s (+4)
  • (modified) lld/test/COFF/autoimport-arm64ec-data.test (+1-1)
diff --git a/lld/COFF/COFFLinkerContext.h b/lld/COFF/COFFLinkerContext.h
index 2c5f6415e5d4b..f45b754384ef9 100644
--- a/lld/COFF/COFFLinkerContext.h
+++ b/lld/COFF/COFFLinkerContext.h
@@ -50,6 +50,14 @@ class COFFLinkerContext : public CommonLinkerContext {
     f(symtab);
   }
 
+  // Invoke the specified callback for each active symbol table,
+  // skipping the native symbol table on pure ARM64EC targets.
+  void forEachActiveSymtab(std::function<void(SymbolTable &symtab)> f) {
+    if (symtab.ctx.config.machine == ARM64X)
+      f(*hybridSymtab);
+    f(symtab);
+  }
+
   std::vector<ObjFile *> objFileInstances;
   std::map<std::string, PDBInputFile *> pdbInputFileInstances;
   std::vector<ImportFile *> importFileInstances;
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index ff2bc40932c04..01752cdc6a9da 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -580,7 +580,7 @@ void SectionChunk::getBaserels(std::vector<Baserel> *res) {
   // to match the value in the EC load config, which is expected to be
   // a relocatable pointer to the __chpe_metadata symbol.
   COFFLinkerContext &ctx = file->symtab.ctx;
-  if (ctx.hybridSymtab && ctx.hybridSymtab->loadConfigSym &&
+  if (ctx.config.machine == ARM64X && ctx.hybridSymtab->loadConfigSym &&
       ctx.hybridSymtab->loadConfigSym->getChunk() == this &&
       ctx.symtab.loadConfigSym &&
       ctx.hybridSymtab->loadConfigSize >=
diff --git a/lld/COFF/DLL.cpp b/lld/COFF/DLL.cpp
index 0440507b71756..c327da28ce138 100644
--- a/lld/COFF/DLL.cpp
+++ b/lld/COFF/DLL.cpp
@@ -560,7 +560,8 @@ class TailMergeChunkARM64 : public NonSectionCodeChunk {
     memcpy(buf, tailMergeARM64, sizeof(tailMergeARM64));
     applyArm64Addr(buf + 44, desc->getRVA(), rva + 44, 12);
     applyArm64Imm(buf + 48, desc->getRVA() & 0xfff, 0);
-    applyArm64Branch26(buf + 52, helper->getRVA() - rva - 52);
+    if (helper)
+      applyArm64Branch26(buf + 52, helper->getRVA() - rva - 52);
   }
 
   Chunk *desc = nullptr;
@@ -781,6 +782,7 @@ void IdataContents::create(COFFLinkerContext &ctx) {
     // ordinal values to the table.
     size_t base = lookups.size();
     Chunk *lookupsTerminator = nullptr, *addressesTerminator = nullptr;
+    uint32_t nativeOnly = 0;
     for (DefinedImportData *s : syms) {
       uint16_t ord = s->getOrdinal();
       HintNameChunk *hintChunk = nullptr;
@@ -806,8 +808,8 @@ void IdataContents::create(COFFLinkerContext &ctx) {
       // the native terminator, they will be ignored in the native view.
       // In the EC view, they should act as terminators, so emit ZEROFILL
       // relocations overriding them.
-      if (ctx.hybridSymtab && !lookupsTerminator && s->file->isEC() &&
-          !s->file->hybridFile) {
+      if (ctx.config.machine == ARM64X && !lookupsTerminator &&
+          s->file->isEC() && !s->file->hybridFile) {
         lookupsTerminator = lookupsChunk;
         addressesTerminator = addressesChunk;
         lookupsChunk = make<NullChunk>(ctx);
@@ -841,6 +843,7 @@ void IdataContents::create(COFFLinkerContext &ctx) {
         // Fill the auxiliary IAT with null chunks for native-only imports.
         auxIat.push_back(make<NullChunk>(ctx));
         auxIatCopy.push_back(make<NullChunk>(ctx));
+        ++nativeOnly;
       }
     }
     // Terminate with null values.
@@ -862,18 +865,15 @@ void IdataContents::create(COFFLinkerContext &ctx) {
     // Create the import table header.
     dllNames.push_back(make<StringChunk>(syms[0]->getDLLName()));
     auto *dir = make<ImportDirectoryChunk>(dllNames.back());
-    dir->lookupTab = lookups[base];
-    dir->addressTab = addresses[base];
-    dirs.push_back(dir);
 
-    if (ctx.hybridSymtab) {
-      // If native-only imports exist, they will appear as a prefix to all
-      // imports. Emit ARM64X relocations to skip them in the EC view.
-      uint32_t nativeOnly =
-          llvm::find_if(syms,
-                        [](DefinedImportData *s) { return s->file->isEC(); }) -
-          syms.begin();
-      if (nativeOnly) {
+    if (ctx.hybridSymtab && nativeOnly) {
+      if (ctx.config.machine != ARM64X)
+        // On pure ARM64EC targets, skip native-only imports in the import
+        // directory.
+        base += nativeOnly;
+      else if (nativeOnly) {
+        // If native-only imports exist, they will appear as a prefix to all
+        // imports. Emit ARM64X relocations to skip them in the EC view.
         ctx.dynamicRelocs->add(
             IMAGE_DVRT_ARM64X_FIXUP_TYPE_DELTA, 0,
             Arm64XRelocVal(
@@ -886,6 +886,10 @@ void IdataContents::create(COFFLinkerContext &ctx) {
             nativeOnly * sizeof(uint64_t));
       }
     }
+
+    dir->lookupTab = lookups[base];
+    dir->addressTab = addresses[base];
+    dirs.push_back(dir);
   }
   // Add null terminator.
   dirs.push_back(make<NullChunk>(sizeof(ImportDirectoryTableEntry), 4));
@@ -922,21 +926,25 @@ void DelayLoadContents::create() {
 
     size_t base = addresses.size();
     ctx.forEachSymtab([&](SymbolTable &symtab) {
-      if (ctx.hybridSymtab && symtab.isEC()) {
-        // For hybrid images, emit null-terminated native import entries
-        // followed by null-terminated EC entries. If a view is missing imports
-        // for a given module, only terminators are emitted. Emit ARM64X
-        // relocations to skip native entries in the EC view.
-        ctx.dynamicRelocs->add(
-            IMAGE_DVRT_ARM64X_FIXUP_TYPE_DELTA, 0,
-            Arm64XRelocVal(dir, offsetof(delay_import_directory_table_entry,
-                                         DelayImportAddressTable)),
-            (addresses.size() - base) * sizeof(uint64_t));
-        ctx.dynamicRelocs->add(
-            IMAGE_DVRT_ARM64X_FIXUP_TYPE_DELTA, 0,
-            Arm64XRelocVal(dir, offsetof(delay_import_directory_table_entry,
-                                         DelayImportNameTable)),
-            (addresses.size() - base) * sizeof(uint64_t));
+      if (symtab.isEC()) {
+        if (ctx.config.machine == ARM64X) {
+          // For hybrid images, emit null-terminated native import entries
+          // followed by null-terminated EC entries. If a view is missing
+          // imports for a given module, only terminators are emitted. Emit
+          // ARM64X relocations to skip native entries in the EC view.
+          ctx.dynamicRelocs->add(
+              IMAGE_DVRT_ARM64X_FIXUP_TYPE_DELTA, 0,
+              Arm64XRelocVal(dir, offsetof(delay_import_directory_table_entry,
+                                           DelayImportAddressTable)),
+              (addresses.size() - base) * sizeof(uint64_t));
+          ctx.dynamicRelocs->add(
+              IMAGE_DVRT_ARM64X_FIXUP_TYPE_DELTA, 0,
+              Arm64XRelocVal(dir, offsetof(delay_import_directory_table_entry,
+                                           DelayImportNameTable)),
+              (addresses.size() - base) * sizeof(uint64_t));
+        } else {
+          base = addresses.size();
+        }
       }
 
       Chunk *tm = nullptr;
@@ -981,7 +989,7 @@ void DelayLoadContents::create() {
           chunk = make<AuxImportChunk>(s->file);
           auxIatCopy.push_back(chunk);
           s->file->auxImpCopySym->setLocation(chunk);
-        } else if (ctx.hybridSymtab) {
+        } else if (ctx.config.machine == ARM64X) {
           // Fill the auxiliary IAT with null chunks for native imports.
           auxIat.push_back(make<NullChunk>(ctx));
           auxIatCopy.push_back(make<NullChunk>(ctx));
@@ -995,6 +1003,10 @@ void DelayLoadContents::create() {
         symtab.addSynthetic(tmName, tm);
       }
 
+      // Skip terminators on pure ARM64EC target if there are no native imports.
+      if (!tm && !symtab.isEC() && ctx.config.machine != ARM64X)
+        return;
+
       // Terminate with null values.
       addresses.push_back(make<NullChunk>(ctx, 8));
       names.push_back(make<NullChunk>(ctx, 8));
@@ -1024,7 +1036,7 @@ void DelayLoadContents::create() {
 }
 
 Chunk *DelayLoadContents::newTailMergeChunk(SymbolTable &symtab, Chunk *dir) {
-  auto helper = cast<Defined>(symtab.delayLoadHelper);
+  auto helper = cast_or_null<Defined>(symtab.delayLoadHelper);
   switch (symtab.machine) {
   case AMD64:
   case ARM64EC:
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index e3ff647209e72..0ad8f9ccec0dc 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -190,7 +190,6 @@ static bool compatibleMachineType(COFFLinkerContext &ctx, MachineTypes mt) {
   case ARM64:
     return mt == ARM64 || mt == ARM64X;
   case ARM64EC:
-    return isArm64EC(mt) || mt == AMD64;
   case ARM64X:
     return isAnyArm64(mt) || mt == AMD64;
   case IMAGE_FILE_MACHINE_UNKNOWN:
@@ -499,7 +498,7 @@ void LinkerDriver::parseDirectives(InputFile *file) {
     case OPT_entry:
       if (!arg->getValue()[0])
         Fatal(ctx) << "missing entry point symbol name";
-      ctx.forEachSymtab([&](SymbolTable &symtab) {
+      ctx.forEachActiveSymtab([&](SymbolTable &symtab) {
         symtab.entry = symtab.addGCRoot(symtab.mangle(arg->getValue()), true);
       });
       break;
@@ -657,7 +656,7 @@ void LinkerDriver::setMachine(MachineTypes machine) {
 
   ctx.config.machine = machine;
 
-  if (machine != ARM64X) {
+  if (!isArm64EC(machine)) {
     ctx.symtab.machine = machine;
   } else {
     ctx.symtab.machine = ARM64EC;
@@ -979,7 +978,7 @@ void LinkerDriver::createImportLibrary(bool asLib) {
   };
 
   getExports(ctx.symtab, exports);
-  if (ctx.hybridSymtab)
+  if (ctx.config.machine == ARM64X)
     getExports(*ctx.hybridSymtab, nativeExports);
 
   std::string libName = getImportName(asLib);
@@ -1383,13 +1382,13 @@ void LinkerDriver::maybeExportMinGWSymbols(const opt::InputArgList &args) {
       return;
 
     if (ctx.symtab.hadExplicitExports ||
-        (ctx.hybridSymtab && ctx.hybridSymtab->hadExplicitExports))
+        (ctx.config.machine == ARM64X && ctx.hybridSymtab->hadExplicitExports))
       return;
     if (args.hasArg(OPT_exclude_all_symbols))
       return;
   }
 
-  ctx.forEachSymtab([&](SymbolTable &symtab) {
+  ctx.forEachActiveSymtab([&](SymbolTable &symtab) {
     AutoExporter exporter(symtab, excludedSymbols);
 
     for (auto *arg : args.filtered(OPT_wholearchive_file))
@@ -2304,7 +2303,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   if (auto *arg = args.getLastArg(OPT_deffile)) {
     // parseModuleDefs mutates Config object.
     ctx.symtab.parseModuleDefs(arg->getValue());
-    if (ctx.hybridSymtab) {
+    if (ctx.config.machine == ARM64X) {
       // MSVC ignores the /defArm64Native argument on non-ARM64X targets.
       // It is also ignored if the /def option is not specified.
       if (auto *arg = args.getLastArg(OPT_defarm64native))
@@ -2331,7 +2330,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   }
 
   // Handle /entry and /dll
-  ctx.forEachSymtab([&](SymbolTable &symtab) {
+  ctx.forEachActiveSymtab([&](SymbolTable &symtab) {
     llvm::TimeTraceScope timeScope("Entry point");
     if (auto *arg = args.getLastArg(OPT_entry)) {
       if (!arg->getValue()[0])
@@ -2363,7 +2362,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
     llvm::TimeTraceScope timeScope("Delay load");
     for (auto *arg : args.filtered(OPT_delayload)) {
       config->delayLoads.insert(StringRef(arg->getValue()).lower());
-      ctx.forEachSymtab([&](SymbolTable &symtab) {
+      ctx.forEachActiveSymtab([&](SymbolTable &symtab) {
         if (symtab.machine == I386) {
           symtab.delayLoadHelper = symtab.addGCRoot("___delayLoadHelper2@8");
         } else {
@@ -2533,7 +2532,9 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
             u->setWeakAlias(symtab.addUndefined(to));
           }
         }
+      });
 
+      ctx.forEachActiveSymtab([&](SymbolTable &symtab) {
         // If any inputs are bitcode files, the LTO code generator may create
         // references to library functions that are not explicit in the bitcode
         // file's symbol table. If any of those library functions are defined in
@@ -2545,7 +2546,6 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
           for (auto *s : lto::LTO::getRuntimeLibcallSymbols(TT))
             symtab.addLibcall(s);
         }
-
         // Windows specific -- if __load_config_used can be resolved, resolve
         // it.
         if (symtab.findUnderscore("_load_config_used"))
@@ -2563,7 +2563,7 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   // Handle /includeglob
   for (StringRef pat : args::getStrings(args, OPT_incl_glob))
-    ctx.forEachSymtab(
+    ctx.forEachActiveSymtab(
         [&](SymbolTable &symtab) { symtab.addUndefinedGlob(pat); });
 
   // Create wrapped symbols for -wrap option.
@@ -2680,12 +2680,12 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   // need to create a .lib file. In MinGW mode, we only do that when the
   // -implib option is given explicitly, for compatibility with GNU ld.
   if (config->dll || !ctx.symtab.exports.empty() ||
-      (ctx.hybridSymtab && !ctx.hybridSymtab->exports.empty())) {
+      (ctx.config.machine == ARM64X && !ctx.hybridSymtab->exports.empty())) {
     llvm::TimeTraceScope timeScope("Create .lib exports");
-    ctx.forEachSymtab([](SymbolTable &symtab) { symtab.fixupExports(); });
+    ctx.forEachActiveSymtab([](SymbolTable &symtab) { symtab.fixupExports(); });
     if (!config->noimplib && (!config->mingw || !config->implib.empty()))
       createImportLibrary(/*asLib=*/false);
-    ctx.forEachSymtab(
+    ctx.forEachActiveSymtab(
         [](SymbolTable &symtab) { symtab.assignExportOrdinals(); });
   }
 
@@ -2751,7 +2751,8 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
 
   if (ctx.symtab.isEC())
     ctx.symtab.initializeECThunks();
-  ctx.forEachSymtab([](SymbolTable &symtab) { symtab.initializeLoadConfig(); });
+  ctx.forEachActiveSymtab(
+      [](SymbolTable &symtab) { symtab.initializeLoadConfig(); });
 
   // Identify unreferenced COMDAT sections.
   if (config->doGC) {
diff --git a/lld/COFF/InputFiles.cpp b/lld/COFF/InputFiles.cpp
index 7fb42bb681939..e10b6419b5ad5 100644
--- a/lld/COFF/InputFiles.cpp
+++ b/lld/COFF/InputFiles.cpp
@@ -137,10 +137,8 @@ void ArchiveFile::parse() {
         ctx.symtab.addLazyArchive(this, sym);
 
       // Read both EC and native symbols on ARM64X.
-      if (!ctx.hybridSymtab)
-        return;
       archiveSymtab = &*ctx.hybridSymtab;
-    } else if (ctx.hybridSymtab) {
+    } else {
       // If the ECSYMBOLS section is missing in the archive, the archive could
       // be either a native-only ARM64 or x86_64 archive. Check the machine type
       // of the object containing a symbol to determine which symbol table to
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 8fb0ee4e890d6..d6f771284aa83 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -551,7 +551,7 @@ void SymbolTable::initializeLoadConfig() {
       Warn(ctx) << "EC version of '_load_config_used' is missing";
       return;
     }
-    if (ctx.hybridSymtab) {
+    if (ctx.config.machine == ARM64X) {
       Warn(ctx) << "native version of '_load_config_used' is missing for "
                    "ARM64X target";
       return;
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index a5582cc8074d1..982701d28a140 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -1369,7 +1369,7 @@ void Writer::createExportTable() {
       }
     }
   }
-  ctx.forEachSymtab([&](SymbolTable &symtab) {
+  ctx.forEachActiveSymtab([&](SymbolTable &symtab) {
     if (symtab.edataStart) {
       if (symtab.hadExplicitExports)
         Warn(ctx) << "literal .edata sections override exports";
@@ -1759,7 +1759,8 @@ template <typename PEHeaderTy> void Writer::writeHeader() {
   assert(coffHeaderOffset == buf - buffer->getBufferStart());
   auto *coff = reinterpret_cast<coff_file_header *>(buf);
   buf += sizeof(*coff);
-  SymbolTable &symtab = ctx.hybridSymtab ? *ctx.hybridSymtab : ctx.symtab;
+  SymbolTable &symtab =
+      ctx.config.machine == ARM64X ? *ctx.hybridSymtab : ctx.symtab;
   coff->Machine = symtab.isEC() ? AMD64 : symtab.machine;
   coff->NumberOfSections = ctx.outputSections.size();
   coff->Characteristics = IMAGE_FILE_EXECUTABLE_IMAGE;
@@ -2391,7 +2392,7 @@ void Writer::setECSymbols() {
     return a.first->getRVA() < b.first->getRVA();
   });
 
-  ChunkRange &chpePdata = ctx.hybridSymtab ? hybridPdata : pdata;
+  ChunkRange &chpePdata = ctx.config.machine == ARM64X ? hybridPdata : pdata;
   Symbol *rfeTableSym = ctx.symtab.findUnderscore("__arm64x_extra_rfe_table");
   replaceSymbol<DefinedSynthetic>(rfeTableSym, "__arm64x_extra_rfe_table",
                                   chpePdata.first);
@@ -2436,7 +2437,7 @@ void Writer::setECSymbols() {
       delayIdata.getAuxIatCopy().empty() ? nullptr
                                          : delayIdata.getAuxIatCopy().front());
 
-  if (ctx.hybridSymtab) {
+  if (ctx.config.machine == ARM64X) {
     // For the hybrid image, set the alternate entry point to the EC entry
     // point. In the hybrid view, it is swapped to the native entry point
     // using ARM64X relocations.
@@ -2826,7 +2827,7 @@ void Writer::fixTlsAlignment() {
 }
 
 void Writer::prepareLoadConfig() {
-  ctx.forEachSymtab([&](SymbolTable &symtab) {
+  ctx.forEachActiveSymtab([&](SymbolTable &symtab) {
     if (!symtab.loadConfigSym)
       return;
 
@@ -2886,7 +2887,7 @@ void Writer::prepareLoadConfig(SymbolTable &symtab, T *loadConfig) {
   IF_CONTAINS(CHPEMetadataPointer) {
     // On ARM64X, only the EC version of the load config contains
     // CHPEMetadataPointer. Copy its value to the native load config.
-    if (ctx.hybridSymtab && !symtab.isEC() &&
+    if (ctx.config.machine == ARM64X && !symtab.isEC() &&
         ctx.symtab.loadConfigSize >=
             offsetof(T, CHPEMetadataPointer) + sizeof(T::CHPEMetadataPointer)) {
       OutputSection *sec =
diff --git a/lld/test/COFF/arm64ec-entry-mangle.test b/lld/test/COFF/arm64ec-entry-mangle.test
index 6db16ef218dc8..1f029077ba51d 100644
--- a/lld/test/COFF/arm64ec-entry-mangle.test
+++ b/lld/test/COFF/arm64ec-entry-mangle.test
@@ -97,7 +97,7 @@ RUN: not lld-link -machine:arm64ec -dll -out:test.dll demangled-func.obj loadcon
 RUN:              "-entry:#func" 2>&1 | FileCheck -check-prefix=FUNC-NOT-FOUND %s
 RUN: not lld-link -machine:arm64ec -dll -out:test.dll demangled-func.obj loadconfig-arm64ec.obj \
 RUN:              -noentry "-export:#func" 2>&1 | FileCheck -check-prefix=FUNC-NOT-FOUND %s
-FUNC-NOT-FOUND: undefined symbol: #func
+FUNC-NOT-FOUND: undefined symbol: #func (EC symbol)
 
 Verify that the linker recognizes the demangled x86_64 _DllMainCRTStartup.
 RUN: lld-link -machine:arm64ec -dll -out:test.dll x64-dll-main.obj loadconfig-arm64ec.obj
diff --git a/lld/test/COFF/arm64ec-hybmp.s b/lld/test/COFF/arm64ec-hybmp.s
index 5fc24d4250704..670ee3926ab5c 100644
--- a/lld/test/COFF/arm64ec-hybmp.s
+++ b/lld/test/COFF/arm64ec-hybmp.s
@@ -62,7 +62,7 @@ thunk:
 
 // RUN: llvm-mc -filetype=obj -triple=arm64ec-windows undef-func.s -o undef-func.obj
 // RUN: not lld-link -machine:arm64ec -dll -noentry -out:test.dll undef-func.obj 2>&1 | FileCheck -check-prefix=UNDEF-FUNC %s
-// UNDEF-FUNC: error: undefined symbol: func
+// UNDEF-FUNC: error: undefined symbol: func (EC symbol)
 
 #--- undef-thunk.s
     .section .text,"xr",discard,func
@@ -79,7 +79,7 @@ func:
 
 // RUN: llvm-mc -filetype=obj -triple=arm64ec-windows undef-thunk.s -o undef-thunk.obj
 // RUN: not lld-link -machine:arm64ec -dll -noentry -out:test.dll undef-thunk.obj 2>&1 | FileCheck -check-prefix=UNDEF-THUNK %s
-// UNDEF-THUNK: error: undefined symbol: thunk
+// UNDEF-THUNK: error: undefined symbol: thunk (EC symbol)
 
 #--- invalid-type.s
     .section .text,"xr",discard,func
diff --git a/lld/test/COFF/arm64ec-lib.test b/lld/test/COFF/arm64ec-lib.test
index 8698a5ceccbe7..1e6fa60209d94 100644
--- a/lld/test/COFF/arm64ec-lib.test
+++ b/lld/test/COFF/arm64ec-lib.test
@@ -29,11 +29,13 @@ RUN: lld-link -machine:arm64ec -dll -noentry -out:test2.dll symref-arm64ec.obj s
 Verify that both native and EC symbols can be referenced in a hybrid target.
 RUN: lld-link -machine:arm64x -dll -noentry -out:test3.dll symref-arm64ec.obj nsymref-aarch64.obj sym-arm64ec.lib \
 RUN:          loadconfig-arm64.obj loadconfig-arm64ec.obj
+RUN: lld-link -machine:arm64ec -dll -noentry -ou...
[truncated]

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a quite large change. That makes it feel hard to grasp - even if the final state perhaps isn't that much more complicated than the initial state.

However I'm not sure I understand the full context of this change. Is this input case something that we do expect that we'd ever want to do? (In my understanding of arm64ec/arm64x use cases, I don't see when we'd ever want to do this?) Doesn't this just cause a quite confusing situation where you can link in object files which do end up in the final image, but which are pretty much inactive (not referenced, not used at runtime at all)? As I don't see the real use case, I also feel a bit more hesitant about a change like this which is yet another small step towards a more complex internal linker state.

I'm also a little curious about how you managed to do this change, to check all changes (e.g. regarding delay imports), as we don't really exercise this new case in any of the existing tests? But I guess the fact that you always create two symbol tables even if we essentially never use one of them, causes us to force testing all codepaths in the existing tests anyway...

@@ -2545,7 +2546,6 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
for (auto *s : lto::LTO::getRuntimeLibcallSymbols(TT))
symtab.addLibcall(s);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Unrelated changed line

@@ -657,7 +656,7 @@ void LinkerDriver::setMachine(MachineTypes machine) {

ctx.config.machine = machine;

if (machine != ARM64X) {
if (!isArm64EC(machine)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this change makes us create two symbol tables for all arm64ec linking, even if the very fast majority of them ever only will use one of them?

That feels odd, but I guess it's good for consistency - otherwise we'd probably have lots of bugs for the very rare cases when we do need both symbol tables for arm64ec. And I guess this is the change which forces the extra (EC symbol) to be printed in many tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s right, this is the core part of the change; the rest of the patch updates various components to handle it properly. The extra (EC symbol) annotation doesn’t add much value when there’s only a single symbol table, but it becomes quite useful with this PR. If someone accidentally passes a native object file, error messages that include (native symbol) should make the issue easy to diagnose.

@cjacek
Copy link
Contributor Author

cjacek commented May 12, 2025

However I'm not sure I understand the full context of this change. Is this input case something that we do expect that we'd ever want to do? (In my understanding of arm64ec/arm64x use cases, I don't see when we'd ever want to do this?) Doesn't this just cause a quite confusing situation where you can link in object files which do end up in the final image, but which are pretty much inactive (not referenced, not used at runtime at all)? As I don't see the real use case, I also feel a bit more hesitant about a change like this which is yet another small step towards a more complex internal linker state.

As I mentioned in the description, I think the usefulness of this feature is questionable, but it's how MSVC behaves. I encountered it a few times during testing, mainly because most linker-related aspects of EC are undocumented. I had to resort to throwing various test cases at MSVC to see how it behaves, and ideally, infer the intent behind that behavior.

The most recent case where I observed this was while investigating the undocumented -arm64xsameaddress for issue #131712. One hypothesis was that it allows referencing native aarch64 code from EC code. I now know that’s not the case. -arm64xsameaddress is a no-op on EC targets. It only has meaning for hybrid images, where it replaces function symbols in both symbol tables with a single thunk that jumps to the actual implementation (so the function has the same address in both views). In that case, additional ARM64X relocs are emitted so that the thunk jumps to the EC variant in the EC view. On pure EC targets, such a thunk can also be emitted if you pass a native aarch64 object file, but it’s not particularly useful there.

Another situation where I encountered native objects was with import libraries. ARM64EC import libraries use native object files for import descriptors. This doesn’t affect LLD much since we synthesize them anyway, but MSVC appears to rely on them (see #84834), so it needs to support native object files.

I don’t have a strong real-world example that requires this, and in fact, using native objects with code is probably almost always a bad idea. One acceptable case might be resource object files. Hypothetically, someone could use an aarch64 build system to build ARM64EC binaries by passing something like CFLAGS=-arm64EC. That wouldn't affect cvtres, so its output would remain a native aarch64 object. MSVC handles that fine, but current LLD would reject it.

That said, I don’t have a strong reason why we need to support this. Now that we have ARM64X support, it was easy to implement when I came across it again. My thought was that it might be a good time to finalize this part of the code while it’s still fresh. But if you don’t think it’s worthwhile, I will skip it.

I'm also a little curious about how you managed to do this change, to check all changes (e.g. regarding delay imports), as we don't really exercise this new case in any of the existing tests? But I guess the fact that you always create two symbol tables even if we essentially never use one of them, causes us to force testing all codepaths in the existing tests anyway...

I grepped the source code for all symbol table references, adjusted them as needed, and reviewed their associated test cases where appropriate. Then I did the reverse: I went through all ARM64X test cases to identify ones where it made sense to add an ARM64EC variant.

Delay-load is covered by arm64ec-delayimport.test. If we didn’t skip the native part when it’s missing, the RVAs would shift due to the presence of null chunks. Also, without adjusting how __delayLoadHelper2 is handled, LLD would either crash or fail to locate the helper, depending on whether we tried to pull it in.

@mstorsjo
Copy link
Member

Another situation where I encountered native objects was with import libraries. ARM64EC import libraries use native object files for import descriptors. This doesn’t affect LLD much since we synthesize them anyway, but MSVC appears to rely on them (see #84834), so it needs to support native object files.

That's a good point. The fact that LLD doesn't actually touch the import library header/footer object files but just makes up its own, is a bit odd at times. Not sure if that's going to change at some point though, but if it would, I guess we'd need this.

I don’t have a strong real-world example that requires this, and in fact, using native objects with code is probably almost always a bad idea. One acceptable case might be resource object files. Hypothetically, someone could use an aarch64 build system to build ARM64EC binaries by passing something like CFLAGS=-arm64EC. That wouldn't affect cvtres, so its output would remain a native aarch64 object. MSVC handles that fine, but current LLD would reject it.

Thanks, that's also a reasonable real-world case. (Although in MSVC style build systems one usually just passes the plain .res files to the linker and let the linker convert it to object file form.)

That said, I don’t have a strong reason why we need to support this. Now that we have ARM64X support, it was easy to implement when I came across it again. My thought was that it might be a good time to finalize this part of the code while it’s still fresh. But if you don’t think it’s worthwhile, I will skip it.

I think it can be reasonable to include it, to make our implementation a bit more consistent with MS link.exe. Perhaps it would be good to add a comment, e.g. in LinkerDriver::setMachine, that we always create 2 symbol tables for that case, even if one is mostly ignored, to make the design decision stand out a bit more.

As I mentioned somwehere, my reaction to reviewing the PR was mostly that it was a lot of changes, but it probably doesn't add much to the overall complexity of LLD, it just changes the existing code. So given these explanations, I think this change may be ok to merge.

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -657,7 +656,7 @@ void LinkerDriver::setMachine(MachineTypes machine) {

ctx.config.machine = machine;

if (machine != ARM64X) {
if (!isArm64EC(machine)) {
ctx.symtab.machine = machine;
} else {
ctx.symtab.machine = ARM64EC;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be good to add a comment here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment, thanks for review!

cjacek added 2 commits May 14, 2025 22:59
… images

MSVC linker accepts native ARM64 object files as input with -machine:arm64ec,
similar to -machine:arm64x. Its usefulness is very limited; for example, both
exports and imports are not reflected in the PE structures and can't work.
However, their symbol tables are otherwise functional.

Since we already have handling of multiple symbol tables implemented for ARM64X,
the required changes are mostly about adjusting relevant checks to account for
them on the ARM64EC target.

Delay-load helper handling is a bit of a shortcut. The patch never pulls it for
native object files and just ensures that the code is fine with that. In general,
I think it would be nice to adjust the driver to pull it only when it's actually
referenced, which would allow applying the same logic to the native symbol table
on ARM64EC without worrying about pulling too much.
Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@cjacek cjacek merged commit 5b05728 into llvm:main May 15, 2025
11 checks passed
@cjacek cjacek deleted the arm64ec-native branch May 15, 2025 09:38
TIFitis pushed a commit to TIFitis/llvm-project that referenced this pull request May 19, 2025
… images (llvm#137653)

MSVC linker accepts native ARM64 object files as input with
`-machine:arm64ec`, similar to `-machine:arm64x`. Its usefulness is very
limited; for example, both exports and imports are not reflected in the
PE structures and can't work. However, their symbol tables are otherwise
functional.

Since we already have handling of multiple symbol tables implemented for
ARM64X, the required changes are mostly about adjusting relevant checks
to account for them on the ARM64EC target.

Delay-load helper handling is a bit of a shortcut. The patch never pulls
it for native object files and just ensures that the code is fine with
that. In general, I think it would be nice to adjust the driver to pull
it only when it's actually referenced, which would allow applying the
same logic to the native symbol table on ARM64EC without worrying about
pulling too much.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants