Skip to content

Commit f377387

Browse files
committed
wasm: Store rlib metadata in wasm object files
The goal of this commit is to remove warnings using LLVM tip-of-tree `wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer looks at archive indices and instead looks at all the objects in archives. Previously `lib.rmeta` files were simply raw rustc metadata bytes, not wasm objects, meaning that `wasm-ld` would emit a warning indicating so. WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by default which meant that if Rust were to update to LLVM 18 then all wasm targets would not work. This immediate blocker was resolved in #120278 which removed `--fatal-warnings` which enabled a theoretical update to LLVM 18 for wasm targets. This current state is ok-enough for now because rustc squashes all linker output by default if it doesn't fail. This means, for example, that rustc squashes all the linker warnings coming out of `wasm-ld` about `lib.rmeta` files with LLVM 18. This again isn't a pressing issue because the information is all hidden, but it runs the risk of being annoying if another linker error were to happen and then the output would have all these unrelated warnings that couldn't be fixed. Thus, this PR comes into the picture. The goal of this PR is to resolve these warnings by using the WebAssembly object file format on wasm targets instead of using raw rustc metadata. When I first implemented the rlib-in-objects scheme in #84449 I remember either concluding that `wasm-ld` would either include the metadata in the output or I thought we didn't have to do anything there at all. I think I was wrong on both counts as `wasm-ld` does not include the metadata in the final output unless the object is referenced and we do actually need to do something to resolve these warnings. This PR updates the object file format containing rustc metadata on WebAssembly targets to be an actual WebAssembly file. To avoid bringing in any new dependencies I've opted to hand-code this encoding at this time. If the object gets more complicated though it'd probably be best to pull in `wasmparser` and `wasm-encoder`. For now though there's two adjacent functions reading/writing wasm. The only caveat I know of with this is that if `wasm-ld` does indeed look at the object file then the metadata will be included in the final output. I believe the only thing that could cause that at this time is `--whole-archive` which I don't think is passed for rlibs. I would clarify that I'm not 100% certain about this, however.
1 parent bf3c6c5 commit f377387

File tree

1 file changed

+99
-20
lines changed

1 file changed

+99
-20
lines changed

compiler/rustc_codegen_ssa/src/back/metadata.rs

Lines changed: 99 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ use rustc_data_structures::owned_slice::{try_slice_owned, OwnedSlice};
1515
use rustc_metadata::creader::MetadataLoader;
1616
use rustc_metadata::fs::METADATA_FILENAME;
1717
use rustc_metadata::EncodedMetadata;
18+
use rustc_serialize::leb128;
19+
use rustc_serialize::{opaque::MemDecoder, Decoder};
1820
use rustc_session::Session;
1921
use rustc_span::sym;
2022
use rustc_target::abi::Endian;
@@ -62,6 +64,8 @@ impl MetadataLoader for DefaultMetadataLoader {
6264
.map_err(|e| format!("failed to parse rlib '{}': {}", path.display(), e))?;
6365
if target.is_like_aix {
6466
return get_metadata_xcoff(path, data);
67+
} else if target.is_like_wasm {
68+
return get_metadata_wasm(data, ".rmeta");
6569
} else {
6670
return search_for_section(path, data, ".rmeta");
6771
}
@@ -420,10 +424,9 @@ pub enum MetadataPosition {
420424
/// it's not in an allowlist of otherwise well known dwarf section names to
421425
/// go into the final artifact.
422426
///
423-
/// * WebAssembly - we actually don't have any container format for this
424-
/// target. WebAssembly doesn't support the `dylib` crate type anyway so
425-
/// there's no need for us to support this at this time. Consequently the
426-
/// metadata bytes are simply stored as-is into an rlib.
427+
/// * WebAssembly - this uses wasm files themselves as the object file format
428+
/// so an empty file with no linking metadata but a single custom section is
429+
/// created holding our metadata.
427430
///
428431
/// * COFF - Windows-like targets create an object with a section that has
429432
/// the `IMAGE_SCN_LNK_REMOVE` flag set which ensures that if the linker
@@ -438,22 +441,13 @@ pub fn create_wrapper_file(
438441
data: &[u8],
439442
) -> (Vec<u8>, MetadataPosition) {
440443
let Some(mut file) = create_object_file(sess) else {
441-
// This is used to handle all "other" targets. This includes targets
442-
// in two categories:
443-
//
444-
// * Some targets don't have support in the `object` crate just yet
445-
// to write an object file. These targets are likely to get filled
446-
// out over time.
447-
//
448-
// * Targets like WebAssembly don't support dylibs, so the purpose
449-
// of putting metadata in object files, to support linking rlibs
450-
// into dylibs, is moot.
451-
//
452-
// In both of these cases it means that linking into dylibs will
453-
// not be supported by rustc. This doesn't matter for targets like
454-
// WebAssembly and for targets not supported by the `object` crate
455-
// yet it means that work will need to be done in the `object` crate
456-
// to add a case above.
444+
if sess.target.is_like_wasm {
445+
return (create_metadata_file_for_wasm(data, &section_name), MetadataPosition::First);
446+
}
447+
448+
// Targets using this branch don't have support implemented here yet or
449+
// they're not yet implemented in the `object` crate and will likely
450+
// fill out this module over time.
457451
return (data.to_vec(), MetadataPosition::Last);
458452
};
459453
let section = if file.format() == BinaryFormat::Xcoff {
@@ -532,6 +526,9 @@ pub fn create_compressed_metadata_file(
532526
packed_metadata.extend(metadata.raw_data());
533527

534528
let Some(mut file) = create_object_file(sess) else {
529+
if sess.target.is_like_wasm {
530+
return create_metadata_file_for_wasm(&packed_metadata, b".rustc");
531+
}
535532
return packed_metadata.to_vec();
536533
};
537534
if file.format() == BinaryFormat::Xcoff {
@@ -624,3 +621,85 @@ pub fn create_compressed_metadata_file_for_xcoff(
624621
file.append_section_data(section, data, 1);
625622
file.write().unwrap()
626623
}
624+
625+
/// Creates a simple WebAssembly object file, which is itself a wasm module,
626+
/// that contains a custom section of the name `section_name` with contents
627+
/// `data`.
628+
///
629+
/// NB: the wasm file format is simple enough that for now an extra crate from
630+
/// crates.io (such as `wasm-encoder` isn't used at this time (nor `wasmparser`
631+
/// for example to parse). The file format is:
632+
///
633+
/// * 4-byte header "\0asm"
634+
/// * 4-byte version number - 1u32 in little-endian format
635+
/// * concatenated sections, which for this object is always "custom sections"
636+
///
637+
/// Custom sections are then defiend by:
638+
/// * 1-byte section identifier - 0 for a custom section
639+
/// * leb-encoded section length (size of the contents beneath this bullet)
640+
/// * leb-encoded custom section name length
641+
/// * custom section name
642+
/// * section contents
643+
///
644+
/// One custom section, `linking`, is added here in accordance with
645+
/// <https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md>
646+
/// which is required to inform LLD that this is an object file but it should
647+
/// otherwise basically ignore it if it otherwise looks at it. The linking
648+
/// section currently is defined by a single verion byte (2) and then further
649+
/// sections, but we have no more sections, so it's just the byte "2".
650+
///
651+
/// The next custom section is the one we're interested in.
652+
pub fn create_metadata_file_for_wasm(data: &[u8], section_name: &[u8]) -> Vec<u8> {
653+
let mut bytes = b"\0asm\x01\0\0\0".to_vec();
654+
655+
let mut append_custom_section = |section_name: &[u8], data: &[u8]| {
656+
let mut section_name_len = [0; leb128::max_leb128_len::<usize>()];
657+
let off = leb128::write_usize_leb128(&mut section_name_len, section_name.len());
658+
let section_name_len = &section_name_len[..off];
659+
660+
let mut section_len = [0; leb128::max_leb128_len::<usize>()];
661+
let off = leb128::write_usize_leb128(
662+
&mut section_len,
663+
data.len() + section_name_len.len() + section_name.len(),
664+
);
665+
let section_len = &section_len[..off];
666+
667+
bytes.push(0u8);
668+
bytes.extend_from_slice(section_len);
669+
bytes.extend_from_slice(section_name_len);
670+
bytes.extend_from_slice(section_name);
671+
bytes.extend_from_slice(data);
672+
};
673+
674+
append_custom_section(b"linking", &[2]);
675+
append_custom_section(section_name, data);
676+
bytes
677+
}
678+
679+
// NB: see documentation on `create_metadata_file_for_wasm` above for
680+
// particulars on the wasm format.
681+
fn get_metadata_wasm<'a>(data: &'a [u8], expected_section_name: &str) -> Result<&'a [u8], String> {
682+
let data = data
683+
.strip_prefix(b"\0asm\x01\0\0\0")
684+
.ok_or_else(|| format!("metadata has an invalid wasm header"))?;
685+
686+
let mut decoder = MemDecoder::new(data, 0);
687+
let mut next_custom_section = |expected_section_name: &str| {
688+
if decoder.read_u8() != 0 {
689+
return Err(format!("metadata did not start with a custom section"));
690+
}
691+
let section_len = leb128::read_usize_leb128(&mut decoder);
692+
693+
let section_start = decoder.position();
694+
let mut section = MemDecoder::new(decoder.read_raw_bytes(section_len), 0);
695+
let name_len = leb128::read_usize_leb128(&mut section);
696+
let section_name = section.read_raw_bytes(name_len);
697+
if section_name != expected_section_name.as_bytes() {
698+
return Err(format!("unexpected section name in metadata object"));
699+
}
700+
Ok(&data[section_start + section.position()..][..section.remaining()])
701+
};
702+
703+
next_custom_section("linking")?;
704+
next_custom_section(expected_section_name)
705+
}

0 commit comments

Comments
 (0)