Skip to content

Tagged pointers, now with strict provenance! #110243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Apr 17, 2023
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c738dcc
Add `bits_for` helper for tagged pointers & fixup docs
WaffleLapkin Apr 11, 2023
f028636
Sprinkle some whitespace & uses
WaffleLapkin Apr 11, 2023
3c6f4c1
Bless tagged pointers (comply to strict provenance)
WaffleLapkin Apr 11, 2023
12fd610
Refactor tagged ptr packing into a function
WaffleLapkin Apr 11, 2023
ad92677
Fix doc test
WaffleLapkin Apr 11, 2023
26232f1
Remove useless parameter from ghost
WaffleLapkin Apr 12, 2023
9051331
Lift `Pointer`'s requirement for the pointer to be thin
WaffleLapkin Apr 12, 2023
c6acd5c
Remove `Pointer::with_ref` in favour implementing it on tagged pointe…
WaffleLapkin Apr 12, 2023
c7c0b85
Make tagged pointers debug impls print the pointer
WaffleLapkin Apr 12, 2023
8f40820
Remove `pointer_{ref,mut}` from tagged pointers
WaffleLapkin Apr 12, 2023
3df9a7b
Shorten `COMPARE_PACKED` => `CP` where it is not important
WaffleLapkin Apr 12, 2023
6f64ae3
Move code around
WaffleLapkin Apr 12, 2023
5e4577e
Add `TaggedPtr::set_tag`
WaffleLapkin Apr 12, 2023
6f9b15c
Add tests for tagged pointers
WaffleLapkin Apr 12, 2023
838c549
Document tagged pointers better
WaffleLapkin Apr 12, 2023
dc19dc2
doc fixes
WaffleLapkin Apr 12, 2023
c155d51
Implement `Send`/`Sync` for `CopyTaggedPtr`
WaffleLapkin Apr 13, 2023
8d49e94
Doc fixes from review
WaffleLapkin Apr 14, 2023
251f662
Share `Tag2` impl between `CopyTaggedPtr` and `TaggedPtr` tests
WaffleLapkin Apr 14, 2023
36f5918
Test `CopyTaggedPtr`'s `HashStable` impl
WaffleLapkin Apr 14, 2023
014c6f2
Use `ptr::Alignment` for extra coolness points
WaffleLapkin Apr 14, 2023
5571dd0
fix broken intradoclinks
WaffleLapkin Apr 14, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions compiler/rustc_data_structures/src/aligned.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
use std::mem;

/// Returns the ABI-required minimum alignment of a type in bytes.
///
/// This is equivalent to [`mem::align_of`], but also works for some unsized
/// types (e.g. slices or rustc's `List`s).
pub const fn align_of<T: ?Sized + Aligned>() -> usize {
T::ALIGN
}

/// A type with a statically known alignment.
///
/// # Safety
///
/// `Self::ALIGN` must be equal to the alignment of `Self`. For sized types it
/// is [`mem::align_of<Self>()`], for unsized types it depends on the type, for
/// example `[T]` has alignment of `T`.
///
/// [`mem::align_of<Self>()`]: mem::align_of
pub unsafe trait Aligned {
/// Alignment of `Self`.
const ALIGN: usize;
}

unsafe impl<T> Aligned for T {
const ALIGN: usize = mem::align_of::<Self>();
}

unsafe impl<T> Aligned for [T] {
const ALIGN: usize = mem::align_of::<T>();
}
2 changes: 2 additions & 0 deletions compiler/rustc_data_structures/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#![feature(get_mut_unchecked)]
#![feature(lint_reasons)]
#![feature(unwrap_infallible)]
#![feature(strict_provenance)]
#![allow(rustc::default_hash_types)]
#![allow(rustc::potential_query_instability)]
#![deny(rustc::untranslatable_diagnostic)]
Expand Down Expand Up @@ -82,6 +83,7 @@ pub mod transitive_relation;
pub mod vec_linked_list;
pub mod work_queue;
pub use atomic_ref::AtomicRef;
pub mod aligned;
pub mod frozen;
pub mod owned_slice;
pub mod sso;
Expand Down
234 changes: 148 additions & 86 deletions compiler/rustc_data_structures/src/tagged_ptr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,166 +3,228 @@
//! In order to utilize the pointer packing, you must have two types: a pointer,
//! and a tag.
//!
//! The pointer must implement the `Pointer` trait, with the primary requirement
//! being conversion to and from a usize. Note that the pointer must be
//! dereferenceable, so raw pointers generally cannot implement the `Pointer`
//! trait. This implies that the pointer must also be nonzero.
//! The pointer must implement the [`Pointer`] trait, with the primary
//! requirement being convertible to and from a raw pointer. Note that the
//! pointer must be dereferenceable, so raw pointers generally cannot implement
//! the [`Pointer`] trait. This implies that the pointer must also be non-null.
//!
//! Many common pointer types already implement the `Pointer` trait.
//! Many common pointer types already implement the [`Pointer`] trait.
//!
//! The tag must implement the `Tag` trait. We assert that the tag and `Pointer`
//! are compatible at compile time.
//! The tag must implement the [`Tag`] trait.
//!
//! We assert that the tag and the [`Pointer`] types are compatible at compile
//! time.

use std::mem::ManuallyDrop;
use std::ops::Deref;
use std::ptr::NonNull;
use std::rc::Rc;
use std::sync::Arc;

use crate::aligned::Aligned;

mod copy;
mod drop;

pub use copy::CopyTaggedPtr;
pub use drop::TaggedPtr;

/// This describes the pointer type encapsulated by TaggedPtr.
/// This describes the pointer type encapsulated by [`TaggedPtr`] and
/// [`CopyTaggedPtr`].
///
/// # Safety
///
/// The usize returned from `into_usize` must be a valid, dereferenceable,
/// pointer to `<Self as Deref>::Target`. Note that pointers to `Pointee` must
/// be thin, even though `Pointee` may not be sized.
/// The pointer returned from [`into_ptr`] must be a [valid], pointer to
/// [`<Self as Deref>::Target`].
///
/// Note that if `Self` implements [`DerefMut`] the pointer returned from
/// [`into_ptr`] must be valid for writes (and thus calling [`NonNull::as_mut`]
/// on it must be safe).
///
/// Note that the returned pointer from `into_usize` should be castable to `&mut
/// <Self as Deref>::Target` if `Pointer: DerefMut`.
/// The [`BITS`] constant must be correct. At least [`BITS`] least significant
/// bits, must be zero on all pointers returned from [`into_ptr`].
///
/// The BITS constant must be correct. At least `BITS` bits, least-significant,
/// must be zero on all returned pointers from `into_usize`.
/// For example, if the alignment of [`Self::Target`] is 2, then `BITS` should be 1.
///
/// For example, if the alignment of `Pointee` is 2, then `BITS` should be 1.
/// [`BITS`]: Pointer::BITS
/// [`into_ptr`]: Pointer::into_ptr
/// [valid]: std::ptr#safety
/// [`<Self as Deref>::Target`]: Deref::Target
/// [`Self::Target`]: Deref::Target
/// [`DerefMut`]: std::ops::DerefMut
pub unsafe trait Pointer: Deref {
/// Number of unused (always zero) **least significant bits** in this
/// pointer, usually related to the pointees alignment.
///
/// Most likely the value you want to use here is the following, unless
/// your Pointee type is unsized (e.g., `ty::List<T>` in rustc) in which
/// case you'll need to manually figure out what the right type to pass to
/// align_of is.
/// your [`Self::Target`] type is unsized (e.g., `ty::List<T>` in rustc)
/// or your pointer is over/under aligned, in which case you'll need to
/// manually figure out what the right type to pass to [`bits_for`] is, or
/// what the value to set here.
///
/// ```ignore UNSOLVED (what to do about the Self)
/// ```rust
/// # use std::ops::Deref;
/// std::mem::align_of::<<Self as Deref>::Target>().trailing_zeros() as usize;
/// # use rustc_data_structures::tagged_ptr::bits_for;
/// # struct T;
/// # impl Deref for T { type Target = u8; fn deref(&self) -> &u8 { &0 } }
/// # impl T {
/// const BITS: usize = bits_for::<<Self as Deref>::Target>();
/// # }
/// ```
///
/// [`Self::Target`]: Deref::Target
const BITS: usize;
fn into_usize(self) -> usize;

/// # Safety
/// Turns this pointer into a raw, non-null pointer.
///
/// The passed `ptr` must be returned from `into_usize`.
/// The inverse of this function is [`from_ptr`].
///
/// This acts as `ptr::read` semantically, it should not be called more than
/// once on non-`Copy` `Pointer`s.
unsafe fn from_usize(ptr: usize) -> Self;
/// This function guarantees that the least-significant [`Self::BITS`] bits
/// are zero.
///
/// [`from_ptr`]: Pointer::from_ptr
/// [`Self::BITS`]: Pointer::BITS
fn into_ptr(self) -> NonNull<Self::Target>;

/// This provides a reference to the `Pointer` itself, rather than the
/// `Deref::Target`. It is used for cases where we want to call methods that
/// may be implement differently for the Pointer than the Pointee (e.g.,
/// `Rc::clone` vs cloning the inner value).
/// Re-creates the original pointer, from a raw pointer returned by [`into_ptr`].
///
/// # Safety
///
/// The passed `ptr` must be returned from `into_usize`.
unsafe fn with_ref<R, F: FnOnce(&Self) -> R>(ptr: usize, f: F) -> R;
/// The passed `ptr` must be returned from [`into_ptr`].
///
/// This acts as [`ptr::read::<Self>()`] semantically, it should not be called more than
/// once on non-[`Copy`] `Pointer`s.
///
/// [`into_ptr`]: Pointer::into_ptr
/// [`ptr::read::<Self>()`]: std::ptr::read
unsafe fn from_ptr(ptr: NonNull<Self::Target>) -> Self;
}

/// This describes tags that the `TaggedPtr` struct can hold.
/// This describes tags that the [`TaggedPtr`] struct can hold.
///
/// # Safety
///
/// The BITS constant must be correct.
/// The [`BITS`] constant must be correct.
///
/// No more than [`BITS`] least significant bits may be set in the returned usize.
///
/// No more than `BITS` least significant bits may be set in the returned usize.
/// [`BITS`]: Tag::BITS
pub unsafe trait Tag: Copy {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether it would be worth it to make this trait safe with an assert!(value.into_usize() < (1 << BITS)). That should be optimized away from all the uses I can imagine, although it would have to be tested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could even assert this at compile time using extremely ugly beautiful hacks like https://godbolt.org/z/fobx1szMq, although I am not sure whether this would be a good idea (almost certainly not).

/// Number of least-significant bits in the return value of [`into_usize`]
/// which may be non-zero. In other words this is the bit width of the
/// value.
///
/// [`into_usize`]: Tag::into_usize
const BITS: usize;

/// Turns this tag into an integer.
///
/// The inverse of this function is [`from_usize`].
///
/// This function guarantees that only the least-significant [`Self::BITS`]
/// bits can be non-zero.
///
/// [`from_usize`]: Tag::from_usize
/// [`Self::BITS`]: Tag::BITS
fn into_usize(self) -> usize;

/// Re-creates the tag from the integer returned by [`into_usize`].
///
/// # Safety
///
/// The passed `tag` must be returned from `into_usize`.
/// The passed `tag` must be returned from [`into_usize`].
///
/// [`into_usize`]: Tag::into_usize
unsafe fn from_usize(tag: usize) -> Self;
}

unsafe impl<T> Pointer for Box<T> {
const BITS: usize = std::mem::align_of::<T>().trailing_zeros() as usize;
unsafe impl<T: ?Sized + Aligned> Pointer for Box<T> {
const BITS: usize = bits_for::<Self::Target>();

#[inline]
fn into_usize(self) -> usize {
Box::into_raw(self) as usize
fn into_ptr(self) -> NonNull<T> {
// Safety: pointers from `Box::into_raw` are valid & non-null
unsafe { NonNull::new_unchecked(Box::into_raw(self)) }
}

#[inline]
unsafe fn from_usize(ptr: usize) -> Self {
Box::from_raw(ptr as *mut T)
}
unsafe fn with_ref<R, F: FnOnce(&Self) -> R>(ptr: usize, f: F) -> R {
let raw = ManuallyDrop::new(Self::from_usize(ptr));
f(&raw)
unsafe fn from_ptr(ptr: NonNull<T>) -> Self {
// Safety: `ptr` comes from `into_ptr` which calls `Box::into_raw`
Box::from_raw(ptr.as_ptr())
}
}

unsafe impl<T> Pointer for Rc<T> {
const BITS: usize = std::mem::align_of::<T>().trailing_zeros() as usize;
unsafe impl<T: ?Sized + Aligned> Pointer for Rc<T> {
const BITS: usize = bits_for::<Self::Target>();

#[inline]
fn into_usize(self) -> usize {
Rc::into_raw(self) as usize
fn into_ptr(self) -> NonNull<T> {
// Safety: pointers from `Rc::into_raw` are valid & non-null
unsafe { NonNull::new_unchecked(Rc::into_raw(self).cast_mut()) }
}

#[inline]
unsafe fn from_usize(ptr: usize) -> Self {
Rc::from_raw(ptr as *const T)
}
unsafe fn with_ref<R, F: FnOnce(&Self) -> R>(ptr: usize, f: F) -> R {
let raw = ManuallyDrop::new(Self::from_usize(ptr));
f(&raw)
unsafe fn from_ptr(ptr: NonNull<T>) -> Self {
// Safety: `ptr` comes from `into_ptr` which calls `Rc::into_raw`
Rc::from_raw(ptr.as_ptr())
}
}

unsafe impl<T> Pointer for Arc<T> {
const BITS: usize = std::mem::align_of::<T>().trailing_zeros() as usize;
unsafe impl<T: ?Sized + Aligned> Pointer for Arc<T> {
const BITS: usize = bits_for::<Self::Target>();

#[inline]
fn into_usize(self) -> usize {
Arc::into_raw(self) as usize
fn into_ptr(self) -> NonNull<T> {
// Safety: pointers from `Arc::into_raw` are valid & non-null
unsafe { NonNull::new_unchecked(Arc::into_raw(self).cast_mut()) }
}

#[inline]
unsafe fn from_usize(ptr: usize) -> Self {
Arc::from_raw(ptr as *const T)
}
unsafe fn with_ref<R, F: FnOnce(&Self) -> R>(ptr: usize, f: F) -> R {
let raw = ManuallyDrop::new(Self::from_usize(ptr));
f(&raw)
unsafe fn from_ptr(ptr: NonNull<T>) -> Self {
// Safety: `ptr` comes from `into_ptr` which calls `Arc::into_raw`
Arc::from_raw(ptr.as_ptr())
}
}

unsafe impl<'a, T: 'a> Pointer for &'a T {
const BITS: usize = std::mem::align_of::<T>().trailing_zeros() as usize;
unsafe impl<'a, T: 'a + ?Sized + Aligned> Pointer for &'a T {
const BITS: usize = bits_for::<Self::Target>();

#[inline]
fn into_usize(self) -> usize {
self as *const T as usize
fn into_ptr(self) -> NonNull<T> {
NonNull::from(self)
}

#[inline]
unsafe fn from_usize(ptr: usize) -> Self {
&*(ptr as *const T)
}
unsafe fn with_ref<R, F: FnOnce(&Self) -> R>(ptr: usize, f: F) -> R {
f(&*(&ptr as *const usize as *const Self))
unsafe fn from_ptr(ptr: NonNull<T>) -> Self {
// Safety:
// `ptr` comes from `into_ptr` which gets the pointer from a reference
ptr.as_ref()
}
}

unsafe impl<'a, T: 'a> Pointer for &'a mut T {
const BITS: usize = std::mem::align_of::<T>().trailing_zeros() as usize;
unsafe impl<'a, T: 'a + ?Sized + Aligned> Pointer for &'a mut T {
const BITS: usize = bits_for::<Self::Target>();

#[inline]
fn into_usize(self) -> usize {
self as *mut T as usize
fn into_ptr(self) -> NonNull<T> {
NonNull::from(self)
}

#[inline]
unsafe fn from_usize(ptr: usize) -> Self {
&mut *(ptr as *mut T)
}
unsafe fn with_ref<R, F: FnOnce(&Self) -> R>(ptr: usize, f: F) -> R {
f(&*(&ptr as *const usize as *const Self))
unsafe fn from_ptr(mut ptr: NonNull<T>) -> Self {
// Safety:
// `ptr` comes from `into_ptr` which gets the pointer from a reference
ptr.as_mut()
}
}

/// Returns the number of bits available for use for tags in a pointer to `T`
/// (this is based on `T`'s alignment).
pub const fn bits_for<T: ?Sized + Aligned>() -> usize {
let bits = crate::aligned::align_of::<T>().trailing_zeros();

// This is a replacement for `.try_into().unwrap()` unavailable in `const`
// (it's fine to make an assert here, since this is only called in compile time)
assert!((bits as u128) < usize::MAX as u128);

bits as usize
}
Loading