-
Notifications
You must be signed in to change notification settings - Fork 120
[WIP] Nested Virtualization: KVM_GET_NESTED_STATE and KVM_SET_NESTED_STATE #322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
faa8422
to
a24356b
Compare
Heya!
If I'm reading the kernel code correctly, then since commit 6ca00dfafda7 ("KVM: x86: Modify struct kvm_nested_state to have explicit fields for data") this isn't actually true anymore: The u8 flexible array member that kvm_nested_state used to have got removed and replaced with a strongly typed union field with a statically known size. The only reason it was declared as a zero-sized member was backwards compatibility with Linux <5.2. But here we don't really need to worry about that, so we can just have the ioctl wrappers take cc @bonzini who probably knows more about this than me though |
Please note that kvm_nested_state from the current bindgen is too small and only contains the "header". Various checks in my code verify this. The actual size of the nested state also differs between VMX and SVM, I also verified this on my developer machine. The effective size can be verified during runtime using My introduced helper type solves this using stack allocated memory. But for SVM, we waste a page every time as the SVM state is smaller. |
argh, you're right, the bindgen version is indeed rubbish, hadnt actually looked at bindings.rs.
yes this is all fine - I misinterpreted your PR description, sorry. I thought you were treating the kvm_nested_state struct as having a flexible array member (I got confused by the &[u8] conversions). As for the serde thingy, I think the diff below is more in line with how serde support for other structures is handled in this crate, and allows the ioctl wrappers to be implemented in terms diff --git a/kvm-bindings/src/x86_64/bindings.rs b/kvm-bindings/src/x86_64/bindings.rs
index d49ee9a4f4f3..248b4a8d76ac 100644
--- a/kvm-bindings/src/x86_64/bindings.rs
+++ b/kvm-bindings/src/x86_64/bindings.rs
@@ -1999,6 +1999,10 @@ impl Default for kvm_vmx_nested_state_data {
}
#[repr(C)]
#[derive(Debug, Default, Copy, Clone, PartialEq)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::IntoBytes, zerocopy::Immutable, zerocopy::FromBytes)
+)]
pub struct kvm_vmx_nested_state_hdr {
pub vmxon_pa: __u64,
pub vmcs12_pa: __u64,
@@ -2009,6 +2013,10 @@ pub struct kvm_vmx_nested_state_hdr {
}
#[repr(C)]
#[derive(Debug, Default, Copy, Clone, PartialEq)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::IntoBytes, zerocopy::Immutable, zerocopy::FromBytes)
+)]
pub struct kvm_vmx_nested_state_hdr__bindgen_ty_1 {
pub flags: __u16,
}
@@ -2065,6 +2073,10 @@ impl Default for kvm_svm_nested_state_data {
}
#[repr(C)]
#[derive(Debug, Default, Copy, Clone, PartialEq)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::IntoBytes, zerocopy::Immutable, zerocopy::FromBytes)
+)]
pub struct kvm_svm_nested_state_hdr {
pub vmcb_pa: __u64,
}
@@ -2087,6 +2099,10 @@ pub struct kvm_nested_state {
}
#[repr(C)]
#[derive(Copy, Clone)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::Immutable, zerocopy::FromBytes)
+)]
pub union kvm_nested_state__bindgen_ty_1 {
pub vmx: kvm_vmx_nested_state_hdr,
pub svm: kvm_svm_nested_state_hdr,
diff --git a/kvm-bindings/src/x86_64/mod.rs b/kvm-bindings/src/x86_64/mod.rs
index 08e1bab89f26..9d9b69150fdc 100644
--- a/kvm-bindings/src/x86_64/mod.rs
+++ b/kvm-bindings/src/x86_64/mod.rs
@@ -9,6 +9,8 @@ pub mod fam_wrappers;
#[cfg(feature = "serde")]
mod serialize;
+pub mod nested;
+
pub use self::bindings::*;
#[cfg(feature = "fam-wrappers")]
pub use self::fam_wrappers::*;
diff --git a/kvm-bindings/src/x86_64/nested.rs b/kvm-bindings/src/x86_64/nested.rs
new file mode 100644
index 000000000000..1298984f7538
--- /dev/null
+++ b/kvm-bindings/src/x86_64/nested.rs
@@ -0,0 +1,59 @@
+use ::{kvm_nested_state__bindgen_ty_1, KVM_STATE_NESTED_VMX_VMCS_SIZE};
+use KVM_STATE_NESTED_SVM_VMCB_SIZE;
+
+
+#[derive(Clone, Copy)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::Immutable, zerocopy::FromBytes)
+)]
+#[repr(C)]
+pub union kvm_nested_state__data {
+ pub vmx: kvm_vmx_nested_state_data,
+ pub svm: kvm_svm_nested_state_data,
+}
+
+impl Default for kvm_nested_state__data {
+ fn default() -> Self {
+ let mut s = ::std::mem::MaybeUninit::<Self>::uninit();
+ unsafe {
+ ::std::ptr::write_bytes(s.as_mut_ptr(), 0, 1);
+ s.assume_init()
+ }
+ }
+}
+
+#[derive(Clone, Copy)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::IntoBytes, zerocopy::Immutable, zerocopy::FromBytes)
+)]
+#[repr(C)]
+pub struct kvm_vmx_nested_state_data {
+ pub vmcs12: [u8; KVM_STATE_NESTED_VMX_VMCS_SIZE as usize],
+ pub shadow_vmcs12: [u8; KVM_STATE_NESTED_VMX_VMCS_SIZE as usize],
+}
+
+#[derive(Clone, Copy)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::IntoBytes, zerocopy::Immutable, zerocopy::FromBytes)
+)]
+#[repr(C)]
+pub struct kvm_svm_nested_state_data {
+ pub vmcb12: [u8; KVM_STATE_NESTED_SVM_VMCB_SIZE as usize],
+}
+
+#[derive(Clone, Copy, Default)]
+#[cfg_attr(
+ feature = "serde",
+ derive(zerocopy::IntoBytes, zerocopy::Immutable, zerocopy::FromBytes)
+)]
+#[repr(C)]
+pub struct kvm_nested_state {
+ pub flags: u16,
+ pub format: u16,
+ pub size: u32,
+ pub hdr: kvm_nested_state__bindgen_ty_1,
+ pub data: kvm_nested_state__data,
+}
diff --git a/kvm-bindings/src/x86_64/serialize.rs b/kvm-bindings/src/x86_64/serialize.rs
index 11d90d533936..e0e109e15cd7 100644
--- a/kvm-bindings/src/x86_64/serialize.rs
+++ b/kvm-bindings/src/x86_64/serialize.rs
@@ -10,6 +10,8 @@ use bindings::{
use fam_wrappers::kvm_xsave2;
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use zerocopy::{transmute, FromBytes, FromZeros, Immutable, IntoBytes};
+use kvm_nested_state__bindgen_ty_1;
+use nested::{kvm_nested_state, kvm_nested_state__data};
serde_impls!(
kvm_regs,
@@ -31,7 +33,8 @@ serde_impls!(
kvm_cpuid2,
kvm_xsave,
kvm_xsave2,
- kvm_irqchip
+ kvm_irqchip,
+ kvm_nested_state
);
// SAFETY: zerocopy's derives explicitly disallow deriving for unions where
@@ -94,10 +97,35 @@ unsafe impl Immutable for kvm_irqchip__bindgen_ty_1 {
}
}
+// SAFETY: zerocopy's derives explicitly disallow deriving for unions where
+// the fields have different sizes, due to the smaller fields having padding.
+// Miri however does not complain about these implementations (e.g. about
+// reading the "padding" for one union field as valid data for a bigger one)
+unsafe impl IntoBytes for kvm_nested_state__bindgen_ty_1 {
+ fn only_derive_is_allowed_to_implement_this_trait()
+ where
+ Self: Sized
+ {
+ }
+}
+
+// SAFETY: zerocopy's derives explicitly disallow deriving for unions where
+// the fields have different sizes, due to the smaller fields having padding.
+// Miri however does not complain about these implementations (e.g. about
+// reading the "padding" for one union field as valid data for a bigger one)
+unsafe impl IntoBytes for kvm_nested_state__data {
+ fn only_derive_is_allowed_to_implement_this_trait()
+ where
+ Self: Sized
+ {
+ }
+}
+
#[cfg(test)]
mod tests {
use super::*;
use bindings::*;
+ use nested;
fn is_serde<T: Serialize + for<'de> Deserialize<'de> + Default>() {
let config = bincode::config::standard();
@@ -152,6 +180,7 @@ mod tests {
is_serde::<kvm_xcrs>();
is_serde::<kvm_irqchip>();
is_serde::<kvm_mp_state>();
+ is_serde::<nested::kvm_nested_state>();
}
fn is_serde_json<T: Serialize + for<'de> Deserialize<'de> + Default>() {
@@ -184,5 +213,6 @@ mod tests {
is_serde_json::<kvm_xcrs>();
is_serde_json::<kvm_irqchip>();
is_serde_json::<kvm_mp_state>();
+ is_serde_json::<nested::kvm_nested_state>();
}
}
|
Co-authored-by: Patrick Roy <[email protected]> Co-authored-by: Philipp Schuster <[email protected]> Signed-off-by: Philipp Schuster <[email protected]>
More info: https://docs.kernel.org/virt/kvm/api.html#capabilities-that-can-be-enabled-on-vcpus Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
7b3a492
to
ee6c962
Compare
Thanks for your guidance, I gave it another shot. I think the outcome is quite nice. Thoughts? |
This type is a helper, making the use of get_nested_state() and set_nested_state(), which are added in the next commit, much more convenient. Note that this type's name uses UpperCamelCase as it is not just a plain old data type but actually contains some logic: the `size` field is properly initialized. Effectively, KVM expects a dynamic buffer with a header reporting the size to either store the nested state in or load it from. As such data structures with a certain alignment are challenging to work with (note that Vec<u8> always have an alignment of 1 but we need 4), this type sacrifices a little memory overhead in some cases for better UX. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
These calls are relevant for live-migration and state save/resume when nested virtualization is used. I tested everything with a nested guest in Cloud Hypervisor, but these patches are not yet upstream. Signed-off-by: Philipp Schuster <[email protected]> On-behalf-of: SAP [email protected]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this looks good to me! Could you just squash the commits a little bit so that we dont first introduce struct kvm_nested_state and then modify it in two follow up commits to rename / add the non-exhaustiveness?
// Prevent constructor bypass in public API. | ||
_priv: core::marker::PhantomData<()>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use #[non_exhaustive]
on the struct for this?
@@ -18,6 +18,8 @@ extern crate serde; | |||
#[cfg(feature = "serde")] | |||
extern crate zerocopy; | |||
|
|||
extern crate core; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this? :o
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because kvm-bindings
still uses Rust edition 2015; needed to access core::mem
.
Strictly speaking, the crate doesn't specify any Rust edition and falls backto 2015
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whew. I'll submit a PR to get us onto edition 2024 tmrw, no reason for us to be on 2015 except that no one ever bothered to update it I guess (we dont really have a MSRV policy around here either)
Summary of the PR
For live-migration and state save/resume in Cloud Hypervisor [PR] and other VMMs using nested virtualization, the relevant IOCTLs were missing so far to get the state from/into KVM.
Handling nested state is a little more complex than anticipated due to the "dynamic buffer" (structure with header reporting the length) that KVM uses in its interface. Creating a nice, safe, and easy interface took multiple iterations.. To mitigate this, my solution is a helper type acting as stack buffer that is good enough for most use-cases. To provide users more flexibility, the new functions consumes
AsRef/AsMut<[u8]>
, in case a user wants to use a type that isserde
compatible over the newKvmNestedStateBuffer
.Hints for Reviewers
Steps to Undraft
Requirements
Before submitting your PR, please make sure you addressed the following
requirements:
git commit -s
), and the commit message has max 60 characters for thesummary and max 75 characters for each description line.
test.
Release" section of CHANGELOG.md (if no such section exists, please create one).
unsafe
code is properly documented.