Skip to content

How to handle Kunit killing the current thread on assertion failures? #759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bjorn3 opened this issue May 5, 2022 · 3 comments
Open
Labels
• lib Related to the `rust/` library.

Comments

@bjorn3
Copy link
Member

bjorn3 commented May 5, 2022

Documentation tests currently use kunit_do_failed_assertion which kills the current thread without unwinding the stack. This is UB in Rust as it deallocates the stack even if there may still be references to it. The first case where this can happen is in case of pinned stack variables. Pin requires that the value has it's Drop impl invoked before the memory is reused. This explicitly allows for example intrusive data structures. The second case is with scoped threads where child threads can reference variables on the stack of the parent thread. In this case the parent thread must wait for the completion of the child threads before returning from the current function. Exiting the current thread forcefully doesn't give the scoped thread implementation the chance of doing this. And finally there may be other unsafe code that temporarily puts references to variables on the stack of the current thread in global places. Yesterday me and @ojeda discussed in private how we should handle this. We came up with several options:

Keep aborting the current thread

This option is based on two observations:

  • Kunit is explicitly not meant for production, so UB is not as catastrophic as it would normally be.
  • Kunit prints the test failure before exiting the current thread. This means that even if everything goes wrong later on, at the test suite still reports at least one failure if there are any failures. The results of later tests may not be reliable anymore, but that is not a big deal.

We could make this option less likely to lead to crashes by avoiding pinning of stack variables.

Unwind the stack on test failures

In this option we would do panic!() in case of test failures and then unwind the stack. This option is free of any UB on the rust side and is the option how rust's builtin test framework works. The disadvantage of this option is that it requires adding a full DWARF unwinder in kernel space. Furthermore this depends on the currently unstable extern "C-unwind" abi to initiate the unwinding. It also gives some questions around how to handle unwinding past C kernel code.

Require all functions using the Kunit assertion macros to return TestResult<T>

In this option the macros would return Err(TestFailure) after reporting the failure allowing the failure to bubble up like a regular error. The advantage of this option is that it is simple to implement. The disadvantage is that it clutters the code a bit, which is not all that nice, especially for doctests.

@ojeda ojeda added • lib Related to the `rust/` library. prio: normal labels May 5, 2022
@sulix
Copy link

sulix commented May 10, 2022

I'm not sure if there's anything you could do with KUnit resources — for instance, it's possible to use kunit_add_resource to execute some code when the test exits, which should run even if the thread is killed via a failed assertion.

In addition, KUnit resource allocations form a stack: they're guaranteed to run in the opposite order that they were created, which could make them useful for stack unwinding.

I suspect something based on this would take a lot of work to implement, and even then only partly solve the problem. But it might prove useful somewhere…

@bjorn3
Copy link
Member Author

bjorn3 commented May 10, 2022

Interesting. That would be useful to for example prevent deadlocks due to locks not being unlocked. I think it is not possible to create a watertight unwinding implementation without compiler support to automatically register destructors akin to SjLj unwinding though.

@DemiMarie
Copy link

I don’t think this is actually specific to Kunit. Any Rust panic or C BUG() call will also trigger this, which makes this a soundness problem for the bindings.

There are a few solutions I can think of:

  • Call Rust destructors during an oops. Linux already has a built-in unwinder, but I am not sure if it is sufficient for this purpose. Also, the destructors could make things worse, by interacting with inconsistent C-side state.
  • Require panic_on_oops. This is unlikely to be viable as it has already been NAKd by Linus.
  • Leak the current thread on BUG(): do not deallocate the kernel stack, and pretend as if the thread is stuck in an infinite loop. Userspace would not be impacted.
  • Leak the current thread if there are any Rust destructors that would be skipped. Not sure if this is a viable option.

Darksonn pushed a commit to Darksonn/linux that referenced this issue Jan 17, 2025
…king

With CONFIG_DMA_API_DEBUG enabled, the following warning is observed:

DMA-API: snd_hda_intel 0000:03:00.1: device driver failed to check map error[device address=0x00000000ffff0000] [size=20480 bytes] [mapped as single]
WARNING: CPU: 28 PID: 2255 at kernel/dma/debug.c:1036 check_unmap+0x1408/0x2430
CPU: 28 UID: 42 PID: 2255 Comm: wireplumber Tainted: G  W L  6.12.0-10-133577cad6bf48e5a7848c4338124081393bfe8a+ Rust-for-Linux#759
debug_dma_unmap_page+0xe9/0xf0
snd_dma_wc_free+0x85/0x130 [snd_pcm]
snd_pcm_lib_free_pages+0x1e3/0x440 [snd_pcm]
snd_pcm_common_ioctl+0x1c9a/0x2960 [snd_pcm]
snd_pcm_ioctl+0x6a/0xc0 [snd_pcm]
...

Check for returned DMA addresses using specialized dma_mapping_error()
helper which is generally recommended for this purpose by
Documentation/core-api/dma-api.rst.

Fixes: c880a51 ("ALSA: memalloc: Use proper DMA mapping API for x86 WC buffer allocations")
Reported-by: Mikhail Gavrilov <[email protected]>
Closes: https://lore.kernel.org/r/CABXGCsNB3RsMGvCucOy3byTEOxoc-Ys+zB_HQ=Opb_GhX1ioDA@mail.gmail.com/
Tested-by: Mikhail Gavrilov <[email protected]>
Signed-off-by: Fedor Pchelkin <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
• lib Related to the `rust/` library.
Development

No branches or pull requests

4 participants