Skip to content

Commit e2a1cda

Browse files
danvetJocelyn Falempe
authored and
Jocelyn Falempe
committed
drm/panic: Add drm panic locking
Rough sketch for the locking of drm panic printing code. The upshot of this approach is that we can pretty much entirely rely on the atomic commit flow, with the pair of raw_spin_lock/unlock providing any barriers we need, without having to create really big critical sections in code. This also avoids the need that drivers must explicitly update the panic handler state, which they might forget to do, or not do consistently, and then we blow up in the worst possible times. It is somewhat racy against a concurrent atomic update, and we might write into a buffer which the hardware will never display. But there's fundamentally no way to avoid that - if we do the panic state update explicitly after writing to the hardware, we might instead write to an old buffer that the user will barely ever see. Note that an rcu protected deference of plane->state would give us the the same guarantees, but it has the downside that we then need to protect the plane state freeing functions with call_rcu too. Which would very widely impact a lot of code and therefore doesn't seem worth the complexity compared to a raw spinlock with very tiny critical sections. Plus rcu cannot be used to protect access to peek/poke registers anyway, so we'd still need it for those cases. Peek/poke registers for vram access (or a gart pte reserved just for panic code) are also the reason I've gone with a per-device and not per-plane spinlock, since usually these things are global for the entire display. Going with per-plane locks would mean drivers for such hardware would need additional locks, which we don't want, since it deviates from the per-console takeoverlocks design. Longer term it might be useful if the panic notifiers grow a bit more structure than just the absolute bare EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console drivers with proper register/unregister interfaces we could perhaps reuse the very fancy console lock with all it's check and takeover semantics that John Ogness is developing to fix the console_lock mess. But for the initial cut of a drm panic printing support I don't think we need that, because the critical sections are extremely small and only happen once per display refresh. So generally just 60 tiny locked sections per second, which is nothing compared to a serial console running a 115kbaud doing really slow mmio writes for each byte. So for now the raw spintrylock in drm panic notifier callback should be good enough. Another benefit of making panic notifiers more like full blown consoles (that are used in panics only) would be that we get the two stage design, where first all the safe outputs are used. And then the dangerous takeover tricks are deployed (where for display drivers we also might try to intercept any in-flight display buffer flips, which if we race and misprogram fifos and watermarks can hang the memory controller on some hw). For context the actual implementation on the drm side is by Jocelyn and this patch is meant to be combined with the overall approach in v7 (v8 is a bit less flexible, which I think is the wrong direction): https://lore.kernel.org/dri-devel/[email protected]/ Note that the locking is very much not correct there, hence this separate rfc. Starting from v10, I (Jocelyn) have included this patch in the drm_panic series, and done the corresponding changes. v2: - fix authorship, this was all my typing - some typo oopsies - link to the drm panic work by Jocelyn for context v10: - Use spinlock_irqsave/restore (John Ogness) v11: - Use macro instead of inline functions for drm_panic_lock/unlock (John Ogness) Signed-off-by: Daniel Vetter <[email protected]> Cc: Jocelyn Falempe <[email protected]> Cc: Andrew Morton <[email protected]> Cc: "Peter Zijlstra (Intel)" <[email protected]> Cc: Lukas Wunner <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: John Ogness <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Signed-off-by: Jocelyn Falempe <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Daniel Vetter <[email protected]>
1 parent 96a9151 commit e2a1cda

File tree

4 files changed

+110
-0
lines changed

4 files changed

+110
-0
lines changed

drivers/gpu/drm/drm_atomic_helper.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@
3838
#include <drm/drm_drv.h>
3939
#include <drm/drm_framebuffer.h>
4040
#include <drm/drm_gem_atomic_helper.h>
41+
#include <drm/drm_panic.h>
4142
#include <drm/drm_print.h>
4243
#include <drm/drm_self_refresh_helper.h>
4344
#include <drm/drm_vblank.h>
@@ -3016,6 +3017,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state *state,
30163017
bool stall)
30173018
{
30183019
int i, ret;
3020+
unsigned long flags;
30193021
struct drm_connector *connector;
30203022
struct drm_connector_state *old_conn_state, *new_conn_state;
30213023
struct drm_crtc *crtc;
@@ -3099,6 +3101,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state *state,
30993101
}
31003102
}
31013103

3104+
drm_panic_lock(state->dev, flags);
31023105
for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
31033106
WARN_ON(plane->state != old_plane_state);
31043107

@@ -3108,6 +3111,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state *state,
31083111
state->planes[i].state = old_plane_state;
31093112
plane->state = new_plane_state;
31103113
}
3114+
drm_panic_unlock(state->dev, flags);
31113115

31123116
for_each_oldnew_private_obj_in_state(state, obj, old_obj_state, new_obj_state, i) {
31133117
WARN_ON(obj->state != old_obj_state);

drivers/gpu/drm/drm_drv.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -638,6 +638,7 @@ static int drm_dev_init(struct drm_device *dev,
638638
mutex_init(&dev->filelist_mutex);
639639
mutex_init(&dev->clientlist_mutex);
640640
mutex_init(&dev->master_mutex);
641+
raw_spin_lock_init(&dev->mode_config.panic_lock);
641642

642643
ret = drmm_add_action_or_reset(dev, drm_dev_init_release, NULL);
643644
if (ret)

include/drm/drm_mode_config.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -505,6 +505,16 @@ struct drm_mode_config {
505505
*/
506506
struct list_head plane_list;
507507

508+
/**
509+
* @panic_lock:
510+
*
511+
* Raw spinlock used to protect critical sections of code that access
512+
* the display hardware or modeset software state, which the panic
513+
* printing code must be protected against. See drm_panic_trylock(),
514+
* drm_panic_lock() and drm_panic_unlock().
515+
*/
516+
struct raw_spinlock panic_lock;
517+
508518
/**
509519
* @num_crtc:
510520
*

include/drm/drm_panic.h

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
/* SPDX-License-Identifier: GPL-2.0 or MIT */
2+
#ifndef __DRM_PANIC_H__
3+
#define __DRM_PANIC_H__
4+
5+
#include <drm/drm_device.h>
6+
/*
7+
* Copyright (c) 2024 Intel
8+
*/
9+
10+
/**
11+
* drm_panic_trylock - try to enter the panic printing critical section
12+
* @dev: struct drm_device
13+
* @flags: unsigned long irq flags you need to pass to the unlock() counterpart
14+
*
15+
* This function must be called by any panic printing code. The panic printing
16+
* attempt must be aborted if the trylock fails.
17+
*
18+
* Panic printing code can make the following assumptions while holding the
19+
* panic lock:
20+
*
21+
* - Anything protected by drm_panic_lock() and drm_panic_unlock() pairs is safe
22+
* to access.
23+
*
24+
* - Furthermore the panic printing code only registers in drm_dev_unregister()
25+
* and gets removed in drm_dev_unregister(). This allows the panic code to
26+
* safely access any state which is invariant in between these two function
27+
* calls, like the list of planes &drm_mode_config.plane_list or most of the
28+
* struct drm_plane structure.
29+
*
30+
* Specifically thanks to the protection around plane updates in
31+
* drm_atomic_helper_swap_state() the following additional guarantees hold:
32+
*
33+
* - It is safe to deference the drm_plane.state pointer.
34+
*
35+
* - Anything in struct drm_plane_state or the driver's subclass thereof which
36+
* stays invariant after the atomic check code has finished is safe to access.
37+
* Specifically this includes the reference counted pointers to framebuffer
38+
* and buffer objects.
39+
*
40+
* - Anything set up by &drm_plane_helper_funcs.fb_prepare and cleaned up
41+
* &drm_plane_helper_funcs.fb_cleanup is safe to access, as long as it stays
42+
* invariant between these two calls. This also means that for drivers using
43+
* dynamic buffer management the framebuffer is pinned, and therefer all
44+
* relevant datastructures can be accessed without taking any further locks
45+
* (which would be impossible in panic context anyway).
46+
*
47+
* - Importantly, software and hardware state set up by
48+
* &drm_plane_helper_funcs.begin_fb_access and
49+
* &drm_plane_helper_funcs.end_fb_access is not safe to access.
50+
*
51+
* Drivers must not make any assumptions about the actual state of the hardware,
52+
* unless they explicitly protected these hardware access with drm_panic_lock()
53+
* and drm_panic_unlock().
54+
*
55+
* Return:
56+
* %0 when failing to acquire the raw spinlock, nonzero on success.
57+
*/
58+
#define drm_panic_trylock(dev, flags) \
59+
raw_spin_trylock_irqsave(&(dev)->mode_config.panic_lock, flags)
60+
61+
/**
62+
* drm_panic_lock - protect panic printing relevant state
63+
* @dev: struct drm_device
64+
* @flags: unsigned long irq flags you need to pass to the unlock() counterpart
65+
*
66+
* This function must be called to protect software and hardware state that the
67+
* panic printing code must be able to rely on. The protected sections must be
68+
* as small as possible. It uses the irqsave/irqrestore variant, and can be
69+
* called from irq handler. Examples include:
70+
*
71+
* - Access to peek/poke or other similar registers, if that is the way the
72+
* driver prints the pixels into the scanout buffer at panic time.
73+
*
74+
* - Updates to pointers like &drm_plane.state, allowing the panic handler to
75+
* safely deference these. This is done in drm_atomic_helper_swap_state().
76+
*
77+
* - An state that isn't invariant and that the driver must be able to access
78+
* during panic printing.
79+
*/
80+
81+
#define drm_panic_lock(dev, flags) \
82+
raw_spin_lock_irqsave(&(dev)->mode_config.panic_lock, flags)
83+
84+
/**
85+
* drm_panic_unlock - end of the panic printing critical section
86+
* @dev: struct drm_device
87+
* @flags: irq flags that were returned when acquiring the lock
88+
*
89+
* Unlocks the raw spinlock acquired by either drm_panic_lock() or
90+
* drm_panic_trylock().
91+
*/
92+
#define drm_panic_unlock(dev, flags) \
93+
raw_spin_unlock_irqrestore(&(dev)->mode_config.panic_lock, flags)
94+
95+
#endif /* __DRM_PANIC_H__ */

0 commit comments

Comments
 (0)