Description
TL;DR
The function async_context_threadsafe_background_execute_sync() incorrectly asserts that the recursive mutex is completely unentered (i.e., enter_count == 0
) during cross-core invocations. However, it fails to consider who owns the mutex. As a result, it prohibits valid scenarios where the target core (e.g., the async core) legitimately holds the lock while doing its work.
This leads to hard assertion failures and makes the function unusable under typical dual-core setups involving background task scheduling.
Description
In multicore builds, async_context_threadsafe_background_execute_sync() uses this assertion in the cross-core execution path:
hard_assert(!recursive_mutex_enter_count(&self->lock_mutex));
This assertion only checks whether the recursive mutex's enter_count
is zero. It does not verify which core, if any, owns the mutex. This check is too strict for valid cross-core calls.
For example, the async core (e.g., Core 1) may legally be processing background work while holding the lock. In this situation, a cross-core call from Core 0 should be allowed to run execute_sync(). However, the assertion fails because enter_count
is non-zero, even though the usage is valid and deadlock-free.
This behaviour contradicts the SDK's documentation:
“You should NOT call this method while holding the
async_context
’s lock.”
The documentation only states that the method should not be called while holding the lock, not that the other core must not acquire the lock.
Instead, the check should ensure that the calling core does not already own the lock.
Steps to Reproduce
- Set up a Pico SDK test environment on an Arduino Nano RP2040 Connect (or compatible dual-core RP2040 board).
- Configure an
async_context_threadsafe_background_t
to run on Core 1. - From Core 0, repeatedly invoke
async_context_execute_sync()
while Core 1 is processing scheduled work viaasync_context_add_at_time_worker_in_ms()
. - Wait until a hard assertion is triggered.
- When the assert fires and pauses the GDB, inspect frame 4, and examine
mutex_state
. - Compare findings against the provided disassembly and GDB logs.
Expected Behaviour
- Cross-core execution of async_context_execute_sync() should succeed when the target core holds the lock (e.g., performing scheduled work).
- Only same-core recursive misuse (i.e., when the calling core already owns the lock) should result in assertion failure.
- The assertion should verify ownership, not just the lock’s enter count.
- Behaviour should align with the documentation: the method must not be called while the calling core holds the lock.
Actual Behaviour
-
The assertion fails even when the calling core does not hold the mutex.
-
GDB inspection reveals that:
- The mutex
owner
is1
(i.e., Core 1), enter_count
is1
,- The calling core (Core 0) does not own the lock.
- The mutex
-
This confirms that the assertion triggers due to activity on the other core, not the one making the call.
(gdb) frame 4
#4 async_context_threadsafe_background_execute_sync (self_base=0x20001590 <async_ctx>, ...) at async_context_threadsafe_background.c:144
144 hard_assert(!recursive_mutex_enter_count((recursive_mutex_t *)&mutex_state));
(gdb) info locals
mutex_state = {
core = {
spin_lock = 0xd0000144
},
owner = 1 '\001',
enter_count = 1 '\001'
}
- Disassembly confirms that execution is mid-call inside
async_context_add_at_time_worker_in_ms()
which holds the lock on the async core.
For example:
0x1000a6d0 <+124>: add r6, pc, #196 @ (adr r6, 0x1000a798 <async_context_base_add_at_time_worker+12>)
This indicates that the assertion is triggered during normal background operation, not a logic error by the user.
Environment
Component | Version/Details |
---|---|
OS | Ubuntu 24.04.2 LTS |
Toolchain | Custom GCC 14.2 / Newlib 4.3 (toolchain-rp2040-earlephilhower @ 5.140200.240929 (14.2.0)) |
CMake Version | 3.28.3 |
Compiler | arm-none-eabi-gcc 14.2.0 |
Board | Arduino Nano RP2040 Connect |
Arduino Core | mutex_owner |
Pico SDK Version | 2.1.1-6-gb1676c18 |
Additional Information
- Introduced a local volatile copy of the
recursive_mutex_t
to prevent compiler optimisations from eliding reads.
if (self_base->core_num != get_core_num()) {
volatile recursive_mutex_t mutex_state = self->lock_mutex; // Atomic copy of entire mutex state
hard_assert(!recursive_mutex_enter_count((recursive_mutex_t *)&mutex_state));
- Designed a test that reliably triggers the condition during async activity on Core 1.
- Used GDB to verify that the mutex is held by the target core (
self->core_num
), not by the calling core.
Comparison with RTOS implementation
The FreeRTOS variant of async_context_execute_sync()
performs an ownership-aware check:
hard_assert(xSemaphoreGetMutexHolder(self->lock_mutex) != xTaskGetCurrentTaskHandle());