You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bpf, sockmap: Fix concurrency issues between memory charge and uncharge
Triggering WARN_ON_ONCE(sk->sk_forward_alloc) by running the following
command, followed by pressing Ctrl-C after 2 seconds:
./bench sockmap -c 2 -p 1 -a --rx-verdict-ingress
'''
------------[ cut here ]------------
WARNING: CPU: 2 PID: 40 at net/ipv4/af_inet.c inet_sock_destruct
Call Trace:
<TASK>
__sk_destruct+0x46/0x222
sk_psock_destroy+0x22f/0x242
process_one_work+0x504/0x8a8
? process_one_work+0x39d/0x8a8
? __pfx_process_one_work+0x10/0x10
? worker_thread+0x44/0x2ae
? __list_add_valid_or_report+0x83/0xea
? srso_return_thunk+0x5/0x5f
? __list_add+0x45/0x52
process_scheduled_works+0x73/0x82
worker_thread+0x1ce/0x2ae
'''
Reason:
When we are in the backlog process, we allocate sk_msg and then perform
the charge process. Meanwhile, in the user process context, the recvmsg()
operation performs the uncharge process, leading to concurrency issues
between them.
The charge process (2 functions):
1. sk_rmem_schedule(size) -> sk_forward_alloc increases by PAGE_SIZE
multiples
2. sk_mem_charge(size) -> sk_forward_alloc -= size
The uncharge process (sk_mem_uncharge()):
3. sk_forward_alloc += size
4. check if sk_forward_alloc > PAGE_SIZE
5. reclaim -> sk_forward_alloc decreases, possibly becoming 0
Because the sk performing charge and uncharge is not locked
(mainly because the backlog process does not lock the socket), therefore,
steps 1 to 5 will execute concurrently as follows:
cpu0 cpu1
1
3
4 --> sk_forward_alloc >= PAGE_SIZE
5 --> reclaim sk_forward_alloc
2 --> sk_forward_alloc may
become negative
Solution:
1. Add locking to the kfree_sk_msg() process, which is only called in the
user process context.
2. Integrate the charge process into sk_psock_create_ingress_msg() in the
backlog process and add locking.
3. Reuse the existing psock->ingress_lock.
Fixes: 799aa7f ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
Signed-off-by: Jiayuan Chen <[email protected]>
0 commit comments