linux-xiaomi-chiron/kernel/locking
Waiman Long d257cc8cb8 locking/rwsem: Make handoff bit handling more consistent
There are some inconsistency in the way that the handoff bit is being
handled in readers and writers that lead to a race condition.

Firstly, when a queue head writer set the handoff bit, it will clear
it when the writer is being killed or interrupted on its way out
without acquiring the lock. That is not the case for a queue head
reader. The handoff bit will simply be inherited by the next waiter.

Secondly, in the out_nolock path of rwsem_down_read_slowpath(), both
the waiter and handoff bits are cleared if the wait queue becomes
empty.  For rwsem_down_write_slowpath(), however, the handoff bit is
not checked and cleared if the wait queue is empty. This can
potentially make the handoff bit set with empty wait queue.

Worse, the situation in rwsem_down_write_slowpath() relies on wstate,
a variable set outside of the critical section containing the ->count
manipulation, this leads to race condition where RWSEM_FLAG_HANDOFF
can be double subtracted, corrupting ->count.

To make the handoff bit handling more consistent and robust, extract
out handoff bit clearing code into the new rwsem_del_waiter() helper
function. Also, completely eradicate wstate; always evaluate
everything inside the same critical section.

The common function will only use atomic_long_andnot() to clear bits
when the wait queue is empty to avoid possible race condition.  If the
first waiter with handoff bit set is killed or interrupted to exit the
slowpath without acquiring the lock, the next waiter will inherit the
handoff bit.

While at it, simplify the trylock for loop in
rwsem_down_write_slowpath() to make it easier to read.

Fixes: 4f23dbc1e6 ("locking/rwsem: Implement lock handoff to prevent lock starvation")
Reported-by: Zhenhua Ma <mazhenhua@xiaomi.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211116012912.723980-1-longman@redhat.com
2021-11-23 09:45:35 +01:00
..
irqflag-debug.c lockdep: Noinstr annotate warn_bogus_irq_restore() 2021-02-10 14:44:39 +01:00
lock_events.c locking/lock_events: Don't show pvqspinlock events on bare metal 2019-04-10 10:56:05 +02:00
lock_events.h locking/lock_events: Use raw_cpu_{add,inc}() for stats 2019-06-03 12:32:56 +02:00
lock_events_list.h locking/rwsem: Remove reader optimistic spinning 2020-12-09 17:08:48 +01:00
lockdep.c Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
lockdep_internals.h lockdep: Allow tuning tracing capacity constants. 2021-04-05 20:33:57 +09:00
lockdep_proc.c locking/lockdep: Fix meaningless /proc/lockdep output of lock classes on !CONFIG_PROVE_LOCKING 2021-07-05 10:44:52 +02:00
lockdep_states.h locking/lockdep: Rework FS_RECLAIM annotation 2017-08-10 12:29:03 +02:00
locktorture.c locktorture: Warn on individual lock_torture_init() error conditions 2021-09-13 16:36:16 -07:00
Makefile locking/ww_mutex: Implement rtmutex based ww_mutex API functions 2021-08-17 19:05:26 +02:00
mcs_spinlock.h locking: Fix typos in comments 2021-03-22 02:45:52 +01:00
mutex-debug.c locking/ww_mutex: Gather mutex_waiter initialization 2021-08-17 19:04:41 +02:00
mutex.c locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able() 2021-10-19 17:27:06 +02:00
mutex.h locking/mutex: Move the 'struct mutex_waiter' definition from <linux/mutex.h> to the internal header 2021-08-17 18:24:31 +02:00
osq_lock.c locking: Fix typos in comments 2021-03-22 02:45:52 +01:00
percpu-rwsem.c locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_count 2020-09-16 16:26:56 +02:00
qrwlock.c locking/qrwlock: Cleanup queued_write_lock_slowpath() 2021-05-06 15:33:49 +02:00
qspinlock.c x86/kvm: Add "nopvspin" parameter to disable PV spinlocks 2020-07-08 16:21:57 -04:00
qspinlock_paravirt.h Revert "locking/pvqspinlock: Don't wait if vCPU is preempted" 2019-09-25 10:22:37 +02:00
qspinlock_stat.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
rtmutex.c rtmutex: Wake up the waiters lockless while dropping the read lock. 2021-10-01 13:57:52 +02:00
rtmutex_api.c locking/rtmutex: Prevent lockdep false positive with PI futexes 2021-08-17 19:06:02 +02:00
rtmutex_common.h locking/rtmutex: Dont dereference waiter lockless 2021-08-25 15:42:32 +02:00
rwbase_rt.c locking/rwbase: Optimize rwbase_read_trylock 2021-10-07 13:51:07 +02:00
rwsem.c locking/rwsem: Make handoff bit handling more consistent 2021-11-23 09:45:35 +01:00
semaphore.c locking/semaphore: Add might_sleep() to down_*() family 2021-08-20 12:33:17 +02:00
spinlock.c locking: Remove spin_lock_flags() etc 2021-10-30 16:37:28 +02:00
spinlock_debug.c locking/rwlock: Provide RT variant 2021-08-17 17:50:51 +02:00
spinlock_rt.c locking/rt: Take RCU nesting into account for __might_resched() 2021-10-01 13:57:51 +02:00
test-ww_mutex.c locking/ww-mutex: Fix uninitialized use of ret in test_aa() 2021-10-01 13:57:49 +02:00
ww_mutex.h locking/ww_mutex: Add rt_mutex based lock type and accessors 2021-08-17 19:05:11 +02:00
ww_rt_mutex.c kernel/locking: Add context to ww_mutex_trylock() 2021-09-17 15:08:41 +02:00