When using "-l", test_progs often is executed as non-root user,
load_bpf_testmod() will fail and output errors. This patch skips loading bpf
testmod when "-l" is specified, making output cleaner.
Signed-off-by: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210817044732.3263066-2-fallentree@fb.com
Commit 0a0a66c984 ("clk: staging: Specify IOMEM dependency for Xilinx
Clocking Wizard driver") introduces a dependency on the non-existing config
IOMEM, which basically makes it impossible to include this driver into any
build. Fortunately, ./scripts/checkkconfigsymbols.py warns:
IOMEM
Referencing files: drivers/staging/clocking-wizard/Kconfig
The config for IOMEM support is called HAS_IOMEM. Correct this reference to
the intended config.
Fixes: 0a0a66c984 ("clk: staging: Specify IOMEM dependency for Xilinx Clocking Wizard driver")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20210817105404.13146-1-lukas.bulwahn@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove unneeded DBG_88E macro call from the rtl8188e_Add_RateATid
function in hal/rtl8188e_cmd.c, as it is not particularly clear in my
opinion, and we should strive towards use of existing kernel machinery
for debugging purposes.
Acked-by: Michael Straube <straube.linux@gmail.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/r/20210816234459.132239-3-phil@philpotter.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove set but unused variable init_rate from the rtl8188e_Add_RateATid
function in hal/rtl8188eu_cmd.c, as this fixes a kernel test robot
warning. Removing the call to get_highest_rate_idx has no side effects
here so is safe.
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Michael Straube <straube.linux@gmail.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/r/20210816234459.132239-2-phil@philpotter.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues reported by checkpatch in the os_dep
directory.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited before semicolon
WARNING: space prohibited between function name and open parenthesis '('
ERROR: spaces required around that '=' (ctx:VxV)
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816211053.31728-1-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues reported by checkpatch in the remaining
files in the hal directory.
CHECK: spaces preferred around that ...
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816205511.20068-4-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues reported by checkpatch in the next 10
files in the hal directory.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited before semicolon
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816205511.20068-3-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues reported by checkpatch in the first 10
files in the hal directory.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816205511.20068-2-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_sta_mgt.c reported by
checkpatch.
WARNING: space prohibited between function name and open parenthesis '('
CHECK: spaces preferred around that '/' (ctx:VxV)
WARNING: space prohibited before semicolon
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-23-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add missing space around operator in core/rtw_sreset.c reported by
checkpatch.
CHECK: spaces preferred around that '|' (ctx:VxV)
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-22-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_debug.c reported by checkpatch.
CHECK: spaces preferred around that '%' (ctx:VxV)
WARNING: space prohibited before semicolon
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-21-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_xmit.c reported by checkpatch.
CHECK: spaces preferred around that ...
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-20-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_security.c reported by
checkpatch.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited before semicolon
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-18-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_recv.c reported by checkpatch.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-17-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_pwrctrl.c reported by
checkpatch.
CHECK: spaces preferred around that ...
WARNING: space prohibited before semicolon
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-16-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_p2p.c reported by checkpatch.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited before semicolon
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-15-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_mp_ioctl.c reported by
checkpatch.
CHECK: spaces preferred around that '|' (ctx:VxV)
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-14-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_mp.c reported by checkpatch.
CHECK: spaces preferred around that ...
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-13-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_mlme_ext.c reported by
checkpatch.
CHECK: spaces preferred around that ...
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-12-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_mlme.c reported by checkpatch.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-11-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_ioctl_set.c reported by
checkpatch.
CHECK: spaces preferred around that ...
WARNING: space prohibited before semicolon
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-9-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Simplify multiplication in core/rtw_ioctl_set.c. to improve readability
and clear a checkpatch issue.
CHECK: spaces preferred around that '/' (ctx:VxV)
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-8-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_ieee80211.c reported by
checkpatch.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-7-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_efuse.c reported by checkpatch.
CHECK: spaces preferred around that ...
WARNING: space prohibited before semicolon
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-6-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clean up spacing style issues in core/rtw_cmd.c reported by checkpatch.
CHECK: spaces preferred around that ...
CHECK: No space is necessary after a cast
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Michael Straube <straube.linux@gmail.com>
Link: https://lore.kernel.org/r/20210816155818.24005-5-straube.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Why]
Userspace should get back a copy of drm_wait_vblank that's been modified
even when drm_wait_vblank_ioctl returns a failure.
Rationale:
drm_wait_vblank_ioctl modifies the request and expects the user to read
it back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates
the sequence to become current_vblank_count + sequence (which was
RELATIVE), but now it became ABSOLUTE.
drmWaitVBlank (in libdrm) expects this to be the case as it modifies
the request to be Absolute so it expects the sequence to would have been
updated.
The change is in compat_drm_wait_vblank, which is called by
drm_compat_ioctl. This change of copying the data back regardless of the
return number makes it en par with drm_ioctl, which always copies the
data before returning.
[How]
Return from the function after everything has been copied to user.
Fixes IGT:kms_flip::modeset-vs-vblank-race-interruptible
Tested on ChromeOS Trogdor(msm)
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Mark Yacoub <markyacoub@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812194917.1703356-1-markyacoub@chromium.org
There was no strong reason to or not to flush barrier work items in
flush_workqueue(). And we have to make barrier work items not participate
in nr_active so we had been using WORK_NO_COLOR for them which also makes
them can't be flushed by flush_workqueue().
And the users of flush_workqueue() often do not intend to wait barrier work
items issued by flush_work(). That made the choice sound perfect.
But barrier work items have reference to internal structure (pool_workqueue)
and the worker thread[s] is/are still busy for the workqueue user when the
barrrier work items are not done. So it is reasonable to make flush_workqueue()
also watch for flush_work() to make it more robust.
And a problem[1] reported by Li Zhe shows that we need such robustness.
The warning logs are listed below:
WARNING: CPU: 0 PID: 19336 at kernel/workqueue.c:4430 destroy_workqueue+0x11a/0x2f0
*****
destroy_workqueue: test_workqueue9 has the following busy pwq
pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=0/1 refcnt=2
in-flight: 5658:wq_barrier_func
Showing busy workqueues and worker pools:
*****
It shows that even after drain_workqueue() returns, the barrier work item
is still in flight and the pwq (and a worker) is still busy on it.
The problem is caused by flush_workqueue() not watching flush_work():
Thread A Worker
/* normal work item with linked */
process_scheduled_works()
destroy_workqueue() process_one_work()
drain_workqueue() /* run normal work item */
/-- pwq_dec_nr_in_flight()
flush_workqueue() <---/
/* the last normal work item is done */
sanity_check process_one_work()
/-- raw_spin_unlock_irq(&pool->lock)
raw_spin_lock_irq(&pool->lock) <-/ /* maybe preempt */
*WARNING* wq_barrier_func()
/* maybe preempt by cond_resched() */
Thread A can get the pool lock after the Worker unlocks the pool lock before
running wq_barrier_func(). And if there is any preemption happen around
wq_barrier_func(), destroy_workqueue()'s sanity check is more likely to
get the lock and catch it. (Note: preemption is not necessary to cause the bug,
the unlocking is enough to possibly trigger the WARNING.)
A simple solution might be just executing all linked barrier work items
once without releasing pool lock after the head work item's
pwq_dec_nr_in_flight(). But this solution has two problems:
1) the head work item might also be barrier work item when the user-queued
work item is cancelled. For example:
thread 1: thread 2:
queue_work(wq, &my_work)
flush_work(&my_work)
cancel_work_sync(&my_work);
/* Neiter my_work nor the barrier work is scheduled. */
destroy_workqueue(wq);
/* This is an easier way to catch the WARNING. */
2) there might be too much linked barrier work items and running them
all once without releasing pool lock just causes trouble.
The only solution is to make flush_workqueue() aslo watch barrier work
items. So we have to assign a color to these barrier work items which
is the color of the head (user-queued) work item.
Assigning a color doesn't cause any problem in ative management, because
the prvious patch made barrier work items not participate in nr_active
via WORK_STRUCT_INACTIVE rather than reliance on the (old) WORK_NO_COLOR.
[1]: https://lore.kernel.org/lkml/20210812083814.32453-1-lizhe.67@bytedance.com/
Reported-by: Li Zhe <lizhe.67@bytedance.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently, WORK_NO_COLOR has two meanings:
Not participate in flushing
Not participate in nr_active
And only non-barrier work items are marked with WORK_STRUCT_INACTIVE
when they are in inactive_works list. The barrier work items are not
marked INACTIVE even linked in inactive_works list since these tail
items are always moved together with the head work item.
These definitions are simple, clean and practical. (Except a small
blemish that only the first meaning of WORK_NO_COLOR is documented in
include/linux/workqueue.h while both meanings are in workqueue.c)
But dual-purpose WORK_NO_COLOR used for barrier work items has proven to
be problematical[1]. Only the second purpose is obligatory. So we plan
to make barrier work items participate in flushing but keep them still
not participating in nr_active.
So the plan is to mark barrier work items inactive without using
WORK_NO_COLOR in this patch so that we can assign a flushing color to
them in next patch.
The reasonable way is to add or reuse a bit in work data of the work
item. But adding a bit will double the size of pool_workqueue.
Currently, WORK_STRUCT_INACTIVE is only used in try_to_grab_pending()
for user-queued work items and try_to_grab_pending() can't work for
barrier work items. So we extend WORK_STRUCT_INACTIVE to also mark
barrier work items no matter which list they are in because we don't
need to determind which list a barrier work item is in.
So the meaning of WORK_STRUCT_INACTIVE becomes just "the work items don't
participate in nr_active" (no matter whether it is a barrier work item or
a user-queued work item). And WORK_STRUCT_INACTIVE for user-queued work
items means they are in inactive_works list.
This patch does it by setting WORK_STRUCT_INACTIVE for barrier work items
in insert_wq_barrier() and checking WORK_STRUCT_INACTIVE first in
pwq_dec_nr_in_flight(). And the meaning of WORK_NO_COLOR is reduced to
only "not participating in flushing".
There is no functionality change intended in this patch. Because
WORK_NO_COLOR+WORK_STRUCT_INACTIVE represents the previous WORK_NO_COLOR
in meaning and try_to_grab_pending() doesn't use for barrier work items
and avoids being confused by this extended WORK_STRUCT_INACTIVE.
A bunch of comment for nr_active & WORK_STRUCT_INACTIVE is also added for
documenting how WORK_STRUCT_INACTIVE works in nr_active management.
[1]: https://lore.kernel.org/lkml/20210812083814.32453-1-lizhe.67@bytedance.com/
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add a local var @work_flags to calculate work_flags step by step, so that
we don't need to squeeze several flags in only the last line of code.
Parepare for next patch to add a bit to barrier work item's flag. Not
squshing this to next patch makes it clear that what it will have changed.
No functional change intended.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Make pwq_dec_nr_in_flight() use work_data rather just work_color.
Prepare for later patch to get WORK_STRUCT_INACTIVE bit from work_data
in pwq_dec_nr_in_flight().
No functional change intended.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
There are two kinds of "delayed" work items in workqueue subsystem.
One is for timer-delayed work items which are visible to workqueue users.
The other kind is for work items delayed by active management which can
not be directly visible to workqueue users. We mixed the word "delayed"
for both kinds and caused somewhat ambiguity.
This patch renames the later one (delayed by active management) to
"inactive", because it is used for workqueue active management and
most of its related symbols are named with "active" or "activate".
All "delayed" and "DELAYED" are carefully checked and renamed one by
one to avoid accidentally changing the name of the other kind for
timer-delayed.
No functional change intended.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
A small pull request to pick up a few new drivers and some cleanup
and fix patches.
New device support
* ad5110 non-volatile digital potentiometer
- New driver
* renesas rzl/gl2 12-bit / 8 channel ADC block
- New driver and bindings
Minor or late breaking fixes and cleanups
* ltc2983
- Fix a false assumption of initial interrupt during probe().
* hp03
- Use devm_* to simplify probe and allow the remove function to be dropped.
* rockchip_saradc
- Use a regulator notifier to reduce overheads of querying the scale.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmEbzu0RHGppYzIzQGtl
cm5lbC5vcmcACgkQVIU0mcT0FogK9RAAj+v3Y7AaHxjMltqZYSrOnkuzYkXnL8Sm
1th0yh1m+U65iyZgZNKxMRg7ABW6+BW2WOBDr5qcfISiKI9VKzxvdSG3S1gsjkOj
mpDX4HNdTToXapWbN9Iw8IwU35r+116YPqrSKcau1Xyani1dsMOp8MhRvzwzh47R
ixlHTVDbu1aeCkIxEuN8HfCyAftbC8P8olenKRzbDvcQxApr68Sr0wvmBDkHk8lg
VjFRvHh2GQZbptvCKt0HegmEEQ9qiMMJmua8o7g49uarIAqzhKCDAfSStszCMzHD
gAe+RPIZaTlz2iY1YMi28rhJWj8VC2hCZl7PIO27AOw05ynVg9tVo4w74cQ0QYFQ
N+2J+L2FRcMuwdPmpAgIwVDY+B6EvAbNkS88Pco45YmfTkzcnrctk2jv4+DSSRMb
5p5ojhq7ONxVS/MPE0mkZRWO8jFyiqNB3YiktdsaA++gFcH7PPuIxsep9GBbjaik
LxrZzbXvmE0hI24iz0oKdp9dO1N8Gws5BtZEtx1quJYxDPwa3cDDOrpCKvnsR86E
JKRkCZ7jPT1Rahj6tOdtU84xbNTRrqFDXZpx8BN0MqV2QqvXSglWTvB7fMJqf78i
r9lx/LJ9LaQ971gQ/dtPSB/7o6TlOxG55JP+GnZOH/0DGAicFCfllluEx1bVTz7T
AgI7Ng6s7Mk=
=d/Fm
-----END PGP SIGNATURE-----
Merge tag 'iio-for-5.15b' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:
2nd set of new IIO device support and cleanups for the 5.15 cycle.
A small pull request to pick up a few new drivers and some cleanup
and fix patches.
New device support
* ad5110 non-volatile digital potentiometer
- New driver
* renesas rzl/gl2 12-bit / 8 channel ADC block
- New driver and bindings
Minor or late breaking fixes and cleanups
* ltc2983
- Fix a false assumption of initial interrupt during probe().
* hp03
- Use devm_* to simplify probe and allow the remove function to be dropped.
* rockchip_saradc
- Use a regulator notifier to reduce overheads of querying the scale.
* tag 'iio-for-5.15b' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
iio: adc: Add driver for Renesas RZ/G2L A/D converter
dt-bindings: iio: adc: Add binding documentation for Renesas RZ/G2L A/D converter
iio: pressure: hp03: update device probe to register with devm functions
iio: adc: rockchip_saradc: add voltage notifier so get referenced voltage once at probe
iio: ltc2983: fix device probe
iio: potentiometer: Add driver support for AD5110
dt-bindings: iio: potentiometer: Add AD5110 in trivial-devices
There is a spelling mistake in a dev_info message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Update the comment with the new features.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/YQwIorQBHEq+s73b@hirez.programming.kicks-ass.net
On PREEMPT_RT enabled kernels local_lock maps to a per CPU 'sleeping'
spinlock which protects the critical section while staying preemptible. CPU
locality is established by disabling migration.
Provide the necessary types and macros to substitute the non-RT variant.
Co-developed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211306.023630962@linutronix.de
Add the static and runtime initializer mechanics to support the RT variant
of local_lock, which requires the lock type in the lockdep map to be set
to LD_LOCK_PERCPU.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.967526724@linutronix.de
Going to sleep when locks are contended can be quite inefficient when the
contention time is short and the lock owner is running on a different CPU.
The MCS mechanism cannot be used because MCS is strictly FIFO ordered while
for rtmutex based locks the waiter ordering is priority based.
Provide a simple adaptive spinwait mechanism which currently restricts the
spinning to the top priority waiter.
[ tglx: Provide a contemporary changelog, extended it to all rtmutex based
locks and updated it to match the other spin on owner implementations ]
Originally-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.912050691@linutronix.de
The current logic only allows lock stealing to occur if the current task is
of higher priority than the pending owner.
Significant throughput improvements can be gained by allowing the lock
stealing to include tasks of equal priority when the contended lock is a
spin_lock or a rw_lock and the tasks are not in a RT scheduling task.
The assumption was that the system will make faster progress by allowing
the task already on the CPU to take the lock rather than waiting for the
system to wake up a different task.
This does add a degree of unfairness, but in reality no negative side
effects have been observed in the many years that this has been used in the
RT kernel.
[ tglx: Refactored and rewritten several times by Steve Rostedt, Sebastian
Siewior and myself ]
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.857240222@linutronix.de
On PREEMPT_RT regular spinlocks and rwlocks are substituted with rtmutex
based constructs. spin/rwlock held regions are preemptible on PREEMPT_RT,
so PREEMPT_LOCK_OFFSET has to be 0 to make the various cond_resched_*lock()
functions work correctly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.804246275@linutronix.de
On PREEMPT_RT the futex hashbucket spinlock becomes 'sleeping' and rtmutex
based. That causes a lockdep false positive because some of the futex
functions invoke spin_unlock(&hb->lock) with the wait_lock of the rtmutex
associated to the pi_futex held. spin_unlock() in turn takes wait_lock of
the rtmutex on which the spinlock is based which makes lockdep notice a
lock recursion.
Give the futex/rtmutex wait_lock a separate key.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.750701219@linutronix.de
The requeue_pi() operation on RT kernels creates a problem versus the
task::pi_blocked_on state when a waiter is woken early (signal, timeout)
and that early wake up interleaves with the requeue_pi() operation.
When the requeue manages to block the waiter on the rtmutex which is
associated to the second futex, then a concurrent early wakeup of that
waiter faces the problem that it has to acquire the hash bucket spinlock,
which is not an issue on non-RT kernels, but on RT kernels spinlocks are
substituted by 'sleeping' spinlocks based on rtmutex. If the hash bucket
lock is contended then blocking on that spinlock would result in a
impossible situation: blocking on two locks at the same time (the hash
bucket lock and the rtmutex representing the PI futex).
It was considered to make the hash bucket locks raw_spinlocks, but
especially requeue operations with a large amount of waiters can introduce
significant latencies, so that's not an option for RT.
The RT tree carried a solution which (ab)used task::pi_blocked_on to store
the information about an ongoing requeue and an early wakeup which worked,
but required to add checks for these special states all over the place.
The distangling of an early wakeup of a waiter for a requeue_pi() operation
is already looking at quite some different states and the task::pi_blocked_on
magic just expanded that to a hard to understand 'state machine'.
This can be avoided by keeping track of the waiter/requeue state in the
futex_q object itself.
Add a requeue_state field to struct futex_q with the following possible
states:
Q_REQUEUE_PI_NONE
Q_REQUEUE_PI_IGNORE
Q_REQUEUE_PI_IN_PROGRESS
Q_REQUEUE_PI_WAIT
Q_REQUEUE_PI_DONE
Q_REQUEUE_PI_LOCKED
The waiter starts with state = NONE and the following state transitions are
valid:
On the waiter side:
Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_IGNORE
Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_WAIT
On the requeue side:
Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_INPROGRESS
Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_DONE/LOCKED
Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_NONE (requeue failed)
Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_DONE/LOCKED
Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_IGNORE (requeue failed)
The requeue side ignores a waiter with state Q_REQUEUE_PI_IGNORE as this
signals that the waiter is already on the way out. It also means that
the waiter is still on the 'wait' futex, i.e. uaddr1.
The waiter side signals early wakeup to the requeue side either through
setting state to Q_REQUEUE_PI_IGNORE or to Q_REQUEUE_PI_WAIT depending
on the current state. In case of Q_REQUEUE_PI_IGNORE it can immediately
proceed to take the hash bucket lock of uaddr1. If it set state to WAIT,
which means the wakeup is interleaving with a requeue in progress it has
to wait for the requeue side to change the state. Either to DONE/LOCKED
or to IGNORE. DONE/LOCKED means the waiter q is now on the uaddr2 futex
and either blocked (DONE) or has acquired it (LOCKED). IGNORE is set by
the requeue side when the requeue attempt failed via deadlock detection
and therefore the waiter's futex_q is still on the uaddr1 futex.
While this is not strictly required on !RT making this unconditional has
the benefit of common code and it also allows the waiter to avoid taking
the hash bucket lock on the way out in certain cases, which reduces
contention.
Add the required helpers required for the state transitions, invoke them at
the right places and restructure the futex_wait_requeue_pi() code to handle
the return from wait (early or not) based on the state machine values.
On !RT enabled kernels the waiter spin waits for the state going from
Q_REQUEUE_PI_WAIT to some other state, on RT enabled kernels this is
handled by rcuwait_wait_event() and the corresponding wake up on the
requeue side.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.693317658@linutronix.de
Move the futex key match out of handle_early_requeue_pi_wakeup() which
allows to simplify that function. The upcoming state machine for
requeue_pi() will make that go away.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.638938670@linutronix.de
No point in allocating memory when the input parameters are bogus.
Validate all parameters before proceeding.
Suggested-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.581789253@linutronix.de
The comment about the restriction of the number of waiters to wake for the
REQUEUE_PI case is confusing at best. Rewrite it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211305.524990421@linutronix.de