mm/vmstat.c code calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
clocksource_verify_percpu() calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
__irq_build_affinity_masks() calls cpumask_weight() to check if
any bit of a given cpumask is set. We can do it more efficiently with
cpumask_empty() because cpumask_empty() stops traversing the cpumask as
soon as it finds first set bit, while cpumask_weight() counts all bits
unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
bcm6345_l1_of_init() calls cpumask_weight() to check if any bit of a given
cpumask is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
i915_pmu_cpu_online() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
In some cases, arch/x86 code calls cpumask_weight() to check if any bit of
a given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
setup_arch() calls cpumask_weight() to check if any bit of a given cpumask
is set. We can do it more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
common_shutdown_1() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
bitmap_empty() is better than bitmap_weight() because it may return
earlier, and improves on readability.
CC: Albert Ou <aou@eecs.berkeley.edu>
CC: Anup Patel <anup@brainfault.org>
CC: Atish Patra <atishp@atishpatra.org>
CC: Jisheng Zhang <jszhang@kernel.org>
CC: Palmer Dabbelt <palmer@dabbelt.com>
CC: Paul Walmsley <paul.walmsley@sifive.com>
CC: Tsukasa OI <research_trasio@irq.a4lg.com>
CC: linux-riscv@lists.infradead.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Add the maintainer information for the LoongArch (LA or LArch for short)
architecture.
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add Non-Uniform Memory Access (NUMA) support for LoongArch. LoongArch
has 48-bit physical address, but the HyperTransport I/O bus only support
40-bit address, so we need a custom phys_to_dma() and dma_to_phys() to
extract the 4-bit node id (bit 44~47) from Loongson-3's 48-bit physical
address space and embed it into 40-bit. In the 40-bit dma address, node
id offset can be read from the LS7A_DMA_CFG register.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch-based procesors have 4, 8 or 16 cores per package. This patch
adds multi-processor (SMP) support for LoongArch.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add VDSO and VSYSCALL support (sigreturn, gettimeofday and its friends)
for LoongArch.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add some misc common routines for LoongArch, including: asm-offsets
routines, futex functions, i/o memory access functions, frame-buffer
functions, procfs information display, etc.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add ucontext/sigcontext definition and signal handling support for
LoongArch.
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add system call support and related uaccess.h for LoongArch.
Q: Why keep _ARCH_WANT_SYS_CLONE definition while there is clone3:
A: The latest glibc release has some basic support for clone3 but it is
not complete. E.g., pthread_create() and spawni() have converted to
use clone3 but fork() will still use clone. Moreover, some seccomp
related applications can still not work perfectly with clone3. E.g.,
Chromium sandbox cannot work at all and there is no solution for it,
which is more terrible than the fork() story [1].
[1] https://chromium-review.googlesource.com/c/chromium/src/+/2936184
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add memory management support for LoongArch, including: cache and tlb
management, page fault handling and ioremap/mmap support.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add process management support for LoongArch, including: thread info
definition, context switch and process tracing.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add the exception and interrupt handling machanism for basic LoongArch
support.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add basic boot, setup and reset routines for LoongArch. Now, LoongArch
machines use UEFI-based firmware. The firmware passes configuration
information to the kernel via ACPI and DMI/SMBIOS.
Currently an existing interface between the kernel and the bootloader
is implemented. Kernel gets 2 values from the bootloader, passed in
registers a0 and a1; a0 is an "EFI boot flag" distinguishing UEFI and
non-UEFI firmware, while a1 is a pointer to an FDT with systable,
memmap, cmdline and initrd information.
The standard UEFI boot protocol (EFISTUB) will be added later.
Cc: linux-efi@vger.kernel.org
Cc: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Co-developed-by: Yun Liu <liuyun@loongson.cn>
Signed-off-by: Yun Liu <liuyun@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add some other common headers for basic LoongArch support.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add common headers (atomic, bitops, barrier and locking) for basic
LoongArch support.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add common headers (CPU definition and address space layout) for basic
LoongArch support.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch maintains cache coherency in hardware, but its WUC attribute
(Weak-ordered UnCached, which is similar to WC) is out of the scope of
cache coherency machanism. This means WUC can only used for write-only
memory regions.
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add some basic documentation (zh_CN version) for LoongArch. LoongArch is
a new RISC ISA, which is a bit like MIPS or RISC-V. LoongArch includes a
reduced 32-bit version (LA32R), a standard 32-bit version (LA32S) and a
64-bit version (LA64).
Reviewed-by: Alex Shi <alexs@kernel.org>
Reviewed-by: Yanteng Si <siyanteng@loongson.cn>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Co-developed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add some basic documentation for LoongArch. LoongArch is a new RISC ISA,
which is a bit like MIPS or RISC-V. LoongArch includes a reduced 32-bit
version (LA32R), a standard 32-bit version (LA32S) and a 64-bit version
(LA64).
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Co-developed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
liointc driver is shared by MIPS and LoongArch, this patch adjust the
code to fix build error for LoongArch.
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
HTVEC will be shared by both MIPS-based and LoongArch-based Loongson
processors (not only Loongson-3), so we adjust its description. HTPIC is
only used by MIPS-based Loongson, so we add a MIPS dependency.
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Merge series from Charles Keepax <ckeepax@opensource.cirrus.com>:
Mostly the usage of the SX controls seems to match the lowest gain
value + number of gain levels expected. The one notable exception
there being cs53l30 as David noted. However, there are a couple of
other places where the minimum value/TLVs are slightly incorrectly
specified.
The minimum value for the PGA Volume is given as 0x1A, however the
values from there to 0x19 are all the same volume and this is not
represented in the TLV structure. The number of volumes given is correct
so this leads to all the volumes being shifted. Move the minimum value
up to 0x19 to fix this.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220602162119.3393857-7-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
A couple of the SX volume controls specify 0x84 as the lowest volume
value, however the correct value from the datasheet is 0x44. The
datasheet don't include spaces in the value it displays as binary so
this was almost certainly just a typo reading 1000100.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220602162119.3393857-6-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The Bypass Volume is accidentally using a -6dB minimum TLV rather than
the correct -60dB minimum. Add a new TLV to correct this.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220602162119.3393857-5-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This driver specified the maximum value rather than the number of volume
levels on the SX controls, this is incorrect, so correct them.
Reported-by: David Rhodes <david.rhodes@cirrus.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220602162119.3393857-4-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The digital volume TLV specifies the step as 0.25dB but the actual step
of the control is 0.125dB. Update the TLV to correct this.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220602162119.3393857-3-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The datasheet specifies the range of the mixer volumes as between
-51.5dB and 12dB with a 0.5dB step. Update the TLVs for this.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220602162119.3393857-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
With the kernel 5.18, the system will hang on boot if it is compiled with
CONFIG_SCHED_MC. The last printed message is "Brought up 1 node, 1 CPU".
The crash happens in sd_init
tl->mask (which is cpu_coregroup_mask) returns an empty mask. This happens
because cpu_topology[0].core_sibling is empty.
Consequently, sd_span is set to an empty mask
sd_id = cpumask_first(sd_span) sets sd_id == NR_CPUS (because the mask is
empty)
sd->shared = *per_cpu_ptr(sdd->sds, sd_id); sets sd->shared to NULL
because sd_id is out of range
atomic_inc(&sd->shared->ref); crashes without printing anything
We can fix it by calling reset_cpu_topology() from init_cpu_topology() -
this will initialize the sibling masks on CPUs, so that they're not empty.
This patch also removes the variable "dualcores_found", it is useless,
because during boot, init_cpu_topology is called before
store_cpu_topology. Thus, set_sched_topology(parisc_mc_topology) is never
called. We don't need to call it at all because default_topology in
kernel/sched/topology.c contains the same items as parisc_mc_topology.
Note that we should not call store_cpu_topology() from init_per_cpu()
because it is called too early in the kernel initialization process and it
results in the message "Failure to register CPU0 device". Before this
patch, store_cpu_topology() would exit immediatelly because
cpuid_topo->core id was uninitialized and it was 0.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v5.18
Signed-off-by: Helge Deller <deller@gmx.de>
When being read, a sysfs attribute is already protected against removal
with the kobject node active reference counter. As a result, in
blk_ia_range_sysfs_show(), there is no need to take the queue sysfs
lock when reading the value of a range attribute. Using the queue sysfs
lock in this function creates a potential deadlock situation with the
disk removal, something that a lockdep signals with a splat when the
device is removed:
[ 760.703551] Possible unsafe locking scenario:
[ 760.703551]
[ 760.703554] CPU0 CPU1
[ 760.703556] ---- ----
[ 760.703558] lock(&q->sysfs_lock);
[ 760.703565] lock(kn->active#385);
[ 760.703573] lock(&q->sysfs_lock);
[ 760.703579] lock(kn->active#385);
[ 760.703587]
[ 760.703587] *** DEADLOCK ***
Solve this by removing the mutex_lock()/mutex_unlock() calls from
blk_ia_range_sysfs_show().
Fixes: a2247f19ee ("block: Add independent access ranges support")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20220603021905.1441419-1-damien.lemoal@opensource.wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This contains a single patch from a series that's ready to go for v5.10
but is also a shared build-time dependency for an IOMMU series that is
planned for v5.20. The idea is to take this into v5.19 to fulfill that
dependency and remove the need for close coordination for the two
series.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAmKXN/sTHHRyZWRpbmdA
bnZpZGlhLmNvbQAKCRDdI6zXfz6zoXZdEAC+jn6g9btaBl8UMF1SPkTnxHdXsjs2
Q9kD3Ipn7QN8UB0da7t1imqnT0yRbbGtn9qdh7Lq/vuF3YPu1b2UHjz9v09+4jK0
Vcw22nADbCiP6XcK0LEzXJTEomejSy7QWADo5aOnjg3exCbA2+6NOcMm9BuzlBOb
15bOi83L86WW2kyjUeHPVUYOCixwidgqDWjRPYwxQJr0n450dWqhbZJVzID72Ufl
hMKGFqtWqs5Wi2yepRxRSIS8VEMUtgkMo5XAsCCGep9E/f8cUCHNjFmQ5+Q2G2rr
ubczqaAxMamAHeOSMT3TWemUoLH19iqXgrh11JZzwKWTCIQWqdlYjON7p6oCreGo
+bZOHh/z9CV9sBWmzdTsnZcBkGT+0oei44Q/+uC32oJ2XK9US6rlVxicwzKiLAD8
2BwyBUbiq1hXwtcJ7a2SW54HavnVSkL76cHDrs0xAHILbYuW7TDVdwiOJvji9o7t
kzrTQsiRJ5iBNL593UyYqZo4Q6vVC2jl9hwdlgH9kWYyEPi05JV7UujkrgbKNHLP
A+ZppjcfQR+YgFgfRw1shSZM652dtv+RVyM9R/1IClocBURPqG8gZswnzj1gHjw0
FHtdH6wqKWg8OgXLGUntgpmMyiG6ioaGFGNkktTgUMnGCsrrasmX26dbxPetqCdQ
4I05RsVAFihTZA==
=7+me
-----END PGP SIGNATURE-----
Merge tag 'drm/tegra/for-5.19-prep-work' of https://gitlab.freedesktop.org/drm/tegra into drm-next
drm/tegra: Preparatory work for v5.19
This contains a single patch from a series that's ready to go for v5.10
but is also a shared build-time dependency for an IOMMU series that is
planned for v5.20. The idea is to take this into v5.19 to fulfill that
dependency and remove the need for close coordination for the two
series.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601100335.3841301-1-thierry.reding@gmail.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJil/W4AAoJEJMTUIYYBUIppm4H/1lqhAzQhbMSM8TDA5LUI2+2
e09mvq7KsZo30uHazCRwnci8AahDWFenfQ78LVOosQHGSXtSYpsgVs0czMWXvRlE
fPNlo75eAwPENNzux2Uo1K3uT32N0oeia0yYJqTzYuZg0Lmkx2cDPd4alUokXqaK
8J+6xJsY2d3YM0xl5gIFmM9gifd+jGFYrVsiEY8Dwr59zkdmoMD9/QIi5ZsWXH7G
LCJA6NUqr31TLrGA7hV/GAME9IFjxNWNotYLjxsoAUn5FPhwHY7EeJsloLJEl4HP
UCDJ47SfLfvYjUHeBXvUDbzTD2FPpWuDhharEqyLtStcVQjnpJCOZOO2e52qqMo=
=PgRN
-----END PGP SIGNATURE-----
Merge tag 'msm-next-5.19-fixes-06-01' of https://gitlab.freedesktop.org/abhinavk/msm into drm-next
5.19 fixes for msm-next
- Fix to add minimum ICC vote in the msm_mdss pm_resume path to address
bootup splats
- Fix to avoid dereferencing without checking in WB encoder
- Fix to avoid crash during suspend in DP driver by ensuring interrupt
mask bits are updated
- Remove unused code from dpu_encoder_virt_atomic_check()
- Fix to remove redundant init of dsc variable
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/927b201e-a734-a29d-b9fb-b9889e1f7795@quicinc.com
During TCP sockmap redirect pressure test, the following warning is triggered:
WARNING: CPU: 3 PID: 2145 at net/core/stream.c:205 sk_stream_kill_queues+0xbc/0xd0
CPU: 3 PID: 2145 Comm: iperf Kdump: loaded Tainted: G W 5.10.0+ #9
Call Trace:
inet_csk_destroy_sock+0x55/0x110
inet_csk_listen_stop+0xbb/0x380
tcp_close+0x41b/0x480
inet_release+0x42/0x80
__sock_release+0x3d/0xa0
sock_close+0x11/0x20
__fput+0x9d/0x240
task_work_run+0x62/0x90
exit_to_user_mode_prepare+0x110/0x120
syscall_exit_to_user_mode+0x27/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason we observed is that:
When the listener is closing, a connection may have completed the three-way
handshake but not accepted, and the client has sent some packets. The child
sks in accept queue release by inet_child_forget()->inet_csk_destroy_sock(),
but psocks of child sks have not released.
To fix, add sock_map_destroy to release psocks.
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220524075311.649153-1-wangyufen@huawei.com
One strategy employed by libbpf to guess the pointer size is by finding
the size of "unsigned long" type. This is achieved by looking for a type
of with the expected name and checking its size.
Unfortunately, the C syntax is friendlier to humans than to computers
as there is some variety in how such a type can be named. Specifically,
gcc and clang do not use the same names for integer types in debug info:
- clang uses "unsigned long"
- gcc uses "long unsigned int"
Lookup all the names for such a type so that libbpf can hope to find the
information it wants.
Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220524094447.332186-1-douglas.raillard@arm.com
Syzbot found a Use After Free bug in compute_effective_progs().
The reproducer creates a number of BPF links, and causes a fault
injected alloc to fail, while calling bpf_link_detach on them.
Link detach triggers the link to be freed by bpf_link_free(),
which calls __cgroup_bpf_detach() and update_effective_progs().
If the memory allocation in this function fails, the function restores
the pointer to the bpf_cgroup_link on the cgroup list, but the memory
gets freed just after it returns. After this, every subsequent call to
update_effective_progs() causes this already deallocated pointer to be
dereferenced in prog_list_length(), and triggers KASAN UAF error.
To fix this issue don't preserve the pointer to the prog or link in the
list, but remove it and replace it with a dummy prog without shrinking
the table. The subsequent call to __cgroup_bpf_detach() or
__cgroup_bpf_detach() will correct it.
Fixes: af6eea5743 ("bpf: Implement bpf_link-based cgroup BPF program attachment")
Reported-by: <syzbot+f264bffdfbd5614f3bb2@syzkaller.appspotmail.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://syzkaller.appspot.com/bug?id=8ebf179a95c2a2670f7cf1ba62429ec044369db4
Link: https://lore.kernel.org/bpf/20220517180420.87954-1-tadeusz.struk@linaro.org
bpf_object__btf() can return a NULL value. If bpf_object__btf returns
null, do not progress through codegen_asserts(). This avoids a null ptr
dereference at the call btf__type_cnt() in the function find_type_for_map()
Signed-off-by: Michael Mullin <masmullin@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220523194917.igkgorco42537arb@jup
In the commit da00d2f117 ("bpf: Add test ops for BPF_PROG_TYPE_TRACING"),
the bpf_fentry_test1 function was moved into bpf_prog_test_run_tracing(),
which is the test_run function of the tracing BPF programs.
Thus calling 'bpf_prog_test_run_opts(filter_fd, &topts)' will not trigger
bpf_fentry_test1 function as filter_fd is a sk_filter BPF program.
Fix it by replacing filter_fd with fexit_fd in the bpf_prog_test_run_opts()
function.
Fixes: da00d2f117 ("bpf: Add test ops for BPF_PROG_TYPE_TRACING")
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220521151329.648013-1-ytcoode@gmail.com