linux-xiaomi-chiron

Author	SHA1	Message	Date
Heiko Stuebner	ffb0b0afbd	riscv: move boot alternatives to after fill_hwcap Move the application of boot alternatives to after the hw-capabilities are populated. This allows to check for available extensions when determining which alternatives to apply and also makes it actually work if CONFIG_SMP is disabled for whatever reason. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20220511192921.2223629-8-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-11 21:36:32 -07:00
Heiko Stuebner	49b290e430	riscv: prevent compressed instructions in alternatives Instructions are opportunistically compressed by the RISC-V assembler when possible, but in alternatives-blocks both the old and new content need to be the same size, so having the toolchain do somewhat random optimizations will cause strange side-effects like "attempt to move .org backwards" compile-time errors. Already a simple "and" used in alternatives assembly will cause these mismatched code sizes. So prevent compressed instructions to be generated in alternatives- code and use option-push and -pop to only limit this to the relevant code blocks Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-7-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-11 21:36:32 -07:00
Heiko Stuebner	e509204acb	riscv: extend concatenated alternatives-lines to the same length ALT_NEW_CONTENT already uses same-length assembler lines, so extend this to the other elements as well. This makes it more readable when these elements need to be extended in the future. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-6-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-11 21:36:32 -07:00
Heiko Stuebner	fbdba60b81	riscv: implement ALTERNATIVE_2 macro When the alternatives were added the commit already provided a template on how to implement 2 different alternatives for one piece of code. Make this usable. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-5-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-11 21:36:32 -07:00
Heiko Stuebner	a8e910168b	riscv: implement module alternatives This allows alternatives to also be applied when loading modules and follows the implementation of other architectures (e.g. arm64). Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-4-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-11 21:36:31 -07:00
Heiko Stuebner	d14ca1f8d3	riscv: allow different stages with alternatives Future features may need to be applied at a different time during boot, so allow defining stages for alternatives and handling them differently depending on the stage. Also make the alternatives-location more flexible so that future stages may provide their own location. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-3-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-11 21:36:31 -07:00
Heiko Stuebner	e64f737ad7	riscv: integrate alternatives better into the main architecture Right now the alternatives need to be explicitly enabled and erratas are limited to SiFive ones. We want to use alternatives not only for patching soc erratas, but in the future also for handling different behaviour depending on the existence of future extensions. So move the core alternatives over to the kernel subdirectory and move the CONFIG_RISCV_ALTERNATIVE to be a hidden symbol which we expect relevant erratas and extensions to just select if needed. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Link: https://lore.kernel.org/r/20220511192921.2223629-2-heiko@sntech.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-05-11 21:36:31 -07:00
Yuntao Wang	a2aa95b71c	bpf: Fix potential array overflow in bpf_trampoline_get_progs() The cnt value in the 'cnt >= BPF_MAX_TRAMP_PROGS' check does not include BPF_TRAMP_MODIFY_RETURN bpf programs, so the number of the attached BPF_TRAMP_MODIFY_RETURN bpf programs in a trampoline can exceed BPF_MAX_TRAMP_PROGS. When this happens, the assignment '*progs++ = aux->prog' in bpf_trampoline_get_progs() will cause progs array overflow as the progs field in the bpf_tramp_progs struct can only hold at most BPF_MAX_TRAMP_PROGS bpf programs. Fixes: `88fd9e5352` ("bpf: Refactor trampoline update code") Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Link: https://lore.kernel.org/r/20220430130803.210624-1-ytcoode@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 21:24:20 -07:00
Andrii Nakryiko	5790a2fee0	selftests/bpf: make fexit_stress test run in serial mode fexit_stress is attaching maximum allowed amount of fexit programs to bpf_fentry_test1 kernel function, which is used by a bunch of other parallel tests, thus pretty frequently interfering with their execution. Given the test assumes nothing else is attaching to bpf_fentry_test1, mark it serial. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220511232012.609370-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 18:22:21 -07:00
Alexei Starovoitov	0bed8f374a	Merge branch 'Introduce access remote cpu elem support in BPF percpu map' Feng zhou says: ==================== From: Feng Zhou <zhoufeng.zf@bytedance.com> Trace some functions, such as enqueue_task_fair, need to access the corresponding cpu, not the current cpu, and bpf_map_lookup_elem percpu map cannot do it. So add bpf_map_lookup_percpu_elem to accomplish it for percpu_array_map, percpu_hash_map, lru_percpu_hash_map. v1->v2: Addressed comments from Alexei Starovoitov. - add a selftest for bpf_map_lookup_percpu_elem. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 18:16:55 -07:00
Feng Zhou	ed7c13776e	selftests/bpf: add test case for bpf_map_lookup_percpu_elem test_progs: Tests new ebpf helpers bpf_map_lookup_percpu_elem. Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20220511093854.411-3-zhoufeng.zf@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 18:16:55 -07:00
Feng Zhou	07343110b2	bpf: add bpf_map_lookup_percpu_elem for percpu map Add new ebpf helpers bpf_map_lookup_percpu_elem. The implementation method is relatively simple, refer to the implementation method of map_lookup_elem of percpu map, increase the parameters of cpu, and obtain it according to the specified cpu. Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20220511093854.411-2-zhoufeng.zf@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 18:16:54 -07:00
Jakub Kicinski	a48ab883c4	bluetooth pull request for net: - Fix the creation of hdev->name when index is greater than 9999 -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmJ8U9YZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKRpdD/4sYxNpqdhUuScsTJ2oZzjT dLE8REPwE/Rvwcw7+eBtnce3L+/UlQQwxJrGbvsBtGsbLxqdT3YGHlTi9MQrty9I IDXxLvYRKs39YjumlpzNgoLJmRguIAZaGg+n/LdIqhh+AhXx6mKWD4Fp5ovoFOWW Z9vNqKWLRhTHHm88xNazOIdP5jw2YEZYaR6sVYfIF89gp2W4gIU+POMFo1DAhw1Q jtf+CrJFtkUiLxGz4iTKKJZcnJHItOQrrsigHgWBlCSRhZLKEXMjwMKQp9uVYcC9 WxqHh0TmfcjNA/E4k0i6fjXcd7zx38JTAtTQIkGyNrBaBHAxIZrbWeIQgJ/9gxpQ JuWGubpYeRFPYf9TvO5WdnMSLVibTqPf+LQYzicqNnIpsD8NUcno4AV9gtHC7GaE hbcOL7z/zOEr1cIkFis2bfNhYrzBgv2OJW0VWREt/OdV8bI0/ABXKlRqZIheC9+y LKxb+tTDCV4w2E9TkCeuehk/Pki4F7sJ5zRNlg/zrzkGZVp7909i5hy6lGESqOAI fs0J9dNXralljsEjOGEnJgyQFbdSHBlpE07NFWOo+Tt3l664pVglmmAz8SrMElUI F7PBPO2lwKUB2V2OM87bzeNBh2KnOkfZkJrFh3knXFc9jW32WAMORQb7ukW/E8le gOO8aZtRtEXtSEm8NB8ilw== =MbKC -----END PGP SIGNATURE----- Merge tag 'for-net-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix the creation of hdev->name when index is greater than 9999 * tag 'for-net-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: Fix the creation of hdev->name ==================== Link: https://lore.kernel.org/r/20220512002901.823647-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 17:40:39 -07:00
Jakub Kicinski	8bf6008c8b	wireless fixes for v5.18 Second set of fixes for v5.18 and hopefully the last one. We have a new iwlwifi maintainer, a fix to rfkill ioctl interface and important fixes to both stack and two drivers. -----BEGIN PGP SIGNATURE----- iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmJ72akRHGt2YWxvQGtl cm5lbC5vcmcACgkQbhckVSbrbZso0wf+PmQev6QTWG/LPBfcIp7H6upRMS09/+Se SEhS9UAE/qh4GJM0Mn1XE6T5mcokQZ/Ck5uaWT3Be9Dwbbk/ucAvYGEvf4OxmUEM wwtDaA0BaFwS417iW5FLLAsu2ascN8yeje/+yK+Uu9DpB2KxXSIQB7OJpy3/HVAj jEavgZN/fQEiTba9/JDa6DBMm2RVAZrmc+1sB5FakUocVTuN2pZAkM+lOBXvlHS4 4jd/KEFDyto2BMOR46IOwXTNKgBk2UovqeYFrTdonMz7W7nhzWJcguFU0e6rej5q MCWxT8PryZ5yD5wl7pfOKZRTqFf+Mb+Up+yFipEEgd2SYnwxjplkaQ== =qa3F -----END PGP SIGNATURE----- Merge tag 'wireless-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v5.18 Second set of fixes for v5.18 and hopefully the last one. We have a new iwlwifi maintainer, a fix to rfkill ioctl interface and important fixes to both stack and two drivers. * tag 'wireless-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: rfkill: uapi: fix RFKILL_IOCTL_MAX_SIZE ioctl request definition nl80211: fix locking in nl80211_set_tx_bitrate_mask() mac80211_hwsim: call ieee80211_tx_prepare_skb under RCU protection mac80211_hwsim: fix RCU protected chanctx access mailmap: update Kalle Valo's email mac80211: Reset MBSSID parameters upon connection cfg80211: retrieve S1G operating channel number nl80211: validate S1G channel width mac80211: fix rx reordering with non explicit / psmp ack policy ath11k: reduce the wait time of 11d scan and hw scan while add interface MAINTAINERS: update iwlwifi driver maintainer iwlwifi: iwl-dbg: Use del_timer_sync() before freeing ==================== Link: https://lore.kernel.org/r/20220511154535.A1A12C340EE@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 17:33:01 -07:00
Itay Iellin	103a2f3255	Bluetooth: Fix the creation of hdev->name Set a size limit of 8 bytes of the written buffer to "hdev->name" including the terminating null byte, as the size of "hdev->name" is 8 bytes. If an id value which is greater than 9999 is allocated, then the "snprintf(hdev->name, sizeof(hdev->name), "hci%d", id)" function call would lead to a truncation of the id value in decimal notation. Set an explicit maximum id parameter in the id allocation function call. The id allocation function defines the maximum allocated id value as the maximum id parameter value minus one. Therefore, HCI_MAX_ID is defined as 10000. Signed-off-by: Itay Iellin <ieitayie@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2022-05-11 17:18:42 -07:00
Alexei Starovoitov	571b8739dd	Merge branch 'Follow ups for kptr series' Kumar Kartikeya Dwivedi says: ==================== Fix a build time warning, and address comments from Alexei on the merged version [0]. [0]: https://lore.kernel.org/bpf/20220424214901.2743946-1-memxor@gmail.com Changelog: ---------- v1 -> v2 v1: https://lore.kernel.org/bpf/20220510211727.575686-1-memxor@gmail.com * Add Fixes tag to patch 1 * Fix test_progs-noalu32 failure in CI due to different alloc_insn (Alexei) * Remove per-CPU struct, use global struct (Alexei) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 16:57:27 -07:00
Kumar Kartikeya Dwivedi	0ef6740e97	selftests/bpf: Add tests for kptr_ref refcounting Check at runtime how various operations for kptr_ref affect its refcount and verify against the actual count. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220511194654.765705-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 16:57:27 -07:00
Kumar Kartikeya Dwivedi	04accf794b	selftests/bpf: Add negative C tests for kptrs This uses the newly added SEC("?foo") naming to disable autoload of programs, and then loads them one by one for the object and verifies that loading fails and matches the returned error string from verifier. This is similar to already existing verifier tests but provides coverage for BPF C. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220511194654.765705-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 16:57:27 -07:00
Kumar Kartikeya Dwivedi	5cdccadcac	bpf: Prepare prog_test_struct kfuncs for runtime tests In an effort to actually test the refcounting logic at runtime, add a refcount_t member to prog_test_ref_kfunc and use it in selftests to verify and test the whole logic more exhaustively. The kfunc calls for prog_test_member do not require runtime refcounting, as they are only used for verifier selftests, not during runtime execution. Hence, their implementation now has a WARN_ON_ONCE as it is not meant to be reachable code at runtime. It is strictly used in tests triggering failure cases in the verifier. bpf_kfunc_call_memb_release is called from map free path, since prog_test_member is embedded in map value for some verifier tests, so we skip WARN_ON_ONCE for it. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220511194654.765705-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 16:57:27 -07:00
Kumar Kartikeya Dwivedi	5b74c690e1	bpf: Fix sparse warning for bpf_kptr_xchg_proto Kernel Test Robot complained about missing static storage class annotation for bpf_kptr_xchg_proto variable. sparse: symbol 'bpf_kptr_xchg_proto' was not declared. Should it be static? This caused by missing extern definition in the header. Add it to suppress the sparse warning. Fixes: `c0a5a21c25` ("bpf: Allow storing referenced kptr in map") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220511194654.765705-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 16:57:27 -07:00
Jakub Kicinski	bb709987f1	Merge branch 'count-tc-taprio-window-drops-in-enetc-driver' Vladimir Oltean says: ==================== Count tc-taprio window drops in enetc driver This series includes a patch from Po Liu (no longer with NXP) which counts frames dropped by the tc-taprio offload in ethtool -S and in ndo_get_stats64. It also contains a preparation patch from myself. ==================== Link: https://lore.kernel.org/r/20220510163615.6096-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 16:37:12 -07:00
Po Liu	285e8dedb4	net: enetc: count the tc-taprio window drops The enetc scheduler for IEEE 802.1Qbv has 2 options (depending on PTGCR[TG_DROP_DISABLE]) when we attempt to send an oversized packet which will never fit in its allotted time slot for its traffic class: either block the entire port due to head-of-line blocking, or drop the packet and set a bit in the writeback format of the transmit buffer descriptor, allowing other packets to be sent. We obviously choose the second option in the driver, but we do not detect the drop condition, so from the perspective of the network stack, the packet is sent and no error counter is incremented. This change checks the writeback of the TX BD when tc-taprio is enabled, and increments a specific ethtool statistics counter and a generic "tx_dropped" counter in ndo_get_stats64. Signed-off-by: Po Liu <Po.Liu@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 16:37:10 -07:00
Vladimir Oltean	32bf8e1f6f	net: enetc: manage ENETC_F_QBV in priv->active_offloads only when enabled Future work in this driver would like to look at priv->active_offloads & ENETC_F_QBV to determine whether a tc-taprio qdisc offload was installed, but this does not produce the intended effect. All the other flags in priv->active_offloads are managed dynamically, except ENETC_F_QBV which is set statically based on the probed SI capability. This change makes priv->active_offloads & ENETC_F_QBV really track the presence of a tc-taprio schedule on the port. Some existing users, like the enetc_sched_speed_set() call from phylink_mac_link_up(), are best kept using the old logic: the tc-taprio offload does not re-trigger another link mode resolve, so the scheduler needs to be functional from the get go, as long as Qbv is supported at all on the port. So to preserve functionality there, look at the static station interface capability from pf->si->hw_features instead. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 16:37:10 -07:00
Jakub Kicinski	d7722973a1	Merge branch 'macb-napi-improvements' Robert Hancock says: ==================== MACB NAPI improvements Simplify the logic in the Cadence MACB/GEM driver for determining when to reschedule NAPI processing, and update it to use NAPI for the TX path as well as the RX path. Changes since v1: Changed to use separate TX and RX NAPI instances and poll functions to avoid unnecessary checks of the other ring (TX/RX) states during polling and to use budget handling for both RX and TX. Fixed locking to protect against concurrent access to TX ring on TX transmit and TX poll paths. ==================== Link: https://lore.kernel.org/r/20220509194635.3094080-1-robert.hancock@calian.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 16:14:15 -07:00
Robert Hancock	138badbc21	net: macb: use NAPI for TX completion path This driver was using the TX IRQ handler to perform all TX completion tasks. Under heavy TX network load, this can cause significant irqs-off latencies (found to be in the hundreds of microseconds using ftrace). This can cause other issues, such as overrunning serial UART FIFOs when using high baud rates with limited UART FIFO sizes. Switch to using a NAPI poll handler to perform the TX completion work to get this out of hard IRQ context and avoid the IRQ latency impact. A separate NAPI instance is used for TX and RX to avoid checking the other ring's state unnecessarily when doing the poll, and so that the NAPI budget handling can work for both TX and RX packets. A new per-queue tx_ptr_lock spinlock has been added to avoid using the main device lock (with IRQs needing to be disabled) across the entire TX mapping operation, and also to protect the TX queue pointers from concurrent access between the TX start and TX poll operations. The TX Used Bit Read interrupt (TXUBR) handling also needs to be moved into the TX NAPI poll handler to maintain the proper order of operations. A flag is used to notify the poll handler that a UBR condition needs to be handled. The macb_tx_restart handler has had some locking added for global register access, since this could now potentially happen concurrently on different queues. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 16:14:13 -07:00
Robert Hancock	1900e30d0e	net: macb: simplify/cleanup NAPI reschedule checking Previously the macb_poll method was checking the RSR register after completing its RX receive work to see if additional packets had been received since IRQs were disabled, since this controller does not maintain the pending IRQ status across IRQ disable. It also had to double-check the register after re-enabling IRQs to detect if packets were received after the first check but before IRQs were enabled. Using the RSR register for this purpose is problematic since it reflects the global device state rather than the per-queue state, so if packets are being received on multiple queues it may end up retriggering receive on a queue where the packets did not actually arrive and not on the one where they did arrive. This will also cause problems with an upcoming change to use NAPI for the TX path where use of multiple queues is more likely. Add a macb_rx_pending function to check the RX ring to see if more packets have arrived in the queue, and use that to check if NAPI should be rescheduled rather than the RSR register. By doing this, we can just ignore the global RSR register entirely, and thus save some extra device register accesses at the same time. This also makes the previous first check for pending packets rather redundant, since it would be checking the RX ring state which was just checked in the receive work function. Therefore we can get rid of it and just check after enabling interrupts whether packets are already pending. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 16:14:13 -07:00
Chengming Zhou	2a371f7d5f	blk-iocost: combine local_stat and desc_stat to stat When we flush usage, wait, indebt stat in iocg_flush_stat(), we use local_stat and desc_stat, which has no point since the leaf iocg only has local_stat and the inner iocg only has desc_stat. Also we don't need to flush percpu abs_vusage for these inner iocgs. This patch combine local_stat and desc_stat to stat, only flush percpu abs_vusage for active leaf iocgs, then build inner walk list to propagate. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220510034757.21761-1-zhouchengming@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-05-11 17:05:52 -06:00
Delyan Kratunov	9c2136be08	sched/tracing: Append prev_state to tp args instead Commit `fa2c3254d7` (sched/tracing: Don't re-read p->state when emitting sched_switch event, 2022-01-20) added a new prev_state argument to the sched_switch tracepoint, before the prev task_struct pointer. This reordering of arguments broke BPF programs that use the raw tracepoint (e.g. tp_btf programs). The type of the second argument has changed and existing programs that assume a task_struct* argument (e.g. for bpf_task_storage access) will now fail to verify. If we instead append the new argument to the end, all existing programs would continue to work and can conditionally extract the prev_state argument on supported kernel versions. Fixes: `fa2c3254d7` (sched/tracing: Don't re-read p->state when emitting sched_switch event, 2022-01-20) Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/c8a6930dfdd58a4a5755fc01732675472979732b.camel@fb.com	2022-05-12 00:37:11 +02:00
Vladimir Oltean	11ecf3412b	net: dsa: ocelot: accept 1000base-X for VSC9959 and VSC9953 Switches using the Lynx PCS driver support 1000base-X optical SFP modules. Accept this interface type on a port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220510164320.10313-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 15:24:37 -07:00
Yicong Yang	a91ee0e9fc	PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store() The sysfs sriov_numvfs_store() path acquires the device lock before the config space access lock: sriov_numvfs_store device_lock # A (1) acquire device lock sriov_configure vfio_pci_sriov_configure # (for example) vfio_pci_core_sriov_configure pci_disable_sriov sriov_disable pci_cfg_access_lock pci_wait_cfg # B (4) wait for dev->block_cfg_access == 0 Previously, pci_dev_lock() acquired the config space access lock before the device lock: pci_dev_lock pci_cfg_access_lock dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1 device_lock # A (3) wait for device lock Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may deadlock with sriov_numvfs_store() if the operations occur in the sequence (1) (2) (3) (4). Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires the device lock before the config space access lock, the same as the sriov_numvfs_store() path. [bhelgaas: combined and adapted commit log from Jay Zhou's independent subsequent posting: https://lore.kernel.org/r/20220404062539.1710-1-jianjay.zhou@huawei.com] Link: https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyicong@hisilicon.com/ Also-posted-by: Jay Zhou <jianjay.zhou@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2022-05-11 17:20:09 -05:00
Xiaomeng Tong	3f95a7472d	i40e: i40e_main: fix a missing check on list iterator The bug is here: ret = i40e_add_macvlan_filter(hw, ch->seid, vdev->dev_addr, &aq_err); The list iterator 'ch' will point to a bogus position containing HEAD if the list is empty or no element is found. This case must be checked before any use of the iterator, otherwise it will lead to a invalid memory access. To fix this bug, use a new variable 'iter' as the list iterator, while use the origin variable 'ch' as a dedicated pointer to point to the found element. Cc: stable@vger.kernel.org Fixes: `1d8d80b4e4` ("i40e: Add macvlan support on i40e") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220510204846.2166999-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 15:19:28 -07:00
Vladimir Oltean	b57c7e8b76	selftests: forwarding: tc_actions: allow mirred egress test to run on non-offloaded h2 The host interfaces $h1 and $h2 don't have to be switchdev interfaces, but due to the fact that we pass $tcflags which may have the value of "skip_sw", we force $h2 to offload a drop rule for dst_ip, something which it may not be able to do. The selftest only wants to verify the hit count of this rule as a means of figuring out whether the packet was received, so remove the $tcflags for it and let it be done in software. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220510220904.284552-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 15:12:23 -07:00
Jakub Kicinski	ddae9bc467	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 1GbE Intel Wired LAN Driver Updates 2022-05-10 This series contains updates to igc driver only. Sasha cleans up the code by removing an unused function and removing an enum for PHY type as there is only one PHY. The return type for igc_check_downshift() is changed to void as it always returns success. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igc: Change type of the 'igc_check_downshift' method igc: Remove unused phy_type enum igc: Remove igc_set_spd_dplx method ==================== Link: https://lore.kernel.org/r/20220510210656.2168393-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 15:11:32 -07:00
Paolo Abeni	8b796475fd	net/sched: act_pedit: really ensure the skb is writable Currently pedit tries to ensure that the accessed skb offset is writable via skb_unclone(). The action potentially allows touching any skb bytes, so it may end-up modifying shared data. The above causes some sporadic MPTCP self-test failures, due to this code: tc -n $ns2 filter add dev ns2eth$i egress \ protocol ip prio 1000 \ handle 42 fw \ action pedit munge offset 148 u8 invert \ pipe csum tcp \ index 100 The above modifies a data byte outside the skb head and the skb is a cloned one, carrying a TCP output packet. This change addresses the issue by keeping track of a rough over-estimate highest skb offset accessed by the action and ensuring such offset is really writable. Note that this may cause performance regressions in some scenarios, but hopefully pedit is not in the critical path. Fixes: `db2c24175d` ("act_pedit: access skb->data safely") Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/1fcf78e6679d0a287dd61bb0f04730ce33b3255d.1652194627.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-11 15:06:42 -07:00
Jason A. Donenfeld	065b8ced7c	openrisc: remove bogus nops and shutdowns Nop 42 is some leftover debugging thing by the looks of it. Nop 1 will shut down the simulator, which isn't what we want, since it makes it impossible to handle errors. Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Stafford Horne <shorne@gmail.com>	2022-05-12 06:02:02 +09:00
Julia Lawall	d49401999a	openrisc: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Stafford Horne <shorne@gmail.com>	2022-05-12 06:01:32 +09:00
Yonghong Song	fd0ad6f1d1	selftests/bpf: fix a few clang compilation errors With latest clang, I got the following compilation errors: .../prog_tests/test_tunnel.c:291:6: error: variable 'local_ip_map_fd' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] if (attach_tc_prog(&tc_hook, -1, set_dst_prog_fd)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .../bpf/prog_tests/test_tunnel.c:312:6: note: uninitialized use occurs here if (local_ip_map_fd >= 0) ^~~~~~~~~~~~~~~ ... .../prog_tests/kprobe_multi_test.c:346:6: error: variable 'err' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] if (IS_ERR(map)) ^~~~~~~~~~~ .../prog_tests/kprobe_multi_test.c:388:6: note: uninitialized use occurs here if (err) { ^~~ This patch fixed the above compilation errors. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20220511184735.3670214-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-11 12:58:12 -07:00
Peter Zijlstra	31cae1eaae	sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state Currently ptrace_stop() / do_signal_stop() rely on the special states TASK_TRACED and TASK_STOPPED resp. to keep unique state. That is, this state exists only in task->__state and nowhere else. There's two spots of bother with this: - PREEMPT_RT has task->saved_state which complicates matters, meaning task_is_{traced,stopped}() needs to check an additional variable. - An alternative freezer implementation that itself relies on a special TASK state would loose TASK_TRACED/TASK_STOPPED and will result in misbehaviour. As such, add additional state to task->jobctl to track this state outside of task->__state. NOTE: this doesn't actually fix anything yet, just adds extra state. --EWB * didn't add a unnecessary newline in signal.h * Update t->jobctl in signal_wake_up and ptrace_signal_wake_up instead of in signal_wake_up_state. This prevents the clearing of TASK_STOPPED and TASK_TRACED from getting lost. * Added warnings if JOBCTL_STOPPED or JOBCTL_TRACED are not cleared Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220421150654.757693825@infradead.org Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-12-ebiederm@xmission.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2022-05-11 14:37:06 -05:00
Eric W. Biederman	5b4197cb28	ptrace: Always take siglock in ptrace_resume Make code analysis simpler and future changes easier by always taking siglock in ptrace_resume. Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-11-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:36:30 -05:00
Eric W. Biederman	2500ad1c7f	ptrace: Don't change __state Stop playing with tsk->__state to remove TASK_WAKEKILL while a ptrace command is executing. Instead remove TASK_WAKEKILL from the definition of TASK_TRACED, and implement a new jobctl flag TASK_PTRACE_FROZEN. This new flag is set in jobctl_freeze_task and cleared when ptrace_stop is awoken or in jobctl_unfreeze_task (when ptrace_stop remains asleep). In signal_wake_up add __TASK_TRACED to state along with TASK_WAKEKILL when the wake up is for a fatal signal. Skip adding __TASK_TRACED when TASK_PTRACE_FROZEN is not set. This has the same effect as changing TASK_TRACED to __TASK_TRACED as all of the wake_ups that use TASK_KILLABLE go through signal_wake_up. Handle a ptrace_stop being called with a pending fatal signal. Previously it would have been handled by schedule simply failing to sleep. As TASK_WAKEKILL is no longer part of TASK_TRACED schedule will sleep with a fatal_signal_pending. The code in signal_wake_up guarantees that the code will be awaked by any fatal signal that codes after TASK_TRACED is set. Previously the __state value of __TASK_TRACED was changed to TASK_RUNNING when woken up or back to TASK_TRACED when the code was left in ptrace_stop. Now when woken up ptrace_stop now clears JOBCTL_PTRACE_FROZEN and when left sleeping ptrace_unfreezed_traced clears JOBCTL_PTRACE_FROZEN. Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-10-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:35:32 -05:00
Eric W. Biederman	57b6de08b5	ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs Long ago and far away there was a BUG_ON at the start of ptrace_stop that did "BUG_ON(!(current->ptrace & PT_PTRACED));" [1]. The BUG_ON had never triggered but examination of the code showed that the BUG_ON could actually trigger. To complement removing the BUG_ON an attempt to better handle the race was added. The code detected the tracer had gone away and did not call do_notify_parent_cldstop. The code also attempted to prevent ptrace_report_syscall from sending spurious SIGTRAPs when the tracer went away. The code to detect when the tracer had gone away before sending a signal to tracer was a legitimate fix and continues to work to this date. The code to prevent sending spurious SIGTRAPs is a failure. At the time and until today the code only catches it when the tracer goes away after siglock is dropped and before read_lock is acquired. If the tracer goes away after read_lock is dropped a spurious SIGTRAP can still be sent to the tracee. The tracer going away after read_lock is dropped is the far likelier case as it is the bigger window. Given that the attempt to prevent the generation of a SIGTRAP was a failure and continues to be a failure remove the code that attempts to do that. This simplifies the code in ptrace_stop and makes ptrace_stop much easier to reason about. To successfully deal with the tracer going away, all of the tracer's instrumentation of the child would need to be removed, and reliably detecting when the tracer has set a signal to continue with would need to be implemented. [1] 66519f549ae5 ("[PATCH] fix ptracer death race yielding bogus BUG_ON") History-Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-9-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:35:00 -05:00
Eric W. Biederman	7b0fe1367e	ptrace: Document that wait_task_inactive can't fail After ptrace_freeze_traced succeeds it is known that the tracee has a __state value of __TASK_TRACED and that no __ptrace_unlink will happen because the tracer is waiting for the tracee, and the tracee is in ptrace_stop. The function ptrace_freeze_traced can succeed at any point after ptrace_stop has set TASK_TRACED and dropped siglock. The read_lock on tasklist_lock only excludes ptrace_attach. This means that the !current->ptrace which executes under a read_lock of tasklist_lock will never see a ptrace_freeze_trace as the tracer must have gone away before the tasklist_lock was taken and ptrace_attach can not occur until the read_lock is dropped. As ptrace_freeze_traced depends upon ptrace_attach running before it can run that excludes ptrace_freeze_traced until __state is set to TASK_RUNNING. This means that task_is_traced will fail in ptrace_freeze_attach and ptrace_freeze_attached will fail. On the current->ptrace branch of ptrace_stop which will be reached any time after ptrace_freeze_traced has succeed it is known that __state is __TASK_TRACED and schedule() will be called with that state. Use a WARN_ON_ONCE to document that wait_task_inactive(TASK_TRACED) should never fail. Remove the stale comment about may_ptrace_stop. Strictly speaking this is not true because if PREEMPT_RT is enabled wait_task_inactive can fail because __state can be changed. I don't see this as a problem as the ptrace code is currently broken on PREMPT_RT, and this is one of the issues. Failing and warning when the assumptions of the code are broken is good. Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-8-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:34:47 -05:00
Eric W. Biederman	6a2d90ba02	ptrace: Reimplement PTRACE_KILL by always sending SIGKILL The current implementation of PTRACE_KILL is buggy and has been for many years as it assumes it's target has stopped in ptrace_stop. At a quick skim it looks like this assumption has existed since ptrace support was added in linux v1.0. While PTRACE_KILL has been deprecated we can not remove it as a quick search with google code search reveals many existing programs calling it. When the ptracee is not stopped at ptrace_stop some fields would be set that are ignored except in ptrace_stop. Making the userspace visible behavior of PTRACE_KILL a noop in those case. As the usual rules are not obeyed it is not clear what the consequences are of calling PTRACE_KILL on a running process. Presumably userspace does not do this as it achieves nothing. Replace the implementation of PTRACE_KILL with a simple send_sig_info(SIGKILL) followed by a return 0. This changes the observable user space behavior only in that PTRACE_KILL on a process not stopped in ptrace_stop will also kill it. As that has always been the intent of the code this seems like a reasonable change. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-7-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:34:28 -05:00
Eric W. Biederman	cb3c19c93d	signal: Use lockdep_assert_held instead of assert_spin_locked The distinction is that assert_spin_locked() checks if the lock is held byanyone* whereas lockdep_assert_held() asserts the current context holds the lock. Also, the check goes away if you build without lockdep. Suggested-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/Ympr/+PX4XgT/UKU@hirez.programming.kicks-ass.net Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-6-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:34:14 -05:00
Eric W. Biederman	16cc1bc67d	ptrace: Remove arch_ptrace_attach The last remaining implementation of arch_ptrace_attach is ia64's ptrace_attach_sync_user_rbs which was added at the end of 2007 in commit `aa91a2e900` ("[IA64] Synchronize RBS on PTRACE_ATTACH"). Reading the comments and examining the code ptrace_attach_sync_user_rbs has the sole purpose of saving registers to the stack when ptrace_attach changes TASK_STOPPED to TASK_TRACED. In all other cases arch_ptrace_stop takes care of the register saving. In commit `d79fdd6d96` ("ptrace: Clean transitions between TASK_STOPPED and TRACED") modified ptrace_attach to wake up the thread and enter ptrace_stop normally even when the thread starts out stopped. This makes ptrace_attach_sync_user_rbs completely unnecessary. So just remove it. I read through the code to verify that ptrace_attach_sync_user_rbs is unnecessary. What I found is that the code is quite dead. Reading ptrace_attach_sync_user_rbs it is easy to see that the it does nothing unless __state == TASK_STOPPED. Calling arch_ptrace_attach (aka ptrace_attach_sync_user_rbs) after ptrace_traceme it is easy to see that because we are talking about the current process the value of __state is TASK_RUNNING. Which means ptrace_attach_sync_user_rbs does nothing. The only other call of arch_ptrace_attach (aka ptrace_attach_sync_user_rbs) is after ptrace_attach. If the task is running (and PTRACE_SEIZE is not specified), a SIGSTOP is sent which results in do_signal_stop setting JOBCTL_TRAP_STOP on the target task (as it is ptraced) and the target task stopping in ptrace_stop with __state == TASK_TRACED. If the task was already stopped then ptrace_attach sets JOBCTL_TRAPPING and JOBCTL_TRAP_STOP, wakes it out of __TASK_STOPPED, and waits until the JOBCTL_TRAPPING_BIT is clear. At which point the task stops in ptrace_stop. In both cases there are a couple of funning excpetions such as if the traced task receiveds a SIGCONT, or is set a fatal signal. However in all of those cases the tracee never stops in __state TASK_STOPPED. Which is a long way of saying that ptrace_attach_sync_user_rbs is guaranteed never to do anything. Cc: linux-ia64@vger.kernel.org Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:33:54 -05:00
Eric W. Biederman	4a3d2717d1	ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP xtensa is the last user of the PT_SINGLESTEP flag. Changing tsk->ptrace in user_enable_single_step and user_disable_single_step without locking could potentiallly cause problems. So use a thread info flag instead of a flag in tsk->ptrace. Use TIF_SINGLESTEP that xtensa already had defined but unused. Remove the definitions of PT_SINGLESTEP and PT_BLOCKSTEP as they have no more users. Cc: stable@vger.kernel.org Acked-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:33:44 -05:00
Eric W. Biederman	c200e4bb44	ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP User mode linux is the last user of the PT_DTRACE flag. Using the flag to indicate single stepping is a little confusing and worse changing tsk->ptrace without locking could potentionally cause problems. So use a thread info flag with a better name instead of flag in tsk->ptrace. Remove the definition PT_DTRACE as uml is the last user. Cc: stable@vger.kernel.org Acked-by: Johannes Berg <johannes@sipsolutions.net> Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-3-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:33:33 -05:00
Eric W. Biederman	e71ba12407	signal: Replace __group_send_sig_info with send_signal_locked The function __group_send_sig_info is just a light wrapper around send_signal_locked with one parameter fixed to a constant value. As the wrapper adds no real value update the code to directly call the wrapped function. Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-2-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:33:17 -05:00
Eric W. Biederman	157cc18122	signal: Rename send_signal send_signal_locked Rename send_signal and __send_signal to send_signal_locked and __send_signal_locked to make send_signal usable outside of signal.c. Tested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20220505182645.497868-1-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2022-05-11 14:32:57 -05:00
Jason Gunthorpe	ff806cbd90	vfio/pci: Remove vfio_device_get_from_dev() The last user of this function is in PCI callbacks that want to convert their struct pci_dev to a vfio_device. Instead of searching use the vfio_device available trivially through the drvdata. When a callback in the device_driver is called, the caller must hold the device_lock() on dev. The purpose of the device_lock is to prevent remove() from being called (see __device_release_driver), and allow the driver to safely interact with its drvdata without races. The PCI core correctly follows this and holds the device_lock() when calling error_detected (see report_error_detected) and sriov_configure (see sriov_numvfs_store). Further, since the drvdata holds a positive refcount on the vfio_device any access of the drvdata, under the device_lock(), from a driver callback needs no further protection or refcounting. Thus the remark in the vfio_device_get_from_dev() comment does not apply here, VFIO PCI drivers all call vfio_unregister_group_dev() from their remove callbacks under the device_lock() and cannot race with the remaining callers. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-c841817a0349+8f-vfio_get_from_dev_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-05-11 13:32:56 -06:00

... 102 103 104 105 106 ...

1105317 commits