linux-xiaomi-chiron

Author	SHA1	Message	Date
Eric Dumazet	68822bdf76	net: generalize skb freeing deferral to per-cpu lists Logic added in commit `f35f821935` ("tcp: defer skb freeing after socket lock is released") helped bulk TCP flows to move the cost of skbs frees outside of critical section where socket lock was held. But for RPC traffic, or hosts with RFS enabled, the solution is far from being ideal. For RPC traffic, recvmsg() has to return to user space right after skb payload has been consumed, meaning that BH handler has no chance to pick the skb before recvmsg() thread. This issue is more visible with BIG TCP, as more RPC fit one skb. For RFS, even if BH handler picks the skbs, they are still picked from the cpu on which user thread is running. Ideally, it is better to free the skbs (and associated page frags) on the cpu that originally allocated them. This patch removes the per socket anchor (sk->defer_list) and instead uses a per-cpu list, which will hold more skbs per round. This new per-cpu list is drained at the end of net_action_rx(), after incoming packets have been processed, to lower latencies. In normal conditions, skbs are added to the per-cpu list with no further action. In the (unlikely) cases where the cpu does not run net_action_rx() handler fast enough, we use an IPI to raise NET_RX_SOFTIRQ on the remote cpu. Also, we do not bother draining the per-cpu list from dev_cpu_dead() This is because skbs in this list have no requirement on how fast they should be freed. Note that we can add in the future a small per-cpu cache if we see any contention on sd->defer_lock. Tested on a pair of hosts with 100Gbit NIC, RFS enabled, and /proc/sys/net/ipv4/tcp_rmem[2] tuned to 16MB to work around page recycling strategy used by NIC driver (its page pool capacity being too small compared to number of skbs/pages held in sockets receive queues) Note that this tuning was only done to demonstrate worse conditions for skb freeing for this particular test. These conditions can happen in more general production workload. 10 runs of one TCP_STREAM flow Before: Average throughput: 49685 Mbit. Kernel profiles on cpu running user thread recvmsg() show high cost for skb freeing related functions () 57.81% [kernel] [k] copy_user_enhanced_fast_string () 12.87% [kernel] [k] skb_release_data () 4.25% [kernel] [k] __free_one_page () 3.57% [kernel] [k] __list_del_entry_valid 1.85% [kernel] [k] __netif_receive_skb_core 1.60% [kernel] [k] __skb_datagram_iter () 1.59% [kernel] [k] free_unref_page_commit () 1.16% [kernel] [k] __slab_free 1.16% [kernel] [k] _copy_to_iter () 1.01% [kernel] [k] kfree () 0.88% [kernel] [k] free_unref_page 0.57% [kernel] [k] ip6_rcv_core 0.55% [kernel] [k] ip6t_do_table 0.54% [kernel] [k] flush_smp_call_function_queue () 0.54% [kernel] [k] free_pcppages_bulk 0.51% [kernel] [k] llist_reverse_order 0.38% [kernel] [k] process_backlog () 0.38% [kernel] [k] free_pcp_prepare 0.37% [kernel] [k] tcp_recvmsg_locked () 0.37% [kernel] [k] __list_add_valid 0.34% [kernel] [k] sock_rfree 0.34% [kernel] [k] _raw_spin_lock_irq () 0.33% [kernel] [k] __page_cache_release 0.33% [kernel] [k] tcp_v6_rcv () 0.33% [kernel] [k] __put_page () 0.29% [kernel] [k] __mod_zone_page_state 0.27% [kernel] [k] _raw_spin_lock After patch: Average throughput: 73076 Mbit. Kernel profiles on cpu running user thread recvmsg() looks better: 81.35% [kernel] [k] copy_user_enhanced_fast_string 1.95% [kernel] [k] _copy_to_iter 1.95% [kernel] [k] __skb_datagram_iter 1.27% [kernel] [k] __netif_receive_skb_core 1.03% [kernel] [k] ip6t_do_table 0.60% [kernel] [k] sock_rfree 0.50% [kernel] [k] tcp_v6_rcv 0.47% [kernel] [k] ip6_rcv_core 0.45% [kernel] [k] read_tsc 0.44% [kernel] [k] _raw_spin_lock_irqsave 0.37% [kernel] [k] _raw_spin_lock 0.37% [kernel] [k] native_irq_return_iret 0.33% [kernel] [k] __inet6_lookup_established 0.31% [kernel] [k] ip6_protocol_deliver_rcu 0.29% [kernel] [k] tcp_rcv_established 0.29% [kernel] [k] llist_reverse_order v2: kdoc issue (kernel bots) do not defer if (alloc_cpu == smp_processor_id()) (Paolo) replace the sk_buff_head with a single-linked list (Jakub) add a READ_ONCE()/WRITE_ONCE() for the lockless read of sd->defer_list Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220422201237.416238-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:05:59 -07:00
Guillaume Nault	5e7260712b	qed: Remove IP services API. qed_nvmetcp_ip_services.c and its corresponding header file were introduced in commit `806ee7f81a` ("qed: Add IP services APIs support") but there's still no users for any of the functions they declare. Since these files are effectively unused, let's just drop them. Found by code inspection. Compile-tested only. Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/351ac8c847980e22850eb390553f8cc0e1ccd0ce.1650545051.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-22 15:15:23 -07:00
Kuniyuki Iwashima	89e9c72800	ipv6: Remove __ipv6_only_sock(). Since commit `9fe516ba3f` ("inet: move ipv6only in sock_common"), ipv6_only_sock() and __ipv6_only_sock() are the same macro. Let's remove the one. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-22 12:47:50 +01:00
Paolo Abeni	f70925bf99	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/ethernet/microchip/lan966x/lan966x_main.c `d08ed85256` ("net: lan966x: Make sure to release ptp interrupt") `c834963932` ("net: lan966x: Add FDMA functionality") Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-22 09:56:00 +02:00
Song Liu	559089e0a9	vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAP Huge page backed vmalloc memory could benefit performance in many cases. However, some users of vmalloc may not be ready to handle huge pages for various reasons: hardware constraints, potential pages split, etc. VM_NO_HUGE_VMAP was introduced to allow vmalloc users to opt-out huge pages. However, it is not easy to track down all the users that require the opt-out, as the allocation are passed different stacks and may cause issues in different layers. To address this issue, replace VM_NO_HUGE_VMAP with an opt-in flag, VM_ALLOW_HUGE_VMAP, so that users that benefit from huge pages could ask specificially. Also, remove vmalloc_no_huge() and add opt-in helper vmalloc_huge(). Fixes: `fac54e2bfb` ("x86/Kconfig: Select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP") Link: https://lore.kernel.org/netdev/14444103-d51b-0fb3-ee63-c3f182f0b546@molgen.mpg.de/" Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-19 12:08:57 -07:00
Christian Brauner	705191b03d	fs: fix acl translation Last cycle we extended the idmapped mounts infrastructure to support idmapped mounts of idmapped filesystems (No such filesystem yet exist.). Since then, the meaning of an idmapped mount is a mount whose idmapping is different from the filesystems idmapping. While doing that work we missed to adapt the acl translation helpers. They still assume that checking for the identity mapping is enough. But they need to use the no_idmapping() helper instead. Note, POSIX ACLs are always translated right at the userspace-kernel boundary using the caller's current idmapping and the initial idmapping. The order depends on whether we're coming from or going to userspace. The filesystem's idmapping doesn't matter at the border. Consequently, if a non-idmapped mount is passed we need to make sure to always pass the initial idmapping as the mount's idmapping and not the filesystem idmapping. Since it's irrelevant here it would yield invalid ids and prevent setting acls for filesystems that are mountable in a userns and support posix acls (tmpfs and fuse). I verified the regression reported in [1] and verified that this patch fixes it. A regression test will be added to xfstests in parallel. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215849 [1] Fixes: `bd303368b7` ("fs: support mapped mounts of mapped filesystems") Cc: Seth Forshee <sforshee@digitalocean.com> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> # 5.17 Cc: <regressions@lists.linux.dev> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-19 10:19:02 -07:00
Marc Kleine-Budde	eb38c2053b	can: rx-offload: rename can_rx_offload_queue_sorted() -> can_rx_offload_queue_timestamp() This patch renames the function can_rx_offload_queue_sorted() to can_rx_offload_queue_timestamp(). This better describes what the function does, it adds a newly RX'ed skb to the sorted queue by its timestamp. Link: https://lore.kernel.org/all/20220417194327.2699059-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2022-04-19 16:58:04 +02:00
Tonghao Zhang	2f1e85b1ae	net: sched: use queue_mapping to pick tx queue This patch fixes issue: * If we install tc filters with act_skbedit in clsact hook. It doesn't work, because netdev_core_pick_tx() overwrites queue_mapping. $ tc filter ... action skbedit queue_mapping 1 And this patch is useful: * We can use FQ + EDT to implement efficient policies. Tx queues are picked by xps, ndo_select_queue of netdev driver, or skb hash in netdev_core_pick_tx(). In fact, the netdev driver, and skb hash are _not_ under control. xps uses the CPUs map to select Tx queues, but we can't figure out which task_struct of pod/containter running on this cpu in most case. We can use clsact filters to classify one pod/container traffic to one Tx queue. Why ? In containter networking environment, there are two kinds of pod/ containter/net-namespace. One kind (e.g. P1, P2), the high throughput is key in these applications. But avoid running out of network resource, the outbound traffic of these pods is limited, using or sharing one dedicated Tx queues assigned HTB/TBF/FQ Qdisc. Other kind of pods (e.g. Pn), the low latency of data access is key. And the traffic is not limited. Pods use or share other dedicated Tx queues assigned FIFO Qdisc. This choice provides two benefits. First, contention on the HTB/FQ Qdisc lock is significantly reduced since fewer CPUs contend for the same queue. More importantly, Qdisc contention can be eliminated completely if each CPU has its own FIFO Qdisc for the second kind of pods. There must be a mechanism in place to support classifying traffic based on pods/container to different Tx queues. Note that clsact is outside of Qdisc while Qdisc can run a classifier to select a sub-queue under the lock. In general recording the decision in the skb seems a little heavy handed. This patch introduces a per-CPU variable, suggested by Eric. The xmit.skip_txqueue flag is firstly cleared in __dev_queue_xmit(). - Tx Qdisc may install that skbedit actions, then xmit.skip_txqueue flag is set in qdisc->enqueue() though tx queue has been selected in netdev_tx_queue_mapping() or netdev_core_pick_tx(). That flag is cleared firstly in __dev_queue_xmit(), is useful: - Avoid picking Tx queue with netdev_tx_queue_mapping() in next netdev in such case: eth0 macvlan - eth0.3 vlan - eth0 ixgbe-phy: For example, eth0, macvlan in pod, which root Qdisc install skbedit queue_mapping, send packets to eth0.3, vlan in host. In __dev_queue_xmit() of eth0.3, clear the flag, does not select tx queue according to skb->queue_mapping because there is no filters in clsact or tx Qdisc of this netdev. Same action taked in eth0, ixgbe in Host. - Avoid picking Tx queue for next packet. If we set xmit.skip_txqueue in tx Qdisc (qdisc->enqueue()), the proper way to clear it is clearing it in __dev_queue_xmit when processing next packets. For performance reasons, use the static key. If user does not config the NET_EGRESS, the patch will not be compiled. +----+ +----+ +----+ \| P1 \| \| P2 \| \| Pn \| +----+ +----+ +----+ \| \| \| +-----------+-----------+ \| \| clsact/skbedit \| MQ v +-----------+-----------+ \| q0 \| q1 \| qn v v v HTB/FQ HTB/FQ ... FIFO Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexander Lobakin <alobakin@pm.me> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Talal Ahmad <talalahmad@google.com> Cc: Kevin Hao <haokexin@gmail.com> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Kees Cook <keescook@chromium.org> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com> Cc: Antoine Tenart <atenart@kernel.org> Cc: Wei Wang <weiwan@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-19 12:20:45 +02:00
Eric Dumazet	8fbf195798	tcp: add drop reason support to tcp_ofo_queue() packets in OFO queue might be redundant, and dropped. tcp_drop() is no longer needed. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-17 13:31:32 +01:00
Eric Dumazet	e7c89ae407	tcp: add drop reason support to tcp_prune_ofo_queue() Add one reason for packets dropped from OFO queue because of memory pressure. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-17 13:31:31 +01:00
Eric Dumazet	4b506af9c5	tcp: add two drop reasons for tcp_ack() Add TCP_TOO_OLD_ACK and TCP_ACK_UNSENT_DATA drop reasons so that tcp_rcv_established() can report them. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-17 13:31:31 +01:00
Eric Dumazet	669da7a718	tcp: add drop reasons to tcp_rcv_state_process() Add basic support for drop reasons in tcp_rcv_state_process() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-17 13:31:31 +01:00
Eric Dumazet	da40b613f8	tcp: add drop reason support to tcp_validate_incoming() Creates four new drop reasons for the following cases: 1) packet being rejected by RFC 7323 PAWS check 2) packet being rejected by SEQUENCE check 3) Invalid RST packet 4) Invalid SYN packet Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-17 13:31:31 +01:00
Arun Ajith S	f9a2fb7331	net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131 Add a new neighbour cache entry in STALE state for routers on receiving an unsolicited (gratuitous) neighbour advertisement with target link-layer-address option specified. This is similar to the arp_accept configuration for IPv4. A new sysctl endpoint is created to turn on this behaviour: /proc/sys/net/ipv6/conf/interface/accept_unsolicited_na. Signed-off-by: Arun Ajith S <aajith@arista.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-17 13:23:49 +01:00
Linus Torvalds	de6e933668	gpio fixes for v5.18-rc3 - fix the set/get_multiple() callbacks in gpio-sim - use correct format characters in gpiolib-acpi - use an unsigned type for pins in gpiolib-acpi -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmJbI2UACgkQEacuoBRx 13Iu1hAAmPpHwe64eECqWdCR0b6L/XZg4gihDaXa4EObL91gBNQL/SR+CnhCGmjm 8PJwoKSUIHyUGI9qQ83Na4jtDrcNV+9H2EgdgA7PHMDX6P5kS6NuaxzvJ7UPi+xB tBVFgqGwx25JAE71/22bmH3CDWTdPDXm9FSTPItQiOXweBBzJKNGQegJYzo3Z1M7 3jLwkvU0LUI7YvjZHRG8vSp2MV3PGWUsDnaBOQ+ajhBQLnEgTudQSsh9Z2e/zZVR MJ26FG5pkzbJ+fY4ern9S84PR/IDcPPMcjO63VHO8AY0gR78aNNBuOj7TTOdsbk6 Rs+z9LPFsNsJBNxqD6iWXMKK35ox6jThqja31w3v9C41uf1sfMFidaUqrBX+k6uN d8oX2VcvsXVtnh9KqKsAijpQQxkZbrfkfh/O4xy/IOZv/r6WHUUtzae/DelfmNhu eM+dVfaPywxQ+HSiqWjklgz6vKzAxNEJJbklFpbwfyJPeWucpD5HGhyWftI4QzsP 1kx0HxcQ20sE5Euiw8yqcR0ZaBc+Sdu7TU8kt2NbMxpBDf58Rs6bd19MsD/p9+Ge JF09kNTh3iWTbtpcK7yww1TccbuzvKUVImsUWWnjLIk8e3Ar0XP3LGWCs0WgqYEf z39CQJ8iVN2/xOtsm0WLUDnce6FATqc2O75sLB3qkf1TaMYQj/s= =Yt3p -----END PGP SIGNATURE----- Merge tag 'gpio-fixes-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: "A single fix for gpio-sim and two patches for GPIO ACPI pulled from Andy: - fix the set/get_multiple() callbacks in gpio-sim - use correct format characters in gpiolib-acpi - use an unsigned type for pins in gpiolib-acpi" * tag 'gpio-fixes-for-v5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: sim: fix setting and getting multiple lines gpiolib: acpi: Convert type for pin to be unsigned gpiolib: acpi: use correct format characters	2022-04-16 17:01:43 -07:00
Linus Torvalds	92edbe32e3	Random number generator fixes for Linux 5.18-rc3. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmJavk0ACgkQSfxwEqXe A640KA/9FHngPjK8AOuFJJRfdNYu5CWcn4PD/xmYGhP1QUuKOfIF6rUGrl7w4jCh +mKICOt2Rt2ZqSscS1ccpddiwAcoURrra6pW8Oo5IiuHlPBWR/jow9FzeLZpP4II 5mcWSZX4/MJFs9o6T+25FE9tPjsVcEi1hgYEU0ecXtXYK+mIUbPobF1pWlI+C61F 8sYQ4GrpJHLWreou7SYfzbI3siaXmewie8aYgrqvU8Bt7U2UVTA0j8VxeX7+r87/ xZ+/n+E3WOiEt2h0UyOB6+5bIL0t6qJI/plc8kQN/R5UHSoMrT06MdwrThI4exI3 YDDf488aKiYPdeQt3kFXH4o1PVK9R072Z+ZK63jfUxGOQs1DXI2DxCSqgO3fPfIR v9uy0kG6zWG+C8RuX3VV12gIL6/XOFRx7UHVwSnt+A1Li/DPEdNoYlKesSGOKuel vbXBmed2z5v02KMd1Y9LtrioLco8JDFYD0OEtbEaGjv7Kt+EcVtapo1e7N2VSmk6 IKxp7soEorJSR13rR5vyeF+yOZmxn6BOePc7m48O0Wqx76DHlyXiMPNTJUrHIC2z XfD8+P1mHA2Iz71T7YI9Dmzw9uIVZuUteiEpvc0o/uy8z3YzOUftnqQBsAKwbxz5 HKrIdkocQNbXK46GKpjK6OY8BCwOeRM+z/XdJwYDek7G1+1ULpg= =2JCC -----END PGP SIGNATURE----- Merge tag 'random-5.18-rc3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fixes from Jason Donenfeld: - Per your suggestion, random reads now won't fail if there's a page fault after some non-zero amount of data has been read, which makes the behavior consistent with all other reads in the kernel. - Rather than an inconsistent mix of random_get_entropy() returning an unsigned long or a cycles_t, now it just returns an unsigned long. - A memcpy() was replaced with an memmove(), because the addresses are sometimes overlapping. In practice the destination is always before the source, so not really an issue, but better to be correct than not. * tag 'random-5.18-rc3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: use memmove instead of memcpy for remaining 32 bytes random: make random_get_entropy() return an unsigned long random: allow partial reads if later user copies fail	2022-04-16 16:42:53 -07:00
Bartosz Golaszewski	0ebb4fbe31	intel-gpio for v5.18-2 * Couple of fixes related to handling unsigned value of the pin from ACPI The following is an automated git shortlog grouped by driver: gpiolib: - acpi: Convert type for pin to be unsigned - acpi: use correct format characters -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAmJUN7cACgkQb7wzTHR8 rCjeMQ/+KZHhBgPHE2Rl928bI/IQYuhcGjFmsLUt/KRFFsjVpzFQro9QI0IFRava /8j6cUy8mTV6v5cxScljusS7++j8K3C6kazsXr2ieGPKpaGDHtKs93QjbC52jF/T x/CI7NbnkvJHPWmpIYSuc/vGHHCHAAsnI62OHAIyEARRYBbd5VAkyrvnchH2skaK bHH0+1La4l2fBkmfA6fpnwLVnRNXEeRTUdQdGk/14Snv1zGNeciGR2/wbg0YtEtf CvEPi9U7d7RJCjBiege9kcb8R4ABOngypL1C9r7XvtFy17glxkc+6nrJx3NgW3bY Eksi4naML2oe4R/PyZ740wztJCUKdR6oV/OmyRz5FhTA4+ZWRz3pL3o2Rv+DTZ5g AcugQZCMe8mNGerFLPxBHpD1s0eD5Gxrnq7CeEpQ0wpBqBj90iZPAvE2mMTghqjH Vm9OsLX/X8/vAPLeMLgHy9tDwl5FRqkM7BxV3qBp14MBpPfzDmQc3VS5SGcw/car eZI7mQH85iSpnYjkqfDAwQtr3Jbxt8OuEnADWPiJknuTwGigQm9MLh37cW9xih0r BQWsHMO9ephWhDSwpKVl+7fbQpIUd2TQFFBOnzIW69ALevD16FkEdnn08HEI2Kd6 mEB60/aA04PNyS6EtbD4ZbCNDjdB3S7EITPPrVPeGycD2JAp5sA= =i+RD -----END PGP SIGNATURE----- Merge tag 'intel-gpio-v5.18-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current intel-gpio for v5.18-2 * Couple of fixes related to handling unsigned value of the pin from ACPI gpiolib: - acpi: Convert type for pin to be unsigned - acpi: use correct format characters	2022-04-16 21:57:00 +02:00
Linus Torvalds	59250f8a7f	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "14 patches. Subsystems affected by this patch series: MAINTAINERS, binfmt, and mm (tmpfs, secretmem, kasan, kfence, pagealloc, zram, compaction, hugetlb, vmalloc, and kmemleak)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: kmemleak: take a full lowmem check in kmemleak_*_phys() mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE" revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders" hugetlb: do not demote poisoned hugetlb pages mm: compaction: fix compiler warning when CONFIG_COMPACTION=n mm: fix unexpected zeroed page mapping with zram swap mm, page_alloc: fix build_zonerefs_node() mm, kfence: support kmem_dump_obj() for KFENCE objects kasan: fix hw tags enablement when KUNIT tests are disabled irq_work: use kasan_record_aux_stack_noalloc() record callstack mm/secretmem: fix panic when growing a memfd_secret tmpfs: fix regressions from wider use of ZERO_PAGE MAINTAINERS: Broadcom internal lists aren't maintainers	2022-04-15 15:57:18 -07:00
Marco Elver	2dfe63e61c	mm, kfence: support kmem_dump_obj() for KFENCE objects Calling kmem_obj_info() via kmem_dump_obj() on KFENCE objects has been producing garbage data due to the object not actually being maintained by SLAB or SLUB. Fix this by implementing __kfence_obj_info() that copies relevant information to struct kmem_obj_info when the object was allocated by KFENCE; this is called by a common kmem_obj_info(), which also calls the slab/slub/slob specific variant now called __kmem_obj_info(). For completeness, kmem_dump_obj() now displays if the object was allocated by KFENCE. Link: https://lore.kernel.org/all/20220323090520.GG16885@xsang-OptiPlex-9020/ Link: https://lkml.kernel.org/r/20220406131558.3558585-1-elver@google.com Fixes: `b89fb5ef0c` ("mm, kfence: insert KFENCE hooks for SLUB") Fixes: `d3fb45f370` ("mm, kfence: insert KFENCE hooks for SLAB") Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Reported-by: kernel test robot <oliver.sang@intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> [slab] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-15 14:49:55 -07:00
Jie Wang	4dc84c06a3	net: ethtool: extend ringparam set/get APIs for tx_push Currently tx push is a standard driver feature which controls use of a fast path descriptor push. So this patch extends the ringparam APIs and data structures to support set/get tx push by ethtool -G/g. Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-15 11:41:35 -07:00
Linus Torvalds	fb649bda6f	block-5.18-2022-04-15 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmJZaB0QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpr97EACWF0YUnJ8H3CHLKtQ19EvHVdx8I4b96NSI 9zNVUCUBau4BfrutkddMpCaqeRMY1tdgoHB1kDACFJV3YV7PmJbQoTe8OQZ/Zo4c OaXYEaijHylLFhKIp9/zmKSROcSFfDk3qtPkk6z55QuXveQ1UIFlMKkIga161CRQ SaHD7D5oJVe0AgWkxsTu+D9eSd8mRvgEfLpTzIOWq/CYj8IT3ZoJMBApg9Z0KwdF ro2qUOpRuhkIYNi5dneNyBf7X9q0RLPWakzY4dm5icronpiR7RchvPDPvaoVaCfO LpsU4Hgk0CUnzyUBuA7QDPEo/XK44g/leK5HGhZlXLPPTm6ajYMzlcnX71Png8Rq Npe3lc1DaB11Vcph+8GuhoAGpniS03/6GPzpmLTf17OQcR2w7Sr8OuW8bIjUDxp6 Qv9tphS+jthCwYol3Vf4jkAQIeWsi5ZALCw9+/yG8WksY+DtDtnXeA1qRmWBRh3l soPKuzvNM3ZiAJcrpF6twVsn07x7tqU5Ab3ZzjaAbVXPciOSPmJtI3LQC4QCXf7X kEhsfq3Ndmcybjs2ML0KBoSFqZQcQ5RY/5IlSU/eEMOT1CVTflkX2S8UbkYiINHw rpCyX+oMUCIXD/7pOghuAhiXgPyt5gHA1zCEf4JZYwjZJDzT/5w1QMrlB2rrlipA rkcj6ikM9A== =7gVd -----END PGP SIGNATURE----- Merge tag 'block-5.18-2022-04-15' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - Moving of lower_48_bits() to the block layer and a fix for the unaligned_be48 added with that originally (Alexander, Keith) - Fix a bad WARN_ON() for trim size checking (Ming) - A polled IO timeout fix for null_blk (Ming) - Silence IO error printing for dead disks (Christoph) - Compat mode range fix (Khazhismel) - NVMe pull request via Christoph: - Tone down the error logging added this merge window a bit (Chaitanya Kulkarni) - Quirk devices with non-unique unique identifiers (Christoph) * tag 'block-5.18-2022-04-15' of git://git.kernel.dk/linux-block: block: don't print I/O error warning for dead disks block/compat_ioctl: fix range check in BLKGETSIZE nvme-pci: disable namespace identifiers for Qemu controllers nvme-pci: disable namespace identifiers for the MAXIO MAP1002/1202 nvme: add a quirk to disable namespace identifiers nvme: don't print verbose errors for internal passthrough requests block: null_blk: end timed out poll request block: fix offset/size check in bio_trim() asm-generic: fix __get_unaligned_be48() on 32 bit platforms block: move lower_48_bits() to block	2022-04-15 11:38:55 -07:00
Paolo Abeni	edf45f007a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2022-04-15 09:26:00 +02:00
Linus Torvalds	38a5e3fb17	VFIO fixes for v5.18-rc3 - Fix VF token checking for vfio-pci variant drivers (Jason Gunthorpe) -----BEGIN PGP SIGNATURE----- iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmJYeycbHGFsZXgud2ls bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsi7J8QAJSuFZvnYNG5MJ6mlUwU iYtCojDdzKloFFcbZU1J1v2dyLuYsaQb0YRwOFJdMOAteFHCjOJzyKCaek4JTK6x uk4MPJWWb3EHP7kNgFsH/EqXJiyjQGJpD3gJNtGNDGjEvAbtBpy/cYoDqPxy3mhx fLHox4O1eWVsBh2Kzgf2GikrDOrK+Xd0zT+7x4SbCRimbRaLQxtndIOuSIEBbGmZ zH4dz5RZ0/1zRLupDfj9voGGIRSxgWjSAvUXBhCxeQKGgHL7zoDkSHbxCmIZN1B5 6RzjbeJPnUSno4QM6WU7hl3lv6qq+VHAb58F7SJCaQGueV6wRGrym1RAxeZQPO0R SmW7s7lPgWG5qXX9glAgha9Nv0gTs1+TEQprYG9pJShvSFGffpo/LHiMcvJDoiND jqnbY5smFKGE/uBDeamESUwotPFiFR9yLug2kZ8qgFRXfYS91/omDlyvii6wOtd/ R9IAtqpQgnpCk7+gH/y8Cf2SlYnffKY3d+dSii67dOcyJoH8KwrRqYp8SWBFir6L VX7n2/zIjxfIiuMjE/YzpW2StZnytxsHS6KT5lJVSbIz4KwnWu0kbBU0SP/i8Tti nGI7QdN4itjN6aYXb2WRbT1RTJZbslIiwlJR1ouO+nn9SGBo9X92ZTGDeMNdIZ6Q BNxxOFN0EgcPu+h9c0Spb8ui =Gif3 -----END PGP SIGNATURE----- Merge tag 'vfio-v5.18-rc3' of https://github.com/awilliam/linux-vfio Pull vfio fix from Alex Williamson: - Fix VF token checking for vfio-pci variant drivers (Jason Gunthorpe) * tag 'vfio-v5.18-rc3' of https://github.com/awilliam/linux-vfio: vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used	2022-04-14 16:06:56 -07:00
Linus Torvalds	ec9c57a732	fscache fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmJW6J0ACgkQ+7dXa6fL C2sXZw//Z5if0BzqI0SjZ7JeozZ4ADgAKYrT0WvYRg7TPfJBThZsa9NEaQfueXBp 0DlZpM/ToNQXMYV/jRlDjzooPstKt45IdS4wMewTtXZAXDuZNDygOIRTY5S3zKHY wW4ZuZ/3AtNejY8YAnxFtykcbVfxMzYkQ+cgX79fd/+ZLgpYImtk4k3fS53pyRhv Q6bT7zxIOIGzaPVda/lu4t1vdTXNTNI2b/9lYOPEXMC6OFmHQ94dnqsyW6AKUD6K oYldKUFIFcv5CeiH8sTXFrkckNqgkUln5OXdkFLUjRPHE8Yyq57ZyX7ilp4yFdfS HE8VpDaDfk11hVpFLngkPu5RJKsRbad5CQ+Jl8+xiEn0fbuR4q7LFvqSVoykh/x5 hh5veuSc/KvfpRIijNlCff4gry9gnAhHlLbt8xFbpNWPaYFTVdkkuDtoB8Kv/KoV HmxyNyQEW2eL7i7JXfSI4EjNPI85RVhVXmccbJJoKHQRvSVCZuWBlR7NUkykzuXg CQbnO/SeqilpOpbFvhTUXdOXNQEHbtcKp9yO9hd/f67E7gNKiDOewOmItMt4HfFx pjKv48tLR1cnZtL8pNf7nSMKuGe0bofDu46rnra0Z4mcz4d4y9mkIrdoIVT/heae E3Ut/1GJ0MkETl+nYXdV9IxTMHWIVGyzt7HyJTziXj03dfVquqA= =fUQK -----END PGP SIGNATURE----- Merge tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache fixes from David Howells: "Here's a collection of fscache and cachefiles fixes and misc small cleanups. The two main fixes are: - Add a missing unmark of the inode in-use mark in an error path. - Fix a KASAN slab-out-of-bounds error when setting the xattr on a cachefiles volume due to the wrong length being given to memcpy(). In addition, there's the removal of an unused parameter, removal of an unused Kconfig option, conditionalising a bit of procfs-related stuff and some doc fixes" * tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: fscache: remove FSCACHE_OLD_API Kconfig option fscache: Use wrapper fscache_set_cache_state() directly when relinquishing fscache: Move fscache_cookies_seq_ops specific code under CONFIG_PROC_FS fscache: Remove the cookie parameter from fscache_clear_page_bits() docs: filesystems: caching/backend-api.rst: fix an object withdrawn API docs: filesystems: caching/backend-api.rst: correct two relinquish APIs use cachefiles: Fix KASAN slab-out-of-bounds in cachefiles_set_volume_xattr cachefiles: unmark inode in use in error path	2022-04-14 10:51:20 -07:00
Lech Perczak	36e747972d	rndis_host: enable the bogus MAC fixup for ZTE devices from cdc_ether Certain ZTE modems, namely: MF823. MF831, MF910, built-in modem from MF286R, expose both CDC-ECM and RNDIS network interfaces. They have a trait of ignoring the locally-administered MAC address configured on the interface both in CDC-ECM and RNDIS part, and this leads to dropping of incoming traffic by the host. However, the workaround was only present in CDC-ECM, and MF286R explicitly requires it in RNDIS mode. Re-use the workaround in rndis_host as well, to fix operation of MF286R module, some versions of which expose only the RNDIS interface. Do so by introducing new flag, RNDIS_DRIVER_DATA_DST_MAC_FIXUP, and testing for it in rndis_rx_fixup. This is required, as RNDIS uses frame batching, and all of the packets inside the batch need the fixup. This might introduce a performance penalty, because test is done for every returned Ethernet frame. Apply the workaround to both "flavors" of RNDIS interfaces, as older ZTE modems, like MF823 found in the wild, report the USB_CLASS_COMM class interfaces, while MF286R reports USB_CLASS_WIRELESS_CONTROLLER. Suggested-by: Bjørn Mork <bjorn@mork.no> Cc: Kristian Evensen <kristian.evensen@gmail.com> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Lech Perczak <lech.perczak@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-14 15:08:12 +02:00
Lech Perczak	64b97df995	cdc_ether: export usbnet_cdc_zte_rx_fixup Commit `bfe9b9d2df` ("cdc_ether: Improve ZTE MF823/831/910 handling") introduces a workaround for certain ZTE modems reporting invalid MAC addresses over CDC-ECM. The same issue was present on their RNDIS interface,which was fixed in commit `a5a18bdf74` ("rndis_host: Set valid random MAC on buggy devices"). However, internal modem of ZTE MF286R router, on its RNDIS interface, also exhibits a second issue fixed already in CDC-ECM, of the device not respecting configured random MAC address. In order to share the fixup for this with rndis_host driver, export the workaround function, which will be re-used in the following commit in rndis_host. Cc: Kristian Evensen <kristian.evensen@gmail.com> Cc: Bjørn Mork <bjorn@mork.no> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Lech Perczak <lech.perczak@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-14 15:08:12 +02:00
Jason Gunthorpe	1ef3342a93	vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used get_pf_vdev() tries to check if a PF is a VFIO PF by looking at the driver: if (pci_dev_driver(physfn) != pci_dev_driver(vdev->pdev)) { However now that we have multiple VF and PF drivers this is no longer reliable. This means that security tests realted to vf_token can be skipped by mixing and matching different VFIO PCI drivers. Instead of trying to use the driver core to find the PF devices maintain a linked list of all PF vfio_pci_core_device's that we have called pci_enable_sriov() on. When registering a VF just search the list to see if the PF is present and record the match permanently in the struct. PCI core locking prevents a PF from passing pci_disable_sriov() while VF drivers are attached so the VFIO owned PF becomes a static property of the VF. In common cases where vfio does not own the PF the global list remains empty and the VF's pointer is statically NULL. This also fixes a lockdep splat from recursive locking of the vfio_group::device_lock between vfio_device_get_from_name() and vfio_device_get_from_dev(). If the VF and PF share the same group this would deadlock. Fixes: `ff53edf6d6` ("vfio/pci: Split the pci_driver code out of vfio_pci_core.c") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v3-876570980634+f2e8-vfio_vf_token_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2022-04-13 11:37:44 -06:00
Menglong Dong	1ad6d548e2	net: icmp: introduce function icmpv6_param_prob_reason() In order to add the skb drop reasons support to icmpv6_param_prob(), introduce the function icmpv6_param_prob_reason() and make icmpv6_param_prob() an inline call to it. This new function will be used in the following patches. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-13 13:09:57 +01:00
Menglong Dong	2edc1a383f	net: ip: add skb drop reasons to ip forwarding Replace kfree_skb() which is used in ip6_forward() and ip_forward() with kfree_skb_reason(). The new drop reason 'SKB_DROP_REASON_PKT_TOO_BIG' is introduced for the case that the length of the packet exceeds MTU and can't fragment. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-13 13:09:57 +01:00
Menglong Dong	c4eb664191	net: ipv4: add skb drop reasons to ip_error() Eventually, I find out the handler function for inputting route lookup fail: ip_error(). The drop reasons we used in ip_error() are almost corresponding to IPSTATS_MIB_, and following new reasons are introduced: SKB_DROP_REASON_IP_INADDRERRORS SKB_DROP_REASON_IP_INNOROUTES Isn't the name SKB_DROP_REASON_IP_HOSTUNREACH and SKB_DROP_REASON_IP_NETUNREACH more accurate? To make them corresponding to IPSTATS_MIB_, we keep their name still. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-13 13:09:57 +01:00
Menglong Dong	d6d3146ce5	skb: add some helpers for skb drop reasons In order to simply the definition and assignment for 'enum skb_drop_reason', introduce some helpers. SKB_DR() is used to define a variable of type 'enum skb_drop_reason' with the 'SKB_DROP_REASON_NOT_SPECIFIED' initial value. SKB_DR_SET() is used to set the value of the variable. Seems it is a little useless? But it makes the code shorter. SKB_DR_OR() is used to set the value of the variable if it is not set yet, which means its value is SKB_DROP_REASON_NOT_SPECIFIED. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-13 13:09:56 +01:00
Jason A. Donenfeld	b0c3e796f2	random: make random_get_entropy() return an unsigned long Some implementations were returning type `unsigned long`, while others that fell back to get_cycles() were implicitly returning a `cycles_t` or an untyped constant int literal. That makes for weird and confusing code, and basically all code in the kernel already handled it like it was an `unsigned long`. I recently tried to handle it as the largest type it could be, a `cycles_t`, but doing so doesn't really help with much. Instead let's just make random_get_entropy() return an unsigned long all the time. This also matches the commonly used `arch_get_random_long()` function, so now RDRAND and RDTSC return the same sized integer, which means one can fallback to the other more gracefully. Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Theodore Ts'o <tytso@mit.edu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-13 13:58:57 +02:00
Nikolay Aleksandrov	1306d5362a	net: add ndo_fdb_del_bulk Add a new netdev op called ndo_fdb_del_bulk, it will be later used for driver-specific bulk delete implementation dispatched from rtnetlink. The first user will be the bridge, we need it to signal to rtnetlink from the driver that we support bulk delete operation (NLM_F_BULK). Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-13 12:46:26 +01:00
Russell King (Oracle)	1a95e04e29	net: phylink: remove phylink_helper_basex_speed() As there are now no users of phylink_helper_basex_speed(), we can remove this obsolete functionality. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-13 12:38:44 +01:00
Linus Torvalds	c1488c9751	NFSD bug fixes for 5.18-rc: - Fix a write performance regression - Fix crashes during request deferral on RDMA transports -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmJVnIgACgkQM2qzM29m f5cdQA//SS2pbiAcjePL6zuVrd/eeALWRJIdCj8fP9eQ4dfi7OM0xQE3qHyLDC3n 2fjciUSnoE+lIYiRXD7Ii+GqYThSrrxlR7aCT+vrIAL/ksnzTeUiUSuEihPOOYjn ZEz9vkP4wlCTFh4SdeBoETEjksvRU3ql37l4S04nxxgidpY/NkF5wZH5iUlV8Z14 Q8OxJJk1tFJARPz6RqGrRkMws24NhYzeuXQktVA+AOoKjUeQSp4bN1ZSCJv3eX/E EjAPDFMYVdenp8OY9RJoP3Xpxb6e1mv5flXYQa7YpJUVH3AccMJ8aKuFadjGfGNS 2tEzoipJR0fmxkxpDX1g0Bzm8fAIc467l4tskVzUxLnqNS/FjXv2P85h8p9U/R06 BWbF5CxRu89h1tZ/ZIh4lfw/Ro94XvYaYCDeu9V1TndP+QroNDDY9Ypl13+uy6uS zLwEEo5nMUbc7FVmT7UidHqeEukwbNzmXEXYrgQD5hRaT9L+85R0L5/Y+gi+ODDK 7SKu1Bomi3WN7WKzjvZspHMivVT6JNy/ngHlKSYkWYl/dJzMy5I4Z4sNRVCHKoKX 17SpKHfKkZAt5oS54dZ40O1PhICZTMclAB7Vb8bs/yggrHTsqwqf21v8FTJW79K9 MjnYigEWgwi62GC5WdZr5LqAkqRHi2K2rLnFwVSLtZvpb2sxPdM= =3AeS -----END PGP SIGNATURE----- Merge tag 'nfsd-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a write performance regression - Fix crashes during request deferral on RDMA transports * tag 'nfsd-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix the svc_deferred_event trace class SUNRPC: Fix NFSD's request deferral on RDMA transports nfsd: Clean up nfsd_file_put() nfsd: Fix a write performance regression SUNRPC: Return true/false (not 1/0) from bool functions	2022-04-12 14:23:19 -10:00
Jakub Kicinski	e69a837f58	Merge branch 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== Mellanox shared branch that includes: * Removal of FPGA TLS code https://lore.kernel.org/all/cover.1649073691.git.leonro@nvidia.com Mellanox INNOVA TLS cards are EOL in May, 2018 [1]. As such, the code is unmaintained, untested and not in-use by any upstream/distro oriented customers. In order to reduce code complexity, drop the kernel code, clean build config options and delete useless kTLS vs. TLS separation. [1] https://network.nvidia.com/related-docs/eol/LCR-000286.pdf * Removal of FPGA IPsec code https://lore.kernel.org/all/cover.1649232994.git.leonro@nvidia.com Together with FPGA TLS, the IPsec went to EOL state in the November of 2019 [1]. Exactly like FPGA TLS, no active customers exist for this upstream code and all the complexity around that area can be deleted. [2] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf * Fix to undefined behavior from Borislav https://lore.kernel.org/all/20220405151517.29753-11-bp@alien8.de * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (23 commits) net/mlx5: Remove not-implemented IPsec capabilities net/mlx5: Remove ipsec_ops function table net/mlx5: Reduce kconfig complexity while building crypto support net/mlx5: Move IPsec file to relevant directory net/mlx5: Remove not-needed IPsec config net/mlx5: Align flow steering allocation namespace to common style net/mlx5: Unify device IPsec capabilities check net/mlx5: Remove useless IPsec device checks net/mlx5: Remove ipsec vs. ipsec offload file separation RDMA/core: Delete IPsec flow action logic from the core RDMA/mlx5: Drop crypto flow steering API RDMA/mlx5: Delete never supported IPsec flow action net/mlx5: Remove FPGA ipsec specific statistics net/mlx5: Remove XFRM no_trailer flag net/mlx5: Remove not-used IDA field from IPsec struct net/mlx5: Delete metadata handling logic net/mlx5_fpga: Drop INNOVA IPsec support IB/mlx5: Fix undefined behavior due to shift overflowing the constant net/mlx5: Cleanup kTLS function names and their exposure net/mlx5: Remove tls vs. ktls separation as it is the same ... ==================== Link: https://lore.kernel.org/r/20220409055303.1223644-1-leon@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-11 20:34:02 -07:00
Keith Busch	868e6139c5	block: move lower_48_bits() to block The function is not generally applicable enough to be included in the core kernel header. Move it to block since it's the only subsystem using it. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20220327173316.315-1-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-04-11 19:18:27 -06:00
Menglong Dong	b384c95a86	net: icmp: add skb drop reasons to icmp protocol Replace kfree_skb() used in icmp_rcv() and icmpv6_rcv() with kfree_skb_reason(). In order to get the reasons of the skb drops after icmp message handle, we change the return type of 'handler()' in 'struct icmp_control' from 'bool' to 'enum skb_drop_reason'. This may change its original intention, as 'false' means failure, but 'SKB_NOT_DROPPED_YET' means success now. Therefore, all 'handler' and the call of them need to be handled. Following 'handler' functions are involved: icmp_unreach() icmp_redirect() icmp_echo() icmp_timestamp() icmp_discard() And following new drop reasons are added: SKB_DROP_REASON_ICMP_CSUM SKB_DROP_REASON_INVALID_PROTO The reason 'INVALID_PROTO' is introduced for the case that the packet doesn't follow rfc 1122 and is dropped. This is not a common case, and I believe we can locate the problem from the data in the packet. For now, this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types. Maybe there should be a document file for these reasons. For example, list all the case that causes the 'UNHANDLED_PROTO' and 'INVALID_PROTO' drop reason. Therefore, users can locate their problems according to the document. Reviewed-by: Hao Peng <flyingpeng@tencent.com> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-11 10:38:37 +01:00
Menglong Dong	9f8ed577c2	net: skb: rename SKB_DROP_REASON_PTYPE_ABSENT As David Ahern suggested, the reasons for skb drops should be more general and not be code based. Therefore, rename SKB_DROP_REASON_PTYPE_ABSENT to SKB_DROP_REASON_UNHANDLED_PROTO, which is used for the cases of no L3 protocol handler, no L4 protocol handler, version extensions, etc. From previous discussion, now we have the aim to make these reasons more abstract and users based, avoiding code based. Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-11 10:38:37 +01:00
Linus Torvalds	33563138ac	Driver core changes for 5.18-rc2 Here are 2 small driver core changes for 5.18-rc2. They are the final bits in the removal of the default_attrs field in struct kobj_type. I had to wait until after 5.18-rc1 for all of the changes to do this came in through different development trees, and then one new user snuck in. So this series has 2 changes: - removal of the default_attrs field in the powerpc/pseries/vas code. Change has been acked by the PPC maintainers to come through this tree - removal of default_attrs from struct kobj_type now that all in-kernel users are removed. This cleans up the kobject code a little bit and removes some duplicated functionality that confused people (now there is only one way to do default groups.) All of these have been in linux-next for all of this week with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYlLRHg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yn+9gCfXN0OvKmw5QD55z8YGp/jIycK0ToAnifJ/OX+ sU2V8ZQfNbV8xw7iXfc2 =L+Uc -----END PGP SIGNATURE----- Merge tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here are two small driver core changes for 5.18-rc2. They are the final bits in the removal of the default_attrs field in struct kobj_type. I had to wait until after 5.18-rc1 for all of the changes to do this came in through different development trees, and then one new user snuck in. So this series has two changes: - removal of the default_attrs field in the powerpc/pseries/vas code. The change has been acked by the PPC maintainers to come through this tree - removal of default_attrs from struct kobj_type now that all in-kernel users are removed. This cleans up the kobject code a little bit and removes some duplicated functionality that confused people (now there is only one way to do default groups) Both of these have been in linux-next for all of this week with no reported problems" * tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kobject: kobj_type: remove default_attrs powerpc/pseries/vas: use default_groups in kobj_type	2022-04-10 09:55:09 -10:00
Linus Torvalds	50c94de67c	- Allow the compiler to optimize away unused percpu accesses and change the local_lock_* macros back to inline functions - A couple of fixes to static call insn patching -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJStZ4ACgkQEsHwGGHe VUpUpA/8DHOMUQa7rM8z49ZWBV01HNVCLECTeeKshQBLyJfWc84MNOfdPbpgEGvY XE/eIZDnTMB5UKD0bfRqD+AQ0fXjl3NiLnJrdDZJqEQAiP/wGBswKNXMire8xPT8 9MfaOKYWYPl0LY2uZBWVLcdC+lVe4kRGfhqAcl4LRx0ZSvMzgjcFy34NeXY8LlXD kFQJEzHa97CTROje54mtmXEt7Y5bxjxWwVTSyfEt0hJPGo1bJtJP6FaY01Muj+Xu h/OGNx3KLOYf9MqQC31caAwKgtUOptm8bTpvG3onaHg29qJgz2umKwONyOjYrUUn 2PE3NREfMuKI38nf88pX+lOCs6/I1uVIjJPvAVJijIcuI1ZBXrfm26IP0lZ3LqG1 h/9Y5gChiZPn1j90VnF4UCJUm4u3bYEAHqKIQgUdpcpUqX0NlxbDiXoYxJWfHnmB PBJ0PE7Vdo4MPK0n3BGVrzXAFeOyHsohAsKFijT8afRCMAOF/ebmVs/tI5NygFrK 11e/U13/78iKkazZSxWew8vU3yXA39W5Rym7aPnhR2lWxvN+xQOjNTgZTxF9hUcZ 6AcsaYJgHR7nD8SM7Y9+cwHWOWaDEdZMg9XSkgvyd1p0tHb4u+Ve/SQK7sA3j9q7 ZmZyFSE1X3K+M1i+75rUSVmIEVM5cpfhodN89iRje/JIZ1KyRT8= =hSOc -----END PGP SIGNATURE----- Merge tag 'locking_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Borislav Petkov: - Allow the compiler to optimize away unused percpu accesses and change the local_lock_* macros back to inline functions - A couple of fixes to static call insn patching * tag 'locking_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "mm/page_alloc: mark pagesets as __maybe_unused" Revert "locking/local_lock: Make the empty local_lock_*() function a macro." x86/percpu: Remove volatile from arch_raw_cpu_ptr(). static_call: Remove __DEFINE_STATIC_CALL macro static_call: Properly initialise DEFINE_STATIC_CALL_RET0() static_call: Don't make __static_call_return0 static x86,static_call: Fix __static_call_return0 for i386	2022-04-10 06:56:46 -10:00
Linus Torvalds	fa3b895da8	gpio fixes for v5.18-rc2 - fix a race condition with consumers accessing the fields of GPIO IRQ chips before they're fully initialized -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmJR8OMACgkQEacuoBRx 13LPfA//Qm/qAwlREBIVAhn/vdjAdLKM+JtMVRarI7V8RNQjEkuwbYFMissSMp44 HxChhzaPMfmJh1kd0oa4t9GL34d83oI6Pa/vgqlcIYg5DjaeYD11wjiYKgE1Hsbs 3t/s77pX3Swl0WNT/P7wl1nVjjbZsNZxS6tqCesvWCII5kQsTJSs4cuRDCYnxjgA 3Xe1Dzs71c4ypSbXPJJ8LGxOi3Y2/fOG3M5Jc8MUO0CAY+B4byZopH5yaSurRWcO 9rGQa7hfbgxfVkqpRgiFk9Vny/laoZQ7Hf1sTotXYjsOs5wa/mi8Zd6mu1X9/gYl Wr2g3VnpuFkfJSu3igxc+o2iwLD2fyxD/+4sIkVPFhvgX3Z0tmlK8yTRQcULAUre zk9eoAsDkJNNXh6wMUJ9no4S0mdSg77TAuJvBZTC727U8I4+xGem1PSjWc6WUW1n IoyRCBGgME5qllsCknFGvYBBLMtbv/UsCNc+0l/9lX20+At2pDH82eSX7keKK49z MmSEIvFtSHNpja0RXeA6byr0V5i4+eyNDnFenApXxx9h4EkC+s/dDjZU/hbTF0TJ NpcUJIU4BmXwl6WXVDLEddvQ3pvDH3mAQY8L3uPn5LLgZLRRlsfJisH1r1FThRFU A/bPbqsqEWCTgLo6lEZCN/WfOXoD1hbBLwWM/axpQVuD0WXt0RM= =faCF -----END PGP SIGNATURE----- Merge tag 'gpio-fixes-for-v5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: - fix a race condition with consumers accessing the fields of GPIO IRQ chips before they're fully initialized * tag 'gpio-fixes-for-v5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: Restrict usage of GPIO chip irq members before initialization	2022-04-09 18:17:43 -10:00
Leon Romanovsky	2984287c4c	net/mlx5: Remove not-implemented IPsec capabilities Clean a capabilities enum to remove not-implemented bits. Link: https://lore.kernel.org/r/1044bb7b779107ff38e48e3f6553421104f3f819.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2022-04-09 08:25:07 +03:00
Leon Romanovsky	f2b41b32cd	net/mlx5: Remove ipsec_ops function table There is only one IPsec implementation and ipsec_ops is not needed at all in this situation. Together with removal of ipsec_ops, we can drop the entry checks as these functions are called for IPsec devices only. Link: https://lore.kernel.org/r/bc8dd1c8a77b65dbf5e2cf92c813ffaca2505c5f.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2022-04-09 08:25:07 +03:00
Leon Romanovsky	54deb0e775	net/mlx5: Remove not-needed IPsec config In current code, the CONFIG_MLX5_IPSEC and CONFIG_MLX5_EN_IPSEC are the same. So remove useless indirection. Link: https://lore.kernel.org/r/fd14492cbc01a0d51a5bfedde02bcd2154123fde.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2022-04-09 08:25:07 +03:00
Leon Romanovsky	2451da081a	net/mlx5: Unify device IPsec capabilities check Merge two different function to one in order to provide coherent picture if the device is IPsec capable or not. Link: https://lore.kernel.org/r/8f10ea06ad19c6f651e9fb33921009658f01e1d5.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2022-04-09 08:25:07 +03:00
Leon Romanovsky	de8bdb4769	RDMA/mlx5: Drop crypto flow steering API The mlx5 flow steering crypto API was intended to be used in FPGA devices, which is not supported for years already. The removal of mlx5 crypto FPGA code together with inability to configure encryption keys makes the low steering API completely unusable. So delete the code, so any ESP flow steering requests will fail with not supported error, as it is happening now anyway as no device support this type of API. Link: https://lore.kernel.org/r/634a5face7734381463d809bfb89850f6998deac.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2022-04-09 08:25:06 +03:00
Leon Romanovsky	2fa33b3518	net/mlx5_fpga: Drop INNOVA IPsec support Mellanox INNOVA IPsec cards are EOL in Nov, 2019 [1]. As such, the code is unmaintained, untested and not in-use by any upstream/distro oriented customers. In order to reduce code complexity, drop the kernel code. [1] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf Link: https://lore.kernel.org/r/2afe88ec5020a491079eacf6fe3c89b64d65195c.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2022-04-09 08:23:47 +03:00
Linus Torvalds	911b2b9516	Merge branch 'akpm' (patches from Andrew) Merge fixes from Andrew Morton: "9 patches. Subsystems affected by this patch series: mm (migration, highmem, sparsemem, mremap, mempolicy, and memcg), lz4, mailmap, and MAINTAINERS" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: MAINTAINERS: add Tom as clang reviewer mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()" mailmap: update Vasily Averin's email address mm/mempolicy: fix mpol_new leak in shared_policy_replace mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0) mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning lz4: fix LZ4_decompress_safe_partial read out of bound highmem: fix checks in __kmap_local_sched_{in,out} mm: migrate: use thp_order instead of HPAGE_PMD_ORDER for new page allocation.	2022-04-08 14:31:41 -10:00
Waiman Long	a431dbbc54	mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning The gcc 12 compiler reports a "'mem_section' will never be NULL" warning on the following code: static inline struct mem_section __nr_to_section(unsigned long nr) { #ifdef CONFIG_SPARSEMEM_EXTREME if (!mem_section) return NULL; #endif if (!mem_section[SECTION_NR_TO_ROOT(nr)]) return NULL; : It happens with CONFIG_SPARSEMEM_EXTREME off. The mem_section definition is #ifdef CONFIG_SPARSEMEM_EXTREME extern struct mem_section *mem_section; #else extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]; #endif In the !CONFIG_SPARSEMEM_EXTREME case, mem_section is a static 2-dimensional array and so the check "!mem_section[SECTION_NR_TO_ROOT(nr)]" doesn't make sense. Fix this warning by moving the "!mem_section[SECTION_NR_TO_ROOT(nr)]" check up inside the CONFIG_SPARSEMEM_EXTREME block and adding an explicit NR_SECTION_ROOTS check to make sure that there is no out-of-bound array access. Link: https://lkml.kernel.org/r/20220331180246.2746210-1-longman@redhat.com Fixes: `3e347261a8` ("sparsemem extreme implementation") Signed-off-by: Waiman Long <longman@redhat.com> Reported-by: Justin Forbes <jforbes@redhat.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-08 14:20:36 -10:00

1 2 3 4 5 ...

83395 commits