linux-xiaomi-chiron

Author	SHA1	Message	Date
Jakub Kicinski	812df69beb	net: liquidio: reject unsupported coalescing params Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-10 16:28:53 -07:00
Jose Abreu	f213bbe8a9	net: stmmac: Integrate it with DesignWare XPCS Adds all the necessary logic so that stmmac can be used with Synopsys DesignWare XPCS. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-09 20:13:16 -07:00
Jose Abreu	fcb26bd2b6	net: phy: Add Synopsys DesignWare XPCS MDIO module Synopsys DesignWare XPCS is an MMD that can manage link status, auto-negotiation, link training, ... In this commit we add basic support for XPCS using USXGMII interface and Clause 73 Auto-negotiation. This is highly tied with PHYLINK and can't be used without it. A given ethernet driver can use the provided callbacks to add the support for XPCS. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-09 20:13:16 -07:00
Saeed Mahameed	a70ed9d8ec	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux This series adds some HW bits and definitions for mlx5 driver, to be used by downstream features in both rdma and netdev branches. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: HW bit for goto chain offload support net/mlx5: Expose link speed directly net/mlx5: Introduce TLS and IPSec objects enums net/mlx5: Introduce egress acl forward-to-vport capability net/mlx5: Expose raw packet pacing APIs net/mlx5e: Replace zero-length array with flexible-array member net/mlx5: fix spelling mistake "reserverd" -> "reserved" Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-09 16:58:26 -07:00
Alex Elder	d7f5f3c89c	remoteproc: add IPA notification to q6v5 driver Set up a subdev in the q6v5 modem remoteproc driver that generates event notifications for the IPA driver to use for initialization and recovery following a modem shutdown or crash. A pair of new functions provides a way for the IPA driver to register and deregister a notification callback function that will be called whenever modem events (about to boot, running, about to shut down, etc.) occur. A void pointer value (provided by the IPA driver at registration time) and an event type are supplied to the callback function. One event, MODEM_REMOVING, is signaled whenever the q6v5 driver is about to remove the notification subdevice. It requires the IPA driver de-register its callback. This sub-device is only used by the modem subsystem (MSS) driver, so the code that adds the new subdev and allows registration and deregistration of the notifier is found in "qcom_q6v6_mss.c". Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-08 22:07:09 -07:00
Eli Cohen	e0ebd8eb36	net/mlx5: HW bit for goto chain offload support Add the HW bit definition indecating goto chain offload support. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-07 13:19:25 -08:00
Mark Bloch	dc392fc56f	net/mlx5: Expose link speed directly Expose port rate as part of the port speed register fields. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-07 13:19:25 -08:00
Saeed Mahameed	bd673da6d9	net/mlx5: Introduce TLS and IPSec objects enums Expose the TLS encryption key general object type enum correctly, and add the IPSec encryption key general object type enum. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-07 13:19:25 -08:00
Vu Pham	86f5d0f3d4	net/mlx5: Introduce egress acl forward-to-vport capability Add HCA_CAP.egress_acl_forward_to_vport field to check whether HW supports e-switch vport's egress acl to forward packets to other e-switch vport or not. By default E-Switch egress ACL forwards eswitch vports egress packets to their corresponding NIC/VF vports. With this cap enabled, the driver is allowed to alter this behavior and forward packets to arbitrary NIC/VF vports with the following limitations: a. Multiple processing paths are supported if all of the following conditions are met: - HCA_CAP.egress_acl_forward_to_vport is set ==1. - A destination of type Flow Table only appears once, as the last destination in the list. - Vport destination is supported if HCA_CAP.egress_acl_forward_to_vport==1. Vport must not be the Uplink. b. Flow_tag not supported. c. This table is only applicable after an FDB table is created. d. Push VLAN action is not supported. e. Pop VLAN action cannot be added concurrently to this table and FDB table. This feature will be used during port failover in bonding scenario where two VFs representors are bonded to handle failover egress traffic (VM's ingress/receive traffic). Signed-off-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-03-07 13:19:25 -08:00
Jacob Keller	70c0923b0e	PCI: Introduce pci_get_dsn Several device drivers read their Device Serial Number from the PCIe extended config space. Introduce a new helper function, pci_get_dsn(). This function reads the eight bytes of the DSN and returns them as a u64. If the capability does not exist for the device, the function returns 0. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Michael Chan <michael.chan@broadcom.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-05 17:36:24 -08:00
Petr Machata	aaca940807	net: sched: Make FIFO Qdisc offloadable Invoke ndo_setup_tc() as appropriate to signal init / replacement, destroying and dumping of pFIFO / bFIFO Qdisc. A lot of the FIFO logic is used for pFIFO_head_drop as well, but that's a semantically very different Qdisc that isn't really in the same boat as pFIFO / bFIFO. Split some of the functions to keep the Qdisc intact. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-05 14:03:31 -08:00
Jakub Kicinski	95cddcb5cc	ethtool: add infrastructure for centralized checking of coalescing parameters Linux supports 22 different interrupt coalescing parameters. No driver implements them all. Some drivers just ignore the ones they don't support, while others have to carry a long list of checks to reject unsupported settings. To simplify the drivers add the ability to specify inside ethtool_ops which parameters are supported and let the core reject attempts to set any other one. This commit makes the mechanism an opt-in, only drivers which set ethtool_opts->coalesce_types to a non-zero value will have the checks enforced. The same mask is used for global and per queue settings. v3: - move the (temporary) check if driver defines types earlier (Michal) - rename used_types -> nonzero_params, and coalesce_types -> supported_coalesce_params (Alex) - use EOPNOTSUPP instead of EINVAL (Andrew, Michal) Leaving the long series of ifs for now, it seems nice to be able to grep for the field and flag names. This will probably have to be revisited once netlink support lands. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-05 12:12:34 -08:00
Yishai Hadas	1326034b3c	net/mlx5: Expose raw packet pacing APIs Expose raw packet pacing APIs to be used by DEVX based applications. The existing code was refactored to have a single flow with the new raw APIs. The new raw APIs considered the input of 'pp_rate_limit_context', uid, 'dedicated', upon looking for an existing entry. This raw mode enables future device specification data in the raw context without changing the existing logic and code. The ability to ask for a dedicated entry gives control for application to allocate entries according to its needs. A dedicated entry may not be used by some other process and it also enables the process spreading its resources to some different entries for use different hardware resources as part of enforcing the rate. The counter per entry was changed to be u64 to prevent any option to overflow. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2020-03-05 14:18:09 +02:00
Heiner Kallweit	ec5d9e8784	PCI: Add pci_status_get_and_clear_errors Several drivers use the following code sequence: 1. Read PCI_STATUS 2. Mask out non-error bits 3. Action based on error bits set 4. Write back set error bits to clear them As this is a repeated pattern, add a helper to the PCI core. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-04 14:21:00 -08:00
Heiner Kallweit	d6e055e873	PCI: Add constant PCI_STATUS_ERROR_BITS This collection of PCI error bits is used in more than one driver, so move it to the PCI core. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-04 14:21:00 -08:00
Gustavo A. R. Silva	bb4cf02d4c	netdevice: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-02 11:16:27 -08:00
Cris Forno	70ae1e127b	ethtool: Factored out similar ethtool link settings for virtual devices to core Three virtual devices (ibmveth, virtio_net, and netvsc) all have similar code to set link settings and validate ethtool command. To eliminate duplication of code, it is factored out into core/ethtool.c. Signed-off-by: Cris Forno <cforno12@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-29 21:48:54 -08:00
David S. Miller	9f0ca0c1a5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-02-28 The following pull-request contains BPF updates for your net-next tree. We've added 41 non-merge commits during the last 7 day(s) which contain a total of 49 files changed, 1383 insertions(+), 499 deletions(-). The main changes are: 1) BPF and Real-Time nicely co-exist. 2) bpftool feature improvements. 3) retrieve bpf_sk_storage via INET_DIAG. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-29 15:53:35 -08:00
Paolo Abeni	e427cad6ee	net: datagram: drop 'destructor' argument from several helpers The only users for such argument are the UDP protocol and the UNIX socket family. We can safely reclaim the accounted memory directly from the UDP code and, after the previous patch, we can do scm stats accounting outside the datagram helpers. Overall this cleans up a bit some datagram-related helpers, and avoids an indirect call per packet in the UDP receive path. v1 -> v2: - call scm_stat_del() only when not peeking - Kirill - fix build issue with CONFIG_INET_ESPINTCP Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-28 12:12:53 -08:00
Gustavo A. R. Silva	8402a31dd8	net: dccp: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-28 12:08:37 -08:00
David S. Miller	549da33801	mlx5-updates-2020-02-27 mlx5 misc updates and minor cleanups: 1) Use per vport tables for mirroring 2) Improve log messages for SW steering (DR) 3) Add devlink fdb_large_groups parameter 4) E-Switch, Allow goto earlier chain 5) Don't allow forwarding between uplink representors 6) Add support for devlink-port in non-representors mode 7) Minor misc cleanups -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl5YYYwACgkQSD+KveBX +j7vIQgAsFQSjJGdyUzVDfeHu94Ze79OMRstuEh9fbcNXeyawMTShPLPE94E91qo Nj6AfZp8sH9M6fLtFcOMrPou87ITslP3mQ/C2cYWWJn/iq9RN13kO+howZNeRhlk iettT8qHGYR7aU3tBU38pc9/bWVHRPt+FZGBeCAcvwgQ1zfjBEtPcMJ8svjsfETA 0bzEDEq0eKEaynHr3phJpzbcnvzm3154HkfZtDZFAApgr4tpEGSlFa4n+8ctcmqy zf82HH8GnB+GhTUPU3W33NwyH2otHOcE3dPAepQNmS3f8EemMJebAtjkexo6SfjY rwUs5qTU05myOy5RPmzwhQsnzRyPBw== =z2AA -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2020-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-02-27 mlx5 misc updates and minor cleanups: 1) Use per vport tables for mirroring 2) Improve log messages for SW steering (DR) 3) Add devlink fdb_large_groups parameter 4) E-Switch, Allow goto earlier chain 5) Don't allow forwarding between uplink representors 6) Add support for devlink-port in non-representors mode 7) Minor misc cleanups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-28 11:59:53 -08:00
Martin KaFai Lau	085c20cacf	bpf: inet_diag: Dump bpf_sk_storages in inet_diag_dump() This patch will dump out the bpf_sk_storages of a sk if the request has the INET_DIAG_REQ_SK_BPF_STORAGES nlattr. An array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD can be specified in INET_DIAG_REQ_SK_BPF_STORAGES to select which bpf_sk_storage to dump. If no map_fd is specified, all bpf_sk_storages of a sk will be dumped. bpf_sk_storages can be added to the system at runtime. It is difficult to find a proper static value for cb->min_dump_alloc. This patch learns the nlattr size required to dump the bpf_sk_storages of a sk. If it happens to be the very first nlmsg of a dump and it cannot fit the needed bpf_sk_storages, it will try to expand the skb by "pskb_expand_head()". Instead of expanding it in inet_sk_diag_fill(), it is expanded at a sleepable context in __inet_diag_dump() so __GFP_DIRECT_RECLAIM can be used. In __inet_diag_dump(), it will retry as long as the skb is empty and the cb->min_dump_alloc becomes larger than before. cb->min_dump_alloc is bounded by KMALLOC_MAX_SIZE. The min_dump_alloc is also changed from 'u16' to 'u32' to accommodate a sk that may have a few large bpf_sk_storages. The updated cb->min_dump_alloc will also be used to allocate the skb in the next dump. This logic already exists in netlink_dump(). Here is the sample output of a locally modified 'ss' and it could be made more readable by using BTF later: [root@arch-fb-vm1 ~]# ss --bpf-map-id 14 --bpf-map-id 13 -t6an 'dst [::1]:8989' State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess ESTAB 0 0 [::1]:51072 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] ESTAB 0 0 [::1]:51070 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] [root@arch-fb-vm1 ~]# ~/devshare/github/iproute2/misc/ss --bpf-maps -t6an 'dst [::1]:8989' State Recv-Q Send-Q Local Address:Port Peer Address:Port Process ESTAB 0 0 [::1]:51072 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ] ESTAB 0 0 [::1]:51070 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ] Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230427.1976129-1-kafai@fb.com	2020-02-27 18:50:19 -08:00
Martin KaFai Lau	1ed4d92458	bpf: INET_DIAG support in bpf_sk_storage This patch adds INET_DIAG support to bpf_sk_storage. 1. Although this series adds bpf_sk_storage diag capability to inet sk, bpf_sk_storage is in general applicable to all fullsock. Hence, the bpf_sk_storage logic will operate on SK_DIAG_* nlattr. The caller will pass in its specific nesting nlattr (e.g. INET_DIAG_) as the argument. 2. The request will be like: INET_DIAG_REQ_SK_BPF_STORAGES (nla_nest) (defined in latter patch) SK_DIAG_BPF_STORAGE_REQ_MAP_FD (nla_put_u32) SK_DIAG_BPF_STORAGE_REQ_MAP_FD (nla_put_u32) ...... Considering there could have multiple bpf_sk_storages in a sk, instead of reusing INET_DIAG_INFO ("ss -i"), the user can select some specific bpf_sk_storage to dump by specifying an array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD. If no SK_DIAG_BPF_STORAGE_REQ_MAP_FD is specified (i.e. an empty INET_DIAG_REQ_SK_BPF_STORAGES), it will dump all bpf_sk_storages of a sk. 3. The reply will be like: INET_DIAG_BPF_SK_STORAGES (nla_nest) (defined in latter patch) SK_DIAG_BPF_STORAGE (nla_nest) SK_DIAG_BPF_STORAGE_MAP_ID (nla_put_u32) SK_DIAG_BPF_STORAGE_MAP_VALUE (nla_reserve_64bit) SK_DIAG_BPF_STORAGE (nla_nest) SK_DIAG_BPF_STORAGE_MAP_ID (nla_put_u32) SK_DIAG_BPF_STORAGE_MAP_VALUE (nla_reserve_64bit) ...... 4. Unlike other INET_DIAG info of a sk which is pretty static, the size required to dump the bpf_sk_storage(s) of a sk is dynamic as the system adding more bpf_sk_storage_map. It is hard to set a static min_dump_alloc size. Hence, this series learns it at the runtime and adjust the cb->min_dump_alloc as it iterates all sk(s) of a system. The "unsigned int res_diag_size" in bpf_sk_storage_diag_put() is for this purpose. The next patch will update the cb->min_dump_alloc as it iterates the sk(s). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230421.1975729-1-kafai@fb.com	2020-02-27 18:50:19 -08:00
Martin KaFai Lau	0df6d32842	inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data The INET_DIAG_REQ_BYTECODE nlattr is currently re-found every time when the "dump()" is re-started. In a latter patch, it will also need to parse the new INET_DIAG_REQ_SK_BPF_STORAGES nlattr to learn the map_fds. Thus, this patch takes this chance to store the parsed nlattr in cb->data during the "start" time of a dump. By doing this, the "bc" argument also becomes unnecessary and is removed. Also, the two copies of the INET_DIAG_REQ_BYTECODE parsing-audit logic between compat/current version can be consolidated to one. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230415.1975555-1-kafai@fb.com	2020-02-27 18:50:19 -08:00
Martin KaFai Lau	5682d393b4	inet_diag: Refactor inet_sk_diag_fill(), dump(), and dump_one() In a latter patch, there is a need to update "cb->min_dump_alloc" in inet_sk_diag_fill() as it learns the diffierent bpf_sk_storages stored in a sk while dumping all sk(s) (e.g. tcp_hashinfo). The inet_sk_diag_fill() currently does not take the "cb" as an argument. One of the reason is inet_sk_diag_fill() is used by both dump_one() and dump() (which belong to the "struct inet_diag_handler". The dump_one() interface does not pass the "cb" along. This patch is to make dump_one() pass a "cb". The "cb" is created in inet_diag_cmd_exact(). The "nlh" and "in_skb" are stored in "cb" as the dump() interface does. The total number of args in inet_sk_diag_fill() is also cut from 10 to 7 and that helps many callers to pass fewer args. In particular, "struct user_namespace *user_ns", "u32 pid", and "u32 seq" can be replaced by accessing "cb->nlh" and "cb->skb". A similar argument reduction is also made to inet_twsk_diag_fill() and inet_req_diag_fill(). inet_csk_diag_dump() and inet_csk_diag_fill() are also removed. They are mostly equivalent to inet_sk_diag_fill(). Their repeated usages are very limited. Thus, inet_sk_diag_fill() is directly used in those occasions. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200225230409.1975173-1-kafai@fb.com	2020-02-27 18:50:19 -08:00
David S. Miller	9f6e055907	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net The mptcp conflict was overlapping additions. The SMC conflict was an additional and removal happening at the same time. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-27 18:31:39 -08:00
Eli Cohen	96e326878f	net/mlx5e: Eswitch, Use per vport tables for mirroring When using port mirroring, we forward the traffic to another table and use that table to forward to the mirrored vport. Since the hardware loses the values of reg c, and in particular reg c0, we fail the match on the input vport which previously existed in reg c0. To overcome this situation, we use a set of per vport tables, positioned at the lowest priority, and forward traffic to those tables. Since these tables are per vport, we can avoid matching on reg c0. Fixes: `c01cfd0f11` ("net/mlx5: E-Switch, Add match on vport metadata for rule in fast path") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-02-27 16:40:05 -08:00
Linus Torvalds	7058b83789	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Fix leak in nl80211 AP start where we leak the ACL memory, from Johannes Berg. 2) Fix double mutex unlock in mac80211, from Andrei Otcheretianski. 3) Fix RCU stall in ipset, from Jozsef Kadlecsik. 4) Fix devlink locking in devlink_dpipe_table_register, from Madhuparna Bhowmik. 5) Fix race causing TX hang in ll_temac, from Esben Haabendal. 6) Stale eth hdr pointer in br_dev_xmit(), from Nikolay Aleksandrov. 7) Fix TX hash calculation bounds checking wrt. tc rules, from Amritha Nambiar. 8) Size netlink responses properly in schedule action code to take into consideration TCA_ACT_FLAGS. From Jiri Pirko. 9) Fix firmware paths for mscc PHY driver, from Antoine Tenart. 10) Don't register stmmac notifier multiple times, from Aaro Koskinen. 11) Various rmnet bug fixes, from Taehee Yoo. 12) Fix vsock deadlock in vsock transport release, from Stefano Garzarella. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) net: dsa: mv88e6xxx: Fix masking of egress port mlxsw: pci: Wait longer before accessing the device after reset sfc: fix timestamp reconstruction at 16-bit rollover points vsock: fix potential deadlock in transport->release() unix: It's CONFIG_PROC_FS not CONFIG_PROCFS net: rmnet: fix packet forwarding in rmnet bridge mode net: rmnet: fix bridge mode bugs net: rmnet: use upper/lower device infrastructure net: rmnet: do not allow to change mux id if mux id is duplicated net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device() net: rmnet: fix suspicious RCU usage net: rmnet: fix NULL pointer dereference in rmnet_changelink() net: rmnet: fix NULL pointer dereference in rmnet_newlink() net: phy: marvell: don't interpret PHY status unless resolved mlx5: register lag notifier for init network namespace only unix: define and set show_fdinfo only if procfs is enabled hinic: fix a bug of rss configuration hinic: fix a bug of setting hw_ioctxt hinic: fix a irq affinity bug net/smc: check for valid ib_client_data ...	2020-02-27 16:34:41 -08:00
Gustavo A. R. Silva	d7f10df862	bpf: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit `7649773293` ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200227001744.GA3317@embeddedor	2020-02-28 01:21:02 +01:00
Russell King	91a208f218	net: phylink: propagate resolved link config via mac_link_up() Propagate the resolved link parameters via the mac_link_up() call for MACs that do not automatically track their PCS state. We propagate the link parameters via function arguments so that inappropriate members of struct phylink_link_state can't be accessed, and creating a new structure just for this adds needless complexity to the API. Tested-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-27 12:02:14 -08:00
Linus Torvalds	278de45e14	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID subsystem fixes from Jiri Kosina: - syzkaller-reported error handling fixes in various drivers, from various people - increase of HID report buffer size to 8K, which is apparently needed by certain modern devices - a few new device-ID-specific fixes / quirks - battery charging status reporting fix in logitech-hidpp, from Filipe Laíns * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: hid-bigbenff: fix race condition for scheduled work during removal HID: hid-bigbenff: call hid_hw_stop() in case of error HID: hid-bigbenff: fix general protection fault caused by double kfree HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override HID: alps: Fix an error handling path in 'alps_input_configured()' HID: hiddev: Fix race in in hiddev_disconnect() HID: core: increase HID report buffer size to 8KiB HID: core: fix off-by-one memset in hid_report_raw_event() HID: apple: Add support for recent firmware on Magic Keyboards HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock HID: logitech-hidpp: BatteryVoltage: only read chargeStatus if extPower is active	2020-02-27 11:13:27 -08:00
Christian Brauner	b8f33e5d76	device: add device_change_owner() Add a helper to change the owner of a device's sysfs entries. This needs to happen when the ownership of a device is changed, e.g. when moving network devices between network namespaces. This function will be used to correctly account for ownership changes, e.g. when moving network devices between network namespaces. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-26 20:07:25 -08:00
Christian Brauner	2c4f9401ce	sysfs: add sysfs_change_owner() Add a helper to change the owner of sysfs objects. This function will be used to correctly account for kobject ownership changes, e.g. when moving network devices between network namespaces. This mirrors how a kobject is added through driver core which in its guts is done via kobject_add_internal() which in summary creates the main directory via create_dir(), populates that directory with the groups associated with the ktype of the kobject (if any) and populates the directory with the basic attributes associated with the ktype of the kobject (if any). These are the basic steps that are associated with adding a kobject in sysfs. Any additional properties are added by the specific subsystem itself (not by driver core) after it has registered the device. So for the example of network devices, a network device will e.g. register a queue subdirectory under the basic sysfs directory for the network device and than further subdirectories within that queues subdirectory. But that is all specific to network devices and they call the corresponding sysfs functions to do that directly when they create those queue objects. So anything that a subsystem adds outside of what driver core does must also be changed by it (That's already true for removal of files it created outside of driver core.) and it's the same for ownership changes. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-26 20:07:25 -08:00
Christian Brauner	303a42769c	sysfs: add sysfs_group{s}_change_owner() Add helpers to change the owner of sysfs groups. This function will be used to correctly account for kobject ownership changes, e.g. when moving network devices between network namespaces. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-26 20:07:25 -08:00
Christian Brauner	0666a3aee7	sysfs: add sysfs_link_change_owner() Add a helper to change the owner of a sysfs link. This function will be used to correctly account for kobject ownership changes, e.g. when moving network devices between network namespaces. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-26 20:07:25 -08:00
Christian Brauner	f70ce18568	sysfs: add sysfs_file_change_owner() Add helpers to change the owner of a sysfs files. This function will be used to correctly account for kobject ownership changes, e.g. when moving network devices between network namespaces. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-26 20:07:25 -08:00
David S. Miller	574b238f64	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes: 1) Perform garbage collection from workqueue to fix rcu detected stall in ipset hash set types, from Jozsef Kadlecsik. 2) Fix the forceadd evaluation path, also from Jozsef. 3) Fix nft_set_pipapo selftest, from Stefano Brivio. 4) Crash when add-flush-add element in pipapo set, also from Stefano. Add test to cover this crash. 5) Remove sysctl entry under mutex in hashlimit, from Cong Wang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-26 16:30:17 -08:00
Linus Torvalds	91ad64a84e	Tracing updates: Change in API of bootconfig (before it comes live in a release) - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig exists - Set CONFIG_BOOT_CONFIG to 'n' by default - Show error if "bootconfig" on cmdline but not compiled in - Prevent redefining the same value - Have a way to append values - Added a SELECT BLK_DEV_INITRD to fix a build failure Synthetic event fixes: - Switch to raw_smp_processor_id() for recording CPU value in preempt section. (No care for what the value actually is) - Fix samples always recording u64 values - Fix endianess - Check number of values matches number of fields - Fix a printing bug Fix of trace_printk() breaking postponed start up tests Make a function static that is only used in a single file. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXlW4vxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qtioAP0WLEm3dWO0z3321h/a0DSshC+Bslu3 HDPTsGVGrXmvggEA/lr1ikRHd8PsO7zW8BfaZMxoXaTqXiuSrzEWxnMlFw0= =O8PM -----END PGP SIGNATURE----- Merge tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing and bootconfig updates: "Fixes and changes to bootconfig before it goes live in a release. Change in API of bootconfig (before it comes live in a release): - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig exists - Set CONFIG_BOOT_CONFIG to 'n' by default - Show error if "bootconfig" on cmdline but not compiled in - Prevent redefining the same value - Have a way to append values - Added a SELECT BLK_DEV_INITRD to fix a build failure Synthetic event fixes: - Switch to raw_smp_processor_id() for recording CPU value in preempt section. (No care for what the value actually is) - Fix samples always recording u64 values - Fix endianess - Check number of values matches number of fields - Fix a printing bug Fix of trace_printk() breaking postponed start up tests Make a function static that is only used in a single file" * tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue bootconfig: Add append value operator support bootconfig: Prohibit re-defining value on same key bootconfig: Print array as multiple commands for legacy command line bootconfig: Reject subkey and value on same parent key tools/bootconfig: Remove unneeded error message silencer bootconfig: Add bootconfig magic word for indicating bootconfig explicitly bootconfig: Set CONFIG_BOOT_CONFIG=n by default tracing: Clear trace_state when starting trace bootconfig: Mark boot_config_checksum() static tracing: Disable trace_printk() on post poned tests tracing: Have synthetic event test use raw_smp_processor_id() tracing: Fix number printing bug in print_synth_event() tracing: Check that number of vals matches number of synth event fields tracing: Make synth_event trace functions endian-correct tracing: Make sure synth_event_trace() example always uses u64	2020-02-26 10:34:42 -08:00
Pablo Neira Ayuso	9ea4894ba4	Merge branch 'master' of git://blackhole.kfki.hu/nf Jozsef Kadlecsik says: ==================== ipset patches for nf The first one is larger than usual, but the issue could not be solved simpler. Also, it's a resend of the patch I submitted a few days ago, with a one line fix on top of that: the size of the comment extensions was not taken into account at reporting the full size of the set. - Fix "INFO: rcu detected stall in hash_xxx" reports of syzbot by introducing region locking and using workqueue instead of timer based gc of timed out entries in hash types of sets in ipset. - Fix the forceadd evaluation path - the bug was also uncovered by the syzbot. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-02-26 13:55:15 +01:00
Jason A. Donenfeld	a8e41f6033	icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n The icmpv6_send function has long had a static inline implementation with an empty body for CONFIG_IPV6=n, so that code calling it doesn't need to be ifdef'd. The new icmpv6_ndo_send function, which is intended for drivers as a drop-in replacement with an identical function signature, should follow the same pattern. Without this patch, drivers that used to work with CONFIG_IPV6=n now result in a linker error. Cc: Chen Zhou <chenzhou10@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: `0b41713b60` ("icmp: introduce helper for nat'd source address in network device context") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-25 11:01:39 -08:00
Thomas Gleixner	c518cfa0c5	bpf: Provide recursion prevention helpers The places which need to prevent the execution of trace type BPF programs to prevent deadlocks on the hash bucket lock do this open coded. Provide two inline functions, bpf_disable/enable_instrumentation() to replace these open coded protection constructs. Use migrate_disable/enable() instead of preempt_disable/enable() right away so this works on RT enabled kernels. On a !RT kernel migrate_disable / enable() are mapped to preempt_disable/enable(). These helpers use this_cpu_inc/dec() instead of __this_cpu_inc/dec() on an RT enabled kernel because migrate disabled regions are preemptible and preemption might hit in the middle of a RMW operation which can lead to inconsistent state. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145644.103910133@linutronix.de	2020-02-24 16:20:09 -08:00
David Miller	2a916f2f54	bpf: Use migrate_disable/enable in array macros and cgroup/lirc code. Replace the preemption disable/enable with migrate_disable/enable() to reflect the actual requirement and to allow PREEMPT_RT to substitute it with an actual migration disable mechanism which does not disable preemption. Including the code paths that go via __bpf_prog_run_save_cb(). Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.998293311@linutronix.de	2020-02-24 16:20:09 -08:00
David Miller	3d9f773cf2	bpf: Use bpf_prog_run_pin_on_cpu() at simple call sites. All of these cases are strictly of the form: preempt_disable(); BPF_PROG_RUN(...); preempt_enable(); Replace this with bpf_prog_run_pin_on_cpu() which wraps BPF_PROG_RUN() with: migrate_disable(); BPF_PROG_RUN(...); migrate_enable(); On non RT enabled kernels this maps to preempt_disable/enable() and on RT enabled kernels this solely prevents migration, which is sufficient as there is no requirement to prevent reentrancy to any BPF program from a preempting task. The only requirement is that the program stays on the same CPU. Therefore, this is a trivially correct transformation. The seccomp loop does not need protection over the loop. It only needs protection per BPF filter program [ tglx: Converted to bpf_prog_run_pin_on_cpu() ] Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.691493094@linutronix.de	2020-02-24 16:20:09 -08:00
Thomas Gleixner	37e1d92022	bpf: Replace cant_sleep() with cant_migrate() As already discussed in the previous change which introduced BPF_RUN_PROG_PIN_ON_CPU() BPF only requires to disable migration to guarantee per CPUness. If RT substitutes the preempt disable based migration protection then the cant_sleep() check will obviously trigger as preemption is not disabled. Replace it by cant_migrate() which maps to cant_sleep() on a non RT kernel and will verify that migration is disabled on a full RT kernel. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.583038889@linutronix.de	2020-02-24 16:20:09 -08:00
Thomas Gleixner	3c58482a38	bpf: Provide bpf_prog_run_pin_on_cpu() helper BPF programs require to run on one CPU to completion as they use per CPU storage, but according to Alexei they don't need reentrancy protection as obviously BPF programs running in thread context can always be 'preempted' by hard and soft interrupts and instrumentation and the same program can run concurrently on a different CPU. The currently used mechanism to ensure CPUness is to wrap the invocation into a preempt_disable/enable() pair. Disabling preemption is also disabling migration for a task. preempt_disable/enable() is used because there is no explicit way to reliably disable only migration. Provide a separate macro to invoke a BPF program which can be used in migrateable task context. It wraps BPF_PROG_RUN() in a migrate_disable/enable() pair which maps on non RT enabled kernels to preempt_disable/enable(). On RT enabled kernels this merely disables migration. Both methods ensure that the invoked BPF program runs on one CPU to completion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200224145643.474592620@linutronix.de	2020-02-24 16:20:05 -08:00
Jeremy Linton	ce69e2162f	mdio_bus: Add generic mdio_find_bus() It appears most ethernet drivers follow one of two main strategies for mdio bus/phy management. A monolithic model where the net driver itself creates, probes and uses the phy, and one where an external mdio/phy driver instantiates the mdio bus/phy and the net driver only attaches to a known phy. Usually in this latter model the phys are discovered via DT relationships or simply phy name/address hardcoding. This is a shame because modern well behaved mdio buses are self describing and can be probed. The mdio layer itself is fully capable of this, yet there isn't a clean way for a standalone net driver to attach and enumerate the discovered devices. This is because outside of of_mdio_find_bus() there isn't a straightforward way to acquire the mii_bus pointer. So, lets add a mdio_find_bus which can return the mii_bus based only on its name. Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-02-24 15:31:23 -08:00
Linus Torvalds	63623fd449	Bugfixes, including the fix for CVE-2020-2732 and a few issues found by "make W=1". -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJeVBwcAAoJEL/70l94x66DB9AH/AxWhmtf6YVXMNyZjXydxa1f hYVm9wg9GCsZS+7cktMhq0/uDEu5IjaCv7d+bzIcYZdFAOcs5nBUUjn1LtVl9w1y 48vobyOa8pXpORerBtZtaO1kt4sfFR63zm7uau32DzXrz3qpHlMUjPdL08A1e35V cSSPAHHsl9S1TbDryc/VUNCOgauJes6LHbd3CdeAXU6lzMBW8JWbF2b/MAkvHG6n Hw5LpicWSeTxoPjR4Oi0Yx3VKvWfS9608netSJmuCNsv36wrhzKR1iuyb3kNCkAy AIlALn4PZq1Y5i1INi/XIkpC8d9yTqt5heRxYwp+yHadWO6E7ZMlITfxLZii+mM= =7EpO -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Bugfixes, including the fix for CVE-2020-2732 and a few issues found by 'make W=1'" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: rstify new ioctls in api.rst KVM: nVMX: Check IO instruction VM-exit conditions KVM: nVMX: Refactor IO bitmap checks into helper function KVM: nVMX: Don't emulate instructions in guest mode KVM: nVMX: Emulate MTF when performing instruction emulation KVM: fix error handling in svm_hardware_setup KVM: SVM: Fix potential memory leak in svm_cpu_init() KVM: apic: avoid calculating pending eoi from an uninitialized val KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1 kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI kvm/emulate: fix a -Werror=cast-function-type KVM: x86: fix incorrect comparison in trace event KVM: nVMX: Fix some obsolete comments and grammar error KVM: x86: fix missing prototypes KVM: x86: enable -Werror	2020-02-24 11:48:17 -08:00
Linus Torvalds	f3cc24942e	Two fixes for the irq core code which are follow ups to the recent MSI fixes: - The WARN_ON which was put into the MSI setaffinity callback for paranoia reasons actually triggered via a callchain which escaped when all the possible ways to reach that code were analyzed. The proc/irq/$N/affinity interfaces have a quirk which came in when ALPHA moved to the generic interface: In case that the written affinity mask does not contain any online CPU it calls into ALPHAs magic auto affinity setting code. A few years later this mechanism was also made available to x86 for no good reasons and in a way which circumvents all sanity checks for interrupts which cannot have their affinity set from process context on X86 due to the way the X86 interrupt delivery works. It would be possible to make this work properly, but there is no point in doing so. If the interrupt is not yet started then the affinity setting has no effect and if it is started already then it is already assigned to an online CPU so there is no point to randomly move it to some other CPU. Just return EINVAL as the code has done before that change forever. - The new MSI quirk bit in the irq domain flags turned out to be already occupied, which escaped the author and the reviewers because the already in use bits were 0,6,2,3,4,5 listed in that order. That bit 6 was simply overlooked because the ordering was straight forward linear otherwise. So the new bit ended up being a duplicate. Fix it up by switching the oddball 6 to the obvious 1. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl5Rh+sTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoQ8gD/0aO/tFJrRLmcSlS5r8r70dtHMHIhcq L6bUkg0GKsud6oLVFEhU43K7aWXZkTEqm1bIjX9x0FnMHIYsASmkHyloP/8OBxoA VOG6Q1THxklge/YLBLz7I2NbZNKs/w+WkRKl63FNV9LmdtxtQblYc67CkaQVXiwC tHuijaKOwT/t4V0+HNRVX15FApiDWOSguwcDNigUwk03Uo+hmJsPXuGtJfvyVAf6 Oa00sbvy/XV3qNr2Zm01Pb4osL4FOwCcGuyoeXMZqSvlRbxgT2qz0TzYObqKzZb5 pvZj7coaxKSimNZsVFQ6mluCp9IvFpZOnp1FXcFrzbbFPxEr0G/IMWgVmgf2s3w/ 8XjFCuF33J4mVz36/HYzR2ieH5FsVcGJQR96ZOHArONmszmbun7D7fC3j8RJlkKC 9QpYDcHK17xbLdhR8YanEIDl5QQT/C/qEOdLWrgON7uWEZLs5Y+u39x479MfknyV fYQqi7CC6qUTxQhrt29puaWy8WQvrG4GzWV+vS9d+8v0FIK2KcbA66p/c7UioV3r F9PhxfneSYjswvLhAPmJCM4PSYWvrYZcoGkT+/OzmPXcm+r+Tg4Stgac0EVmhvq5 rBpenELnSuK1MGiFXzL0DKlyNJhpj+UdeAuCoUzConfBMpMwQdvPXpmnT73CQtRb pOgBI4qB3YgP4Q== =jvVI -----END PGP SIGNATURE----- Merge tag 'irq-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two fixes for the irq core code which are follow ups to the recent MSI fixes: - The WARN_ON which was put into the MSI setaffinity callback for paranoia reasons actually triggered via a callchain which escaped when all the possible ways to reach that code were analyzed. The proc/irq/$N/affinity interfaces have a quirk which came in when ALPHA moved to the generic interface: In case that the written affinity mask does not contain any online CPU it calls into ALPHAs magic auto affinity setting code. A few years later this mechanism was also made available to x86 for no good reasons and in a way which circumvents all sanity checks for interrupts which cannot have their affinity set from process context on X86 due to the way the X86 interrupt delivery works. It would be possible to make this work properly, but there is no point in doing so. If the interrupt is not yet started then the affinity setting has no effect and if it is started already then it is already assigned to an online CPU so there is no point to randomly move it to some other CPU. Just return EINVAL as the code has done before that change forever. - The new MSI quirk bit in the irq domain flags turned out to be already occupied, which escaped the author and the reviewers because the already in use bits were 0,6,2,3,4,5 listed in that order. That bit 6 was simply overlooked because the ordering was straight forward linear otherwise. So the new bit ended up being a duplicate. Fix it up by switching the oddball 6 to the obvious 1" * tag 'irq-urgent-2020-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/irqdomain: Make sure all irq domain flags are distinct genirq/proc: Reject invalid affinity masks (again)	2020-02-22 17:25:46 -08:00
Jozsef Kadlecsik	f66ee0410b	netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports In the case of huge hash:* types of sets, due to the single spinlock of a set the processing of the whole set under spinlock protection could take too long. There were four places where the whole hash table of the set was processed from bucket to bucket under holding the spinlock: - During resizing a set, the original set was locked to exclude kernel side add/del element operations (userspace add/del is excluded by the nfnetlink mutex). The original set is actually just read during the resize, so the spinlocking is replaced with rcu locking of regions. However, thus there can be parallel kernel side add/del of entries. In order not to loose those operations a backlog is added and replayed after the successful resize. - Garbage collection of timed out entries was also protected by the spinlock. In order not to lock too long, region locking is introduced and a single region is processed in one gc go. Also, the simple timer based gc running is replaced with a workqueue based solution. The internal book-keeping (number of elements, size of extensions) is moved to region level due to the region locking. - Adding elements: when the max number of the elements is reached, the gc was called to evict the timed out entries. The new approach is that the gc is called just for the matching region, assuming that if the region (proportionally) seems to be full, then the whole set does. We could scan the other regions to check every entry under rcu locking, but for huge sets it'd mean a slowdown at adding elements. - Listing the set header data: when the set was defined with timeout support, the garbage collector was called to clean up timed out entries to get the correct element numbers and set size values. Now the set is scanned to check non-timed out entries, without actually calling the gc for the whole set. Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe -> SOFTIRQ-unsafe lock order issues during working on the patch. Reported-by: syzbot+4b0e9d4ff3cf117837e5@syzkaller.appspotmail.com Reported-by: syzbot+c27b8d5010f45c666ed1@syzkaller.appspotmail.com Reported-by: syzbot+68a806795ac89df3aa1c@syzkaller.appspotmail.com Fixes: `23c42a403a` ("netfilter: ipset: Introduction of new commands and protocol version 7") Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>	2020-02-22 12:00:06 +01:00
Alexei Starovoitov	8eece07c01	Two migrate disable related stubs for BPF to base the RT patches on -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl5O6asTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoRRwEACaqcIHhYRVZJZkVqtjYoYPnP74XCaS ZTbJiFMJDxcM0lt5YfjjNA7qFA8AVfOOTBdPUrjqw05LuykibpNXEBpk+n2lDEgc x9tZHcbylFlmhJxsnAI/S++l++I7ieqYNbNXUhUkkiN86OMxfn1rSoCo639ManxB WiwR+R7Q/aicUN95kLGPzvAt41K/DQ30pNpjAE5Y2z+Hl26JPti2jcMHhfFWDD23 91mRd0ryuMSQm+ZkyiZdobfd6OzOGtYLnnxRjNJVSFz7q0s+hLUarWI+wnOxh4fD Jb+eahKPXSvA787/TbOvM8iNkIPX70HeBpmQIpIjbI+JEjo05amaEiYZ3cwZgs5P rDpFadyfi4peftbGs7G+/8RnlDJ4Towbw0X8Sl6t1RaKAfz04zsAlmW4zyNucRUD 1pN0mxfA/OCssGahXceP0VKpqqm6hDZipFfwFheXk6zX+5YWyNGX2iP3Fh4u2vbx vTwxQPmsusQuehQfmR23HzNG9I67knumsJI0PiFNhnroiNklevY+cPNMRN+tj3fB osOSgcKVmbsKUc9jcn7yfIxSyZ1exnT2Wsri4lMHo7+kZGEguX2qnLRd1tOtVrgO hjawoDrhXaA9AjU80e3p9eXmRYke9oZmQdeA8X0vl9YVTsLdxlishYCNBTysHvUl ULgHkTY/WoxZkA== =4Vci -----END PGP SIGNATURE----- Merge tag 'sched-for-bpf-2020-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into bpf-next Two migrate disable related stubs for BPF to base the RT patches on	2020-02-21 18:27:47 -08:00

1 2 3 4 5 ...

70575 commits