linux-xiaomi-chiron

Author	SHA1	Message	Date
Arnd Bergmann	25ec92fbdd	hamradio: use ndo_siocdevprivate hamradio uses a set of private ioctls that do seem to work correctly in compat mode, as they only rely on the ifr_data pointer. Move them over to the ndo_siocdevprivate callback as a cleanup. Cc: Thomas Sailer <t.sailer@alumni.ethz.ch> Cc: Joerg Reuter <jreuter@yaina.de> Cc: Jean-Paul Roubelat <jpr@f6fbb.org> Cc: linux-hams@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 20:11:44 +01:00
Arnd Bergmann	b9067f5dc4	net: split out SIOCDEVPRIVATE handling from dev_ioctl SIOCDEVPRIVATE ioctl commands are mainly used in really old drivers, and they have a number of problems: - They hide behind the normal .ndo_do_ioctl function that is also used for other things in modern drivers, so it's hard to spot a driver that actually uses one of these - Since drivers use a number different calling conventions, it is impossible to support compat mode for them in a generic way. - With all drivers using the same 16 commands codes, there is no way to introspect the data being passed through things like strace. Add a new net_device_ops callback pointer, to address the first two of these. Separating them from .ndo_do_ioctl makes it easy to grep for drivers with a .ndo_siocdevprivate callback, and the unwieldy name hopefully makes it easier to spot in code review. By passing the ifreq structure and the ifr_data pointer separately, it is no longer necessary to overload these, and the driver can use either one for a given command. Cc: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 20:11:43 +01:00
Tonghao Zhang	409f386b8e	qdisc: add new field for qdisc_enqueue tracepoint qdisc_enqueue tracepoint can work with qdisc:qdisc_dequeue to measure packets latency in qdisc queues. Add a new field txq for it, then we can retrieve more info. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 14:16:38 +01:00
Mark Gray	784dcfa56e	openvswitch: fix alignment issues Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:48:42 +01:00
Mark Gray	e4252cb666	openvswitch: update kdoc OVS_DP_ATTR_PER_CPU_PIDS Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:48:42 +01:00
Yajun Deng	f9b282b36d	net: netlink: add the case when nlh is NULL Add the case when nlh is NULL in nlmsg_report(), so that the caller doesn't need to deal with this case. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:43:50 +01:00
Vladimir Oltean	edac6f6332	Revert "net: dsa: Allow drivers to filter packets they can decode source port from" This reverts commit `cc1939e4b3`. Currently 2 classes of DSA drivers are able to send/receive packets directly through the DSA master: - drivers with DSA_TAG_PROTO_NONE - sja1105 Now that sja1105 has gained the ability to perform traffic termination even under the tricky case (VLAN-aware bridge), and that is much more functional (we can perform VLAN-aware bridging with foreign interfaces), there is no reason to keep this code in the receive path of the network core. So delete it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	b6ad86e6ad	net: dsa: sja1105: add bridge TX data plane offload based on tag_8021q The main desire for having this feature in sja1105 is to support network stack termination for traffic coming from a VLAN-aware bridge. For sja1105, offloading the bridge data plane means sending packets as-is, with the proper VLAN tag, to the chip. The chip will look up its FDB and forward them to the correct destination port. But we support bridge data plane offload even for VLAN-unaware bridges, and the implementation there is different. In fact, VLAN-unaware bridging is governed by tag_8021q, so it makes sense to have the .bridge_fwd_offload_add() implementation fully within tag_8021q. The key difference is that we only support 1 VLAN-aware bridge, but we support multiple VLAN-unaware bridges. So we need to make sure that the forwarding domain is not crossed by packets injected from the stack. For this, we introduce the concept of a tag_8021q TX VLAN for bridge forwarding offload. As opposed to the regular TX VLANs which contain only 2 ports (the user port and the CPU port), a bridge data plane TX VLAN is "multicast" (or "imprecise"): it contains all the ports that are part of a certain bridge, and the hardware will select where the packet goes within this "imprecise" forwarding domain. Each VLAN-unaware bridge has its own "imprecise" TX VLAN, so we make use of the unique "bridge_num" provided by DSA for the data plane offload. We use the same 3 bits from the tag_8021q VLAN ID format to encode this bridge number. Note that these 3 bit positions have been used before for sub-VLANs in best-effort VLAN filtering mode. The difference is that for best-effort, the sub-VLANs were only valid on RX (and it was documented that the sub-VLAN field needed to be transmitted as zero). Whereas for the bridge data plane offload, these 3 bits are only valid on TX. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	ee80dd2e89	net: bridge: add a helper for retrieving port VLANs from the data path Introduce a brother of br_vlan_get_info() which is protected by the RCU mechanism, as opposed to br_vlan_get_info() which relies on taking the write-side rtnl_mutex. This is needed for drivers which need to find out whether a bridge port has a VLAN configured or not. For example, certain DSA switches might not offer complete source port identification to the CPU on RX, just the VLAN in which the packet was received. Based on this VLAN, we cannot set an accurate skb->dev ingress port, but at least we can configure one that behaves the same as the correct one would (this is possible because DSA sets skb->offload_fwd_mark = 1). When we look at the bridge RX handler (br_handle_frame), we see that what matters regarding skb->dev is the VLAN ID and the port STP state. So we need to select an skb->dev that has the same bridge VLAN as the packet we're receiving, and is in the LEARNING or FORWARDING STP state. The latter is easy, but for the former, we should somehow keep a shadow list of the bridge VLANs on each port, and a lookup table between VLAN ID and the 'designated port for imprecise RX'. That is rather complicated to keep in sync properly (the designated port per VLAN needs to be updated on the addition and removal of a VLAN, as well as on the join/leave events of the bridge on that port). So, to avoid all that complexity, let's just iterate through our finite number of ports and ask the bridge, for each packet: "do you have this VLAN configured on this port?". Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
David S. Miller	9bff668419	mlx5-updates-2021-07-24 This series aims to reduce coupling in mlx5e, particularly between RX resources (TIRs, RQTs) and numerous code units that use them. This refactoring is required for upcoming features, ADQ and TX lag hashing. The issue with the current code is that TIRs and RQTs are unmanaged, different places all over the driver create, destroy, track and configure them, often in an uncoordinated way. The responsibilities of different units become vague, leading to a lot of hidden dependencies between numerous units and tight coupling between them, which is prone to bugs and hard to maintain. The result of this refactoring is: 1. Creating a manager for RX resources, that controls their lifecycle and provides a clear API, which restricts the set of actions that other units can do. 2. Using object-oriented approach for TIRs, RQTs and RX resource manager (struct mlx5e_rx_res). 3. Fixing a few bugs and misbehaviors found during the refactoring. 4. Reducing the amount of dependencies, removing hidden dependencies, making them one-directional and organizing the code in clear abstraction layers. 5. Explicitly exposing the remaining weird dependencies. 6. Simplifying and organizing code that creates and modifies TIRs and RQTs. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmD+5+EACgkQSD+KveBX +j6l/Qf9FHU8xRCBW0Bn4Ja0CnS/C3MJMIBTRbe+Qq2noj6qr8y6ut8Z7jV3hGaw dvcA6bVxpWPsqrY5CX4errfo2T4QiAsYhIeXrE2liCZsMUeYeZLusAYquks0oo57 ai2FnihMvL5Xf9U2r+aAeT4dxpQBJ7ANON+b+ZRCFjwRl8CZkd7i4pnifLYpptye d6de6QSZkvxc818homiALdu0VH8HOBZALzfMI9y2Oh9peW+irc2UhEoThukD5VCD Turm6WBQ4xteQq3uNKebbqg0cEevyQjHoUmBg799xD0tHzFbSp+kV3ivHKxsksUF sOBdEumQjDc6J22uTnA086dIStN7sA== =jxI/ -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2021-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2021-07-24 This series aims to reduce coupling in mlx5e, particularly between RX resources (TIRs, RQTs) and numerous code units that use them. This refactoring is required for upcoming features, ADQ and TX lag hashing. The issue with the current code is that TIRs and RQTs are unmanaged, different places all over the driver create, destroy, track and configure them, often in an uncoordinated way. The responsibilities of different units become vague, leading to a lot of hidden dependencies between numerous units and tight coupling between them, which is prone to bugs and hard to maintain. The result of this refactoring is: 1. Creating a manager for RX resources, that controls their lifecycle and provides a clear API, which restricts the set of actions that other units can do. 2. Using object-oriented approach for TIRs, RQTs and RX resource manager (struct mlx5e_rx_res). 3. Fixing a few bugs and misbehaviors found during the refactoring. 4. Reducing the amount of dependencies, removing hidden dependencies, making them one-directional and organizing the code in clear abstraction layers. 5. Explicitly exposing the remaining weird dependencies. 6. Simplifying and organizing code that creates and modifies TIRs and RQTs. Saeed Mahameed says: ==================== mlx5 updates 2021-07-24 This series provides some refactoring to mlx5e RX resource management, it is required for upcoming ADQ and TX lag hashing features. The first two patches in this series : net/mlx5e: Prohibit inner indir TIRs in IPoIB net/mlx5e: Block LRO if firmware asks for tunneled LRO Were supposed to go to net, but due to dependency and timing they were included here. I would appreciate it if you'd apply them to net and mark for -stable. For more information please see tag log below. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:32:24 +01:00
Maxim Mikityanskiy	26ab7b3845	net/mlx5e: Block LRO if firmware asks for tunneled LRO This commit does a cleanup in LRO configuration. LRO is a parameter of an RQ, but its state is changed by modifying a TIR related to the RQ. The current status: LRO for tunneled packets is not supported in the driver, inner TIRs may enable LRO on creation, but LRO status of inner TIRs isn't changed in mlx5e_modify_tirs_lro(). This is inconsistent, but as long as the firmware doesn't declare support for tunneled LRO, it works, because the same RQs are shared between the inner and outer TIRs. This commit does two fixes: 1. If the firmware has the tunneled LRO capability, LRO is blocked altogether, because it's not possible to block it for inner TIRs only, when the same RQs are shared between inner and outer TIRs, and the driver won't be able to handle tunneled LRO traffic. 2. mlx5e_modify_tirs_lro() is patched to modify LRO state for all TIRs, including inner ones, because all TIRs related to an RQ should agree on their LRO state. Fixes: `7b3722fa9e` ("net/mlx5e: Support RSS for GRE tunneled packets") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:37 -07:00
Angelo Dureghello	896e7f3e74	can: flexcan: add platform data header Add platform data header for flexcan. Link: https://lore.kernel.org/r/20210702094841.327679-1-angelo@kernel-space.org Signed-off-by: Angelo Dureghello <angelo@kernel-space.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-07-25 11:36:29 +02:00
Marc Kleine-Budde	8345a33073	can: bittiming: fix documentation for struct can_tdc This patch fixes a typo in the documentation for struct can_tdc::tdcv. The number "0" refers to automatic mode not the letter "O". Further two grammar errors in the documentation for struct can_tdc are fixed. First grammar error: add a missing third person 's'. Second grammar error: replace "such as" by "such that". The intent is to give a condition, not an example. Fixes: `289ea9e4ae` ("can: add new CAN FD bittiming parameters: Transmitter Delay Compensation (TDC)") Link: https://lore.kernel.org/r/20210616095922.2430415-1-mkl@pengutronix.de Link: https://lore.kernel.org/r/20210616124057.60723-1-mailhol.vincent@wanadoo.fr Co-developed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-07-25 11:36:25 +02:00
Marc Kleine-Budde	30bfec4fec	can: rx-offload: can_rx_offload_threaded_irq_finish(): add new function to be called from threaded interrupt After reading all CAN frames from the controller in the IRQ handler and storing them into a skb_queue, the driver calls napi_schedule(). In the napi poll function the skb from the skb_queue are then pushed into the networking stack. However if napi_schedule() is called from a threaded IRQ handler this triggers the following error: \| NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!! To avoid this, create a new rx-offload function (can_rx_offload_threaded_irq_finish()) with a call to local_bh_disable()/local_bh_enable() around the napi_schedule() call. Convert all drivers that call can_rx_offload_irq_finish() from threaded IRQ context to can_rx_offload_threaded_irq_finish(). Link: https://lore.kernel.org/r/20210724204745.736053-4-mkl@pengutronix.de Suggested-by: Daniel Glöckner <dg@emlix.com> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-07-25 11:36:25 +02:00
Marc Kleine-Budde	1e0d8e507e	can: rx-offload: can_rx_offload_irq_finish(): directly call napi_schedule() Instead of calling can_rx_offload_schedule() call napi_schedule() directly. As this was the last use of can_rx_offload_schedule() remove this helper function. Link: https://lore.kernel.org/r/20210724204745.736053-3-mkl@pengutronix.de Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-07-25 11:36:25 +02:00
Marc Kleine-Budde	c757096ea1	can: rx-offload: add skb queue for use during ISR Adding a skb to the skb_queue in rx-offload requires to take a lock. This commit avoids this by adding an unlocked skb queue that is appended at the end of the ISR. Having one lock at the end of the ISR should be OK as the HW is empty, not about to overflow. Link: https://lore.kernel.org/r/20210724204745.736053-2-mkl@pengutronix.de Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Co-developed-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2021-07-25 11:36:25 +02:00
Krzysztof Kozlowski	7186aac9c2	nfc: constify nfc_digital_ops Neither the core nor the drivers modify the passed pointer to struct nfc_digital_ops, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-25 09:21:21 +01:00
Krzysztof Kozlowski	094c45c84d	nfc: constify nfc_hci_ops Neither the core nor the drivers modify the passed pointer to struct nfc_hci_ops, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-25 09:21:21 +01:00
Krzysztof Kozlowski	f6c802a726	nfc: constify nfc_ops Neither the core nor the drivers modify the passed pointer to struct nfc_ops, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-25 09:21:21 +01:00
Krzysztof Kozlowski	15944ad2e5	nfc: constify pointer to nfc_vendor_cmd Neither the core nor the drivers modify the passed pointer to struct nfc_vendor_cmd, so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-25 09:21:21 +01:00
Krzysztof Kozlowski	cb8caa3c6c	nfc: constify nci_driver_ops (prop_ops and core_ops) Neither the core nor the drivers modify the passed pointer to struct nci_driver_ops (consisting of function pointers), so make it a pointer to const for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-25 09:21:21 +01:00
Krzysztof Kozlowski	b9c28286d8	nfc: constify nci_ops The struct nci_ops is modified by NFC core in only one case: nci_allocate_device() receives too many proprietary commands (prop_ops) to configure. This is a build time known constrain, so a graceful handling of such case is not necessary. Instead, fail the nci_allocate_device() and add BUILD_BUG_ON() to places which set these. This allows to constify the struct nci_ops (consisting of function pointers) for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-25 09:21:20 +01:00
Krzysztof Kozlowski	48d5440393	nfc: constify payload argument in nci_send_cmd() The nci_send_cmd() payload argument is passed directly to skb_put_data() which already accepts a pointer to const, so make it const as well for correctness and safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-25 09:21:20 +01:00
Vladimir Oltean	123abc06e7	net: dsa: add support for bridge TX forwarding offload For a DSA switch, to offload the forwarding process of a bridge device means to send the packets coming from the software bridge as data plane packets. This is contrary to everything that DSA has done so far, because the current taggers only know to send control packets (ones that target a specific destination port), whereas data plane packets are supposed to be forwarded according to the FDB lookup, much like packets ingressing on any regular ingress port. If the FDB lookup process returns multiple destination ports (flooding, multicast), then replication is also handled by the switch hardware - the bridge only sends a single packet and avoids the skb_clone(). DSA keeps for each bridge port a zero-based index (the number of the bridge). Multiple ports performing TX forwarding offload to the same bridge have the same dp->bridge_num value, and ports not offloading the TX data plane of a bridge have dp->bridge_num = -1. The tagger can check if the packet that is being transmitted on has skb->offload_fwd_mark = true or not. If it does, it can be sure that the packet belongs to the data plane of a bridge, further information about which can be obtained based on dp->bridge_dev and dp->bridge_num. It can then compose a DSA tag for injecting a data plane packet into that bridge number. For the switch driver side, we offer two new dsa_switch_ops methods, called .port_bridge_fwd_offload_{add,del}, which are modeled after .port_bridge_{join,leave}. These methods are provided in case the driver needs to configure the hardware to treat packets coming from that bridge software interface as data plane packets. The switchdev <-> bridge interaction happens during the netdev_master_upper_dev_link() call, so to switch drivers, the effect is that the .port_bridge_fwd_offload_add() method is called immediately after .port_bridge_join(). If the bridge number exceeds the number of bridges for which the switch driver can offload the TX data plane (and this includes the case where the driver can offload none), DSA falls back to simply returning tx_fwd_offload = false in the switchdev_bridge_port_offload() call. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 16:32:37 +01:00
Vladimir Oltean	5b22d3669f	net: dsa: track the number of switches in a tree In preparation of supporting data plane forwarding on behalf of a software bridge, some drivers might need to view bridges as virtual switches behind the CPU port in a cross-chip topology. Give them some help and let them know how many physical switches there are in the tree, so that they can count the virtual switches starting from that number on. Note that the first dsa_switch_ops method where this information is reliably available is .setup(). This is because of how DSA works: in a tree with 3 switches, each calling dsa_register_switch(), the first 2 will advance until dsa_tree_setup() -> dsa_tree_setup_routing_table() and exit with error code 0 because the topology is not complete. Since probing is parallel at this point, one switch does not know about the existence of the other. Then the third switch comes, and for it, dsa_tree_setup_routing_table() returns complete = true. This switch goes ahead and calls dsa_tree_setup_switches() for everybody else, calling their .setup() methods too. This acts as the synchronization point. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 16:32:37 +01:00
Tobias Waldekranz	472111920f	net: bridge: switchdev: allow the TX data plane forwarding to be offloaded Allow switchdevs to forward frames from the CPU in accordance with the bridge configuration in the same way as is done between bridge ports. This means that the bridge will only send a single skb towards one of the ports under the switchdev's control, and expects the driver to deliver the packet to all eligible ports in its domain. Primarily this improves the performance of multicast flows with multiple subscribers, as it allows the hardware to perform the frame replication. The basic flow between the driver and the bridge is as follows: - When joining a bridge port, the switchdev driver calls switchdev_bridge_port_offload() with tx_fwd_offload = true. - The bridge sends offloadable skbs to one of the ports under the switchdev's control using skb->offload_fwd_mark = true. - The switchdev driver checks the skb->offload_fwd_mark field and lets its FDB lookup select the destination port mask for this packet. v1->v2: - convert br_input_skb_cb::fwd_hwdoms to a plain unsigned long - introduce a static key "br_switchdev_fwd_offload_used" to minimize the impact of the newly introduced feature on all the setups which don't have hardware that can make use of it - introduce a check for nbp->flags & BR_FWD_OFFLOAD to optimize cache line access - reorder nbp_switchdev_frame_mark_accel() and br_handle_vlan() in __br_forward() - do not strip VLAN on egress if forwarding offload on VLAN-aware bridge is being used - propagate errors from .ndo_dfwd_add_station() if not EOPNOTSUPP v2->v3: - replace the solution based on .ndo_dfwd_add_station with a solution based on switchdev_bridge_port_offload - rename BR_FWD_OFFLOAD to BR_TX_FWD_OFFLOAD v3->v4: rebase v4->v5: - make sure the static key is decremented on bridge port unoffload - more function and variable renaming and comments for them: br_switchdev_fwd_offload_used to br_switchdev_tx_fwd_offload br_switchdev_accels_skb to br_switchdev_frame_uses_tx_fwd_offload nbp_switchdev_frame_mark_tx_fwd to nbp_switchdev_frame_mark_tx_fwd_to_hwdom nbp_switchdev_frame_mark_accel to nbp_switchdev_frame_mark_tx_fwd_offload fwd_accel to tx_fwd_offload Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 16:32:37 +01:00
David S. Miller	5af84df962	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts are simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 16:13:06 +01:00
Arnd Bergmann	29c4964822	net: socket: rework compat_ifreq_ioctl() compat_ifreq_ioctl() is one of the last users of copy_in_user() and compat_alloc_user_space(), as it attempts to convert the 'struct ifreq' arguments from 32-bit to 64-bit format as used by dev_ioctl() and a couple of socket family specific interpretations. The current implementation works correctly when calling dev_ioctl(), inet_ioctl(), ieee802154_sock_ioctl(), atalk_ioctl(), qrtr_ioctl() and packet_ioctl(). The ioctl handlers for x25, netrom, rose and x25 do not interpret the arguments and only block the corresponding commands, so they do not care. For af_inet6 and af_decnet however, the compat conversion is slightly incorrect, as it will copy more data than the native handler accesses, both of them use a structure that is shorter than ifreq. Replace the copy_in_user() conversion with a pair of accessor functions to read and write the ifreq data in place with the correct length where needed, while leaving the other ones to copy the (already compatible) structures directly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 14:20:25 +01:00
Arnd Bergmann	876f0bf9d0	net: socket: simplify dev_ifconf handling The dev_ifconf() calling conventions make compat handling more complicated than necessary, simplify this by moving the in_compat_syscall() check into the function. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 14:20:25 +01:00
Arnd Bergmann	b0e99d0377	net: socket: remove register_gifconf Since dynamic registration of the gifconf() helper is only used for IPv4, and this can not be in a loadable module, this can be simplified noticeably by turning it into a direct function call as a preparation for cleaning up the compat handling. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 14:20:25 +01:00
Arnd Bergmann	dd98d2895d	ethtool: improve compat ioctl handling The ethtool compat ioctl handling is hidden away in net/socket.c, which introduces a couple of minor oddities: - The implementation may end up diverging, as seen in the RXNFC extension in commit `84a1d9c482` ("net: ethtool: extend RXNFC API to support RSS spreading of filter matches") that does not work in compat mode. - Most architectures do not need the compat handling at all because u64 and compat_u64 have the same alignment. - On x86, the conversion is done for both x32 and i386 user space, but it's actually wrong to do it for x32 and cannot work there. - On 32-bit Arm, it never worked for compat oabi user space, since that needs to do the same conversion but does not. - It would be nice to get rid of both compat_alloc_user_space() and copy_in_user() throughout the kernel. None of these actually seems to be a serious problem that real users are likely to encounter, but fixing all of them actually leads to code that is both shorter and more readable. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 14:20:25 +01:00
Arnd Bergmann	1a33b18b3b	compat: make linux/compat.h available everywhere Parts of linux/compat.h are under an #ifdef, but we end up using more of those over time, moving things around bit by bit. To get it over with once and for all, make all of this file uncondititonal now so it can be accessed everywhere. There are only a few types left that are in asm/compat.h but not yet in the asm-generic version, so add those in the process. This requires providing a few more types in asm-generic/compat.h that were not already there. The only tricky one is compat_sigset_t, which needs a little help on 32-bit architectures and for x86. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-23 14:20:24 +01:00
Linus Torvalds	9f42f674a8	arm64 fixes for -rc3 - Fix hang when issuing SMC on SVE-capable system due to clobbered LR - Fix boot failure due to missing block mappings with folded page-table -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmD5VM4QHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNLpcCACQWJ/MsBEQbyg7YfYioOOm4a2qIcci0EN1 Su4rkMsjVXQN4nWsP8tpu1AVNKNe3dX3O4Vl1KQy1W0/8LY+Sbkws35RHur/kdpr aY12nh9Jt3+L0Q5Vt8OkuN18K3W+CrVFQtUWEVsbvfX8KnE6ralqSlKWNhNhSHBZ 1ETIWotZ/1d95y8C9FO/HcvGgbWxk6KYCNYECeLgK23+vne1O/9eoMvdOdnAQUjy 2aHlEMKIn4fLs5PnUJRLhh+tFi517uWBJWV1SraxomVBwr4Ng8ywYdRLJsawCHXo OtpMDBphQb7F5dIKGBw+LqN46PNznv8bVjdQ4rbdFqKZ4xrmNo2d =hFuL -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A pair of arm64 fixes for -rc3. The straightforward one is a fix to our firmware calling stub, which accidentally started corrupting the link register on machines with SVE. Since these machines don't really exist yet, it wasn't spotted in -next. The other fix is a revert-and-a-bit of a patch originally intended to allow PTE-level huge mappings for the VMAP area on 32-bit PPC 8xx. A side-effect of this change was that our pXd_set_huge() implementations could be replaced with generic dummy functions depending on the levels of page-table being used, which in turn broke the boot if we fail to create the linear mapping as a result of using these functions to operate on the pgd. Huge thanks to Michael Ellerman for modifying the revert so as not to regress PPC 8xx in terms of functionality. Anyway, that's the background and it's also available in the commit message along with Link tags pointing at all of the fun. Summary: - Fix hang when issuing SMC on SVE-capable system due to clobbered LR - Fix boot failure due to missing block mappings with folded page-table" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge" arm64: smccc: Save lr before calling __arm_smccc_sve_check()	2021-07-22 10:38:19 -07:00
Linus Torvalds	4784dc99c7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Fix type of bind option flag in af_xdp, from Baruch Siach. 2) Fix use after free in bpf_xdp_link_release(), from Xuan Zhao. 3) PM refcnt imbakance in r8152, from Takashi Iwai. 4) Sign extension ug in liquidio, from Colin Ian King. 5) Mising range check in s390 bpf jit, from Colin Ian King. 6) Uninit value in caif_seqpkt_sendmsg(), from Ziyong Xuan. 7) Fix skb page recycling race, from Ilias Apalodimas. 8) Fix memory leak in tcindex_partial_destroy_work, from Pave Skripkin. 9) netrom timer sk refcnt issues, from Nguyen Dinh Phi. 10) Fix data races aroun tcp's tfo_active_disable_stamp, from Eric Dumazet. 11) act_skbmod should only operate on ethernet packets, from Peilin Ye. 12) Fix slab out-of-bpunds in fib6_nh_flush_exceptions(),, from Psolo Abeni. 13) Fix sparx5 dependencies, from Yajun Deng. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits) dpaa2-switch: seed the buffer pool after allocating the swp net: sched: cls_api: Fix the the wrong parameter net: sparx5: fix unmet dependencies warning net: dsa: tag_ksz: dont let the hardware process the layer 4 checksum net: dsa: ensure linearized SKBs in case of tail taggers ravb: Remove extra TAB ravb: Fix a typo in comment net: dsa: sja1105: make VID 4095 a bridge VLAN too tcp: disable TFO blackhole logic by default sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set net: ixp46x: fix ptp build failure ibmvnic: Remove the proper scrq flush selftests: net: add ESP-in-UDP PMTU test udp: check encap socket in __udp_lib_err sctp: update active_key for asoc when old key is being replaced r8169: Avoid duplicate sysfs entry creation error ixgbe: Fix packet corruption due to missing DMA sync Revert "qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()" ipv6: fix another slab-out-of-bounds in fib6_nh_flush_exceptions fsl/fman: Add fibre support ...	2021-07-22 10:11:27 -07:00
Vladimir Oltean	4e51bf44a0	net: bridge: move the switchdev object replay helpers to "push" mode Starting with commit `4f2673b3a2` ("net: bridge: add helper to replay port and host-joined mdb entries"), DSA has introduced some bridge helpers that replay switchdev events (FDB/MDB/VLAN additions and deletions) that can be lost by the switchdev drivers in a variety of circumstances: - an IP multicast group was host-joined on the bridge itself before any switchdev port joined the bridge, leading to the host MDB entries missing in the hardware database. - during the bridge creation process, the MAC address of the bridge was added to the FDB as an entry pointing towards the bridge device itself, but with no switchdev ports being part of the bridge yet, this local FDB entry would remain unknown to the switchdev hardware database. - a VLAN/FDB/MDB was added to a bridge port that is a LAG interface, before any switchdev port joined that LAG, leading to the hardware database missing those entries. - a switchdev port left a LAG that is a bridge port, while the LAG remained part of the bridge, and all FDB/MDB/VLAN entries remained installed in the hardware database of the switchdev port. Also, since commit `0d2cfbd41c` ("net: bridge: ignore switchdev events for LAG ports which didn't request replay"), DSA introduced a method, based on a const void ctx, to ensure that two switchdev ports under the same LAG that is a bridge port do not see the same MDB/VLAN entry being replayed twice by the bridge, once for every bridge port that joins the LAG. With so many ordering corner cases being possible, it seems unreasonable to expect a switchdev driver writer to get it right from the first try. Therefore, now that DSA has experimented with the bridge replay helpers for a little bit, we can move the code to the bridge driver where it is more readily available to all switchdev drivers. To convert the switchdev object replay helpers from "pull mode" (where the driver asks for them) to a "push mode" (where the bridge offers them automatically), the biggest problem is that the bridge needs to be aware when a switchdev port joins and leaves, even when the switchdev is only indirectly a bridge port (for example when the bridge port is a LAG upper of the switchdev). Luckily, we already have a hook for that, in the form of the newly introduced switchdev_bridge_port_offload() and switchdev_bridge_port_unoffload() calls. These offer a natural place for hooking the object addition and deletion replays. Extend the above 2 functions with: - pointers to the switchdev atomic notifier (for FDB replays) and the blocking notifier (for MDB and VLAN replays). - the "const void ctx" argument required for drivers to be able to disambiguate between which port is targeted, when multiple ports are lowers of the same LAG that is a bridge port. Most of the drivers pass NULL to this argument, except the ones that support LAG offload and have the proper context check already in place in the switchdev blocking notifier handler. Also unexport the replay helpers, since nobody except the bridge calls them directly now. Note that: (a) we abuse the terminology slightly, because FDB entries are not "switchdev objects", but we count them as objects nonetheless. With no direct way to prove it, I think they are not modeled as switchdev objects because those can only be installed by the bridge to the hardware (as opposed to FDB entries which can be propagated in the other direction too). This is merely an abuse of terms, FDB entries are replayed too, despite not being objects. (b) the bridge does not attempt to sync port attributes to newly joined ports, just the countable stuff (the objects). The reason for this is simple: no universal and symmetric way to sync and unsync them is known. For example, VLAN filtering: what to do on unsync, disable or leave it enabled? Similarly, STP state, ageing timer, etc etc. What a switchdev port does when it becomes standalone again is not really up to the bridge's competence, and the driver should deal with it. On the other hand, replaying deletions of switchdev objects can be seen a matter of cleanup and therefore be treated by the bridge, hence this patch. We make the replay helpers opt-in for drivers, because they might not bring immediate benefits for them: - nbp_vlan_init() is called _after_ netdev_master_upper_dev_link(), so br_vlan_replay() should not do anything for the new drivers on which we call it. The existing drivers where there was even a slight possibility for there to exist a VLAN on a bridge port before they join it are already guarded against this: mlxsw and prestera deny joining LAG interfaces that are members of a bridge. - br_fdb_replay() should now notify of local FDB entries, but I patched all drivers except DSA to ignore these new entries in commit `2c4eca3ef7` ("net: bridge: switchdev: include local flag in FDB notifications"). Driver authors can lift this restriction as they wish, and when they do, they can also opt into the FDB replay functionality. - br_mdb_replay() should fix a real issue which is described in commit `4f2673b3a2` ("net: bridge: add helper to replay port and host-joined mdb entries"). However most drivers do not offload the SWITCHDEV_OBJ_ID_HOST_MDB to see this issue: only cpsw and am65_cpsw offload this switchdev object, and I don't completely understand the way in which they offload this switchdev object anyway. So I'll leave it up to these drivers' respective maintainers to opt into br_mdb_replay(). So most of the drivers pass NULL notifier blocks for the replay helpers, except: - dpaa2-switch which was already acked/regression-tested with the helpers enabled (and there isn't much of a downside in having them) - ocelot which already had replay logic in "pull" mode - DSA which already had replay logic in "pull" mode An important observation is that the drivers which don't currently request bridge event replays don't even have the switchdev_bridge_port_{offload,unoffload} calls placed in proper places right now. This was done to avoid unnecessary rework for drivers which might never even add support for this. For driver writers who wish to add replay support, this can be used as a tentative placement guide: https://patchwork.kernel.org/project/netdevbpf/patch/20210720134655.892334-11-vladimir.oltean@nxp.com/ Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-22 00:26:23 -07:00
Vladimir Oltean	2f5dc00f7a	net: bridge: switchdev: let drivers inform which bridge ports are offloaded On reception of an skb, the bridge checks if it was marked as 'already forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it is, it assigns the source hardware domain of that skb based on the hardware domain of the ingress port. Then during forwarding, it enforces that the egress port must have a different hardware domain than the ingress one (this is done in nbp_switchdev_allowed_egress). Non-switchdev drivers don't report any physical switch id (neither through devlink nor .ndo_get_port_parent_id), therefore the bridge assigns them a hardware domain of 0, and packets coming from them will always have skb->offload_fwd_mark = 0. So there aren't any restrictions. Problems appear due to the fact that DSA would like to perform software fallback for bonding and team interfaces that the physical switch cannot offload. +-- br0 ---+ / / \| \ / / \| \ / \| \| bond0 / \| \| / \ swp0 swp1 swp2 swp3 swp4 There, it is desirable that the presence of swp3 and swp4 under a non-offloaded LAG does not preclude us from doing hardware bridging beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high enough that software bridging between {swp0,swp1,swp2} and bond0 is not impractical. But this creates an impossible paradox given the current way in which port hardware domains are assigned. When the driver receives a packet from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to something. - If we set it to 0, then the bridge will forward it towards swp1, swp2 and bond0. But the switch has already forwarded it towards swp1 and swp2 (not to bond0, remember, that isn't offloaded, so as far as the switch is concerned, ports swp3 and swp4 are not looking up the FDB, and the entire bond0 is a destination that is strictly behind the CPU). But we don't want duplicated traffic towards swp1 and swp2, so it's not ok to set skb->offload_fwd_mark = 0. - If we set it to 1, then the bridge will not forward the skb towards the ports with the same switchdev mark, i.e. not to swp1, swp2 and bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should have forwarded the skb there. So the real issue is that bond0 will be assigned the same hardware domain as {swp0,swp1,swp2}, because the function that assigns hardware domains to bridge ports, nbp_switchdev_add(), recurses through bond0's lower interfaces until it finds something that implements devlink (calls dev_get_port_parent_id with bool recurse = true). This is a problem because the fact that bond0 can be offloaded by swp3 and swp4 in our example is merely an assumption. A solution is to give the bridge explicit hints as to what hardware domain it should use for each port. Currently, the bridging offload is very 'silent': a driver registers a netdevice notifier, which is put on the netns's notifier chain, and which sniffs around for NETDEV_CHANGEUPPER events where the upper is a bridge, and the lower is an interface it knows about (one registered by this driver, normally). Then, from within that notifier, it does a bunch of stuff behind the bridge's back, without the bridge necessarily knowing that there's somebody offloading that port. It looks like this: ip link set swp0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v call_netdevice_notifiers \| v dsa_slave_netdevice_event \| v oh, hey! it's for me! \| v .port_bridge_join What we do to solve the conundrum is to be less silent, and change the switchdev drivers to present themselves to the bridge. Something like this: ip link set swp0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the \| \| hardware domain for v \| this port, and zero dsa_slave_netdevice_event \| if I got nothing. \| \| v \| oh, hey! it's for me! \| \| \| v \| .port_bridge_join \| \| \| +------------------------+ switchdev_bridge_port_offload(swp0, swp0) Then stacked interfaces (like bond0 on top of swp3/swp4) would be treated differently in DSA, depending on whether we can or cannot offload them. The offload case: ip link set bond0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the \| \| switchdev mark for v \| bond0. dsa_slave_netdevice_event \| Coincidentally (or not), \| \| bond0 and swp0, swp1, swp2 v \| all have the same switchdev hmm, it's not quite for me, \| mark now, since the ASIC but my driver has already \| is able to forward towards called .port_lag_join \| all these ports in hw. for it, because I have \| a port with dp->lag_dev == bond0. \| \| \| v \| .port_bridge_join \| for swp3 and swp4 \| \| \| +------------------------+ switchdev_bridge_port_offload(bond0, swp3) switchdev_bridge_port_offload(bond0, swp4) And the non-offload case: ip link set bond0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge waiting: call_netdevice_notifiers ^ huh, switchdev_bridge_port_offload \| \| wasn't called, okay, I'll use a v \| hwdom of zero for this one. dsa_slave_netdevice_event : Then packets received on swp0 will \| : not be software-forwarded towards v : swp1, but they will towards bond0. it's not for me, but bond0 is an upper of swp3 and swp4, but their dp->lag_dev is NULL because they couldn't offload it. Basically we can draw the conclusion that the lowers of a bridge port can come and go, so depending on the configuration of lowers for a bridge port, it can dynamically toggle between offloaded and unoffloaded. Therefore, we need an equivalent switchdev_bridge_port_unoffload too. This patch changes the way any switchdev driver interacts with the bridge. From now on, everybody needs to call switchdev_bridge_port_offload and switchdev_bridge_port_unoffload, otherwise the bridge will treat the port as non-offloaded and allow software flooding to other ports from the same ASIC. Note that these functions lay the ground for a more complex handshake between switchdev drivers and the bridge in the future. For drivers that will request a replay of the switchdev objects when they offload and unoffload a bridge port (DSA, dpaa2-switch, ocelot), we place the call to switchdev_bridge_port_unoffload() strategically inside the NETDEV_PRECHANGEUPPER notifier's code path, and not inside NETDEV_CHANGEUPPER. This is because the switchdev object replay helpers need the netdev adjacency lists to be valid, and that is only true in NETDEV_PRECHANGEUPPER. Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch: regression Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> # ocelot-switch Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-22 00:26:23 -07:00
Linus Torvalds	7c3d49b0b5	regulator: Fixes for v5.14 A few driver specific fixes that came in since the merge window, plus a change to mark the regulator-fixed-domain DT binding as deprecated in order to try to to discourage any new users while a better solution is put in place. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmD4WckACgkQJNaLcl1U h9DltAgAgxwqrdGPLt/0rGlhpI183L6x81fbLcbvVYwWiyUDqU1rSqZhR+UnxkOf 3tOX4B+4EJTNpArLRcoXD2uu1KP92AhOwv3uGKsJGibS0OC6AwV7adyK+qkBuZsS IJgyI63YfRGP1udWfklZle+CxK6JIRMILbT9oTMGoWrPnK3QfS2rNDapTtpfYRn1 PTIu0TqMQ6xKHg8XPbtmD4INbrbhxP0H3848g0ZhohGozEwggKSEwB1c55cbQc2J I3HpBMVk3sO0UrG6NBpX+fSj0BT4MwjNdFDgmstgwEn/31gSSDjAxaHnfZqWLJqJ cW6rlhw0eSqRXAjrqh6WfK5QiojRXw== =+jG2 -----END PGP SIGNATURE----- Merge tag 'regulator-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A few driver specific fixes that came in since the merge window, plus a change to mark the regulator-fixed-domain DT binding as deprecated in order to try to to discourage any new users while a better solution is put in place" * tag 'regulator-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: hi6421: Fix getting wrong drvdata regulator: mtk-dvfsrc: Fix wrong dev pointer for devm_regulator_register regulator: fixed: Mark regulator-fixed-domain as deprecated regulator: bd9576: Fix testing wrong flag in check_temp_flag_mismatch regulator: hi6421v600: Fix getting wrong drvdata that causes boot failure regulator: rt5033: Fix n_voltages settings for BUCK and LDO regulator: rtmv20: Fix wrong mask for strobe-polarity-high	2021-07-21 12:37:49 -07:00
Linus Torvalds	b4e62aaf95	AFS fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmD4LMkACgkQ+7dXa6fL C2t4vg//eOjVirl2L9NSwW27btAv3tySeX7zN78i4IIy1KesVTV4An/yNYnZ1XDg yackjIIgRaw/pX+W/dWiQalRtAcYno4vFam874lzBq+Yourk9j+7mU7WDtkofRh4 UtuWAyARGVfLZdkJQp1Eb1YoZj0oro31Qa7eFeHQ/1Wlx6Wo1gdsRAdGfRpukLeC fTv/DXrpK+cC36ENZjP3VuhfsOAKIxKVqf7rLKiMBbeyAGW7jRs3fSTxgOJAgJN0 opj/WTGZf8vskq+u3uTbAv8dwK9CakqB6QEkDq22kZSVh+LSxKzIB6E/3wSXfhAj 1NAKUVQb4e6caXKk/vxFUhnyq8wWdzWW6ExVzH2DYSdNn442A6eo9g6d6qE7wIGK ui0sL1p7ilwV/1aBVA7imYM5bSS3HDpVFCW1+HCZpbVxnARDv87su8iOJGErWpz+ 5fDqifhm+Wrtrhoo05RnsYoygCf5elnz5iUu4MaOYKno1dFQrHBBJhFsv8PAduUq nZkr+q8HUFH7EYcUjidJiMR+NSiVpe8bCV4LoaGczY4j/AYc5rOfD1s3kW0tfiqG zqx+pGy5AXTwrZqzVf28I5T6cBCW1GxLnFLmSWZ4DsQdNGziYbPUTYDCJcJxfT0U cASaEXQSItfxlkVw+JIr+Hb/QXYEfoNEdwWLvUQ2LrPueJwH2V8= =Pn/w -----END PGP SIGNATURE----- Merge tag 'afs-fixes-20210721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: - Fix a tracepoint that causes one of the tracing subsystem query files to crash if the module is loaded - Fix afs_writepages() to take account of whether the storage rpc actually succeeded when updating the cyclic writeback counter - Fix some error code propagation/handling - Fix place where afs_writepages() was setting writeback_index to a file position rather than a page index * tag 'afs-fixes-20210721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Remove redundant assignment to ret afs: Fix setting of writeback_index afs: check function return afs: Fix tracepoint string placement with built-in AFS	2021-07-21 11:51:59 -07:00
Vadim Fedorenko	ac6627a28d	net: ipv4: Consolidate ipv4_mtu and ip_dst_mtu_maybe_forward Consolidate IPv4 MTU code the same way it is done in IPv6 to have code aligned in both address families Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:22:03 -07:00
Vadim Fedorenko	427faee167	net: ipv6: introduce ip6_dst_mtu_maybe_forward Replace ip6_dst_mtu_forward with ip6_dst_mtu_maybe_forward and reuse this code in ip6_mtu. Actually these two functions were almost duplicates, this change will simplify the maintaince of mtu calculation code. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:22:02 -07:00
Justin Iurman	3edede08ff	ipv6: ioam: Support for IOAM injection with lwtunnels Add support for the IOAM inline insertion (only for the host-to-host use case) which is per-route configured with lightweight tunnels. The target is iproute2 and the patch is ready. It will be posted as soon as this patchset is merged. Here is an overview: $ ip -6 ro ad fc00::1/128 encap ioam6 trace type 0x800000 ns 1 size 12 dev eth0 This example configures an IOAM Pre-allocated Trace option attached to the fc00::1/128 prefix. The IOAM namespace (ns) is 1, the size of the pre-allocated trace data block is 12 octets (size) and only the first IOAM data (bit 0: hop_limit + node id) is included in the trace (type) represented as a bitfield. The reason why the in-transit (IPv6-in-IPv6 encapsulation) use case is not implemented is explained on the patchset cover. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:14:33 -07:00
Justin Iurman	8c6f6fa677	ipv6: ioam: IOAM Generic Netlink API Add Generic Netlink commands to allow userspace to configure IOAM namespaces and schemas. The target is iproute2 and the patch is ready. It will be posted as soon as this patchset is merged. Here is an overview: $ ip ioam Usage: ip ioam { COMMAND \| help } ip ioam namespace show ip ioam namespace add ID [ data DATA32 ] [ wide DATA64 ] ip ioam namespace del ID ip ioam schema show ip ioam schema add ID DATA ip ioam schema del ID ip ioam namespace set ID schema { ID \| none } Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:14:33 -07:00
Justin Iurman	9ee11f0fff	ipv6: ioam: Data plane support for Pre-allocated Trace Implement support for processing the IOAM Pre-allocated Trace with IPv6, see [1] and [2]. Introduce a new IPv6 Hop-by-Hop TLV option, see IANA [3]. A new per-interface sysctl is introduced. The value is a boolean to accept (=1) or ignore (=0, by default) IPv6 IOAM options on ingress for an interface: - net.ipv6.conf.XXX.ioam6_enabled Two other sysctls are introduced to define IOAM IDs, represented by an integer. They are respectively per-namespace and per-interface: - net.ipv6.ioam6_id - net.ipv6.conf.XXX.ioam6_id The value of the first one represents the IOAM ID of the node itself (u32; max and default value = U32_MAX>>8, due to hop limit concatenation) while the other represents the IOAM ID of an interface (u16; max and default value = U16_MAX). Each "ioam6_id" sysctl has a "_wide" equivalent: - net.ipv6.ioam6_id_wide - net.ipv6.conf.XXX.ioam6_id_wide The value of the first one represents the wide IOAM ID of the node itself (u64; max and default value = U64_MAX>>8, due to hop limit concatenation) while the other represents the wide IOAM ID of an interface (u32; max and default value = U32_MAX). The use of short and wide equivalents is not exclusive, a deployment could choose to leverage both. For example, net.ipv6.conf.XXX.ioam6_id (short format) could be an identifier for a physical interface, whereas net.ipv6.conf.XXX.ioam6_id_wide (wide format) could be an identifier for a logical sub-interface. Documentation about new sysctls is provided at the end of this patchset. Two relativistic hash tables are used: one for IOAM namespaces, the other for IOAM schemas. A namespace can only have a single active schema and a schema can only be attached to a single namespace (1:1 relationship). [1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options [2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data [3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2 Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:14:33 -07:00
Justin Iurman	db67f219fc	uapi: IPv6 IOAM headers definition This patch provides the IPv6 IOAM option header [1] as well as the IOAM Trace header [2]. An IOAM option must be 4n-aligned. Here is an overview of a Hop-by-Hop with an IOAM Trace option: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| Next header \| Hdr Ext Len \| Padding \| Padding \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| Option Type \| Opt Data Len \| Reserved \| IOAM Type \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| Namespace-ID \| NodeLen \| Flags \| RemainingLen\| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| IOAM-Trace-Type \| Reserved \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ \| \| \| \| node data [n] \| \| \| \| \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ D \| \| a \| node data [n-1] \| t \| \| a +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ ... ~ S +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ p \| \| a \| node data [1] \| c \| \| e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| \| \| \| \| node data [0] \| \| \| \| \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ The IOAM option header starts at "Option Type" and ends after "IOAM Type". The IOAM Trace header starts at "Namespace-ID" and ends after "IOAM-Trace-Type/Reserved". IOAM Type: either Pre-allocated Trace (=0), Incremental Trace (=1), Proof-of-Transit (=2) or Edge-to-Edge (=3). Note that both the Pre-allocated Trace and the Incremental Trace look the same. The two others are not implemented. Namespace-ID: IOAM namespace identifier, not to be confused with network namespaces. It adds further context to IOAM options and associated data, and allows devices which are IOAM capable to determine whether IOAM options must be processed or ignored. It can also be used by an operator to distinguish different operational domains or to identify different sets of devices. NodeLen: Length of data added by each node. It depends on the Trace Type. Flags: Only the Overflow (O) flag for now. The O flag is set by a transit node when there are not enough octets left to record its data. RemainingLen: Remaining free space to record data. IOAM-Trace-Type: Bit field where each bit corresponds to a specific kind of IOAM data. See [2] for a detailed list. [1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options [2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:14:33 -07:00
Vladimir Oltean	94111dfc18	net: switchdev: remove stray semicolon in switchdev_handle_fdb_del_to_device shim With the semicolon at the end, the compiler sees the shim function as a declaration and not as a definition, and warns: 'switchdev_handle_fdb_del_to_device' declared 'static' but never defined Reported-by: kernel test robot <lkp@intel.com> Fixes: `8ca07176ab` ("net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-21 08:00:16 -07:00
David Howells	6c881ca0b3	afs: Fix tracepoint string placement with built-in AFS To quote Alexey[1]: I was adding custom tracepoint to the kernel, grabbed full F34 kernel .config, disabled modules and booted whole shebang as VM kernel. Then did perf record -a -e ... It crashed: general protection fault, probably for non-canonical address 0x435f5346592e4243: 0000 [#1] SMP PTI CPU: 1 PID: 842 Comm: cat Not tainted 5.12.6+ #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:t_show+0x22/0xd0 Then reproducer was narrowed to # cat /sys/kernel/tracing/printk_formats Original F34 kernel with modules didn't crash. So I started to disable options and after disabling AFS everything started working again. The root cause is that AFS was placing char arrays content into a section full of _pointers_ to strings with predictable consequences. Non canonical address 435f5346592e4243 is "CB.YFS_" which came from CM_NAME macro. Steps to reproduce: CONFIG_AFS=y CONFIG_TRACING=y # cat /sys/kernel/tracing/printk_formats Fix this by the following means: (1) Add enum->string translation tables in the event header with the AFS and YFS cache/callback manager operations listed by RPC operation ID. (2) Modify the afs_cb_call tracepoint to print the string from the translation table rather than using the string at the afs_call name pointer. (3) Switch translation table depending on the service we're being accessed as (AFS or YFS) in the tracepoint print clause. Will this cause problems to userspace utilities? Note that the symbolic representation of the YFS service ID isn't available to this header, so I've put it in as a number. I'm not sure if this is the best way to do this. (4) Remove the name wrangling (CM_NAME) macro and put the names directly into the afs_call_type structs in cmservice.c. Fixes: `8e8d7f13b6` ("afs: Add some tracepoints") Reported-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/YLAXfvZ+rObEOdc%2F@localhost.localdomain/ [1] Link: https://lore.kernel.org/r/643721.1623754699@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162430903582.2896199.6098150063997983353.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/162609463957.3133237.15916579353149746363.stgit@warthog.procyon.org.uk/ # v1 (repost) Link: https://lore.kernel.org/r/162610726860.3408253.445207609466288531.stgit@warthog.procyon.org.uk/ # v2	2021-07-21 15:08:35 +01:00
Jonathan Marek	d8a719059b	Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge" This reverts commit `c742199a01`. `c742199a01` ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") breaks arm64 in at least two ways for configurations where PUD or PMD folding occur: 1. We no longer install huge-vmap mappings and silently fall back to page-granular entries, despite being able to install block entries at what is effectively the PGD level. 2. If the linear map is backed with block mappings, these will now silently fail to be created in alloc_init_pud(), causing a panic early during boot. The pgtable selftests caught this, although a fix has not been forthcoming and Christophe is AWOL at the moment, so just revert the change for now to get a working -rc3 on which we can queue patches for 5.15. A simple revert breaks the build for 32-bit PowerPC 8xx machines, which rely on the default function definitions when the corresponding page-table levels are folded, since commit `a6a8f7c4aa` ("powerpc/8xx: add support for huge pages on VMAP and VMALLOC"), eg: powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range': linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge' To avoid that, add stubs for pud_clear_huge() and pmd_clear_huge() in arch/powerpc/mm/nohash/8xx.c as suggested by Christophe. Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: `c742199a01` ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> [mpe: Fold in 8xx.c changes from Christophe and mention in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/ Link: https://lore.kernel.org/r/20210717160118.9855-1-jonathan@marek.ca Link: https://lore.kernel.org/r/87r1fs1762.fsf@mpe.ellerman.id.au Signed-off-by: Will Deacon <will@kernel.org>	2021-07-21 11:28:09 +01:00
Vladimir Oltean	8ca07176ab	net: switchdev: introduce a fanout helper for SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE Currently DSA has an issue with FDB entries pointing towards the bridge in the presence of br_fdb_replay() being called at port join and leave time. In particular, each bridge port will ask for a replay for the FDB entries pointing towards the bridge when it joins, and for another replay when it leaves. This means that for example, a bridge with 4 switch ports will notify DSA 4 times of the bridge MAC address. But if the MAC address of the bridge changes during the normal runtime of the system, the bridge notifies switchdev [ once ] of the deletion of the old MAC address as a local FDB towards the bridge, and of the insertion [ again once ] of the new MAC address as a local FDB. This is a problem, because DSA keeps the old MAC address as a host FDB entry with refcount 4 (4 ports asked for it using br_fdb_replay). So the old MAC address will not be deleted. Additionally, the new MAC address will only be installed with refcount 1, and when the first switch port leaves the bridge (leaving 3 others as still members), it will delete with it the new MAC address of the bridge from the local FDB entries kept by DSA (because the br_fdb_replay call on deletion will bring the entry's refcount from 1 to 0). So the problem, really, is that the number of br_fdb_replay() calls is not matched with the refcount that a host FDB is offloaded to DSA during normal runtime. An elegant way to solve the problem would be to make the switchdev notification emitted by br_fdb_change_mac_address() result in a host FDB kept by DSA which has a refcount exactly equal to the number of ports under that bridge. Then, no matter how many DSA ports join or leave that bridge, the host FDB entry will always be deleted when there are exactly zero remaining DSA switch ports members of the bridge. To implement the proposed solution, we remember that the switchdev objects and port attributes have some helpers provided by switchdev, which can be optionally called by drivers: switchdev_handle_port_obj_{add,del} and switchdev_handle_port_attr_set. These helpers: - fan out a switchdev object/attribute emitted for the bridge towards all the lower interfaces that pass the check_cb(). - fan out a switchdev object/attribute emitted for a bridge port that is a LAG towards all the lower interfaces that pass the check_cb(). In other words, this is the model we need for the FDB events too: something that will keep an FDB entry emitted towards a physical port as it is, but translate an FDB entry emitted towards the bridge into N FDB entries, one per physical port. Of course, there are many differences between fanning out a switchdev object (VLAN) on 3 lower interfaces of a LAG and fanning out an FDB entry on 3 lower interfaces of a LAG. Intuitively, an FDB entry towards a LAG should be treated specially, because FDB entries are unicast, we can't just install the same address towards 3 destinations. It is imaginable that drivers might want to treat this case specifically, so create some methods for this case and do not recurse into the LAG lower ports, just the bridge ports. DSA also listens for FDB entries on "foreign" interfaces, aka interfaces bridged with us which are not part of our hardware domain: think an Ethernet switch bridged with a Wi-Fi AP. For those addresses, DSA installs host FDB entries. However, there we have the same problem (those host FDB entries are installed with a refcount of only 1) and an even bigger one which we did not have with FDB entries towards the bridge: br_fdb_replay() is currently not called for FDB entries on foreign interfaces, just for the physical port and for the bridge itself. So when DSA sniffs an address learned by the software bridge towards a foreign interface like an e1000 port, and then that e1000 leaves the bridge, DSA remains with the dangling host FDB address. That will be fixed separately by replaying all FDB entries and not just the ones towards the port and the bridge. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-20 07:04:27 -07:00
Vladimir Oltean	c6451cda10	net: switchdev: introduce helper for checking dynamically learned FDB entries It is a bit difficult to understand what DSA checks when it tries to avoid installing dynamically learned addresses on foreign interfaces as local host addresses, so create a generic switchdev helper that can be reused and is generally more readable. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-20 07:04:27 -07:00
Xu Liang	8b72b301b4	net: phy: add API to read 802.3-c45 IDs Add API to read 802.3-c45 IDs so that C22/C45 mixed device can use C45 APIs without failing ID checks. Signed-off-by: Xu Liang <lxu@maxlinear.com> Acked-by: Hauke Mehrtens <hmehrtens@maxlinear.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-20 06:55:20 -07:00

1 2 3 4 5 ...

131298 commits