Commit graph

81365 commits

Author SHA1 Message Date
Sebastian Andrzej Siewior
3c118547f8 u64_stats: Disable preemption on 32bit UP+SMP PREEMPT_RT during updates.
On PREEMPT_RT the seqcount_t for synchronisation is required on 32bit
architectures even on UP because the softirq (and the threaded IRQ handler) can
be preempted.

With the seqcount_t for synchronisation, a reader with higher priority can
preempt the writer and then spin endlessly in read_seqcount_begin() while the
writer can't make progress.

To avoid such a lock up on PREEMPT_RT the writer must disable preemption during
the update. There is no need to disable interrupts because no writer is using
this API in hard-IRQ context on PREEMPT_RT.

Disable preemption on 32bit-RT within the u64_stats write section.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13 12:42:08 +00:00
Vladimir Oltean
950a419d9d net: dsa: tag_sja1105: split sja1105_tagger_data into private and public sections
The sja1105 driver messes with the tagging protocol's state when PTP RX
timestamping is enabled/disabled. This is fundamentally necessary
because the tagger needs to know what to do when it receives a PTP
packet. If RX timestamping is enabled, then a metadata follow-up frame
is expected, and this holds the (partial) timestamp. So the tagger plays
hide-and-seek with the network stack until it also gets the metadata
frame, and then presents a single packet, the timestamped PTP packet.
But when RX timestamping isn't enabled, there is no metadata frame
expected, so the hide-and-seek game must be turned off and the packet
must be delivered right away to the network stack.

Considering this, we create a pseudo isolation by devising two tagger
methods callable by the switch: one to get the RX timestamping state,
and one to set it. Since we can't export symbols between the tagger and
the switch driver, these methods are exposed through function pointers.

After this change, the public portion of the sja1105_tagger_data
contains only function pointers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:34 +00:00
Vladimir Oltean
fcbf979a5b Revert "net: dsa: move sja1110_process_meta_tstamp inside the tagging protocol driver"
This reverts commit 6d709cadfd.

The above change was done to avoid calling symbols exported by the
switch driver from the tagging protocol driver.

With the tagger-owned storage model, we have a new option on our hands,
and that is for the switch driver to provide a data consumer handler in
the form of a function pointer inside the ->connect_tag_protocol()
method. Having a function pointer avoids the problems of the exported
symbols approach.

By creating a handler for metadata frames holding TX timestamps on
SJA1110, we are able to eliminate an skb queue from the tagger data, and
replace it with a simple, and stateless, function pointer. This skb
queue is now handled exclusively by sja1105_ptp.c, which makes the code
easier to follow, as it used to be before the reverted patch.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:34 +00:00
Vladimir Oltean
c79e84866d net: dsa: tag_sja1105: convert to tagger-owned data
Currently, struct sja1105_tagger_data is a part of struct
sja1105_private, and is used by the sja1105 driver to populate dp->priv.

With the movement towards tagger-owned storage, the sja1105 driver
should not be the owner of this memory.

This change implements the connection between the sja1105 switch driver
and its tagging protocol, which means that sja1105_tagger_data no longer
stays in dp->priv but in ds->tagger_data, and that the sja1105 driver
now only populates the sja1105_port_deferred_xmit callback pointer.
The kthread worker is now the responsibility of the tagger.

The sja1105 driver also alters the tagger's state some more, especially
with regard to the PTP RX timestamping state. This will be fixed up a
bit in further changes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:33 +00:00
Vladimir Oltean
22ee9f8e40 net: dsa: sja1105: move ts_id from sja1105_tagger_data
The TX timestamp ID is incremented by the SJA1110 PTP timestamping
callback (->port_tx_timestamp) for every packet, when cloning it.
It isn't used by the tagger at all, even though it sits inside the
struct sja1105_tagger_data.

Also, serialization to this structure is currently done through
tagger_data->meta_lock, which is a cheap hack because the meta_lock
isn't used for anything else on SJA1110 (sja1105_rcv_meta_state_machine
isn't called).

This change moves ts_id from sja1105_tagger_data to sja1105_private and
introduces a dedicated spinlock for it, also in sja1105_private.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:33 +00:00
Vladimir Oltean
bfcf142522 net: dsa: sja1105: make dp->priv point directly to sja1105_tagger_data
The design of the sja1105 tagger dp->priv is that each port has a
separate struct sja1105_port, and the sp->data pointer points to a
common struct sja1105_tagger_data.

We have removed all per-port members accessible by the tagger, and now
only struct sja1105_tagger_data remains. Make dp->priv point directly to
this.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:33 +00:00
Vladimir Oltean
6f6770ab1c net: dsa: sja1105: remove hwts_tx_en from tagger data
This tagger property is in fact not used at all by the tagger, only by
the switch driver. Therefore it makes sense to be moved to
sja1105_private.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:33 +00:00
Vladimir Oltean
d38049bbe7 net: dsa: sja1105: bring deferred xmit implementation in line with ocelot-8021q
When the ocelot-8021q driver was converted to deferred xmit as part of
commit 8d5f7954b7 ("net: dsa: felix: break at first CPU port during
init and teardown"), the deferred implementation was deliberately made
subtly different from what sja1105 has.

The implementation differences lied on the following observations:

- There might be a race between these two lines in tag_sja1105.c:

       skb_queue_tail(&sp->xmit_queue, skb_get(skb));
       kthread_queue_work(sp->xmit_worker, &sp->xmit_work);

  and the skb dequeue logic in sja1105_port_deferred_xmit(). For
  example, the xmit_work might be already queued, however the work item
  has just finished walking through the skb queue. Because we don't
  check the return code from kthread_queue_work, we don't do anything if
  the work item is already queued.

  However, nobody will take that skb and send it, at least until the
  next timestampable skb is sent. This creates additional (and
  avoidable) TX timestamping latency.

  To close that race, what the ocelot-8021q driver does is it doesn't
  keep a single work item per port, and a skb timestamping queue, but
  rather dynamically allocates a work item per packet.

- It is also unnecessary to have more than one kthread that does the
  work. So delete the per-port kthread allocations and replace them with
  a single kthread which is global to the switch.

This change brings the two implementations in line by applying those
observations to the sja1105 driver as well.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:33 +00:00
Vladimir Oltean
35d9768021 net: dsa: tag_ocelot: convert to tagger-owned data
The felix driver makes very light use of dp->priv, and the tagger is
effectively stateless. dp->priv is practically only needed to set up a
callback to perform deferred xmit of PTP and STP packets using the
ocelot-8021q tagging protocol (the main ocelot tagging protocol makes no
use of dp->priv, although this driver sets up dp->priv irrespective of
actual tagging protocol in use).

struct felix_port (what used to be pointed to by dp->priv) is removed
and replaced with a two-sided structure. The public side of this
structure, visible to the switch driver, is ocelot_8021q_tagger_data.
The private side is ocelot_8021q_tagger_private, and the latter
structure physically encapsulates the former. The public half of the
tagger data structure can be accessed through a helper of the same name
(ocelot_8021q_tagger_data) which also sanity-checks the protocol
currently in use by the switch. The public/private split was requested
by Andrew Lunn.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12 12:51:33 +00:00
Jakub Kicinski
be3158290d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
bpf-next 2021-12-10 v2

We've added 115 non-merge commits during the last 26 day(s) which contain
a total of 182 files changed, 5747 insertions(+), 2564 deletions(-).

The main changes are:

1) Various samples fixes, from Alexander Lobakin.

2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov.

3) A batch of new unified APIs for libbpf, logging improvements, version
   querying, etc. Also a batch of old deprecations for old APIs and various
   bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko.

4) BPF documentation reorganization and improvements, from Christoph Hellwig
   and Dave Tucker.

5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in
   libbpf, from Hengqi Chen.

6) Verifier log fixes, from Hou Tao.

7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong.

8) Extend branch record capturing to all platforms that support it,
   from Kajol Jain.

9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi.

10) bpftool doc-generating script improvements, from Quentin Monnet.

11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet.

12) Deprecation warning fix for perf/bpf_counter, from Song Liu.

13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf,
    from Tiezhu Yang.

14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song.

15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe
    Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun,
    and others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits)
  libbpf: Add "bool skipped" to struct bpf_map
  libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
  bpftool: Switch bpf_object__load_xattr() to bpf_object__load()
  selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()
  selftests/bpf: Add test for libbpf's custom log_buf behavior
  selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()
  libbpf: Deprecate bpf_object__load_xattr()
  libbpf: Add per-program log buffer setter and getter
  libbpf: Preserve kernel error code and remove kprobe prog type guessing
  libbpf: Improve logging around BPF program loading
  libbpf: Allow passing user log setting through bpf_object_open_opts
  libbpf: Allow passing preallocated log_buf when loading BTF into kernel
  libbpf: Add OPTS-based bpf_btf_load() API
  libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
  samples/bpf: Remove unneeded variable
  bpf: Remove redundant assignment to pointer t
  selftests/bpf: Fix a compilation warning
  perf/bpf_counter: Use bpf_map_create instead of bpf_create_map
  samples: bpf: Fix 'unknown warning group' build warning on Clang
  samples: bpf: Fix xdp_sample_user.o linking with Clang
  ...
====================

Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 15:56:13 -08:00
Eric Dumazet
04a931e58d net: add netns refcount tracker to struct seq_net_private
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 06:38:26 -08:00
Eric Dumazet
9ba74e6c9e net: add networking namespace refcount tracker
We have 100+ syzbot reports about netns being dismantled too soon,
still unresolved as of today.

We think a missing get_net() or an extra put_net() is the root cause.

In order to find the bug(s), and be able to spot future ones,
this patch adds CONFIG_NET_NS_REFCNT_TRACKER and new helpers
to precisely pair all put_net() with corresponding get_net().

To use these helpers, each data structure owning a refcount
should also use a "netns_tracker" to pair the get and put.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 06:38:26 -08:00
Jakub Kicinski
3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Kees Cook
1a2fb220ed skbuff: Extract list pointers to silence compiler warnings
Under both -Warray-bounds and the object_size sanitizer, the compiler is
upset about accessing prev/next of sk_buff when the object it thinks it
is coming from is sk_buff_head. The warning is a false positive due to
the compiler taking a conservative approach, opting to warn at casting
time rather than access time.

However, in support of enabling -Warray-bounds globally (which has
found many real bugs), arrange things for sk_buff so that the compiler
can unambiguously see that there is no intention to access anything
except prev/next.  Introduce and cast to a separate struct sk_buff_list,
which contains _only_ the first two fields, silencing the warnings:

In file included from ./include/net/net_namespace.h:39,
                 from ./include/linux/netdevice.h:37,
                 from net/core/netpoll.c:17:
net/core/netpoll.c: In function 'refill_skbs':
./include/linux/skbuff.h:2086:9: warning: array subscript 'struct sk_buff[0]' is partly outside array bounds of 'struct sk_buff_head[1]' [-Warray-bounds]
 2086 |         __skb_insert(newsk, next->prev, next, list);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/core/netpoll.c:49:28: note: while referencing 'skb_pool'
   49 | static struct sk_buff_head skb_pool;
      |                            ^~~~~~~~

This change results in no executable instruction differences.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20211207062758.2324338-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 11:54:24 -08:00
Linus Torvalds
ded746bfc9 Networking fixes for 5.16-rc5, including fixes from bpf, can and netfilter.
Current release - regressions:
 
  - bpf, sockmap: re-evaluate proto ops when psock is removed from sockmap
 
 Current release - new code bugs:
 
  - bpf: fix bpf_check_mod_kfunc_call for built-in modules
 
  - ice: fixes for TC classifier offloads
 
  - vrf: don't run conntrack on vrf with !dflt qdisc
 
 Previous releases - regressions:
 
  - bpf: fix the off-by-two error in range markings
 
  - seg6: fix the iif in the IPv6 socket control block
 
  - devlink: fix netns refcount leak in devlink_nl_cmd_reload()
 
  - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
 
  - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
 
 Previous releases - always broken:
 
  - ethtool: do not perform operations on net devices being unregistered
 
  - udp: use datalen to cap max gso segments
 
  - ice: fix races in stats collection
 
  - fec: only clear interrupt of handling queue in fec_enet_rx_queue()
 
  - m_can: pci: fix incorrect reference clock rate
 
  - m_can: disable and ignore ELO interrupt
 
  - mvpp2: fix XDP rx queues registering
 
 Misc:
 
  - treewide: add missing includes masked by cgroup -> bpf.h dependency
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGyN1AACgkQMUZtbf5S
 IrtgMA/8D0qk3c75ts0hCzGXwdNdEBs+e7u1bJVPqdyU8x/ZLAp2c0EKB/7IWuxA
 CtsnbanPcmibqvQJDI1hZEBdafi43BmF5VuFSIxYC4EM/1vLoRprurXlIwL2YWki
 aWi//tyOIGBl6/ClzJ9Vm51HTJQwDmdv8GRnKAbsC1eOTM3pmmcg+6TLbDhycFEQ
 F9kkDCvyB9kWIH645QyJRH+Y5qQOvneCyQCPkkyjTgEADzV5i7YgtRol6J3QIbw3
 umPHSckCBTjMacYcCLsbhQaF2gTMgPV1basNLPMjCquJVrItE0ZaeX3MiD6nBFae
 yY5+Wt5KAZDzjERhneX8AINHoRPA/tNIahC1+ytTmsTA8Hj230FHE5hH1ajWiJ9+
 GSTBCBqjtZXce3r2Efxfzy0Kb9JwL3vDi7LS2eKQLv0zBLfYp2ry9Sp9qe4NhPkb
 OYrxws9kl5GOPvrFB5BWI9XBINciC9yC3PjIsz1noi0vD8/Hi9dPwXeAYh36fXU3
 rwRg9uAt6tvFCpwbuQ9T2rsMST0miur2cDYd8qkJtuJ7zFvc+suMXwBZyI29nF2D
 uyymIC2XStHJfAjUkFsGVUSXF5FhML9OQsqmisdQ8KdH26jMnDeMjIWJM7UWK+zY
 E/fqWT8UyS3mXWqaggid4ZbotipCwA0gxiDHuqqUGTM+dbKrzmk=
 =F6rS
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, can and netfilter.

  Current release - regressions:

   - bpf, sockmap: re-evaluate proto ops when psock is removed from
     sockmap

  Current release - new code bugs:

   - bpf: fix bpf_check_mod_kfunc_call for built-in modules

   - ice: fixes for TC classifier offloads

   - vrf: don't run conntrack on vrf with !dflt qdisc

  Previous releases - regressions:

   - bpf: fix the off-by-two error in range markings

   - seg6: fix the iif in the IPv6 socket control block

   - devlink: fix netns refcount leak in devlink_nl_cmd_reload()

   - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"

   - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports

  Previous releases - always broken:

   - ethtool: do not perform operations on net devices being
     unregistered

   - udp: use datalen to cap max gso segments

   - ice: fix races in stats collection

   - fec: only clear interrupt of handling queue in fec_enet_rx_queue()

   - m_can: pci: fix incorrect reference clock rate

   - m_can: disable and ignore ELO interrupt

   - mvpp2: fix XDP rx queues registering

  Misc:

   - treewide: add missing includes masked by cgroup -> bpf.h
     dependency"

* tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
  net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
  net: wwan: iosm: fixes unable to send AT command during mbim tx
  net: wwan: iosm: fixes net interface nonfunctional after fw flash
  net: wwan: iosm: fixes unnecessary doorbell send
  net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering
  MAINTAINERS: s390/net: remove myself as maintainer
  net/sched: fq_pie: prevent dismantle issue
  net: mana: Fix memory leak in mana_hwc_create_wq
  seg6: fix the iif in the IPv6 socket control block
  nfp: Fix memory leak in nfp_cpp_area_cache_add()
  nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
  nfc: fix segfault in nfc_genl_dump_devices_done
  udp: using datalen to cap max gso segments
  net: dsa: mv88e6xxx: error handling for serdes_power functions
  can: kvaser_usb: get CAN clock frequency from device
  can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter
  net: mvpp2: fix XDP rx queues registering
  vmxnet3: fix minimum vectors alloc issue
  net, neigh: clear whole pneigh_entry at alloc time
  net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
  ...
2021-12-09 11:26:44 -08:00
Russell King (Oracle)
001f4261fe net: phylink: use legacy_pre_march2020
Use the legacy flag to indicate whether we should operate in legacy
mode. This allows us to stop using the presence of a PCS as an
indicator to the age of the phylink user, and make PCS presence
optional.

Legacy mode involves:
1) calling mac_config() whenever the link comes up
2) calling mac_config() whenever the inband advertisement changes,
   possibly followed by a call to mac_an_restart()
3) making use of mac_an_restart()
4) making use of mac_pcs_get_state()

All the above functionality was moved to a seperate "PCS" block of
operations in March 2020.

Update the documents to indicate that the differences that this flag
makes.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 11:21:03 -08:00
Russell King (Oracle)
3e5b1fecce net: phylink: add legacy_pre_march2020 indicator
Add a boolean to phylink_config to indicate whether a driver has not
been updated for the changes in commit 7cceb599d1 ("net: phylink:
avoid mac_config calls"), and thus are reliant on the old behaviour.

We were currently keying the phylink behaviour on the presence of a
PCS, but this is sub-optimal for modern drivers that may not have a
PCS.

This commit merely introduces the new flag, but does not add any use,
since we need all legacy drivers to set this flag before it can be
used. Once these legacy drivers have been updated, we can remove this
flag.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 11:21:03 -08:00
Linus Torvalds
03090cc76e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:

 - fixes for various drivers which assume that a HID device is on USB
   transport, but that might not necessarily be the case, as the device
   can be faked by uhid. (Greg, Benjamin Tissoires)

 - fix for spurious wakeups on certain Lenovo notebooks (Thomas
   Weißschuh)

 - a few other device-specific quirks

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: Ignore battery for Elan touchscreen on Asus UX550VE
  HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested
  HID: google: add eel USB id
  HID: add USB_HID dependancy to hid-prodikeys
  HID: add USB_HID dependancy to hid-chicony
  HID: bigbenff: prevent null pointer dereference
  HID: sony: fix error path in probe
  HID: add USB_HID dependancy on some USB HID drivers
  HID: check for valid USB device for many HID drivers
  HID: wacom: fix problems when device is not a valid USB device
  HID: add hid_is_usb() function to make it simpler for USB detection
  HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
2021-12-09 11:08:19 -08:00
Sergey Ryazanov
283e6f5a81 net: wwan: make debugfs optional
Debugfs interface is optional for the regular modem use. Some distros
and users will want to disable this feature for security or kernel
size reasons. So add a configuration option that allows to completely
disable the debugfs interface of the WWAN devices.

A primary considered use case for this option was embedded firmwares.
For example, in OpenWrt, you can not completely disable debugfs, as a
lot of wireless stuff can only be configured and monitored with the
debugfs knobs. At the same time, reducing the size of a kernel and
modules is an essential task in the world of embedded software.
Disabling the WWAN and IOSM debugfs interfaces allows us to save 50K
(x86-64 build) of space for module storage. Not much, but already
considerable when you only have 16MB of storage.

So it is hard to just disable whole debugfs. Users need some fine
grained set of options to control which debugfs interface is important
and should be available and which is not.

The new configuration symbol is enabled by default and is hidden under
the EXPERT option. So a regular user would not be bothered by another
one configuration question. While an embedded distro maintainer will be
able to a little more reduce the final image size.

Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 17:58:59 -08:00
Jakub Kicinski
a43a072021 linux-can-next-for-5.17-20211208
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmGwqbcTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCpyVqK+u3vqSGtB/sF8wFDyY50dRG8WQLaeeFIFIZetHni
 NY6+9ch5xWM6aFDP3ecmGXqZtNaHgeAJVChmyALKs4Sa+5L2F66bGwHHuKu7uG0y
 42tz4XSI8B3mZgtVTJ7lKyeu4BismzhJm+iG6I/x3CXo482AxEB2UP53nYssv1vA
 8jfjH9gx95sQDXs0WLY2QDvkWS55BgZzpqqcT0K+uqeT+5Ufa2z3oLuCdQbSejNR
 3v3kizMp6mBL55z/UbEiR2G7bqZbOtzmAMdhd7g69fKnrav67qnXtyUQpGcOSMWB
 Y6/xUuwq8JoYVMQfpnzJ1KebW+uwpIyIGL/e8GOLlkVioWWfMqLz5bv0
 =7p/y
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-5.17-20211208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
can-next 2021-12-08

The first patch is by Vincent Mailhol and replaces the custom CAN
units with generic one form linux/units.h.

The next 3 patches are by Evgeny Boger and add Allwinner R40 support
to the sun4i CAN driver.

Andy Shevchenko contributes 4 patches to the hi311x CAN driver,
consisting of cleanups and converting the driver to the device
property API.

* tag 'linux-can-next-for-5.17-20211208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: hi311x: hi3110_can_probe(): convert to use dev_err_probe()
  can: hi311x: hi3110_can_probe(): make use of device property API
  can: hi311x: hi3110_can_probe(): try to get crystal clock rate from property
  can: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the input clock
  ARM: dts: sun8i: r40: add node for CAN controller
  can: sun4i_can: add support for R40 CAN controller
  dt-bindings: net: can: add support for Allwinner R40 CAN controller
  can: bittiming: replace CAN units with the generic ones from linux/units.h
====================

Link: https://lore.kernel.org/r/20211208125055.223141-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 17:06:58 -08:00
Jakub Kicinski
6efcdadc15 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2021-12-08

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).

The main changes are:

1) Fix an off-by-two error in packet range markings and also add a batch of
   new tests for coverage of these corner cases, from Maxim Mikityanskiy.

2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.

3) Fix two functional regressions and a build warning related to BTF kfunc
   for modules, from Kumar Kartikeya Dwivedi.

4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
   PREEMPT_RT kernels, from Sebastian Andrzej Siewior.

5) Add missing includes in order to be able to detangle cgroup vs bpf header
   dependencies, from Jakub Kicinski.

6) Fix regression in BPF sockmap tests caused by missing detachment of progs
   from sockets when they are removed from the map, from John Fastabend.

7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
   dispatcher, from Björn Töpel.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add selftests to cover packet access corner cases
  bpf: Fix the off-by-two error in range markings
  treewide: Add missing includes masked by cgroup -> bpf dependency
  tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
  bpf: Fix bpf_check_mod_kfunc_call for built-in modules
  bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
  mips, bpf: Fix reference to non-existing Kconfig symbol
  bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
  Documentation/locking/locktypes: Update migrate_disable() bits.
  bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
  bpf, sockmap: Attach map progs to psock early for feature probes
  bpf, x86: Fix "no previous prototype" warning
====================

Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 16:06:44 -08:00
Vladimir Oltean
d3eed0e57d net: dsa: keep the bridge_dev and bridge_num as part of the same structure
The main desire behind this is to provide coherent bridge information to
the fast path without locking.

For example, right now we set dp->bridge_dev and dp->bridge_num from
separate code paths, it is theoretically possible for a packet
transmission to read these two port properties consecutively and find a
bridge number which does not correspond with the bridge device.

Another desire is to start passing more complex bridge information to
dsa_switch_ops functions. For example, with FDB isolation, it is
expected that drivers will need to be passed the bridge which requested
an FDB/MDB entry to be offloaded, and along with that bridge_dev, the
associated bridge_num should be passed too, in case the driver might
want to implement an isolation scheme based on that number.

We already pass the {bridge_dev, bridge_num} pair to the TX forwarding
offload switch API, however we'd like to remove that and squash it into
the basic bridge join/leave API. So that means we need to pass this
pair to the bridge join/leave API.

During dsa_port_bridge_leave, first we unset dp->bridge_dev, then we
call the driver's .port_bridge_leave with what used to be our
dp->bridge_dev, but provided as an argument.

When bridge_dev and bridge_num get folded into a single structure, we
need to preserve this behavior in dsa_port_bridge_leave: we need a copy
of what used to be in dp->bridge.

Switch drivers check bridge membership by comparing dp->bridge_dev with
the provided bridge_dev, but now, if we provide the struct dsa_bridge as
a pointer, they cannot keep comparing dp->bridge to the provided
pointer, since this only points to an on-stack copy. To make this
obvious and prevent driver writers from forgetting and doing stupid
things, in this new API, the struct dsa_bridge is provided as a full
structure (not very large, contains an int and a pointer) instead of a
pointer. An explicit comparison function needs to be used to determine
bridge membership: dsa_port_offloads_bridge().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 14:31:16 -08:00
Vladimir Oltean
3f9bb0301d net: dsa: make dp->bridge_num one-based
I have seen too many bugs already due to the fact that we must encode an
invalid dp->bridge_num as a negative value, because the natural tendency
is to check that invalid value using (!dp->bridge_num). Latest example
can be seen in commit 1bec0f0506 ("net: dsa: fix bridge_num not
getting cleared after ports leaving the bridge").

Convert the existing users to assume that dp->bridge_num == 0 is the
encoding for invalid, and valid bridge numbers start from 1.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 14:31:14 -08:00
Vincent Mailhol
330c6d3bfa can: bittiming: replace CAN units with the generic ones from linux/units.h
In [1], we introduced a set of units in linux/can/bittiming.h. Since
then, generic SI prefixes were added to linux/units.h in [2]. Those
new prefixes can perfectly replace CAN specific ones.

This patch replaces all occurrences of the CAN units with their
corresponding prefix (from linux/units) and the unit (as a comment)
according to below table.

 CAN units	SI metric prefix (from linux/units) + unit (as a comment)
 ------------------------------------------------------------------------
 CAN_KBPS	KILO /* BPS */
 CAN_MBPS	MEGA /* BPS */
 CAM_MHZ	MEGA /* Hz */

The definition are then removed from linux/can/bittiming.h

[1] commit 1d7750760b ("can: bittiming: add CAN_KBPS, CAN_MBPS and
CAN_MHZ macros")

[2] commit 26471d4a6c ("units: Add SI metric prefix definitions")

Link: https://lore.kernel.org/all/20211124014536.782550-1-mailhol.vincent@wanadoo.fr
Suggested-by: Jimmy Assarsson <extja@kvaser.com>
Suggested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08 10:12:58 +01:00
Yanteng Si
a97770cc40 net: phy: Remove unnecessary indentation in the comments of phy_device
Fix warning as:

linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:543: WARNING: Unexpected indentation.
linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:544: WARNING: Block quote ends without a blank line; unexpected unindent.
linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:546: WARNING: Unexpected indentation.

Suggested-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 21:26:06 -08:00
Jakub Kicinski
150791442e wireless-drivers-next patches for v5.17
First set of patches for v5.17. The biggest change is the iwlmei
 driver for Intel's AMT devices. Also now WCN6855 support in ath11k
 should be usable.
 
 Major changes:
 
 ath10k
 
 * fetch (pre-)calibration data via nvmem subsystem
 
 ath11k
 
 * enable 802.11 power save mode in station mode for qca6390 and wcn6855
 
 * trace log support
 
 * proper board file detection for WCN6855 based on PCI ids
 
 * BSS color change support
 
 rtw88
 
 * add debugfs file to force lowest basic rate
 
 * add quirk to disable PCI ASPM on HP 250 G7 Notebook PC
 
 mwifiex
 
 * add quirk to disable deep sleep with certain hardware revision in
   Surface Book 2 devices
 
 iwlwifi
 
 * add iwlmei driver for co-operating with Intel's Active Management
   Technology (AMT) devices
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmGvclcRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZtK6wgAoOoT83JKMreXlXXhVegqlJbC3HyElF5r
 DJlpDDJkJUT9airol2nd0yFfP+5WFyQPrt5shsQmqz43U4jlgfFpFXZIjQufK+gn
 VAGvVGfsanRXEFlsVgFZeSZvAEyEyNSggxADC03Ky0xtiCGc89r2o71jD3HA/ZzO
 1X8gbKH7YLWj4G/GQkrKsvIAwzoZXT7nwQSdW73M8QVzk4OSNhLBLdiqKYq0yViM
 7Ea2Vj27hiyk/RXNUZHy+bKa8vKN5sA91VHJ836aPZBQ4OLotGzF3AgHfgIhIpdr
 hI4BVJbngpjQho1EkCnZZuISPes0YQWJB5hK5xpL98yuIL4wyJRfeQ==
 =I8Dj
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.17

First set of patches for v5.17. The biggest change is the iwlmei
driver for Intel's AMT devices. Also now WCN6855 support in ath11k
should be usable.

Major changes:

ath10k
 * fetch (pre-)calibration data via nvmem subsystem

ath11k
 * enable 802.11 power save mode in station mode for qca6390 and wcn6855
 * trace log support
 * proper board file detection for WCN6855 based on PCI ids
 * BSS color change support

rtw88
 * add debugfs file to force lowest basic rate
 * add quirk to disable PCI ASPM on HP 250 G7 Notebook PC

mwifiex
 * add quirk to disable deep sleep with certain hardware revision in
  Surface Book 2 devices

iwlwifi
 * add iwlmei driver for co-operating with Intel's Active Management
   Technology (AMT) devices

* tag 'wireless-drivers-next-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next: (87 commits)
  iwlwifi: mei: fix linking when tracing is not enabled
  rtlwifi: rtl8192de: Style clean-ups
  mwl8k: Use named struct for memcpy() region
  intersil: Use struct_group() for memcpy() region
  libertas_tf: Use struct_group() for memcpy() region
  libertas: Use struct_group() for memcpy() region
  wlcore: no need to initialise statics to false
  rsi: Fix out-of-bounds read in rsi_read_pkt()
  rsi: Fix use-after-free in rsi_rx_done_handler()
  brcmfmac: Configure keep-alive packet on suspend
  wilc1000: remove '-Wunused-but-set-variable' warning in chip_wakeup()
  iwlwifi: mvm: read the rfkill state and feed it to iwlmei
  iwlwifi: mvm: add vendor commands needed for iwlmei
  iwlwifi: integrate with iwlmei
  iwlwifi: mei: add debugfs hooks
  iwlwifi: mei: add the driver to allow cooperation with CSME
  mei: bus: add client dma interface
  mwifiex: Ignore BTCOEX events from the 88W8897 firmware
  mwifiex: Ensure the version string from the firmware is 0-terminated
  mwifiex: Add quirk to disable deep sleep with certain hardware revision
  ...
====================

Link: https://lore.kernel.org/r/20211207144211.A9949C341C1@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 21:01:18 -08:00
Eric Dumazet
f12bf6f3f9 net: watchdog: add net device refcount tracker
Add a netdevice_tracker inside struct net_device, to track
the self reference when a device has an active watchdog timer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:44:58 -08:00
Eric Dumazet
19c9ebf6ed vlan: add net device refcount tracker
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:44:58 -08:00
Eric Dumazet
08f0b22d73 net: eql: add net device refcount tracker
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:44:58 -08:00
Maxim Galaganov
6fadaa5658 tcp: expose __tcp_sock_set_cork and __tcp_sock_set_nodelay
Expose __tcp_sock_set_cork() and __tcp_sock_set_nodelay() for use in
MPTCP setsockopt code -- namely for syncing MPTCP socket options with
subflows inside sync_socket_options() while already holding the subflow
socket lock.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Maxim Galaganov <max@internet.ru>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 11:36:30 -08:00
Eric Dumazet
45cac67545 net: fix recent csum changes
Vladimir reported csum issues after my recent change in skb_postpull_rcsum()

Issue here is the following:

initial skb->csum is the csum of

[part to be pulled][rest of packet]

Old code:
 skb->csum = csum_sub(skb->csum, csum_partial(pull, pull_length, 0));

New code:
 skb->csum = ~csum_partial(pull, pull_length, ~skb->csum);

This is broken if the csum of [pulled part]
happens to be equal to skb->csum, because end
result of skb->csum is 0 in new code, instead
of being 0xffffffff

David Laight suggested to use

skb->csum = -csum_partial(pull, pull_length, -skb->csum);

I based my patches on existing code present in include/net/seg6.h,
update_csum_diff4() and update_csum_diff16() which might need
a similar fix.

I guess that my tests, mostly pulling 40 bytes of IPv6 header
were not providing enough entropy to hit this bug.

v2: added wsum_negate() to make sparse happy.

Fixes: 29c3002644 ("net: optimize skb_postpull_rcsum()")
Fixes: 0bd28476f6 ("gro: optimize skb_gro_postpull_rcsum()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: David Lebrun <dlebrun@google.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211204045356.3659278-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:26:46 -08:00
Eric Dumazet
5fa5ae6058 netpoll: add net device refcount tracker to struct netpoll
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:06:02 -08:00
Eric Dumazet
42120a8643 ipmr, ip6mr: add net device refcount tracker to struct vif_device
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:06:02 -08:00
Eric Dumazet
63f13937cb net: linkwatch: add net device refcount tracker
Add a netdevice_tracker inside struct net_device, to track
the self reference when a device is in lweventlist.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:44 -08:00
Eric Dumazet
c04438f58d ipv4: add net device refcount tracker to struct in_device
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:11 -08:00
Eric Dumazet
9038c32000 net: dst: add net device refcount tracking to dst_entry
We want to track all dev_hold()/dev_put() to ease leak hunting.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:10 -08:00
Eric Dumazet
0b688f24b7 net: add net device refcount tracker to struct netdev_queue
This will help debugging pesky netdev reference leaks.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:10 -08:00
Eric Dumazet
80e8921b2b net: add net device refcount tracker to struct netdev_rx_queue
This helps debugging net device refcount leaks.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:10 -08:00
Eric Dumazet
4d92b95ff2 net: add net device refcount tracker infrastructure
net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

This patch adds dev_hold_track() and dev_put_track().

To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.

netdevice_tracker dev_tracker;
...
dev_hold_track(dev, &dev_tracker, GFP_ATOMIC);
...
dev_put_track(dev, &dev_tracker);

Whenever a leak happens, we will get precise stack traces
of the point dev_hold_track() happened, at device dismantle phase.

We will also get a stack trace if too many dev_put_track() for the same
netdevice_tracker are attempted.

This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:07 -08:00
Eric Dumazet
4e66934eaa lib: add reference counting tracking infrastructure
It can be hard to track where references are taken and released.

In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.

This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.

This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.

This adds both cpu and memory costs, and thus should probably be
used with care.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:04:44 -08:00
Linus Torvalds
b806bec538 regulator: Documentation fix for v5.16
A fix for bitrot in the documentation for protection interrupts that
 crept in as the code was revised during review.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmGuLzQACgkQJNaLcl1U
 h9CooQf/R65eO6RU0FqfEOP81kMHicfucJF4zcFiEJepGF8Do5ULx7eZpD+KdyLw
 zH2xPVqXlOCX9kEFgd/evYftZG3W9UMFBXUWVwNSl29hqWUkX+VuLZOK2LTpQzNO
 uH+UrYM1z4wg/6j8UrOVeROlQ28fvNwimNLWoNOy4yXJIiXLZpM04oA+2kNxyfmC
 6gG0XP3c/6zGQkqX+qKu9R3Ni5qZzqYESvgJ5d4aCZgTgU6SseDVxeA6awmbPcSf
 35d9T0tX3QNBRQT37WS0Rsdv2aVwcAuHiXwaBNm5hpcy1NrCzRKRxvIj3lQLUHmV
 5/ERi03h62tyNESyd2ik/Wve6I1Igg==
 =gA31
 -----END PGP SIGNATURE-----

Merge tag 'regulator-fix-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "Documentation fix for v5.17.

  A fix for bitrot in the documentation for protection interrupts that
  crept in as the code was revised during review"

* tag 'regulator-fix-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Update protection IRQ helper docs
2021-12-06 10:14:12 -08:00
Linus Torvalds
1d213767dc - Properly init uclamp_flags of a runqueue, on first enqueuing
- Fix preempt= callback return values
 
 - Correct utime/stime resource usage reporting on nohz_full to return
 the proper times instead of shorter ones
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmGspTcACgkQEsHwGGHe
 VUqHhBAAhEd9DMoJwREKDCDMqc3pttNYpTpSVo1K6oBTsOh7mwEilPdlmsTl239V
 jRocVJST+/JmJ424j7t0Sp42tREMKNlbyf+ddvr0oUwi0mLUnN6J83NU4WK4Jisf
 gyXFIkeMR+/W6/LO7gDdq/+rlRDtJcllwHoOm1yyiy5Zc0qDrcy6CjgP5/9hEsh6
 xvRvPOXbeZZVA+a+n+G9xGN836aBe1VptoABbdAlOSTiOvAVkS95UCb9rfPTvMtq
 /71jjZMmhTxGUhg5oLpgvfRRZE608X6b2RCTcAPKa5mfMpN5YMQLcD9G0f8XZjkq
 iOO/+arE6XQJlTzhAEsGxkSXaVweYxRHHP1yAlWYlWV/xGhoaAyq/tXE1KusAnng
 16/eTbrPb1eawpI6p1AAScCQuF/TlYZCMqjbFVhViXM5Rkd6jrii9vz/JnkdokGR
 3TH0n4WAJkdZeg18WS3B0eIt6zDTvxbR9g5ap2/10xYnYHMNdHXGH8A+5Grw9/Ln
 Qsv0V43OjdUK2tVuIHYblx1X9dOlLdpTEg9FCfjiZTQVor1pTwcbG62qNMozanlf
 lQqI6f63E0jugHqhrqrfBvl4lUuoajN5SvXfBNFDIzxwWBGSdr+hJQXstUatfSZZ
 MdmJX+Dk5cAk4CpQQ1ofPvYkS3Ade0vxaL4H++KHYtRvpPvxCXA=
 =XQFF
 -----END PGP SIGNATURE-----

Merge tag 'sched_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Borislav Petkov:

 - Properly init uclamp_flags of a runqueue, on first enqueuing

 - Fix preempt= callback return values

 - Correct utime/stime resource usage reporting on nohz_full to return
   the proper times instead of shorter ones

* tag 'sched_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/uclamp: Fix rq->uclamp_max not set on first enqueue
  preempt/dynamic: Fix setup_preempt_mode() return value
  sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full
2021-12-05 08:53:31 -08:00
Hou Tao
866de40744 bpf: Disallow BPF_LOG_KERNEL log level for bpf(BPF_BTF_LOAD)
BPF_LOG_KERNEL is only used internally, so disallow bpf_btf_load()
to set log level as BPF_LOG_KERNEL. The same checking has already
been done in bpf_check(), so factor out a helper to check the
validity of log attributes and use it in both places.

Fixes: 8580ac9404 ("bpf: Process in-kernel BTF")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211203053001.740945-1-houtao1@huawei.com
2021-12-04 10:10:24 -08:00
Manish Chopra
823163ba6e qed*: esl priv flag support through ethtool
ESL(Enhanced System Lockdown) was designed to lock PCI adapter firmware
images and prevent changes to critical non-volatile configuration data
so that uncontrolled, malicious or unintentional modification to the
adapters are avoided, ensuring it's operational state. Once this feature is
enabled, the device is locked, rejecting any modification to non-volatile
images. Once unlocked, the protection is off such that firmware and
non-volatile configurations may be altered.

Driver just reflects the capability and status of this through
the ethtool private flag.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-03 18:24:21 -08:00
Manish Chopra
0cc3a80179 qed*: enhance tx timeout debug info
This patch add some new qed APIs to query status block
info and report various data to MFW on tx timeout event

Along with that it enhances qede to dump more debug logs
(not just specific to the queue which was reported by stack)
on tx timeout which includes various other basic metadata about
all tx queues and other info (like status block etc.)

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-03 18:24:20 -08:00
Jakub Kicinski
8581fd402a treewide: Add missing includes masked by cgroup -> bpf dependency
cgroup.h (therefore swap.h, therefore half of the universe)
includes bpf.h which in turn includes module.h and slab.h.
Since we're about to get rid of that dependency we need
to clean things up.

v2: drop the cpu.h include from cacheinfo.h, it's not necessary
and it makes riscv sensitive to ordering of include files.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Krzysztof Wilczyński <kw@linux.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/all/20211120035253.72074-1-kuba@kernel.org/  # v1
Link: https://lore.kernel.org/all/20211120165528.197359-1-kuba@kernel.org/ # cacheinfo discussion
Link: https://lore.kernel.org/bpf/20211202203400.1208663-1-kuba@kernel.org
2021-12-03 10:58:13 -08:00
Avihai Horon
b247f32aec net/mlx5: Dynamically resize flow counters query buffer
The flow counters bulk query buffer is allocated once during
mlx5_fc_init_stats(). For PFs and VFs this buffer usually takes a little
more than 512KB of memory, which is aligned to the next power of 2, to
1MB. For SFs, this buffer is reduced and takes around 128 Bytes.

The buffer size determines the maximum number of flow counters that
can be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.

There are cases that don't use many flow counters and don't need a big
buffer (e.g. SFs, VFs). Since this size is critical with large scale,
in these cases the buffer size should be reduced.

In order to reduce memory consumption while maintaining query
performance, change the query buffer's allocation scheme to the
following:
- First allocate the buffer with small initial size.
- If the number of counters surpasses the initial size, resize the
  buffer to the maximum size.

The buffer only grows and isn't shrank, because users with many flow
counters don't care about the buffer size and we don't want to add
resize overhead if the current number of counters drops.

This solution is preferable to the current one, which is less accurate
and only addresses SFs.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-02 16:53:16 -08:00
Kumar Kartikeya Dwivedi
d9847eb8be bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
Vinicius Costa Gomes reported [0] that build fails when
CONFIG_DEBUG_INFO_BTF is enabled and CONFIG_BPF_SYSCALL is disabled.
This leads to btf.c not being compiled, and then no symbol being present
in vmlinux for the declarations in btf.h. Since BTF is not useful
without enabling BPF subsystem, disallow this combination.

However, theoretically disabling both now could still fail, as the
symbol for kfunc_btf_id_list variables is not available. This isn't a
problem as the compiler usually optimizes the whole register/unregister
call, but at lower optimization levels it can fail the build in linking
stage.

Fix that by adding dummy variables so that modules taking address of
them still work, but the whole thing is a noop.

  [0]: https://lore.kernel.org/bpf/20211110205418.332403-1-vinicius.gomes@intel.com

Fixes: 14f267d95f ("bpf: btf: Introduce helpers for dynamic BTF set registration")
Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122144742.477787-2-memxor@gmail.com
2021-12-02 13:39:46 -08:00
Jakub Kicinski
fc993be36f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02 11:44:56 -08:00
Linus Torvalds
a51e3ac43d Networking fixes for 5.16-rc4, including fixes from wireless,
and wireguard.
 
 Current release - regressions:
 
  - smc: keep smc_close_final()'s error code during active close
 
 Current release - new code bugs:
 
  - iwlwifi: various static checker fixes (int overflow, leaks, missing
    error codes)
 
  - rtw89: fix size of firmware header before transfer, avoid crash
 
  - mt76: fix timestamp check in tx_status; fix pktid leak;
 
  - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()
 
 Previous releases - regressions:
 
  - smc: fix list corruption in smc_lgr_cleanup_early
 
  - ipv4: convert fib_num_tclassid_users to atomic_t
 
 Previous releases - always broken:
 
  - tls: fix authentication failure in CCM mode
 
  - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
    incorrect processing
 
  - dsa: mv88e6xxx: fixes for various device errata
 
  - rds: correct socket tunable error in rds_tcp_tune()
 
  - ipv6: fix memory leak in fib6_rule_suppress
 
  - wireguard: reset peer src endpoint when netns exits
 
  - wireguard: improve resilience to DoS around incoming handshakes
 
  - tcp: fix page frag corruption on page fault which involves TCP
 
  - mpls: fix missing attributes in delete notifications
 
  - mt7915: fix NULL pointer dereference with ad-hoc mode
 
 Misc:
 
  - rt2x00: be more lenient about EPROTO errors during start
 
  - mlx4_en: update reported link modes for 1/10G
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGo6KMACgkQMUZtbf5S
 Iruq+BAAhRMTcL+X4eRIL9lIEvWEKHMKLCA/pUaQWNlSxsbEeydJWRNSc37Cs3pv
 z0rYIEhfieOz8+QXS1Kq+yZwJVXjA8Jvgld2qw9V9Y5w+N15Mj8RUtG8NaUw+o4E
 U8PCAbaamnbzyPdlCYcVHschd8MD0BCXm5+jAGeIyCP+KQCnhEpFZv+bvHaWzQR8
 FZLYrhXTR9W0DFsrKG9+haqFwFBR3+VDqTGILhaHPE+r2o6wKQQ5yJMhd8fq0SaC
 nne8zDkGuFEeW3cxj0VbhdRMyrV97eMK+P4dZ2P0Z7xcrsed9/2XJkNQNJGtuRnj
 GGJV6utupJRAY+lnJNUkifqS4Wt7KirfZsSsyaKKa4plyoVgtGhiqEYFTQVLagC0
 CF4Qe+3qks6rESbRu6PEFN4oWSkMEhRzdcDpg7vBDURUKcrRs9fgtNUJUCi8nKFA
 A/F/K+7IHBoBZyQYZbYmnGdNsNauKbF3rUY3hwMGBfQZIr/wsql9+jhtLsmZX77m
 V/L7KzT2jhhNc5gDzuLps25K3P7snKuV19qQSsY2LeuGj1x3gmWZ+ibN6ynhB+Gt
 KBnfHDMTI/4aciZBIbwJmwfeRhCF8tOfw0WZdUP7FRIXukbfVuDBoznWLz4BKKgf
 GSYSTNDs/PHZQo5vCQ/onvTwUK5aN6zoPNy5ih7lp9YZBYtN2TI=
 =r0Jh
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireless, and wireguard.

  Mostly scattered driver changes this week, with one big clump in
  mv88e6xxx. Nothing of note, really.

  Current release - regressions:

   - smc: keep smc_close_final()'s error code during active close

  Current release - new code bugs:

   - iwlwifi: various static checker fixes (int overflow, leaks, missing
     error codes)

   - rtw89: fix size of firmware header before transfer, avoid crash

   - mt76: fix timestamp check in tx_status; fix pktid leak;

   - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()

  Previous releases - regressions:

   - smc: fix list corruption in smc_lgr_cleanup_early

   - ipv4: convert fib_num_tclassid_users to atomic_t

  Previous releases - always broken:

   - tls: fix authentication failure in CCM mode

   - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
     incorrect processing

   - dsa: mv88e6xxx: fixes for various device errata

   - rds: correct socket tunable error in rds_tcp_tune()

   - ipv6: fix memory leak in fib6_rule_suppress

   - wireguard: reset peer src endpoint when netns exits

   - wireguard: improve resilience to DoS around incoming handshakes

   - tcp: fix page frag corruption on page fault which involves TCP

   - mpls: fix missing attributes in delete notifications

   - mt7915: fix NULL pointer dereference with ad-hoc mode

  Misc:

   - rt2x00: be more lenient about EPROTO errors during start

   - mlx4_en: update reported link modes for 1/10G"

* tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
  net: dsa: b53: Add SPI ID table
  gro: Fix inconsistent indenting
  selftests: net: Correct case name
  net/rds: correct socket tunable error in rds_tcp_tune()
  mctp: Don't let RTM_DELROUTE delete local routes
  net/smc: Keep smc_close_final rc during active close
  ibmvnic: drop bad optimization in reuse_tx_pools()
  ibmvnic: drop bad optimization in reuse_rx_pools()
  net/smc: fix wrong list_del in smc_lgr_cleanup_early
  Fix Comment of ETH_P_802_3_MIN
  ethernet: aquantia: Try MAC address from device tree
  ipv4: convert fib_num_tclassid_users to atomic_t
  net: avoid uninit-value from tcp_conn_request
  net: annotate data-races on txq->xmit_lock_owner
  octeontx2-af: Fix a memleak bug in rvu_mbox_init()
  net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
  vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
  net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
  net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
  net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
  ...
2021-12-02 11:22:06 -08:00