Commit graph

75300 commits

Author SHA1 Message Date
Jakub Kicinski
846c3c9cfe wireless-drivers-next patches for v5.11
First set of patches for v5.11. rtw88 getting improvements to work
 better with Bluetooth and other driver also getting some new features.
 mhi-ath11k-immutable branch was pulled from mhi tree to avoid
 conflicts with mhi tree.
 
 Major changes:
 
 rtw88
 
 * major bluetooth co-existance improvements
 
 wilc1000
 
 * Wi-Fi Multimedia (WMM) support
 
 ath11k
 
 * Fast Initial Link Setup (FILS) discovery and unsolicited broadcast
   probe response support
 
 * qcom,ath11k-calibration-variant Device Tree setting
 
 * cold boot calibration support
 
 * new DFS region: JP
 
 wnc36xx
 
 * enable connection monitoring and keepalive in firmware
 
 ath10k
 
 * firmware IRAM recovery feature
 
 mhi
 
 * merge mhi-ath11k-immutable branch to make MHI API change go smoothly
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJfyTQyAAoJEG4XJFUm622bCdcIAIyVnqdW7pnoDmWIyQmAEnD9
 vGARkzghPHXnufpOzohyDdxT12X9klhrxSVIgzEgH1/pl3i1PpnF6KXyGFCC44Lw
 wrLXhQygPzmIW1IZtJJE3G72WExXoRjWx6LD1I7C7oEIduqFixXADmK2tKzFp795
 Jxum+sOeT6+Dk1OvO/fIroBHX73mRE9zAuiTIMpt2G1j8uXs9QVfcTbTrUshLASN
 0sX9J6JutltBuM4G7+bFpVzKnLnlQ7ebUaF6nvTCQsgHWZwkS7yAubSWX9sFohbR
 UXgQHNE83s/esOg7nBxAfqTKP8mbxsobmxZtxE5GR5vFY5FJDxqP9Zc2KzPp39w=
 =CbX/
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-2020-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.11

First set of patches for v5.11. rtw88 getting improvements to work
better with Bluetooth and other driver also getting some new features.
mhi-ath11k-immutable branch was pulled from mhi tree to avoid
conflicts with mhi tree.

Major changes:

rtw88
 * major bluetooth co-existance improvements
wilc1000
 * Wi-Fi Multimedia (WMM) support
ath11k
 * Fast Initial Link Setup (FILS) discovery and unsolicited broadcast
   probe response support
 * qcom,ath11k-calibration-variant Device Tree setting
 * cold boot calibration support
 * new DFS region: JP
wnc36xx
 * enable connection monitoring and keepalive in firmware
ath10k
 * firmware IRAM recovery feature
mhi
 * merge mhi-ath11k-immutable branch to make MHI API change go smoothly

* tag 'wireless-drivers-next-2020-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next: (180 commits)
  wl1251: remove trailing semicolon in macro definition
  airo: remove trailing semicolon in macro definition
  wilc1000: added queue support for WMM
  wilc1000: call complete() for failure in wilc_wlan_txq_add_cfg_pkt()
  wilc1000: free resource in wilc_wlan_txq_add_mgmt_pkt() for failure path
  wilc1000: free resource in wilc_wlan_txq_add_net_pkt() for failure path
  wilc1000: added 'ndo_set_mac_address' callback support
  brcmfmac: expose firmware config files through modinfo
  wlcore: Switch to using the new API kobj_to_dev()
  rtw88: coex: add feature to enhance HID coexistence performance
  rtw88: coex: upgrade coexistence A2DP mechanism
  rtw88: coex: add action for coexistence in hardware initial
  rtw88: coex: add function to avoid cck lock
  rtw88: coex: change the coexistence mechanism for WLAN connected
  rtw88: coex: change the coexistence mechanism for HID
  rtw88: coex: update AFH information while in free-run mode
  rtw88: coex: update the mechanism for A2DP + PAN
  rtw88: coex: add debug message
  rtw88: coex: run coexistence when WLAN entering/leaving LPS
  Revert "rtl8xxxu: Add Buffalo WI-U3-866D to list of supported devices"
  ...
====================

Link: https://lore.kernel.org/r/20201203185732.9CFA5C433ED@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 10:56:37 -08:00
Jakub Kicinski
a1dd1d8697 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-12-03

The main changes are:

1) Support BTF in kernel modules, from Andrii.

2) Introduce preferred busy-polling, from Björn.

3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh.

4) Memcg-based memory accounting for bpf objects, from Roman.

5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits)
  selftests/bpf: Fix invalid use of strncat in test_sockmap
  libbpf: Use memcpy instead of strncpy to please GCC
  selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module
  selftests/bpf: Add tp_btf CO-RE reloc test for modules
  libbpf: Support attachment of BPF tracing programs to kernel modules
  libbpf: Factor out low-level BPF program loading helper
  bpf: Allow to specify kernel module BTFs when attaching BPF programs
  bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
  selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF
  selftests/bpf: Add support for marking sub-tests as skipped
  selftests/bpf: Add bpf_testmod kernel module for testing
  libbpf: Add kernel module BTF support for CO-RE relocations
  libbpf: Refactor CO-RE relocs to not assume a single BTF object
  libbpf: Add internal helper to load BTF data by FD
  bpf: Keep module's btf_data_size intact after load
  bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address()
  selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP
  bpf: Adds support for setting window clamp
  samples/bpf: Fix spelling mistake "recieving" -> "receiving"
  bpf: Fix cold build of test_progs-no_alu32
  ...
====================

Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 07:48:12 -08:00
Andrii Nakryiko
290248a5b7 bpf: Allow to specify kernel module BTFs when attaching BPF programs
Add ability for user-space programs to specify non-vmlinux BTF when attaching
BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this,
attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD
of a module or vmlinux BTF object. For backwards compatibility reasons,
0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org
2020-12-03 17:38:21 -08:00
Andrii Nakryiko
22dc4a0f5e bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
Remove a permeating assumption thoughout BPF verifier of vmlinux BTF. Instead,
wherever BTF type IDs are involved, also track the instance of struct btf that
goes along with the type ID. This allows to gradually add support for kernel
module BTFs and using/tracking module types across BPF helper calls and
registers.

This patch also renames btf_id() function to btf_obj_id() to minimize naming
clash with using btf_id to denote BTF *type* ID, rather than BTF *object*'s ID.

Also, altough btf_vmlinux can't get destructed and thus doesn't need
refcounting, module BTFs need that, so apply BTF refcounting universally when
BPF program is using BTF-powered attachment (tp_btf, fentry/fexit, etc). This
makes for simpler clean up code.

Now that BTF type ID is not enough to uniquely identify a BTF type, extend BPF
trampoline key to include BTF object ID. To differentiate that from target
program BPF ID, set 31st bit of type ID. BTF type IDs (at least currently) are
not allowed to take full 32 bits, so there is no danger of confusing that bit
with a valid BTF type ID.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-10-andrii@kernel.org
2020-12-03 17:38:21 -08:00
Jakub Kicinski
55fd59b003 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:
	drivers/net/ethernet/ibm/ibmvnic.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-03 15:44:09 -08:00
Linus Torvalds
bbe2ba04c5 Networking fixes for 5.10-rc7, including fixes from bpf, netfilter,
wireless drivers, wireless mesh and can.
 
 Current release - regressions:
 
  - mt76: usb: fix crash on device removal
 
 Current release - always broken:
 
  - xsk: Fix umem cleanup from wrong context in socket destruct
 
 Previous release - regressions:
 
  - net: ip6_gre: set dev->hard_header_len when using header_ops
 
  - ipv4: Fix TOS mask in inet_rtm_getroute()
 
  - net, xsk: Avoid taking multiple skbuff references
 
 Previous release - always broken:
 
  - net/x25: prevent a couple of overflows
 
  - netfilter: ipset: prevent uninit-value in hash_ip6_add
 
  - geneve: pull IP header before ECN decapsulation
 
  - mpls: ensure LSE is pullable in TC and openvswitch paths
 
  - vxlan: respect needed_headroom of lower device
 
  - batman-adv: Consider fragmentation for needed packet headroom
 
  - can: drivers: don't count arbitration loss as an error
 
  - netfilter: bridge: reset skb->pkt_type after POST_ROUTING
               traversal
 
  - inet_ecn: Fix endianness of checksum update when setting ECT(1)
 
  - ibmvnic: fix various corner cases around reset handling
 
  - net/mlx5: fix rejecting unsupported Connect-X6DX SW steering
 
  - net/mlx5: Enforce HW TX csum offload with kTLS
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/JS3sACgkQMUZtbf5S
 Irs7QA/9ELcJ2gklCJwrlVGXNhUddGpZH9OX2K3WL/c1ZzgARt3e0jkO88lY25Tk
 tXTRTelx7xzHUNBmXJhBx1Wj8H+S/5A1FLMdl3ZqkeFrvrYIUxSvnbRoFB0CALrV
 OXYtsd7P86BHrT5hQNGte9V5JV5LpYAUvH6+QSD7mWOzul0gtIcKEJ7claypYuRT
 hm+wt2ENSRU3bNNwOVG8SoA1CEFFXePfyqEr6cBTs+1/OyzYV4880LvJXVdwwOx0
 DogwsPt5L53Y2uoOaFKVRr2SUVzOi9Y79FAX3rfqIqoi89xcbK6ihHsb4ldGxkAy
 ILZEU/Y4lB6YsdtJjGGrB7cPhiWOl0AzPYgmOczWHw/5LMzgWKEt6H/JvkjGSlQJ
 pXixi6/cmsQOS6o5ydQT9Iu5qLMOOduv2mmQmOPJHkq8/SgiYTuTUiJkXgL8pPv+
 Mq4Qm4JL+6aB2WL0NNzlqjVnIbFQmmGdrYGWdQnSeTN6X4T/uFQIz4fSQlQmFils
 qw1MBLZfhgjc4npfC0j5LdcABhC0BwEGelTJBKnc6+MbZlDTv2NdzP7wldzpjalR
 /a0/hLHsDMCkft92BQ3jp0C1LSikSYAhBPRJLSQiQbxzBv5JnDr6S5WpBTtBoDKT
 LdEqlS+mo0GwRK3pm2vSHQ4iVJY9v0PV0SbeJXH/SlJGYieUqJc=
 =HskU
 -----END PGP SIGNATURE-----

Merge tag 'net-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.10-rc7, including fixes from bpf, netfilter,
  wireless drivers, wireless mesh and can.

  Current release - regressions:

   - mt76: usb: fix crash on device removal

  Current release - always broken:

   - xsk: Fix umem cleanup from wrong context in socket destruct

  Previous release - regressions:

   - net: ip6_gre: set dev->hard_header_len when using header_ops

   - ipv4: Fix TOS mask in inet_rtm_getroute()

   - net, xsk: Avoid taking multiple skbuff references

  Previous release - always broken:

   - net/x25: prevent a couple of overflows

   - netfilter: ipset: prevent uninit-value in hash_ip6_add

   - geneve: pull IP header before ECN decapsulation

   - mpls: ensure LSE is pullable in TC and openvswitch paths

   - vxlan: respect needed_headroom of lower device

   - batman-adv: Consider fragmentation for needed packet headroom

   - can: drivers: don't count arbitration loss as an error

   - netfilter: bridge: reset skb->pkt_type after POST_ROUTING traversal

   - inet_ecn: Fix endianness of checksum update when setting ECT(1)

   - ibmvnic: fix various corner cases around reset handling

   - net/mlx5: fix rejecting unsupported Connect-X6DX SW steering

   - net/mlx5: Enforce HW TX csum offload with kTLS"

* tag 'net-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
  net/mlx5: DR, Proper handling of unsupported Connect-X6DX SW steering
  net/mlx5e: kTLS, Enforce HW TX csum offload with kTLS
  net: mlx5e: fix fs_tcp.c build when IPV6 is not enabled
  net/mlx5: Fix wrong address reclaim when command interface is down
  net/sched: act_mpls: ensure LSE is pullable before reading it
  net: openvswitch: ensure LSE is pullable before reading it
  net: skbuff: ensure LSE is pullable before decrementing the MPLS ttl
  net: mvpp2: Fix error return code in mvpp2_open()
  chelsio/chtls: fix a double free in chtls_setkey()
  rtw88: debug: Fix uninitialized memory in debugfs code
  vxlan: fix error return code in __vxlan_dev_create()
  net: pasemi: fix error return code in pasemi_mac_open()
  cxgb3: fix error return code in t3_sge_alloc_qset()
  net/x25: prevent a couple of overflows
  dpaa_eth: copy timestamp fields to new skb in A-050385 workaround
  net: ip6_gre: set dev->hard_header_len when using header_ops
  mt76: usb: fix crash on device removal
  iwlwifi: pcie: add some missing entries for AX210
  iwlwifi: pcie: invert values of NO_160 device config entries
  iwlwifi: pcie: add one missing entry for AX210
  ...
2020-12-03 13:10:11 -08:00
Florian Westphal
41dd9596d6 security: add const qualifier to struct sock in various places
A followup change to tcp_request_sock_op would have to drop the 'const'
qualifier from the 'route_req' function as the
'security_inet_conn_request' call is moved there - and that function
expects a 'struct sock *'.

However, it turns out its also possible to add a const qualifier to
security_inet_conn_request instead.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-03 12:56:03 -08:00
Yevgeny Kliteynik
d421e466c2 net/mlx5: DR, Proper handling of unsupported Connect-X6DX SW steering
STEs format for Connect-X5 and Connect-X6DX different. Currently, on
Connext-X6DX the SW steering would break at some point when building STEs
w/o giving a proper error message. Fix this by checking the STE format of
the current device when initializing domain: add mlx5_ifc definitions for
Connect-X6DX SW steering, read FW capability to get the current format
version, and check this version when domain is being created.

Fixes: 26d688e33f ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-03 11:18:36 -08:00
Thomas Karlsson
d4bff72c84 macvlan: Support for high multicast packet rate
Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000.

This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.

The change:
This patch make the queue length adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.

The queue length is specified as a request per macvlan
using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.

The actual used queue length will then be the maximum of
any macvlan connected to the same port. The actual used
queue length for the port can be retrieved (read only)
by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.

This will be followed up by a patch to iproute2
in order to adjust the parameter from userspace.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
Link: https://lore.kernel.org/r/dd4673b2-7eab-edda-6815-85c67ce87f63@paneda.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-03 08:21:29 -08:00
Roman Gushchin
3ac1f01b43 bpf: Eliminate rlimit-based memory accounting for bpf progs
Do not use rlimit-based memory accounting for bpf progs. It has been
replaced with memcg-based memory accounting.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201201215900.3569844-34-guro@fb.com
2020-12-02 18:32:47 -08:00
Roman Gushchin
80ee81e040 bpf: Eliminate rlimit-based memory accounting infra for bpf maps
Remove rlimit-based accounting infrastructure code, which is not used
anymore.

To provide a backward compatibility, use an approximation of the
bpf map memory footprint as a "memlock" value, available to a user
via map info. The approximation is based on the maximal number of
elements and key and value sizes.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201201215900.3569844-33-guro@fb.com
2020-12-02 18:32:47 -08:00
Roman Gushchin
48edc1f78a bpf: Prepare for memcg-based memory accounting for bpf maps
Bpf maps can be updated from an interrupt context and in such
case there is no process which can be charged. It makes the memory
accounting of bpf maps non-trivial.

Fortunately, after commit 4127c6504f ("mm: kmem: enable kernel
memcg accounting from interrupt contexts") and commit b87d8cefe4
("mm, memcg: rework remote charging API to support nesting")
it's finally possible.

To make the ownership model simple and consistent, when the map
is created, the memory cgroup of the current process is recorded.
All subsequent allocations related to the bpf map are charged to
the same memory cgroup. It includes allocations made by any processes
(even if they do belong to a different cgroup) and from interrupts.

This commit introduces 3 new helpers, which will be used by following
commits to enable the accounting of bpf maps memory:
  - bpf_map_kmalloc_node()
  - bpf_map_kzalloc()
  - bpf_map_alloc_percpu()

They are wrapping popular memory allocation functions. They set
the active memory cgroup to the map's memory cgroup and add
__GFP_ACCOUNT to the passed gfp flags. Then they call into
the corresponding memory allocation function and restore
the original active memory cgroup.

These helpers are supposed to use everywhere except the map creation
path. During the map creation when the map structure is allocated by
itself, it cannot be passed to those helpers. In those cases default
memory allocation function will be used with the __GFP_ACCOUNT flag.

Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201201215900.3569844-7-guro@fb.com
2020-12-02 18:32:34 -08:00
Roman Gushchin
18b2db3b03 mm: Convert page kmemcg type to a page memcg flag
PageKmemcg flag is currently defined as a page type (like buddy, offline,
table and guard).  Semantically it means that the page was accounted as a
kernel memory by the page allocator and has to be uncharged on the
release.

As a side effect of defining the flag as a page type, the accounted page
can't be mapped to userspace (look at page_has_type() and comments above).
In particular, this blocks the accounting of vmalloc-backed memory used
by some bpf maps, because these maps do map the memory to userspace.

One option is to fix it by complicating the access to page->mapcount,
which provides some free bits for page->page_type.

But it's way better to move this flag into page->memcg_data flags.
Indeed, the flag makes no sense without enabled memory cgroups and memory
cgroup pointer set in particular.

This commit replaces PageKmemcg() and __SetPageKmemcg() with
PageMemcgKmem() and an open-coded OR operation setting the memcg pointer
with the MEMCG_DATA_KMEM bit.  __ClearPageKmemcg() can be simple deleted,
as the whole memcg_data is zeroed at once.

As a bonus, on !CONFIG_MEMCG build the PageMemcgKmem() check will be
compiled out.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: https://lkml.kernel.org/r/20201027001657.3398190-5-guro@fb.com
Link: https://lore.kernel.org/bpf/20201201215900.3569844-5-guro@fb.com
2020-12-02 18:28:06 -08:00
Roman Gushchin
87944e2992 mm: Introduce page memcg flags
The lowest bit in page->memcg_data is used to distinguish between struct
memory_cgroup pointer and a pointer to a objcgs array.  All checks and
modifications of this bit are open-coded.

Let's formalize it using page memcg flags, defined in enum
page_memcg_data_flags.

Additional flags might be added later.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: https://lkml.kernel.org/r/20201027001657.3398190-4-guro@fb.com
Link: https://lore.kernel.org/bpf/20201201215900.3569844-4-guro@fb.com
2020-12-02 18:28:06 -08:00
Roman Gushchin
270c6a7146 mm: memcontrol/slab: Use helpers to access slab page's memcg_data
To gather all direct accesses to struct page's memcg_data field in one
place, let's introduce 3 new helpers to use in the slab accounting code:

  struct obj_cgroup **page_objcgs(struct page *page);
  struct obj_cgroup **page_objcgs_check(struct page *page);
  bool set_page_objcgs(struct page *page, struct obj_cgroup **objcgs);

They are similar to the corresponding API for generic pages, except that
the setter can return false, indicating that the value has been already
set from a different thread.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/20201027001657.3398190-3-guro@fb.com
Link: https://lore.kernel.org/bpf/20201201215900.3569844-3-guro@fb.com
2020-12-02 18:28:06 -08:00
Roman Gushchin
bcfe06bf26 mm: memcontrol: Use helpers to read page's memcg data
Patch series "mm: allow mapping accounted kernel pages to userspace", v6.

Currently a non-slab kernel page which has been charged to a memory cgroup
can't be mapped to userspace.  The underlying reason is simple: PageKmemcg
flag is defined as a page type (like buddy, offline, etc), so it takes a
bit from a page->mapped counter.  Pages with a type set can't be mapped to
userspace.

But in general the kmemcg flag has nothing to do with mapping to
userspace.  It only means that the page has been accounted by the page
allocator, so it has to be properly uncharged on release.

Some bpf maps are mapping the vmalloc-based memory to userspace, and their
memory can't be accounted because of this implementation detail.

This patchset removes this limitation by moving the PageKmemcg flag into
one of the free bits of the page->mem_cgroup pointer.  Also it formalizes
accesses to the page->mem_cgroup and page->obj_cgroups using new helpers,
adds several checks and removes a couple of obsolete functions.  As the
result the code became more robust with fewer open-coded bit tricks.

This patch (of 4):

Currently there are many open-coded reads of the page->mem_cgroup pointer,
as well as a couple of read helpers, which are barely used.

It creates an obstacle on a way to reuse some bits of the pointer for
storing additional bits of information.  In fact, we already do this for
slab pages, where the last bit indicates that a pointer has an attached
vector of objcg pointers instead of a regular memcg pointer.

This commits uses 2 existing helpers and introduces a new helper to
converts all read sides to calls of these helpers:
  struct mem_cgroup *page_memcg(struct page *page);
  struct mem_cgroup *page_memcg_rcu(struct page *page);
  struct mem_cgroup *page_memcg_check(struct page *page);

page_memcg_check() is intended to be used in cases when the page can be a
slab page and have a memcg pointer pointing at objcg vector.  It does
check the lowest bit, and if set, returns NULL.  page_memcg() contains a
VM_BUG_ON_PAGE() check for the page not being a slab page.

To make sure nobody uses a direct access, struct page's
mem_cgroup/obj_cgroups is converted to unsigned long memcg_data.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: https://lkml.kernel.org/r/20201027001657.3398190-1-guro@fb.com
Link: https://lkml.kernel.org/r/20201027001657.3398190-2-guro@fb.com
Link: https://lore.kernel.org/bpf/20201201215900.3569844-2-guro@fb.com
2020-12-02 18:28:05 -08:00
Jakub Kicinski
32e417024f mlx5-next-2020-12-02
Low level mlx5 updates required by both netdev and rdma trees:
 
   net/mlx5: Treat host PF vport as other (non eswitch manager) vport
   net/mlx5: Enable host PF HCA after eswitch is initialized
   net/mlx5: Rename peer_pf to host_pf
   net/mlx5: Make API mlx5_core_is_ecpf accept const pointer
   net/mlx5: Export steering related functions
   net/mlx5: Expose other function ifc bits
   net/mlx5: Expose IP-in-IP TX and RX capability bits
   net/mlx5: Update the hardware interface definition for vhca state
   net/mlx5: Update the list of the PCI supported devices
   net/mlx5: Avoid exposing driver internal command helpers
   net/mlx5: Add ts_cqe_to_dest_cqn related bits
   net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits
   net/mlx5: Check dr mask size against mlx5_match_param size
   net/mlx5: Add sampler destination type
   net/mlx5: Add sample offload hardware bits and structures
 
 Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl/IOZcACgkQSD+KveBX
 +j4J8wgAuxwflrYrbCWXV7LE08J7T7ZHRDE+jEbaZ0Zp9mOsYDDpcifpKwy2EVRf
 RKcpMYh/GzAljmEpeWIAlMxmlpXhKWXTDruWCx73r1jvdXf/RU24/zQHa6BjeiDo
 rMB8bgiW4a66+z4LcN/U6ahbVM5gScBNEt2sS1OIi9ZInngGVo9FgfhYMpERPNcH
 3+mcHulCnGBNbbLwoTllOcgbxexn+xoByukg5Z0ddBJp007DMjzBIWDpDS0y2HaT
 jGo1LYONgRc3zoGVmdeu9F+tSsWBIgsaiyGxKj1T/8sZUaNz2TKj9VOiYIj9BLff
 cp6GRc88k7HWA4tImSHQiLbK6cx+yA==
 =mjvI
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-next-2020-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
mlx5-next-2020-12-02

Low level mlx5 updates required by both netdev and rdma trees.

* tag 'mlx5-next-2020-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Treat host PF vport as other (non eswitch manager) vport
  net/mlx5: Enable host PF HCA after eswitch is initialized
  net/mlx5: Rename peer_pf to host_pf
  net/mlx5: Make API mlx5_core_is_ecpf accept const pointer
  net/mlx5: Export steering related functions
  net/mlx5: Expose other function ifc bits
  net/mlx5: Expose IP-in-IP TX and RX capability bits
  net/mlx5: Update the hardware interface definition for vhca state
  net/mlx5: Update the list of the PCI supported devices
  net/mlx5: Avoid exposing driver internal command helpers
  net/mlx5: Add ts_cqe_to_dest_cqn related bits
  net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits
  net/mlx5: Check dr mask size against mlx5_match_param size
  net/mlx5: Add sampler destination type
  net/mlx5: Add sample offload hardware bits and structures
====================

Link: https://lore.kernel.org/r/20201203011010.213440-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-02 17:29:23 -08:00
Stanislav Fomichev
427167c0b0 bpf: Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks
I have to now lock/unlock socket for the bind hook execution.
That shouldn't cause any overhead because the socket is unbound
and shouldn't receive any traffic.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20201202172516.3483656-3-sdf@google.com
2020-12-02 13:25:11 -08:00
Vladimir Oltean
c214550ff8 net: delete __dev_getfirstbyhwtype
The last user of the RTNL brother of dev_getfirstbyhwtype (the latter
being synchronized under RCU) has been deleted in commit b4db2b35fc
("afs: Use core kernel UUID generation").

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20201129200550.2433401-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 15:40:01 -08:00
Linus Torvalds
ef6900acc8 Tracing fixes for 5.10-rc6
- Use correct timestamp variable for ring buffer write stamp update
  - Fix up before stamp and write stamp when crossing ring buffer sub
    buffers
  - Keep a zero delta in ring buffer in slow path if cmpxchg fails
  - Fix trace_printk static buffer for archs that care
  - Fix ftrace record accounting for ftrace ops with trampolines
  - Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
  - Remove WARN_ON in hwlat tracer that triggers on something that is OK
  - Make "my_tramp" trampoline in ftrace direct sample code global
  - Fixes in the bootconfig tool for better alignment management
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX8ZzghQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qg0JAQCII1bDQyF3APLlNFRqfHf3bTo7Zl5z
 WaUd1Cd7JkY+WAD/eF1dWjN0JRtfU+oRlk6UZ4oNmp8WMJvQ7oV26ub2egE=
 =lts8
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Use correct timestamp variable for ring buffer write stamp update

 - Fix up before stamp and write stamp when crossing ring buffer sub
   buffers

 - Keep a zero delta in ring buffer in slow path if cmpxchg fails

 - Fix trace_printk static buffer for archs that care

 - Fix ftrace record accounting for ftrace ops with trampolines

 - Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency

 - Remove WARN_ON in hwlat tracer that triggers on something that is OK

 - Make "my_tramp" trampoline in ftrace direct sample code global

 - Fixes in the bootconfig tool for better alignment management

* tag 'trace-v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Always check to put back before stamp when crossing pages
  ftrace: Fix DYNAMIC_FTRACE_WITH_DIRECT_CALLS dependency
  ftrace: Fix updating FTRACE_FL_TRAMP
  tracing: Fix alignment of static buffer
  tracing: Remove WARN_ON in start_thread()
  samples/ftrace: Mark my_tramp[12]? global
  ring-buffer: Set the right timestamp in the slow path of __rb_reserve_next()
  ring-buffer: Update write stamp with the correct ts
  docs: bootconfig: Update file format on initrd image
  tools/bootconfig: Align the bootconfig applied initrd image size to 4
  tools/bootconfig: Fix to check the write failure correctly
  tools/bootconfig: Fix errno reference after printf()
2020-12-01 15:30:18 -08:00
Marco Elver
fa69ee5aa4 net: switch to storing KCOV handle directly in sk_buff
It turns out that usage of skb extensions can cause memory leaks. Ido
Schimmel reported: "[...] there are instances that blindly overwrite
'skb->extensions' by invoking skb_copy_header() after __alloc_skb()."

Therefore, give up on using skb extensions for KCOV handle, and instead
directly store kcov_handle in sk_buff.

Fixes: 6370cc3bbd ("net: add kcov handle to skb extensions")
Fixes: 85ce50d337 ("net: kcov: don't select SKB_EXTENSIONS when there is no NET")
Fixes: 97f53a08cb ("net: linux/skbuff.h: combine SKB_EXTENSIONS + KCOV handling")
Link: https://lore.kernel.org/linux-wireless/20201121160941.GA485907@shredder.lan/
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20201125224840.2014773-1-elver@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 11:26:19 -08:00
Björn Töpel
7fd3253a7d net: Introduce preferred busy-polling
The existing busy-polling mode, enabled by the SO_BUSY_POLL socket
option or system-wide using the /proc/sys/net/core/busy_read knob, is
an opportunistic. That means that if the NAPI context is not
scheduled, it will poll it. If, after busy-polling, the budget is
exceeded the busy-polling logic will schedule the NAPI onto the
regular softirq handling.

One implication of the behavior above is that a busy/heavy loaded NAPI
context will never enter/allow for busy-polling. Some applications
prefer that most NAPI processing would be done by busy-polling.

This series adds a new socket option, SO_PREFER_BUSY_POLL, that works
in concert with the napi_defer_hard_irqs and gro_flush_timeout
knobs. The napi_defer_hard_irqs and gro_flush_timeout knobs were
introduced in commit 6f8b12d661 ("net: napi: add hard irqs deferral
feature"), and allows for a user to defer interrupts to be enabled and
instead schedule the NAPI context from a watchdog timer. When a user
enables the SO_PREFER_BUSY_POLL, again with the other knobs enabled,
and the NAPI context is being processed by a softirq, the softirq NAPI
processing will exit early to allow the busy-polling to be performed.

If the application stops performing busy-polling via a system call,
the watchdog timer defined by gro_flush_timeout will timeout, and
regular softirq handling will resume.

In summary; Heavy traffic applications that prefer busy-polling over
softirq processing should use this option.

Example usage:

  $ echo 2 | sudo tee /sys/class/net/ens785f1/napi_defer_hard_irqs
  $ echo 200000 | sudo tee /sys/class/net/ens785f1/gro_flush_timeout

Note that the timeout should be larger than the userspace processing
window, otherwise the watchdog will timeout and fall back to regular
softirq processing.

Enable the SO_BUSY_POLL/SO_PREFER_BUSY_POLL options on your socket.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/20201130185205.196029-2-bjorn.topel@gmail.com
2020-12-01 00:09:25 +01:00
Jakub Kicinski
3771b82242 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-11-28

1) Do not reference the skb for xsk's generic TX side since when looped
   back into RX it might crash in generic XDP, from Björn Töpel.

2) Fix umem cleanup on a partially set up xsk socket when being destroyed,
   from Magnus Karlsson.

3) Fix an incorrect netdev reference count when failing xsk_bind() operation,
   from Marek Majtyka.

4) Fix bpftool to set an error code on failed calloc() in build_btf_type_table(),
   from Zhen Lei.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add MAINTAINERS entry for BPF LSM
  bpftool: Fix error return value in build_btf_type_table
  net, xsk: Avoid taking multiple skbuff references
  xsk: Fix incorrect netdev reference count
  xsk: Fix umem cleanup bug at socket destruct
  MAINTAINERS: Update XDP and AF_XDP entries
====================

Link: https://lore.kernel.org/r/20201128005104.1205-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-28 12:04:57 -08:00
Jakub Kicinski
5c39f26e67 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in CAN, keep the net-next + the byteswap wrapper.

Conflicts:
	drivers/net/can/usb/gs_usb.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 18:25:27 -08:00
Linus Torvalds
c84e1efae0 asm-generic: add correct MAX_POSSIBLE_PHYSMEM_BITS setting
This is a single bugfix for a bug that Stefan Agner found on 32-bit
 Arm, but that exists on several other architectures.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl/BZx4ACgkQmmx57+YA
 GNnSPA/9HK0dwaGuXHRxKpt2ShHt5kOmixlmRJszYmuSIJde945EJNTP/+2l2Qs2
 TDXmOU8pdZSAZX2EHLLEksNsnhUoTBWzsn4WxHRTNVc2cYuHHA6PKMdAPV136ag/
 U0gnC7eCYKCDM3A1A/G4437PDI3vfm0Wzo6Biikxwhi861bshxjVs3DapDQw5+Zn
 bOS8CCNpmwpDC26ZAfIY8es32Hg063GhdJXQ01uqkaZLJdRn7ui6bkv18vi+b3gM
 QLeaubDT4+oH+HpJJpFZ01iugBFah5iJtg/JtWyap/LJSkelyjU9Gr7qrrpI7M3t
 hfDzk7fRjHO1XPn2bDc4InWJEoekE9vde5M0QKn3ID8dFO1M5tNqov2uH40m4fQD
 UM7irWe0BmP9Nms5LV7dMWChPn8FUEr34ZYAwF9B+YPL1Ec6GGn8mA/E0Iz8pre0
 MUgv5LZ8LYdeYvSSpXrgBkgv2pwni5rTc7/K9KtvGdkLQ3rOuihPBbPyR0YTYa8f
 UkboIky80lcx/uyhhu+OxWxe0q+Ug8WF87UkPIDDhsaF9W2DoErIwiCQhqS+AKs4
 9DiCBzLgF6mZ11ijK73DtLNBmQnKdssV9Bs5lnOO0XqYdoqiQ5gRJWrixvI0OWSa
 WGt66UV481rV/Oxlt1A/1lynYkZU0b121fFFB/EPbuFuUwZu9So=
 =xgYa
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic fix from Arnd Bergmann:
 "Add correct MAX_POSSIBLE_PHYSMEM_BITS setting to asm-generic.

  This is a single bugfix for a bug that Stefan Agner found on 32-bit
  Arm, but that exists on several other architectures"

* tag 'asm-generic-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed
2020-11-27 15:00:35 -08:00
Linus Torvalds
303bc93472 ARM: SoC fixes for v5.10, part 3
Another set of patches for devicetree files and Arm
 SoC specific drivers:
 
  - A fix for OP-TEE shared memory on non-SMP systems
 
  - multiple code fixes for the OMAP platform, including
    one regression for the CPSW network driver and a few
    runtime warning fixes
 
  - Some DT patches for the Rockchip RK3399 platform,
    in particular fixing the MMC device ordering that
    recently became nondeterministic with async probe.
 
  - Multiple DT fixes for the Tegra platform, including
    a regression fix for suspend/resume on TX2
 
  - A regression fix for a user-triggered fault in the
    NXP dpio driver
 
  - A regression fix for a bug caused by an earlier bug
    fix in the xilinx firmware driver
 
  - Two more DTC warning fixes
 
  - Sylvain Lemieux steps down as maintainer for the
    NXP LPC32xx platform
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl/BZkEACgkQmmx57+YA
 GNkl/xAAiFj6+N5iTVv/l1p28x+YhID/A9ahCCexEpvq+sU/PCFmGub0chw/ns6W
 xIM2+YAFcuIbfPt7J/eYG+q1FkQl3N+hsJ5yi9NOC4ugQZtq8Ag7ZKEzlLMCtBUU
 XD7Y6cz7BD/4FZ4XIn9w84qh7LoehOgH1MKW/wt+sCBpkwMroqmVmF/N9XzcruaB
 LX4M9bt5Ibt+fc+rkC4ka03jq41DCquQsSjSroLzSuFNkAy+OwvrOTJH2fLgqqlM
 Eu6//AYQzE8hz+2kHkpc5mCqfxvRN6HcITwgopQwMhXn092WoPu5zRPiUrILw2CK
 TtEhMDfJ1Q60A2NSuCDAho98rTsPEf4zMrql7rzDwo0M0wdv0xE6hWKglAVhPywT
 Hs1SFmd1Z3+7n5IcwufU3JHVJ9VViJxXJK3WrLU9skm+CfZQpGOmXrmqVTfNtkb2
 BK58guf11APvojzZ0nb8FGkxn/mCgCgNCMwRna1rjvtzQsnL+d8t7Cz5hXDABgpy
 QVXDhrGT2cLTizGGRcMIuHMs2pNB25Hj4mgLsuarDwofZktFOWtRARoz6ialv/MI
 H6ff5/8nOxgVFyE7GYjuVz69igBfnb4NYN/O3A0d6MroiTzZbwNXbq3B2vVWmVa5
 hBSqj54n6Me2Bk/q0KPJlpL8qRKUuoU9lKTzY7ZyaQVHZzEO5+c=
 =yGOq
 -----END PGP SIGNATURE-----

Merge tag 'arm-soc-fixes-v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Another set of patches for devicetree files and Arm SoC specific
  drivers:

   - A fix for OP-TEE shared memory on non-SMP systems

   - multiple code fixes for the OMAP platform, including one regression
     for the CPSW network driver and a few runtime warning fixes

   - Some DT patches for the Rockchip RK3399 platform, in particular
     fixing the MMC device ordering that recently became
     nondeterministic with async probe.

   - Multiple DT fixes for the Tegra platform, including a regression
     fix for suspend/resume on TX2

   - A regression fix for a user-triggered fault in the NXP dpio driver

   - A regression fix for a bug caused by an earlier bug fix in the
     xilinx firmware driver

   - Two more DTC warning fixes

   - Sylvain Lemieux steps down as maintainer for the NXP LPC32xx
     platform"

* tag 'arm-soc-fixes-v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
  arm64: tegra: Fix Tegra234 VDK node names
  arm64: tegra: Wrong AON HSP reg property size
  arm64: tegra: Fix USB_VBUS_EN0 regulator on Jetson TX1
  arm64: tegra: Correct the UART for Jetson Xavier NX
  arm64: tegra: Disable the ACONNECT for Jetson TX2
  optee: add writeback to valid memory type
  firmware: xilinx: Use hash-table for api feature check
  firmware: xilinx: Fix SD DLL node reset issue
  soc: fsl: dpio: Get the cpumask through cpumask_of(cpu)
  ARM: dts: dra76x: m_can: fix order of clocks
  bus: ti-sysc: suppress err msg for timers used as clockevent/source
  MAINTAINERS: Remove myself as LPC32xx maintainers
  arm64: dts: qcom: clear the warnings caused by empty dma-ranges
  arm64: dts: broadcom: clear the warnings caused by empty dma-ranges
  ARM: dts: am437x-l4: fix compatible for cpsw switch dt node
  arm64: dts: rockchip: Reorder LED triggers from mmc devices on rk3399-roc-pc.
  arm64: dts: rockchip: Assign a fixed index to mmc devices on rk3399 boards.
  arm64: dts: rockchip: Remove system-power-controller from pmic on Odroid Go Advance
  arm64: dts: rockchip: fix NanoPi R2S GMAC clock name
  ARM: OMAP2+: Manage MPU state properly for omap_enter_idle_coupled()
  ...
2020-11-27 14:48:03 -08:00
Linus Torvalds
79c0c1f038 Networking fixes for 5.10-rc6, including fixes from the WiFi driver,
and can subtrees.
 
 Current release - regressions:
 
  - gro_cells: reduce number of synchronize_net() calls
 
  - ch_ktls: release a lock before jumping to an error path
 
 Current release - always broken:
 
  - tcp: Allow full IP tos/IPv6 tclass to be reflected in L3 header
 
 Previous release - regressions:
 
  - net/tls: fix missing received data after fast remote close
 
  - vsock/virtio: discard packets only when socket is really closed
 
  - sock: set sk_err to ee_errno on dequeue from errq
 
  - cxgb4: fix the panic caused by non smac rewrite
 
 Previous release - always broken:
 
  - tcp: fix corner cases around setting ECN with BPF selection
         of congestion control
 
  - tcp: fix race condition when creating child sockets from
         syncookies on loopback interface
 
  - usbnet: ipheth: fix connectivity with iOS 14
 
  - tun: honor IOCB_NOWAIT flag
 
  - net/packet: fix packet receive on L3 devices without visible
                hard header
 
  - devlink: Make sure devlink instance and port are in same net
             namespace
 
  - net: openvswitch: fix TTL decrement action netlink message format
 
  - bonding: wait for sysfs kobject destruction before freeing
             struct slave
 
  - net: stmmac: fix upstream patch applied to the wrong context
 
  - bnxt_en: fix return value and unwind in probe error paths
 
 Misc:
 
  - devlink: add extra layer of categorization to the reload stats
             uAPI before it's released
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/BWHkACgkQMUZtbf5S
 IruVBQ/+KnCXbgYxDGEcKXhQf4wb0n6RQS899aUshfw4XIr63Qbj2r2EnShF7sDm
 vCYsl0DmmNh9UYvt38jMoUi8yCrqzIzofEuS1lE+kREE5+mKeJIOn55z61V+pWNJ
 KaYnqixCwx8o78Fk0BqMENrzMluVMblcQ73OkigrGO0nXjcmG/a5ZFr1EAPpfovs
 GJ6kyZZxV5E0+vJqm05DYebwvqi4/1VoHJx98c4nm7egF7j3K5x6UhNJV82wfqJ4
 OIZsaTpT98GuPC7/wCFs7EkdEgSsDYLqo/hiPI9Yg/DRv6aQXtmTwNkkaTnxPsKl
 FscLOZkslDca1sW9kwClvuXO4JsV+mDoz8G1iRhe1tBTkvL8NHGm/QpOkThQBG4S
 Rf5n4Lv416U7sONEdEV6Rq9ZWAa/hFXr9A3XO8tHZd2LvtchXYgFoUxHp8Aap17R
 ICyBcXKvDSEsV8NGTRe8cbgFxaPiF1iSvYPmuLf9ANgbNJo7WYPNxld8Z7lczqzM
 ReO+kHj71FyY5tbcMuOu15iJp710UYTvFo+e5fMQR9K80Pha98CWVh7hm2vuLf9q
 FQ2v/l8kVev4v42lULXdVq/sbHDUPC+WPjumhL2LSZrUhgwMGKcI5V1KK0CEdprK
 vjqkTyeKv9+nVxnXBVZO1nJNNK0F+zlNqbZtNnOiQAJK22HLDts=
 =1tQm
 -----END PGP SIGNATURE-----

Merge tag 'net-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.10-rc6, including fixes from the WiFi driver,
  and CAN subtrees.

  Current release - regressions:

   - gro_cells: reduce number of synchronize_net() calls

   - ch_ktls: release a lock before jumping to an error path

  Current release - always broken:

   - tcp: Allow full IP tos/IPv6 tclass to be reflected in L3 header

  Previous release - regressions:

   - net/tls: fix missing received data after fast remote close

   - vsock/virtio: discard packets only when socket is really closed

   - sock: set sk_err to ee_errno on dequeue from errq

   - cxgb4: fix the panic caused by non smac rewrite

  Previous release - always broken:

   - tcp: fix corner cases around setting ECN with BPF selection of
     congestion control

   - tcp: fix race condition when creating child sockets from syncookies
     on loopback interface

   - usbnet: ipheth: fix connectivity with iOS 14

   - tun: honor IOCB_NOWAIT flag

   - net/packet: fix packet receive on L3 devices without visible hard
     header

   - devlink: Make sure devlink instance and port are in same net
     namespace

   - net: openvswitch: fix TTL decrement action netlink message format

   - bonding: wait for sysfs kobject destruction before freeing struct
     slave

   - net: stmmac: fix upstream patch applied to the wrong context

   - bnxt_en: fix return value and unwind in probe error paths

  Misc:

   - devlink: add extra layer of categorization to the reload stats uAPI
     before it's released"

* tag 'net-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
  sock: set sk_err to ee_errno on dequeue from errq
  mptcp: fix NULL ptr dereference on bad MPJ
  net: openvswitch: fix TTL decrement action netlink message format
  can: af_can: can_rx_unregister(): remove WARN() statement from list operation sanity check
  can: m_can: m_can_dev_setup(): add support for bosch mcan version 3.3.0
  can: m_can: fix nominal bitiming tseg2 min for version >= 3.1
  can: m_can: m_can_open(): remove IRQF_TRIGGER_FALLING from request_threaded_irq()'s flags
  can: mcp251xfd: mcp251xfd_probe(): bail out if no IRQ was given
  can: gs_usb: fix endianess problem with candleLight firmware
  ch_ktls: lock is not freed
  net/tls: Protect from calling tls_dev_del for TLS RX twice
  devlink: Make sure devlink instance and port are in same net namespace
  devlink: Hold rtnl lock while reading netdev attributes
  ptp: clockmatrix: bug fix for idtcm_strverscmp
  enetc: Let the hardware auto-advance the taprio base-time of 0
  gro_cells: reduce number of synchronize_net() calls
  net: stmmac: fix incorrect merge of patch upstream
  ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init
  Documentation: netdev-FAQ: suggest how to post co-dependent series
  ibmvnic: enhance resetting status check during module exit
  ...
2020-11-27 14:38:02 -08:00
Arnd Bergmann
454a079b38 Fixes for omaps for various issues noticed during the -rc cycle:
- Earlier omap4 cpuidle fix was incomplete and needs to use a
   configured idle state instead
 
 - Fix am4 cpsw driver compatible to avoid invalid resource error
   for the legacy driver
 
 - Two kconfig fixes for genpd support that we added for for v5.10
   for proper location of the option and adding missing option
 
 - Fix ti-sysc reset status checking on enabling modules to ignore
   quirky modules with reset status only usable when the quirk is
   activated during reset. Also fix bogus resetdone warning for
   cpsw and modules with no sysst register reset status bit
 
 - Suppress a ti-sysc warning for timers reserved as system timers
 
 - Fix the ordering of clocks for dra7 m_can
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAl/AozMRHHRvbnlAYXRv
 bWlkZS5jb20ACgkQG9Q+yVyrpXOy0xAA1epgUY0nYa8jsNc4Mv7rM0hkQv2Onm3p
 ZH03iAORiK2mTd/C2QSsq2ByhrpTJh4gVvlQ081JrE6Fh7Zec3YBDcKWCHSjGlOx
 IJcf7q+UJ8zxgBHolT36K68CNQB4U4ro+bUHCBa69ujB90LRM9qJQIexnLEH/tlH
 ApBvQ10ARHe2eod528LJn8R5g6LX3GGvN7CTcfOBIbmvg0OnFpJEzxQ2uK0fZj0K
 m6kawCJ6iHvNvVxzRR35DDQHZaWU/ihPlvuIhthhFdF+26TdEO1qHXU5s8qmZLWN
 WDq3zZ7dd5wuXQNtKiaUUhh3+H4VKpJDzrEVVZytDEoIgMKI8L6FROo6UW5wl7vP
 o3WJpBDchZvAAQWtAuGr0KqresWv9RX8zYwkHDXG6MK2WdErB3GKx750nvSRj1X6
 ZLrFyggYWkAmqCO7moqilKIx6lJPKWBqIRrGyYl8vsVHb9odtS9tB3b3Jnsjjm9H
 D6ahtxdtaZX6kr3o+XszZW15jMxdLDVfvvXzQ+wuIk2MwEUrsf11XkGFVLaTHAD3
 s42Np9+d9KXigOtjrm/+gv7AzIprjZBfG02EUXlcSSoUmIwWZfhMn9KnwMgFzMnC
 gDVnLZ4mEXG40r1gQJeEt5QKgcsZaT/NTcTUNb6PwMUNR8h906P4J2MZYFQrsOjJ
 4jtXur7AOmw=
 =KSUM
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v5.10/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes

Fixes for omaps for various issues noticed during the -rc cycle:

- Earlier omap4 cpuidle fix was incomplete and needs to use a
  configured idle state instead

- Fix am4 cpsw driver compatible to avoid invalid resource error
  for the legacy driver

- Two kconfig fixes for genpd support that we added for for v5.10
  for proper location of the option and adding missing option

- Fix ti-sysc reset status checking on enabling modules to ignore
  quirky modules with reset status only usable when the quirk is
  activated during reset. Also fix bogus resetdone warning for
  cpsw and modules with no sysst register reset status bit

- Suppress a ti-sysc warning for timers reserved as system timers

- Fix the ordering of clocks for dra7 m_can

* tag 'omap-for-v5.10/fixes-rc5-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: dra76x: m_can: fix order of clocks
  bus: ti-sysc: suppress err msg for timers used as clockevent/source
  ARM: dts: am437x-l4: fix compatible for cpsw switch dt node
  ARM: OMAP2+: Manage MPU state properly for omap_enter_idle_coupled()
  bus: ti-sysc: Fix bogus resetdone warning on enable for cpsw
  bus: ti-sysc: Fix reset status check for modules with quirks
  ARM: OMAP2+: Fix missing select PM_GENERIC_DOMAINS_OF
  ARM: OMAP2+: Fix location for select PM_GENERIC_DOMAINS

Link: https://lore.kernel.org/r/pull-1606460270-864284@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-11-27 13:18:40 +01:00
Parav Pandit
617b860c18 net/mlx5: Treat host PF vport as other (non eswitch manager) vport
When eswitch manager is running on ECPF, host PF should be treated
as non eswitch manager port, similar to other VF vports.
Fail to do so, results in firmware treating PF's vport as ECPF
vport for eswitch ACL tables.
Non zero check to figure out if a given vport is other vport or not
is not sufficient becase PF vport number = 0 on ECPF.
Hence, create esw acl tables with an attribute of other vport.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Parav Pandit
8a90f2fc67 net/mlx5: Rename peer_pf to host_pf
To match the hardware spec, rename peer_pf to host_pf.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Parav Pandit
3b1e58aa83 net/mlx5: Make API mlx5_core_is_ecpf accept const pointer
Subsequent patch implements helper API which has mlx5_core_dev
as const pointer, make its caller API too const *.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Yishai Hadas
959af5569f net/mlx5: Expose other function ifc bits
Expose other function ifc bits to enable setting HCA caps on behalf of
other function.

In addition, expose vhca_resource_manager bit to control whether the
other function functionality is supported by firmware.

Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Aya Levin
21adf05d45 net/mlx5: Expose IP-in-IP TX and RX capability bits
Expose FW indication that it supports stateless offloads for IP over IP
tunneled packets per direction. In some HW like ConnectX-4 IP-in-IP
support is not symmetric, it supports steering on the inner header but
it doesn't TX-Checksum and TSO. Add IP-in-IP capability per direction to
cover this case as well.

Note: only if both indications are turned on, the global
tunnel_stateless_ip_over_ip is on too.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Parav Pandit
349125ba23 net/mlx5: Update the hardware interface definition for vhca state
Update the hardware interface definitions to query and modify vhca
state, related EQE and event code.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Parav Pandit
e5dfe6b57e net/mlx5: Avoid exposing driver internal command helpers
mlx5 command init and cleanup routines are internal to mlx5_core driver.
Hence, avoid exporting them and move their definition to mlx5_core
driver's internal file mlx5_core.h

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Eran Ben Elisha
59d2ae1db8 net/mlx5: Add ts_cqe_to_dest_cqn related bits
Add a bit in HCA capabilities layout to indicate if ts_cqe_to_dest_cqn is
supported.

In addition, add ts_cqe_to_dest_cqn field to SQ context, for driver to
set the actual CQN.

Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Muhammad Sammar
7da3ad6c26 net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits
Add misc4 match params to enable matching on prog_sample_fields.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Chris Mi
3873063088 net/mlx5: Add sampler destination type
The flow sampler object is a new destination type. Add a new member
for the flow destination.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Chris Mi
2a29708916 net/mlx5: Add sample offload hardware bits and structures
Hardware introduces flow sampler object for packet sampling.
Add the offload hardware bits and structures.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Feng Tang
4df910620b mm: memcg: relayout structure mem_cgroup to avoid cache interference
0day reported one -22.7% regression for will-it-scale page_fault2
case [1] on a 4 sockets 144 CPU platform, and bisected to it to be
caused by Waiman's optimization (commit bd0b230fe1) of saving one
'struct page_counter' space for 'struct mem_cgroup'.

Initially we thought it was due to the cache alignment change introduced
by the patch, but further debug shows that it is due to some hot data
members ('vmstats_local', 'vmstats_percpu', 'vmstats') sit in 2 adjacent
cacheline (2N and 2N+1 cacheline), and when adjacent cache line prefetch
is enabled, it triggers an "extended level" of cache false sharing for
2 adjacent cache lines.

So exchange the 2 member blocks, while keeping mostly the original
cache alignment, which can restore and even enhance the performance,
and save 64 bytes of space for 'struct mem_cgroup' (from 2880 to 2816,
with 0day's default RHEL-8.3 kernel config)

[1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/

Fixes: bd0b230fe1 ("mm/memcg: unify swap and memsw page counters")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-26 09:35:49 -08:00
Yunsheng Lin
8b5536ad12 lockdep: Introduce in_softirq lockdep assert
The current semantic for napi_consume_skb() is that caller need
to provide non-zero budget when calling from NAPI context, and
breaking this semantic will cause hard to debug problem, because
_kfree_skb_defer() need to run in atomic context in order to push
the skb to the particular cpu' napi_alloc_cache atomically.

So add the lockdep_assert_in_softirq() to assert when the running
context is not in_softirq, in_softirq means softirq is serving or
BH is disabled, which has a ambiguous semantics due to the BH
disabled confusion, so add a comment to emphasize that.

And the softirq context can be interrupted by hard IRQ or NMI
context, lockdep_assert_in_softirq() need to assert about hard
IRQ or NMI context too.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 15:08:34 -08:00
KP Singh
403319be5d ima: Implement ima_inode_hash
This is in preparation to add a helper for BPF LSM programs to use
IMA hashes when attached to LSM hooks. There are LSM hooks like
inode_unlink which do not have a struct file * argument and cannot
use the existing ima_file_hash API.

An inode based API is, therefore, useful in LSM based detections like an
executable trying to delete itself which rely on the inode_unlink LSM
hook.

Moreover, the ima_file_hash function does nothing with the struct file
pointer apart from calling file_inode on it and converting it to an
inode.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20201124151210.1081188-2-kpsingh@chromium.org
2020-11-26 00:04:04 +01:00
Ioana Ciornei
6527b93842 net: phy: remove the .did_interrupt() and .ack_interrupt() callback
Now that all the PHY drivers have been migrated to directly implement
the generic .handle_interrupt() callback for a seamless support of
shared IRQs and all the .config_inter() implementations clear any
pending interrupts, we can safely remove the two callbacks.

With this patch, phylib has a proper support for shared IRQs (and not
just for multi-PHY devices. A PHY driver must implement both the
.handle_interrupt() and .config_intr() callbacks for the IRQs to be
actually used.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:18:38 -08:00
Björn Töpel
36ccdf8582 net, xsk: Avoid taking multiple skbuff references
Commit 642e450b6b ("xsk: Do not discard packet when NETDEV_TX_BUSY")
addressed the problem that packets were discarded from the Tx AF_XDP
ring, when the driver returned NETDEV_TX_BUSY. Part of the fix was
bumping the skbuff reference count, so that the buffer would not be
freed by dev_direct_xmit(). A reference count larger than one means
that the skbuff is "shared", which is not the case.

If the "shared" skbuff is sent to the generic XDP receive path,
netif_receive_generic_xdp(), and pskb_expand_head() is entered the
BUG_ON(skb_shared(skb)) will trigger.

This patch adds a variant to dev_direct_xmit(), __dev_direct_xmit(),
where a user can select the skbuff free policy. This allows AF_XDP to
avoid bumping the reference count, but still keep the NETDEV_TX_BUSY
behavior.

Fixes: 642e450b6b ("xsk: Do not discard packet when NETDEV_TX_BUSY")
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201123175600.146255-1-bjorn.topel@gmail.com
2020-11-24 22:39:56 +01:00
Jakub Kicinski
23c01ed3b0 rxrpc development
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl+8E4YACgkQ+7dXa6fL
 C2tehQ/8C1O+p1UBmQ4k/eXkYqsIvRQJGXGheSClxvym8tQiut/sZEKsCqucFmO8
 Y2gKEYJB3xYIqbZjjJUFYbL70Q5qG/cXgKn3rCpMjhn/0UAoUSwaGyoURcLwaz2B
 asipDSvbvZk7OwX4ScUTi/b/7AHtO6l16Y6yRzUqneOsSkXYk1+HhK2/4rPDKEJf
 cxocGIjjWs4d1erG2Aqgn3BZ2GsDNb88NJYi5qgc0ywFgRacX7oL0WFeu6pD1Ja+
 8vZLjFYB/bDeLBlvzpnWal99lUiSee8BbUoQn1iGJ3EekUVWkkNNfjhpYRjzIIR5
 GsJfbDHP16PHhLZgxBvNBwkWoqqBw3X/Zn0dkBDZgjlZQvLbeds6QRd6Ag/M2CJI
 D1QtmBDBtr4UZfRFm+5hfpTqf2jeOc4o68WROH3FEdsnKvICkttKucE9LaykokFD
 gQLNmkn2tb1D1yS2Y4iPucV3kYUU1Y6yw++Ck0cE2isBauN0gHtr9UAXZEc57ILc
 hXCFrJTbGkph3TimQDd1QbXZ28eExRBnn+H5RyYHeEidv03d2UO5uFZaJ5b5UGX0
 sE+3aeWLTH3fFTBvMnYPyFjzalSky8V5zOlUltwVCx7PpiDe6P7rKpO+ZXEi4OXP
 mHz7HM6h1iEnnrqEGNz7L39pTqKVbfBxDMvFra07dyKkvVthF1w=
 =TvbK
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-next-20201123' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Prelude to gssapi support

Here are some patches that do some reorganisation of the security class
handling in rxrpc to allow implementation of the RxGK security class that
will allow AF_RXRPC to use GSSAPI-negotiated tokens and better crypto.  The
RxGK security class is not included in this patchset.

It does the following things:

 (1) Add a keyrings patch to provide the original key description, as
     provided to add_key(), to the payload preparser so that it can
     interpret the content on that basis.  Unfortunately, the rxrpc_s key
     type wasn't written to interpret its payload as anything other than a
     string of bytes comprising a key, but for RxGK, more information is
     required as multiple Kerberos enctypes are supported.

 (2) Remove the rxk5 security class key parsing.  The rxk5 class never got
     rolled out in OpenAFS and got replaced with rxgk.

 (3) Support the creation of rxrpc keys with multiple tokens of different
     types.  If some types are not supported, the ENOPKG error is
     suppressed if at least one other token's type is supported.

 (4) Punt the handling of server keys (rxrpc_s type) to the appropriate
     security class.

 (5) Organise the security bits in the rxrpc_connection struct into a
     union to make it easier to override for other classes.

 (6) Move some bits from core code into rxkad that won't be appropriate to
     rxgk.

* tag 'rxrpc-next-20201123' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  rxrpc: Ask the security class how much space to allow in a packet
  rxrpc: rxkad: Don't use pskb_pull() to advance through the response packet
  rxrpc: Organise connection security to use a union
  rxrpc: Don't reserve security header in Tx DATA skbuff
  rxrpc: Merge prime_packet_security into init_connection_security
  rxrpc: Fix example key name in a comment
  rxrpc: Ignore unknown tokens in key payload unless no known tokens
  rxrpc: Make the parsing of xdr payloads more coherent
  rxrpc: Allow security classes to give more info on server keys
  rxrpc: Don't leak the service-side session key to userspace
  rxrpc: Hand server key parsing off to the security class
  rxrpc: Split the server key type (rxrpc_s) into its own file
  rxrpc: Don't retain the server key in the connection
  rxrpc: Support keys with multiple authentication tokens
  rxrpc: List the held token types in the key description in /proc/keys
  rxrpc: Remove the rxk5 security class as it's now defunct
  keys: Provide the original description to the key preparser
====================

Link: https://lore.kernel.org/r/160616220405.830164.2239716599743995145.stgit@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:05:58 -08:00
Amit Sunil Dhamne
acfdd18591 firmware: xilinx: Use hash-table for api feature check
Currently array of fix length PM_API_MAX is used to cache
the pm_api version (valid or invalid). However ATF based
PM APIs values are much higher then PM_API_MAX.
So to include ATF based PM APIs also, use hash-table to
store the pm_api version status.

Signed-off-by: Amit Sunil Dhamne <amit.sunil.dhamne@xilinx.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ravi Patel <ravi.patel@xilinx.com>
Signed-off-by: Rajan Vaja <rajan.vaja@xilinx.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Fixes: f3217d6f2f ("firmware: xilinx: fix out-of-bounds access")
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1606197161-25976-1-git-send-email-rajan.vaja@xilinx.com
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2020-11-24 15:13:54 +01:00
Eyal Birger
d549699048 net/packet: fix packet receive on L3 devices without visible hard header
In the patchset merged by commit b9fcf0a0d8
("Merge branch 'support-AF_PACKET-for-layer-3-devices'") L3 devices which
did not have header_ops were given one for the purpose of protocol parsing
on af_packet transmit path.

That change made af_packet receive path regard these devices as having a
visible L3 header and therefore aligned incoming skb->data to point to the
skb's mac_header. Some devices, such as ipip, xfrmi, and others, do not
reset their mac_header prior to ingress and therefore their incoming
packets became malformed.

Ideally these devices would reset their mac headers, or af_packet would be
able to rely on dev->hard_header_len being 0 for such cases, but it seems
this is not the case.

Fix by changing af_packet RX ll visibility criteria to include the
existence of a '.create()' header operation, which is used when creating
a device hard header - via dev_hard_header() - by upper layers, and does
not exist in these L3 devices.

As this predicate may be useful in other situations, add it as a common
dev_has_header() helper in netdevice.h.

Fixes: b9fcf0a0d8 ("Merge branch 'support-AF_PACKET-for-layer-3-devices'")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20201121062817.3178900-1-eyal.birger@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:29:36 -08:00
Jakub Kicinski
cc69837fca net: don't include ethtool.h from netdevice.h
linux/netdevice.h is included in very many places, touching any
of its dependecies causes large incremental builds.

Drop the linux/ethtool.h include, linux/netdevice.h just needs
a forward declaration of struct ethtool_ops.

Fix all the places which made use of this implicit include.

Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:27:04 -08:00
Christian Eggers
076d38b88c net: ptp: introduce common defines for PTP message types
Using PTP wide defines will obsolete different driver internal defines
and uses of magic numbers.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Cc: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 13:43:07 -08:00
David Howells
8eb621698f keys: Provide the original description to the key preparser
Provide the proposed description (add key) or the original description
(update/instantiate key) when preparsing a key so that the key type can
validate it against the data.

This is important for rxrpc server keys as we need to check that they have
the right amount of key material present - and it's better to do that when
the key is loaded rather than deep in trying to process a response packet.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: keyrings@vger.kernel.org
2020-11-23 18:09:29 +00:00