Commit graph

8977 commits

Author SHA1 Message Date
Hangbin Liu
129e3c1bab bonding: add new option ns_ip6_target
This patch add a new bonding option ns_ip6_target, which correspond
to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS
request to determine the health of the link.

For other related options like the validation, we still use
arp_validate, and will change to ns_validate later.

Note: the sysfs configuration support was removed based on
https://lore.kernel.org/netdev/8863.1645071997@famine

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21 12:13:45 +00:00
Ingo Molnar
6255b48aeb Linux 5.17-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmISrYgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg20IAKDZr7rfSHBopjQV
 Cocw744tom0XuxpvSZpp2GGOOXF+tkswcNNaRIrbGOl1mkyxA7eBZCTMpDeDS9aQ
 wB0D0Gxx8QBAJp4KgB1W7TB+hIGes/rs8Ve+6iO4ulLLdCVWX/q2boI0aZ7QX9O9
 qNi8OsoZQtk6falRvciZFHwV5Av1p2Sy1AW57udQ7DvJ4H98AfKf1u8/z208WWW8
 1ixC+qJxQcUcM9vI+7P9Tt7NbFSKv8SvAmqjFY7P+DxQAsVw6KXoqVXykDzeOv0t
 fUNOE/t0oFZafwtn8h7KBQnwS9lH03+3KkslVZs+iMFyUj/Bar+NVVyKoDhWXtVg
 /PuMhEg=
 =eU1o
 -----END PGP SIGNATURE-----

Merge tag 'v5.17-rc5' into sched/core, to resolve conflicts

New conflicts in sched/core due to the following upstream fixes:

  44585f7bc0 ("psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n")
  a06247c680 ("psi: Fix uaf issue when psi trigger is destroyed while being polled")

Conflicts:
	include/linux/psi_types.h
	kernel/sched/psi.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-02-21 11:53:51 +01:00
Amitay Isaacs
a1dc630886 fsi: sbefifo: Implement FSI_SBEFIFO_READ_TIMEOUT_SECONDS ioctl
FSI_SBEFIFO_READ_TIMEOUT_SECONDS ioctl sets the read timeout (in
seconds) for the response received by sbefifo device from sbe.  The
timeout affects only the read operation on current sbefifo device fd.

Certain SBE operations can take long time to complete and the default
timeout of 10 seconds might not be sufficient to start receiving
response from SBE.  In such cases, allow the timeout to be set to the
maximum of 120 seconds.

The kernel does not contain the definition of the various SBE
operations, so we must expose an interface to userspace to set the
timeout for the given operation.

Signed-off-by: Amitay Isaacs <amitay@ozlabs.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220121053816.82253-3-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
2022-02-21 19:38:17 +10:30
Mobashshera Rasool
4b340a5a72 net: ip6mr: add support for passing full packet on wrong mif
This patch adds support for MRT6MSG_WRMIFWHOLE which is used to pass
full packet and real vif id when the incoming interface is wrong.
While the RP and FHR are setting up state we need to be sending the
registers encapsulated with all the data inside otherwise we lose it.
The RP then decapsulates it and forwards it to the interested parties.
Currently with WRONGMIF we can only be sending empty register packets
and will lose that data.
This behaviour can be enabled by using MRT_PIM with
val == MRT6MSG_WRMIFWHOLE. This doesn't prevent MRT6MSG_WRONGMIF from
happening, it happens in addition to it, also it is controlled by the same
throttling parameters as WRONGMIF (i.e. 1 packet per 3 seconds currently).
Both messages are generated to keep backwards compatibily and avoid
breaking someone who was enabling MRT_PIM with val == 4, since any
positive val is accepted and treated the same.

Signed-off-by: Mobashshera Rasool <mobash.rasool.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 16:05:54 +00:00
Jacques de Laval
47f0bd5032 net: Add new protocol attribute to IP addresses
This patch adds a new protocol attribute to IPv4 and IPv6 addresses.
Inspiration was taken from the protocol attribute of routes. User space
applications like iproute2 can set/get the protocol with the Netlink API.

The attribute is stored as an 8-bit unsigned integer.

The protocol attribute is set by kernel for these categories:

- IPv4 and IPv6 loopback addresses
- IPv6 addresses generated from router announcements
- IPv6 link local addresses

User space may pass custom protocols, not defined by the kernel.

Grouping addresses on their origin is useful in scenarios where you want
to distinguish between addresses based on who added them, e.g. kernel
vs. user space.

Tagging addresses with a string label is an existing feature that could be
used as a solution. Unfortunately the max length of a label is
15 characters, and for compatibility reasons the label must be prefixed
with the name of the device followed by a colon. Since device names also
have a max length of 15 characters, only -1 characters is guaranteed to be
available for any origin tag, which is not that much.

A reference implementation of user space setting and getting protocols
is available for iproute2:

9a6ea18bd7

Signed-off-by: Jacques de Laval <Jacques.De.Laval@westermo.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220217150202.80802-1-Jacques.De.Laval@westermo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18 21:20:06 -08:00
Michael Kelley
6de74d1069 hv_utils: Add comment about max VMbus packet size in VSS driver
The VSS driver allocates a VMbus receive buffer significantly
larger than sizeof(hv_vss_msg), with no explanation. To help
prevent future mistakes, add a #define and comment about why
this is done.

No functional change.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1644423070-75125-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-02-18 13:08:18 +00:00
Kees Cook
d4568fc852 media: omap3isp: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields. Wrap the target region
in struct_group(). This additionally fixes a theoretical misalignment
of the copy (since the size of "buf" changes between 64-bit and 32-bit,
but this is likely never built for 64-bit).

FWIW, I think this code is totally broken on 64-bit (which appears to
not be a "real" build configuration): it would either always fail (with
an uninitialized data->buf_size) or would cause corruption in userspace
due to the copy_to_user() in the call path against an uninitialized
data->buf value:

omap3isp_stat_request_statistics_time32(...)
    struct omap3isp_stat_data data64;
    ...
    omap3isp_stat_request_statistics(stat, &data64);

int omap3isp_stat_request_statistics(struct ispstat *stat,
                                     struct omap3isp_stat_data *data)
    ...
    buf = isp_stat_buf_get(stat, data);

static struct ispstat_buffer *isp_stat_buf_get(struct ispstat *stat,
                                               struct omap3isp_stat_data *data)
...
    if (buf->buf_size > data->buf_size) {
            ...
            return ERR_PTR(-EINVAL);
    }
    ...
    rval = copy_to_user(data->buf,
                        buf->virt_addr,
                        buf->buf_size);

Regardless, additionally initialize data64 to be zero-filled to avoid
undefined behavior.

Link: https://lore.kernel.org/lkml/20211215220505.GB21862@embeddedor

Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 378e3f81cb ("media: omap3isp: support 64-bit version of omap3isp_stat_data")
Cc: stable@vger.kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-02-17 09:56:02 +01:00
Masahiro Yamada
2a5c0fdc70
reiserfs_xattr.h: add linux/reiserfs_xattr.h to UAPI compile-test coverage
linux/reiserfs_xattr.h is currently excluded from the UAPI compile-test
because of the error like follows:

    HDRTEST usr/include/linux/reiserfs_xattr.h
  In file included from <command-line>:
  ./usr/include/linux/reiserfs_xattr.h:22:9: error: unknown type name ‘size_t’
     22 |         size_t length;
        |         ^~~~~~

The error can be fixed by replacing size_t with __kernel_size_t.

Then, remove the no-header-test entry from user/include/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-17 09:09:38 +01:00
Masahiro Yamada
8b4bca21c2
kexec.h: add linux/kexec.h to UAPI compile-test coverage
linux/kexec.h is currently excluded from the UAPI compile-test because
of the errors like follows:

    HDRTEST usr/include/linux/kexec.h
  In file included from <command-line>:
  ./usr/include/linux/kexec.h:56:9: error: unknown type name ‘size_t’
     56 |         size_t bufsz;
        |         ^~~~~~
  ./usr/include/linux/kexec.h:58:9: error: unknown type name ‘size_t’
     58 |         size_t memsz;
        |         ^~~~~~

The errors can be fixed by replacing size_t with __kernel_size_t.

Then, remove the no-header-test entry from user/include/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-17 09:09:38 +01:00
Masahiro Yamada
cbf2820341
fsmap.h: add linux/fsmap.h to UAPI compile-test coverage
linux/fsmap.h is currently excluded from the UAPI compile-test because
of the error like follows:

    HDRTEST usr/include/linux/fsmap.h
  In file included from <command-line>:
  ./usr/include/linux/fsmap.h:72:19: error: unknown type name ‘size_t’
     72 | static __inline__ size_t
        |                   ^~~~~~

The error can be fixed by replacing size_t with __kernel_size_t.

Then, remove the no-header-test entry from user/include/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-17 09:09:37 +01:00
Masahiro Yamada
169adc2b6b
android/binder.h: add linux/android/binder(fs).h to UAPI compile-test coverage
linux/android/binder.h and linux/android/binderfs.h are currently
excluded from the UAPI compile-test because of the errors like follows:

    HDRTEST usr/include/linux/android/binder.h
  In file included from <command-line>:
  ./usr/include/linux/android/binder.h:291:9: error: unknown type name ‘pid_t’
    291 |         pid_t           sender_pid;
        |         ^~~~~
  ./usr/include/linux/android/binder.h:292:9: error: unknown type name ‘uid_t’
    292 |         uid_t           sender_euid;
        |         ^~~~~

The errors can be fixed by replacing {pid,uid}_t with __kernel_{pid,uid}_t.

Then, remove the no-header-test entries from user/include/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-17 09:09:37 +01:00
Felix Kuehling
79aa036738 drm/amdkfd: Replace zero-length array with flexible-array member
Reference:
https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

CC: Changcheng Deng <deng.changcheng@zte.com.cn>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-16 17:30:03 -05:00
Ilan Peer
ea05fd3581 cfg80211: Support configuration of station EHT capabilities
Add attributes and some code bits to support userspace passing
in EHT capabilities of stations.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Link: https://lore.kernel.org/r/20220214173004.ecf0b3ff9627.Icb4a5f2ec7b41d9008ac4cfc16c59baeb84793d3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:43:25 +01:00
Ilan Peer
31846b6578 cfg80211: add NO-EHT flag to regulatory
This may be necessary in some cases, add a flag and propagate
it, just like the NO-HE that already exists.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
[split off from a combined 320/no-EHT patch]
Link: https://lore.kernel.org/r/20220214173004.dbb85a7b86bb.Ifc1e2daac51c1cc5f895ccfb79faf5eaec3950ec@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:43:14 +01:00
Sriram R
c2b3d7699f nl80211: add support for 320MHz channel limitation
Add support to advertise drivers or regulatory limitations on 320 MHz
channels to userspace.

Signed-off-by: Sriram R <quic_srirrama@quicinc.com>
Co-authored-by: Karthikeyan Periyasamy <quic_periyasia@quicinc.com>
Signed-off-by: Karthikeyan Periyasamy <quic_periyasia@quicinc.com>
Co-authored-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Link: https://lore.kernel.org/r/1640163883-12696-6-git-send-email-quic_vjakkam@quicinc.com
Link: https://lore.kernel.org/r/20220214163009.175289-3-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:43:02 +01:00
Veerendranath Jakkam
cfb14110ac nl80211: add EHT MCS support
Add support for reporting and calculating EHT bitrates.

Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Link: https://lore.kernel.org/r/1640163883-12696-7-git-send-email-quic_vjakkam@quicinc.com
Link: https://lore.kernel.org/r/20220214163009.175289-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:42:52 +01:00
Jia Ding
3743bec612 cfg80211: Add support for EHT 320 MHz channel width
Add 320 MHz support in the channel def and center frequency validation
with compatible check.

Signed-off-by: Jia Ding <quic_jiad@quicinc.com>
Co-authored-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Co-authored-by: Muna Sinada <quic_msinada@quicinc.com>
Signed-off-by: Muna Sinada <quic_msinada@quicinc.com>
Co-authored-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Link: https://lore.kernel.org/r/1640163883-12696-5-git-send-email-quic_vjakkam@quicinc.com
Link: https://lore.kernel.org/r/20220214163009.175289-1-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:42:39 +01:00
Ilan Peer
5cd5a8a3e2 cfg80211: Add data structures to capture EHT capabilities
And advertise EHT capabilities to user space when supported.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Link: https://lore.kernel.org/r/20220214173004.6fb70658529f.I2413a37c8f7d2d6d638038a3d95360a3fce0114d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-02-16 15:42:29 +01:00
Catalin Marinas
761b9b366c elf: Introduce the ARM MTE ELF segment type
Memory tags will be dumped in the core file as segments with their own
type. Discussions with the binutils and the generic ABI community
settled on using new definitions in the PT_*PROC space (and to be
documented in the processor-specific ABIs).

Introduce PT_ARM_MEMTAG_MTE as (PT_LOPROC + 0x1). Not included in this
patch since there is no upstream support but the CHERI/BSD community
will also reserve:

  #define PT_ARM_MEMTAG_CHERI    (PT_LOPROC + 0x2)
  #define PT_RISCV_MEMTAG_CHERI  (PT_LOPROC + 0x3)

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Luis Machado <luis.machado@linaro.org>
Link: https://lore.kernel.org/r/20220131165456.2160675-3-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-15 22:53:28 +00:00
Janis Schoetterl-Glausch
5e35d0eb47 KVM: s390: Update api documentation for memop ioctl
Document all currently existing operations, flags and explain under
which circumstances they are available. Document the recently
introduced absolute operations and the storage key protection flag,
as well as the existing SIDA operations.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20220211182215.2730017-10-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-02-14 16:12:57 +01:00
Janis Schoetterl-Glausch
d004079edc KVM: s390: Add capability for storage key extension of MEM_OP IOCTL
Availability of the KVM_CAP_S390_MEM_OP_EXTENSION capability signals that:
* The vcpu MEM_OP IOCTL supports storage key checking.
* The vm MEM_OP IOCTL exists.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20220211182215.2730017-9-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-02-14 16:12:57 +01:00
Janis Schoetterl-Glausch
ef11c9463a KVM: s390: Add vm IOCTL for key checked guest absolute memory access
Channel I/O honors storage keys and is performed on absolute memory.
For I/O emulation user space therefore needs to be able to do key
checked accesses.
The vm IOCTL supports read/write accesses, as well as checking
if an access would succeed.
Unlike relying on KVM_S390_GET_SKEYS for key checking would,
the vm IOCTL performs the check in lockstep with the read or write,
by, ultimately, mapping the access to move instructions that
support key protection checking with a supplied key.
Fetch and storage protection override are not applicable to absolute
accesses and so are not applied as they are when using the vcpu memop.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20220211182215.2730017-7-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-02-14 16:12:57 +01:00
Janis Schoetterl-Glausch
e9e9feebcb KVM: s390: Add optional storage key checking to MEMOP IOCTL
User space needs a mechanism to perform key checked accesses when
emulating instructions.

The key can be passed as an additional argument.
Having an additional argument is flexible, as user space can
pass the guest PSW's key, in order to make an access the same way the
CPU would, or pass another key if necessary.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20220211182215.2730017-6-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-02-14 16:12:57 +01:00
Dave Airlie
b9c7babe2c Linux 5.17-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmIJZmoeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGZdoH/04d8zUhM3Fd3ACB
 V/ONtOXmkfP2mEJSjb7cXTN1EM2SlOBdSnSsEw09FtGhjHABjOnLho4J5ixk9TH8
 zNMNI3EMksM2T9KadHwxv8Vvp1LTrWRzMbws8tOCPA0RkOpikJfClC8CzRAyidJ3
 cAbbDH/Jl1GnVZ8bpKmv2auYt+kNVGb0cwJ2W8phCwwkL7sLky5tgYeaGiJEXbJf
 Tfi/3qtFdmYjD8wtYnCfzjnB7suG5nF7rGEnxCIxNi+IA4DieUv2c1KchuoaBfT9
 df364VjKaGT3j+GB07ksQ/8mkwWiRXsCzOXAyMZSZaWjdMD4aAhCTJak5j7/TvGC
 wtgHPww=
 =/CMW
 -----END PGP SIGNATURE-----

Backmerge tag 'v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next

Daniel asked for this for some intel deps, so let's do it now.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-02-14 10:52:27 +10:00
Dave Airlie
123db17ddf Merge tag 'amd-drm-next-5.18-2022-02-11-1' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.18-2022-02-11-1:

amdgpu:
- Clean up of power management code
- Enable freesync video mode by default
- Clean up of RAS code
- Improve VRAM access for debug using SDMA
- Coding style cleanups
- SR-IOV fixes
- More display FP reorg
- TLB flush fixes for Arcuturus, Vega20
- Misc display fixes
- Rework special register access methods for SR-IOV
- DP2 fixes
- DP tunneling fixes
- DSC fixes
- More IP discovery cleanups
- Misc RAS fixes
- Enable both SMU i2c buses where applicable
- s2idle improvements
- DPCS header cleanup
- Add new CAP firmware support for SR-IOV

amdkfd:
- Misc cleanups
- SVM fixes
- CRIU support
- Clean up MQD manager

UAPI:
- Add interface to amdgpu CTX ioctl to request a stable power state for profiling
  https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207
- Add amdkfd support for CRIU
  https://github.com/checkpoint-restore/criu/pull/1709
- Remove old unused amdkfd debugger interface
  Was only implemented for Kaveri and was only ever used by an old HSA tool that was never open sourced

radeon:
- Fix error handling in radeon_driver_open_kms
- UVD suspend fix
- Misc fixes

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220211220706.5803-1-alexander.deucher@amd.com
2022-02-14 10:31:51 +10:00
D. Wythe
f9496b7c1b net/smc: Add global configure for handshake limitation by netlink
Although we can control SMC handshake limitation through socket options,
which means that applications who need it must modify their code. It's
quite troublesome for many existing applications. This patch modifies
the global default value of SMC handshake limitation through netlink,
providing a way to put constraint on handshake without modifies any code
for applications.

Suggested-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11 11:14:58 +00:00
D. Wythe
a6a6fe27ba net/smc: Dynamic control handshake limitation by socket options
This patch aims to add dynamic control for SMC handshake limitation for
every smc sockets, in production environment, it is possible for the
same applications to handle different service types, and may have
different opinion on SMC handshake limitation.

This patch try socket options to complete it, since we don't have socket
option level for SMC yet, which requires us to implement it at the same
time.

This patch does the following:

- add new socket option level: SOL_SMC.
- add new SMC socket option: SMC_LIMIT_HS.
- provide getter/setter for SMC socket options.

Link: https://lore.kernel.org/all/20f504f961e1a803f85d64229ad84260434203bd.1644323503.git.alibuda@linux.alibaba.com/
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11 11:14:58 +00:00
Beau Belgrave
7f5a08c79d user_events: Add minimal support for trace_event into ftrace
Minimal support for interacting with dynamic events, trace_event and
ftrace. Core outline of flow between user process, ioctl and trace_event
APIs.

User mode processes that wish to use trace events to get data into
ftrace, perf, eBPF, etc are limited to uprobes today. The user events
features enables an ABI for user mode processes to create and write to
trace events that are isolated from kernel level trace events. This
enables a faster path for tracing from user mode data as well as opens
managed code to participate in trace events, where stub locations are
dynamic.

User processes often want to trace only when it's useful. To enable this
a set of pages are mapped into the user process space that indicate the
current state of the user events that have been registered. User
processes can check if their event is hooked to a trace/probe, and if it
is, emit the event data out via the write() syscall.

Two new files are introduced into tracefs to accomplish this:
user_events_status - This file is mmap'd into participating user mode
processes to indicate event status.

user_events_data - This file is opened and register/delete ioctl's are
issued to create/open/delete trace events that can be used for tracing.

The typical scenario is on process start to mmap user_events_status. Processes
then register the events they plan to use via the REG ioctl. The ioctl reads
and updates the passed in user_reg struct. The status_index of the struct is
used to know the byte in the status page to check for that event. The
write_index of the struct is used to describe that event when writing out to
the fd that was used for the ioctl call. The data must always include this
index first when writing out data for an event. Data can be written either by
write() or by writev().

For example, in memory:
int index;
char data[];

Psuedo code example of typical usage:
struct user_reg reg;

int page_fd = open("user_events_status", O_RDWR);
char *page_data = mmap(NULL, PAGE_SIZE, PROT_READ, MAP_SHARED, page_fd, 0);
close(page_fd);

int data_fd = open("user_events_data", O_RDWR);

reg.size = sizeof(reg);
reg.name_args = (__u64)"test";

ioctl(data_fd, DIAG_IOCSREG, &reg);
int status_id = reg.status_index;
int write_id = reg.write_index;

struct iovec io[2];
io[0].iov_base = &write_id;
io[0].iov_len = sizeof(write_id);
io[1].iov_base = payload;
io[1].iov_len = sizeof(payload);

if (page_data[status_id])
	writev(data_fd, io, 2);

User events are also exposed via the dynamic_events tracefs file for
both create and delete. Current status is exposed via the user_events_status
tracefs file.

Simple example to register a user event via dynamic_events:
	echo u:test >> dynamic_events
	cat dynamic_events
	u:test

If an event is hooked to a probe, the probe hooked shows up:
	echo 1 > events/user_events/test/enable
	cat user_events_status
	1:test # Used by ftrace

	Active: 1
	Busy: 1
	Max: 4096

If an event is not hooked to a probe, no probe status shows up:
	echo 0 > events/user_events/test/enable
	cat user_events_status
	1:test

	Active: 1
	Busy: 0
	Max: 4096

Users can describe the trace event format via the following format:
	name[:FLAG1[,FLAG2...] [field1[;field2...]]

Each field has the following format:
	type name

Example for char array with a size of 20 named msg:
	echo 'u:detailed char[20] msg' >> dynamic_events
	cat dynamic_events
	u:detailed char[20] msg

Data offsets are based on the data written out via write() and will be
updated to reflect the correct offset in the trace_event fields. For dynamic
data it is recommended to use the new __rel_loc data type. This type will be
the same as __data_loc, but the offset is relative to this entry. This allows
user_events to not worry about what common fields are being inserted before
the data.

The above format is valid for both the ioctl and the dynamic_events file.

Link: https://lkml.kernel.org/r/20220118204326.2169-2-beaub@linux.microsoft.com

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-10 22:36:46 -05:00
Jakub Kicinski
5b91c5cc0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10 17:29:56 -08:00
Linus Torvalds
f1baf68e13 Networking fixes for 5.17-rc4, including fixes from netfilter and can.
Current release - new code bugs:
 
  - sparx5: fix get_stat64 out-of-bound access and crash
 
  - smc: fix netdev ref tracker misuse
 
 Previous releases - regressions:
 
  - eth: ixgbevf: require large buffers for build_skb on 82599VF,
    avoid overflows
 
  - eth: ocelot: fix all IP traffic getting trapped to CPU with PTP
    over IP
 
  - bonding: fix rare link activation misses in 802.3ad mode
 
 Previous releases - always broken:
 
  - tcp: fix tcp sock mem accounting in zero-copy corner cases
 
  - remove the cached dst when uncloning an skb dst and its metadata,
    since we only have one ref it'd lead to an UaF
 
  - netfilter:
    - conntrack: don't refresh sctp entries in closed state
    - conntrack: re-init state for retransmitted syn-ack, avoid
      connection establishment getting stuck with strange stacks
    - ctnetlink: disable helper autoassign, avoid it getting lost
    - nft_payload: don't allow transport header access for fragments
 
  - dsa: fix use of devres for mdio throughout drivers
 
  - eth: amd-xgbe: disable interrupts during pci removal
 
  - eth: dpaa2-eth: unregister netdev before disconnecting the PHY
 
  - eth: ice: fix IPIP and SIT TSO offload
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmIFcW0ACgkQMUZtbf5S
 IrsloQ/+LrzlXgYrWf60DEFT4+AfJ20YxeakB+SpT0cZdvylU6NHF/U4rSHdUJRn
 xZiHfhfKzNWV5miJ2wzOuvufUvoz173dtrdgJJmp6G43qfAFyqiqtxrbVuuTQk0G
 C2i5M66zJ2svSj3EO5dYKV8zeydd14eVeoaJqfN5rjMARTmvpT/ssdzw0LTf0aXp
 87CF/WyeH8NyfQxQwPmbGRxDpnV2RqDJYSNdA4kOtDrnQmoKet32rhE6liHeP2jD
 OFQo70QXEMVyIZEh4wT17lMqA4M1zIEQtgrB0NsmVNU3jQnDI4UQ0aA2gHzwK31S
 glORZWGqTGGrhVy9uQBGKUK29RW8vb0D2ojZpi3zd0htWpIqpfFf6wuMAdUwEZag
 mTlZ1Yi6YgzAQEdSALoXLKps1GiHQzQNYPK6fNxoSOxnmoQTMQL9KxU/HB9d8W+L
 hjuYOGDw9vtGyUpNF7lktCQR/sWjFCmevLk97d2mhofFqfHBJYzFHGyd0GgNVzgl
 o3CMWnyiqTVapLRMTFPnEADEXGWq4DYAjRPkfjDCHeITB9Neg9S85Geth+xwfdi1
 cwSu64xFb7k7hiAbnymB3xg5sBQjXHrIQo03GHVymQE5p6XxIhn+5UD9CYHGJ2TF
 7PGyHeJJOZyWGQ/iSZJbo+BpsVGthIgk+9tTKPeTToej0tic27w=
 =jlhJ
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter and can.

Current release - new code bugs:

   - sparx5: fix get_stat64 out-of-bound access and crash

   - smc: fix netdev ref tracker misuse

  Previous releases - regressions:

   - eth: ixgbevf: require large buffers for build_skb on 82599VF, avoid
     overflows

   - eth: ocelot: fix all IP traffic getting trapped to CPU with PTP
     over IP

   - bonding: fix rare link activation misses in 802.3ad mode

  Previous releases - always broken:

   - tcp: fix tcp sock mem accounting in zero-copy corner cases

   - remove the cached dst when uncloning an skb dst and its metadata,
     since we only have one ref it'd lead to an UaF

   - netfilter:
      - conntrack: don't refresh sctp entries in closed state
      - conntrack: re-init state for retransmitted syn-ack, avoid
        connection establishment getting stuck with strange stacks
      - ctnetlink: disable helper autoassign, avoid it getting lost
      - nft_payload: don't allow transport header access for fragments

   - dsa: fix use of devres for mdio throughout drivers

   - eth: amd-xgbe: disable interrupts during pci removal

   - eth: dpaa2-eth: unregister netdev before disconnecting the PHY

   - eth: ice: fix IPIP and SIT TSO offload"

* tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
  net: dsa: mv88e6xxx: fix use-after-free in mv88e6xxx_mdios_unregister
  net: mscc: ocelot: fix mutex lock error during ethtool stats read
  ice: Avoid RTNL lock when re-creating auxiliary device
  ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler
  ice: fix IPIP and SIT TSO offload
  ice: fix an error code in ice_cfg_phy_fec()
  net: mpls: Fix GCC 12 warning
  dpaa2-eth: unregister the netdev before disconnecting from the PHY
  skbuff: cleanup double word in comment
  net: macb: Align the dma and coherent dma masks
  mptcp: netlink: process IPv6 addrs in creating listening sockets
  selftests: mptcp: add missing join check
  net: usb: qmi_wwan: Add support for Dell DW5829e
  vlan: move dev_put into vlan_dev_uninit
  vlan: introduce vlan_dev_free_egress_priority
  ax25: fix UAF bugs of net_device caused by rebinding operation
  net: dsa: fix panic when DSA master device unbinds on shutdown
  net: amd-xgbe: disable interrupts during pci removal
  tipc: rate limit warning for received illegal binding update
  net: mdio: aspeed: Add missing MODULE_DEVICE_TABLE
  ...
2022-02-10 16:01:22 -08:00
Jakub Kicinski
4523082982 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

1) Conntrack sets on CHECKSUM_UNNECESSARY for UDP packet with no checksum,
   from Kevin Mitchell.

2) skb->priority support for nfqueue, from Nicolas Dichtel.

3) Remove conntrack extension register API, from Florian Westphal.

4) Move nat destroy hook to nf_nat_hook instead, to remove
   nf_ct_ext_destroy(), also from Florian.

5) Wrap pptp conntrack NAT hooks into single structure, from Florian Westphal.

6) Support for tcp option set to noop for nf_tables, also from Florian.

7) Do not run x_tables comment match from packet path in nf_tables,
   from Florian Westphal.

8) Replace spinlock by cmpxchg() loop to update missed ct event,
   from Florian Westphal.

9) Wrap cttimeout hooks into single structure, from Florian.

10) Add fast nft_cmp expression for up to 16-bytes.

11) Use cb->ctx to store context in ctnetlink dump, instead of using
    cb->args[], from Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: ctnetlink: use dump structure instead of raw args
  nfqueue: enable to set skb->priority
  netfilter: nft_cmp: optimize comparison for 16-bytes
  netfilter: cttimeout: use option structure
  netfilter: ecache: don't use nf_conn spinlock
  netfilter: nft_compat: suppress comment match
  netfilter: exthdr: add support for tcp option removal
  netfilter: conntrack: pptp: use single option structure
  netfilter: conntrack: remove extension register api
  netfilter: conntrack: handle ->destroy hook via nat_ops instead
  netfilter: conntrack: move extension sizes into core
  netfilter: conntrack: make all extensions 8-byte alignned
  netfilter: nfqueue: enable to get skb->priority
  netfilter: conntrack: mark UDP zero checksum as CHECKSUM_UNNECESSARY
====================

Link: https://lore.kernel.org/r/20220209133616.165104-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 21:35:08 -08:00
Jakub Kicinski
1127170d45 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-02-09

We've added 126 non-merge commits during the last 16 day(s) which contain
a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).

The main changes are:

1) Add custom BPF allocator for JITs that pack multiple programs into a huge
   page to reduce iTLB pressure, from Song Liu.

2) Add __user tagging support in vmlinux BTF and utilize it from BPF
   verifier when generating loads, from Yonghong Song.

3) Add per-socket fast path check guarding from cgroup/BPF overhead when
   used by only some sockets, from Pavel Begunkov.

4) Continued libbpf deprecation work of APIs/features and removal of their
   usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
   and various others.

5) Improve BPF instruction set documentation by adding byte swap
   instructions and cleaning up load/store section, from Christoph Hellwig.

6) Switch BPF preload infra to light skeleton and remove libbpf dependency
   from it, from Alexei Starovoitov.

7) Fix architecture-agnostic macros in libbpf for accessing syscall
   arguments from BPF progs for non-x86 architectures,
   from Ilya Leoshkevich.

8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
   of 16-bit field with anonymous zero padding, from Jakub Sitnicki.

9) Add new bpf_copy_from_user_task() helper to read memory from a different
   task than current. Add ability to create sleepable BPF iterator progs,
   from Kenny Yu.

10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
    utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.

11) Generate temporary netns names for BPF selftests to avoid naming
    collisions, from Hangbin Liu.

12) Implement bpf_core_types_are_compat() with limited recursion for
    in-kernel usage, from Matteo Croce.

13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
    to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.

14) Misc minor fixes to libbpf and selftests from various folks.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
  selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
  bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
  libbpf: Fix compilation warning due to mismatched printf format
  selftests/bpf: Test BPF_KPROBE_SYSCALL macro
  libbpf: Add BPF_KPROBE_SYSCALL macro
  libbpf: Fix accessing the first syscall argument on s390
  libbpf: Fix accessing the first syscall argument on arm64
  libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
  selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
  libbpf: Fix accessing syscall arguments on riscv
  libbpf: Fix riscv register names
  libbpf: Fix accessing syscall arguments on powerpc
  selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
  libbpf: Add PT_REGS_SYSCALL_REGS macro
  selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
  bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
  bpf: Fix leftover header->pages in sparc and powerpc code.
  libbpf: Fix signedness bug in btf_dump_array_data()
  selftests/bpf: Do not export subtest as standalone test
  bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
  ...
====================

Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 18:40:56 -08:00
Menglong Dong
5cad527d5f net: drop_monitor: support drop reason
In the commit c504e5c2f9 ("net: skb: introduce kfree_skb_reason()")
drop reason is introduced to the tracepoint of kfree_skb. Therefore,
drop_monitor is able to report the drop reason to users by netlink.

The drop reasons are reported as string to users, which is exactly
the same as what we do when reporting it to ftrace.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220209060838.55513-1-imagedong@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 17:25:57 -08:00
Mukul Joshi
5bdd3eb253 drm/amdkfd: Remove unused old debugger implementation
Cleanup the kfd code by removing the unused old debugger
implementation.
The address watch was only ever implemented in the upstream
driver for GFXv7 (Kaveri). The user mode tools runtime using
this API was never open-sourced. Work on the old debugger
prototype that used this API has been discontinued years ago.
Only a small piece of resetting wavefronts is kept and
is moved to kfd_device_queue_manager.c.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-09 16:57:51 -05:00
Jakub Sitnicki
9a69e2b385 bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
remote_port is another case of a BPF context field documented as a 32-bit
value in network byte order for which the BPF context access converter
generates a load of a zero-padded 16-bit integer in network byte order.

First such case was dst_port in bpf_sock which got addressed in commit
4421a58271 ("bpf: Make dst_port field in struct bpf_sock 16-bit wide").

Loading 4-bytes from the remote_port offset and converting the value with
bpf_ntohl() leads to surprising results, as the expected value is shifted
by 16 bits.

Reduce the confusion by splitting the field in two - a 16-bit field holding
a big-endian integer, and a 16-bit zero-padding anonymous field that
follows it.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220209184333.654927-2-jakub@cloudflare.com
2022-02-09 11:40:45 -08:00
Matt Johnston
63ed1aab3d mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control
This change adds a couple of new ioctls for mctp sockets:
SIOCMCTPALLOCTAG and SIOCMCTPDROPTAG.  These ioctls provide facilities
for explicit allocation / release of tags, overriding the automatic
allocate-on-send/release-on-reply and timeout behaviours. This allows
userspace more control over messages that may not fit a simple
request/response model.

In order to indicate a pre-allocated tag to the sendmsg() syscall, we
introduce a new flag to the struct sockaddr_mctp.smctp_tag value:
MCTP_TAG_PREALLOC.

Additional changes from Jeremy Kerr <jk@codeconstruct.com.au>.

Contains a fix that was:
Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09 12:00:11 +00:00
Leon Romanovsky
7c76ecd9c9 xfrm: enforce validity of offload input flags
struct xfrm_user_offload has flags variable that received user input,
but kernel didn't check if valid bits were provided. It caused a situation
where not sanitized input was forwarded directly to the drivers.

For example, XFRM_OFFLOAD_IPV6 define that was exposed, was used by
strongswan, but not implemented in the kernel at all.

As a solution, check and sanitize input flags to forward
XFRM_OFFLOAD_INBOUND to the drivers.

Fixes: d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-02-09 09:00:40 +01:00
Rajneesh Bhardwaj
692996f2be drm/amdkfd: Bump up KFD API version for CRIU
- Change KFD minor version to 7 for CRIU

Proposed userspace changes:
https://github.com/RadeonOpenCompute/criu

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-07 17:59:53 -05:00
Rajneesh Bhardwaj
3698807094 drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs
Checkpoint-Restore in userspace (CRIU) is a powerful tool that can
snapshot a running process and later restore it on same or a remote
machine but expects the processes that have a device file (e.g. GPU)
associated with them, provide necessary driver support to assist CRIU
and its extensible plugin interface. Thus, In order to support the
Checkpoint-Restore of any ROCm process, the AMD Radeon Open Compute
Kernel driver, needs to provide a set of new APIs that provide
necessary VRAM metadata and its contents to a userspace component
(CRIU plugin) that can store it in form of image files.

This introduces some new ioctls which will be used to checkpoint-Restore
any KFD bound user process. KFD only allows ioctl calls from the same
process that opened the KFD file descriptor. Since these ioctls are
expected to be called from a KFD criu plugin which has elevated ptrace
attached privileges and CAP_CHECKPOINT_RESTORE capabilities attached with
the file descriptors so modify KFD to allow such calls.

(API redesigned by David Yat Sin)
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-07 17:59:03 -05:00
Linus Torvalds
c3bf8a1440 perf/urgent contains 3 fixups:
- Intel/PT: filters could crash the kernel
 
  - Intel: default disable the PMU for SMM, some new-ish EFI firmware has
    started using CPL3 and the PMU CPL filters don't discriminate against
    SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM
    cycles.
 
  - Fixup for perf_event_attr::sig_data
 
 (Peter Zijlstra)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmH/vpgACgkQEsHwGGHe
 VUqIGw/9EWg7Ek89BG9ZZui8EEDAzx3x0s/tyxiz0z18YfvtTnex5I87uJUYpw2s
 hFhxxmGN+rwhcMGQDc0sDLLLxp170Yg0383N6OBBBMWPtNyxMWihBOHQgz8hQzbW
 KtwoiBewmvAycHw0aoOtDMqFZTn5RToONnG9h7yV9rUIGKq75XNh72MBy9sCSE2F
 w8lA3WWVTrv91YTPSMbsrm/tMC6eQCRiJGRMHTapxrWxkVu/H8O42pxJgS6dlo+h
 vw025hXcf0KGBLzwVSHYdZg8jMn7uD2oSMh+wQ+Jy15XjKVWDfF1m3sA5S+zSJsS
 THHtmqni5mF5xn0H7eOK9nYmRXR013zx6weo9miK4SN1pcoJq+PTNdSZOIwBm3Nh
 eUXR/bXFYL0GGuPOk0QHA9AjqbCBPrkiw1nfppbJem2rrZ0uKHyKa8REVcVg/Xzy
 e/nDy8I2y2bnwU9Ugk9BNWBRmn54Q2kb4/egmtLME6oYiqOXumQ4ZB/CmwRaSwxG
 bB99/tBKblrWSA6wcgATkqYFSg4ZJniDxKipnrEYX8ePkGODKHoIQS4EUyjxuPW/
 fO2G4Oe8aO/qYS/yei8XcubyEFaSPyUo+th+ZiPODCt15JKzQCAxeOYxqnEI4I4s
 5afDBmAo47bs9Eem7GRjZOgrDOP88+lISZ1rZidp5paDwWAmL2E=
 =0tH5
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Intel/PT: filters could crash the kernel

 - Intel: default disable the PMU for SMM, some new-ish EFI firmware has
   started using CPL3 and the PMU CPL filters don't discriminate against
   SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM
   cycles.

 - Fixup for perf_event_attr::sig_data

* tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/pt: Fix crash with stop filters in single-range mode
  perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
  selftests/perf_events: Test modification of perf_event_attr::sig_data
  perf: Copy perf_event_attr::sig_data on modification
  x86/perf: Default set FREEZE_ON_SMI for all
2022-02-06 10:11:14 -08:00
Linus Torvalds
5fdb26213f ARM:
* A couple of fixes when handling an exception while a SError has been delivered
 
 * Workaround for Cortex-A510's single-step erratum
 
 RISCV:
 
 * Make CY, TM, and IR counters accessible in VU mode
 
 * Fix SBI implementation version
 
 x86:
 
 * Report deprecation of x87 features in supported CPUID
 
 * Preparation for fixing an interrupt delivery race on AMD hardware
 
 * Sparse fix
 
 All except POWER and s390:
 
 * Rework guest entry code to correctly mark noinstr areas and fix vtime'
   accounting (for x86, this was already mostly correct but not entirely;
   for ARM, MIPS and RISC-V it wasn't)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmH+E4AUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNujwf+ON/8pBWyMPdjiY5l5SyNLpRup8Su
 zkQoMEDICI7khYUz2bEAjOazFWHmHPsdogAlG82QeJCbFmCqyMb6iX0uWj53BdGP
 P2bOM/tXbulvKBBeiTritkUUNO+hBmmSF+57AOJSW+Enhc7HFwk54cuft6f30r+d
 JRaEOhPOP34hQ+wFQQhZZh72BaZBqgnrYwZDp1TiC0Wh8v7P4Nf+NFtEgba2nsxC
 xfz5PrEhvegtU8Ee9JAF2bAl7851WJq557P2cOpghtUMgh4t6GzCcUOCKIie67oQ
 0Vaf+OieAopdT+QNazSEWO9zxl7eTpWjk2hrwsDrgKHAL/YmuWJuSyEYIA==
 =C0LZ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - A couple of fixes when handling an exception while a SError has
     been delivered

   - Workaround for Cortex-A510's single-step erratum

  RISC-V:

   - Make CY, TM, and IR counters accessible in VU mode

   - Fix SBI implementation version

  x86:

   - Report deprecation of x87 features in supported CPUID

   - Preparation for fixing an interrupt delivery race on AMD hardware

   - Sparse fix

  All except POWER and s390:

   - Rework guest entry code to correctly mark noinstr areas and fix
     vtime' accounting (for x86, this was already mostly correct but not
     entirely; for ARM, MIPS and RISC-V it wasn't)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer
  KVM: x86: Report deprecated x87 features in supported CPUID
  KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
  KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
  KVM: arm64: Avoid consuming a stale esr value when SError occur
  RISC-V: KVM: Fix SBI implementation version
  RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode
  kvm/riscv: rework guest entry logic
  kvm/arm64: rework guest entry logic
  kvm/x86: rework guest entry logic
  kvm/mips: rework guest entry logic
  kvm: add guest_state_{enter,exit}_irqoff()
  KVM: x86: Move delivery of non-APICv interrupt into vendor code
  kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h
2022-02-05 09:55:59 -08:00
Paolo Bonzini
7e6a6b400d KVM/arm64 fixes for 5.17, take #2
- A couple of fixes when handling an exception while a SError has been
   delivered
 
 - Workaround for Cortex-A510's single-step[ erratum
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmH9LlcPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDLTcP/3Ry8CzvPubZquMyNdRUFvEg2EcfTa6vtIGW
 Fw7ap2hwPUaXUgJKDihMFIWj3Wf/wPmXw4t2Sr8R/yq8v9kWe+IG1isnT0yQhY3W
 kLXEqc8Mu4Rf8+jvlFHsp5mLENHIswpWAv/EY49ChgZkNmtkKpnPm1qnD89d8bNv
 tUwooDWidQ/7nXdM3z6zygSROJS24+OGTYTWzOQ1KgV3FGaXbqYiCleoPOpRR/Tc
 DQQWF/tVl8bZCqgkGKZCv3aXT0ZUPrQggARJGai78vP0l2sE/Kyaydgq5I7npZja
 2L2U4kDNoPYIVa8A1jvV3Ef3AqNFs6B7+jXWfYIgAcXjCYzDK3cZcxavf/Inq9F1
 3udVGJGSzH1KkGaihW3BVhsqGORRHKCdksJzWRgqf6vGyJhJw0u0D2u1rTWcT+jw
 Nm4KxShp0CX59HSLnVF5sR0Mct3jNNZ7UCCgH7q10wuBqYRfJT32hCo2ZrT7g9oD
 IQ+pa2dVYa3SaKZ4O6T/lSlbLOuuxtvmcEIfxYpPD6m10S5RrxOdsW3MCtiYM5HQ
 24oo2mk6NIu/va0XxhcW+NBMcYtLQD9JUGbkUkpcRy2mgilTi9b4YPp+muYM7plQ
 /S1gj2kGY8vjMg0H+wysjMJyl2huEwSRsZ/UfxCAgW+MYhHLDxhxAnDWc8EcwGgE
 tUzomowB
 =Mbx/
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.17, take #2

- A couple of fixes when handling an exception while a SError has been
  delivered

- Workaround for Cortex-A510's single-step[ erratum
2022-02-05 00:58:25 -05:00
Justin Iurman
be847673cf uapi: ioam: Insertion frequency
Add the insertion frequency uapi for IOAM lwtunnels.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-04 20:24:45 -08:00
Nicolas Dichtel
8b54136472 netfilter: nfqueue: enable to get skb->priority
This info could be useful to improve traffic analysis.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 06:30:27 +01:00
Florian Westphal
d1ca60efc5 netfilter: ctnetlink: disable helper autoassign
When userspace, e.g. conntrackd, inserts an entry with a specified helper,
its possible that the helper is lost immediately after its added:

ctnetlink_create_conntrack
  -> nf_ct_helper_ext_add + assign helper
    -> ctnetlink_setup_nat
      -> ctnetlink_parse_nat_setup
         -> parse_nat_setup -> nfnetlink_parse_nat_setup
	                       -> nf_nat_setup_info
                                 -> nf_conntrack_alter_reply
                                   -> __nf_ct_try_assign_helper

... and __nf_ct_try_assign_helper will zero the helper again.

Set IPS_HELPER bit to bypass auto-assign logic, its unwanted, just like
when helper is assigned via ruleset.

Dropped old 'not strictly necessary' comment, it referred to use of
rcu_assign_pointer() before it got replaced by RCU_INIT_POINTER().

NB: Fixes tag intentionally incorrect, this extends the referenced commit,
but this change won't build without IPS_HELPER introduced there.

Fixes: 6714cf5465 ("netfilter: nf_conntrack: fix explicit helper attachment and NAT")
Reported-by: Pham Thanh Tuyen <phamtyn@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:39:57 +01:00
Jakub Kicinski
c59400a68c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 17:36:16 -08:00
Linus Torvalds
eb2eb5161c Networking fixes for 5.17-rc3, including fixes from bpf, netfilter,
and ieee802154.
 
 Current release - regressions:
 
  - Partially revert "net/smc: Add netlink net namespace support",
    fix uABI breakage
 
  - netfilter:
      - nft_ct: fix use after free when attaching zone template
      - nft_byteorder: track register operations
 
 Previous releases - regressions:
 
  - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
 
  - phy: qca8081: fix speeds lower than 2.5Gb/s
 
  - sched: fix use-after-free in tc_new_tfilter()
 
 Previous releases - always broken:
 
  - tcp: fix mem under-charging with zerocopy sendmsg()
 
  - tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
 
  - neigh: do not trigger immediate probes on NUD_FAILED from
    neigh_managed_work, avoid a deadlock
 
  - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
    false-positives
 
  - netfilter: nft_reject_bridge: fix for missing reply from prerouting
 
  - smc: forward wakeup to smc socket waitqueue after fallback
 
  - ieee802154:
      - return meaningful error codes from the netlink helpers
      - mcr20a: fix lifs/sifs periods
      - at86rf230, ca8210: stop leaking skbs on error paths
 
  - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
 
  - ax25: add refcount in ax25_dev to avoid UAF bugs
 
  - eth: mlx5e:
      - fix SFP module EEPROM query
      - fix broken SKB allocation in HW-GRO
      - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
 
  - eth: amd-xgbe:
      - fix skb data length underflow
      - ensure reset of the tx_timer_active flag, avoid Tx timeouts
 
  - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
 
  - eth: e1000e: handshake with CSME starts from Alder Lake platforms
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmH8X9UACgkQMUZtbf5S
 IrsxuhAAlAvFHGL6y5Y2gAmhKvVUvCYjiIJBcvk7R66CwYVRxofvlhmxi6GM/Czs
 9SrVSaN4RXu3p3d7UtAl1gAQwHqzLIHH3m2g5dSKVvHZWQgkm/+n74x0aZQ9Fll7
 mWs9uu5fWsQr/qZBnnjoQTvUxRUNVd4trBy7nXGzkNqJL5j0+2TT4BhH4qalhE28
 iPc9YFCyKPdjoWFksteZqD3hAQbXxK/xRRr6xuvFHENlZdEHM6ARftHnJthTG/fY
 32rdn9YUkQ9lNtOBJNMN9yP2z1B7TcxASBqjjk55I7XtT1QAI9/PskszavHC0hOk
 BCSMX779bLNW4+G0wiSKVB4tq4tvswtawq8Hxa6zdU4TKIzfQ84ZL/Nf66GtH+4W
 C0mbZohmyJV9hQFkNT0ZLeihljd7i4BkDttlbK3uz2IL9tHeX3uSo5V7AgS/Xaf6
 frXgbGgjQTaR6IL9AUhfN3GTCx60mzpH/aRpFho8A5xAl3EtHWCJcRhbY/CEhQBR
 zyCndcLcG5mUzbhx/TxlKrrpRCLxqCUG/Tsb2wCh5jMxO1zonW9Hhv4P1ie6EFuI
 h+XiJT2WWObS/KTze9S86WOR0zcqrtRqaOGJlNB+/+K8ClZU8UsDTFXLQ0dqpVZF
 Mvp7VchBzyFFJrrvO8WkkJgLTKdaPJmM9wuWUZb4J6d2MWlmDkE=
 =qKvf
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, netfilter, and ieee802154.

  Current release - regressions:

   - Partially revert "net/smc: Add netlink net namespace support", fix
     uABI breakage

   - netfilter:
      - nft_ct: fix use after free when attaching zone template
      - nft_byteorder: track register operations

  Previous releases - regressions:

   - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback

   - phy: qca8081: fix speeds lower than 2.5Gb/s

   - sched: fix use-after-free in tc_new_tfilter()

  Previous releases - always broken:

   - tcp: fix mem under-charging with zerocopy sendmsg()

   - tcp: add missing tcp_skb_can_collapse() test in
     tcp_shift_skb_data()

   - neigh: do not trigger immediate probes on NUD_FAILED from
     neigh_managed_work, avoid a deadlock

   - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
     false-positives

   - netfilter: nft_reject_bridge: fix for missing reply from prerouting

   - smc: forward wakeup to smc socket waitqueue after fallback

   - ieee802154:
      - return meaningful error codes from the netlink helpers
      - mcr20a: fix lifs/sifs periods
      - at86rf230, ca8210: stop leaking skbs on error paths

   - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent

   - ax25: add refcount in ax25_dev to avoid UAF bugs

   - eth: mlx5e:
      - fix SFP module EEPROM query
      - fix broken SKB allocation in HW-GRO
      - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows

   - eth: amd-xgbe:
      - fix skb data length underflow
      - ensure reset of the tx_timer_active flag, avoid Tx timeouts

   - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()

   - eth: e1000e: handshake with CSME starts from Alder Lake platforms"

* tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  ax25: fix reference count leaks of ax25_dev
  net: stmmac: ensure PTP time register reads are consistent
  net: ipa: request IPA register values be retained
  dt-bindings: net: qcom,ipa: add optional qcom,qmp property
  tools/resolve_btfids: Do not print any commands when building silently
  bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
  net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work
  tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
  net: sparx5: do not refer to skb after passing it on
  Partially revert "net/smc: Add netlink net namespace support"
  net/mlx5e: Avoid field-overflowing memcpy()
  net/mlx5e: Use struct_group() for memcpy() region
  net/mlx5e: Avoid implicit modify hdr for decap drop rule
  net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
  net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
  net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
  net/mlx5: E-Switch, Fix uninitialized variable modact
  net/mlx5e: Fix handling of wrong devices during bond netevent
  net/mlx5e: Fix broken SKB allocation in HW-GRO
  net/mlx5e: Fix wrong calculation of header index in HW_GRO
  ...
2022-02-03 16:54:18 -08:00
Srinivas Pandruvada
e4b1eb24ce thermal: netlink: Add a new event to notify CPU capabilities change
Add a new netlink event to notify change in CPU capabilities in terms of
performance and efficiency.

Firmware may change CPU capabilities as a result of thermal events in the
system or to account for changes in the TDP (thermal design power) level.

This notification type will allow user space to avoid running workloads
on certain CPUs or proactively adjust power limits to avoid future events.

The netlink message consists of a nested attribute
(THERMAL_GENL_ATTR_CPU_CAPABILITY) with three attributes:

 * THERMAL_GENL_ATTR_CPU_CAPABILITY_ID (type u32):
   -- logical CPU number
 * THERMAL_GENL_ATTR_CPU_CAPABILITY_PERFORMANCE (type u32):
   -- Scaled performance from 0-1023
 * THERMAL_GENL_ATTR_CPU_CAPABILITY_EFFICIENCY (type u32):
   -- Scaled efficiency from 0-1023

Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-02-03 19:50:49 +01:00
Dmitry V. Levin
c86d86131a Partially revert "net/smc: Add netlink net namespace support"
The change of sizeof(struct smc_diag_linkinfo) by commit 79d39fc503
("net/smc: Add netlink net namespace support") introduced an ABI
regression: since struct smc_diag_lgrinfo contains an object of
type "struct smc_diag_linkinfo", offset of all subsequent members
of struct smc_diag_lgrinfo was changed by that change.

As result, applications compiled with the old version
of struct smc_diag_linkinfo will receive garbage in
struct smc_diag_lgrinfo.role if the kernel implements
this new version of struct smc_diag_linkinfo.

Fix this regression by reverting the part of commit 79d39fc503 that
changes struct smc_diag_linkinfo.  After all, there is SMC_GEN_NETLINK
interface which is good enough, so there is probably no need to touch
the smc_diag ABI in the first place.

Fixes: 79d39fc503 ("net/smc: Add netlink net namespace support")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20220202030904.GA9742@altlinux.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02 07:42:41 -08:00
Marco Elver
ddecd22878 perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
Due to the alignment requirements of siginfo_t, as described in
3ddb3fd8cd ("signal, perf: Fix siginfo_t by avoiding u64 on 32-bit
architectures"), siginfo_t::si_perf_data is limited to an unsigned long.

However, perf_event_attr::sig_data is an u64, to avoid having to deal
with compat conversions. Due to being an u64, it may not immediately be
clear to users that sig_data is truncated on 32 bit architectures.

Add a comment to explicitly point this out, and hopefully help some
users save time by not having to deduce themselves what's happening.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20220131103407.1971678-3-elver@google.com
2022-02-02 13:11:40 +01:00