Commit graph

1059881 commits

Author SHA1 Message Date
David S. Miller
c6e03dbe0c Merge branch 'mana-misc'
Dexuan Cui says:

====================
net: mana: some misc patches

Patch 1 is a small fix.

Patch 2 reports OS info to the PF driver.
Before the patch, the req fields were all zeros.

Patch 3 fixes and cleans up the error handling of HWC creation failure.

Patch 4 adds the callbacks for hibernation/kexec. It's based on patch 3.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Dexuan Cui
635096a86e net: mana: Support hibernation and kexec
Implement the suspend/resume/shutdown callbacks for hibernation/kexec.

Add mana_gd_setup() and mana_gd_cleanup() for some common code, and
use them in the mand_gd_* callbacks.

Reuse mana_probe/remove() for the hibernation path.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Dexuan Cui
62ea8b77ed net: mana: Improve the HWC error handling
Currently when the HWC creation fails, the error handling is flawed,
e.g. if mana_hwc_create_channel() -> mana_hwc_establish_channel() fails,
the resources acquired in mana_hwc_init_queues() is not released.

Enhance mana_hwc_destroy_channel() to do the proper cleanup work and
call it accordingly.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Dexuan Cui
3c37f35735 net: mana: Report OS info to the PF driver
The PF driver might use the OS info for statistical purposes.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
Dexuan Cui
6c7ea69653 net: mana: Fix the netdev_err()'s vPort argument in mana_init_port()
Use the correct port index rather than 0.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:21:49 +00:00
David S. Miller
986d2e3da7 Merge branch 'mptcp-selftests'
Mat Martineau says:

====================
mptcp: Some selftest improvements

Here are a couple of selftest changes for MPTCP.

Patch 1 fixes a mistake where the wrong protocol (TCP vs MPTCP) could be
requested on the listening socket in some link failure tests.

Patch 2 refactors the simulataneous flow tests to improve timing
accuracy and give more consistent results.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:19:49 +00:00
Paolo Abeni
b6ab64b074 selftests: mptcp: more stable simult_flows tests
Currently the simult_flows.sh self-tests are not very stable,
especially when running on slow VMs.

The tests measure runtime for transfers on multiple subflows
and check that the time is near the theoretical maximum.

The current test infra introduces a bit of jitter in test
runtime, due to multiple explicit delays. Additionally the
runtime is measured by the shell script wrapper. On a slow
VM, the script overhead is measurable and subject to relevant
jitter.

One solution to make the test more stable would be adding more
slack to the expected time; that could possibly hide real
regressions. Instead move the measurement inside the command
doing the transfer, and drop most unneeded sleeps.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:19:49 +00:00
Geliang Tang
7c909a9804 selftests: mptcp: fix proto type in link_failure tests
In listener_ns, we should pass srv_proto argument to mptcp_connect command,
not cl_proto.

Fixes: 7d1e6f1639 ("selftests: mptcp: add testcase for active-back")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:19:49 +00:00
Sukadev Bhattiprolu
6b278c0cb3 ibmvnic: delay complete()
If we get CRQ_INIT, we set errno to -EIO and first call complete() to
notify the waiter. Then we try to schedule a FAILOVER reset. If this
occurs while adapter is in PROBING state, ibmvnic_reset() changes the
error code to EAGAIN and returns without scheduling the FAILOVER. The
purpose of setting error code to EAGAIN is to ask the waiter to retry.

But due to the earlier complete() call, the waiter may already have seen
the -EIO response and decided not to retry. This can cause intermittent
failures when bringing up ibmvnic adapters during boot, specially in
in kexec/kdump kernels.

Defer the complete() call until after scheduling the reset.

Also streamline the error code to EAGAIN. Don't see why we need EIO
sometimes. All 3 callers of ibmvnic_reset_init() can handle EAGAIN.

Fixes: 17c8705838 ("ibmvnic: Return error code if init interrupted by transport event")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:14:52 +00:00
Sukadev Bhattiprolu
6e20d00158 ibmvnic: Process crqs after enabling interrupts
Soon after registering a CRQ it is possible that we get a fail over or
maybe a CRQ_INIT from the VIOS while interrupts were disabled.

Look for any such CRQs after enabling interrupts.

Otherwise we can intermittently fail to bring up ibmvnic adapters during
boot, specially in kexec/kdump kernels.

Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:14:52 +00:00
Sukadev Bhattiprolu
8878e46fcf ibmvnic: don't stop queue in xmit
If adapter's resetting bit is on, discard the packet but don't stop the
transmit queue - instead leave that to the reset code. With this change,
it is possible that we may get several calls to ibmvnic_xmit() that simply
discard packets and return.

But if we stop the queue here, we might end up doing so just after
__ibmvnic_open() started the queues (during a hard/soft reset) and before
the ->resetting bit was cleared. If that happens, there will be no one to
restart queue and transmissions will be blocked indefinitely.

This can cause a TIMEOUT reset and with auto priority failover enabled,
an unnecessary FAILOVER reset to less favored backing device and then a
FAILOVER back to the most favored backing device. If we hit the window
repeatedly, we can get stuck in a loop of TIMEOUT, FAILOVER, FAILOVER
resets leaving the adapter unusable for extended periods of time.

Fixes: 7f5b030830 ("ibmvnic: Free skb's in cases of failure in transmit")
Reported-by: Abdul Haleem <abdhalee@in.ibm.com>
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:14:52 +00:00
David S. Miller
7be49d242b Merge branch 'SO_MARK-routing'
Jakub Kicinski says:

====================
udp6: allow SO_MARK ctrl msg to affect routing

Looks like SO_MARK from cmsg does not affect routing policy.
This seems accidental.

I opted for net because of the discrepancy between IPv4
and IPv6, but it never worked and doesn't cause crashes..
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:12:48 +00:00
Jakub Kicinski
b0ced8f290 selftests: udp: test for passing SO_MARK as cmsg
Before fix:
|  Case IPv6 rejection returned 0, expected 1
|FAIL - 1/4 cases failed

With the fix:
| OK

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:12:48 +00:00
Jakub Kicinski
42dcfd850e udp6: allow SO_MARK ctrl msg to affect routing
Commit c6af0c227a ("ip: support SO_MARK cmsg")
added propagation of SO_MARK from cmsg to skb->mark.
For IPv4 and raw sockets the mark also affects route
lookup, but in case of IPv6 the flow info is
initialized before cmsg is parsed.

Fixes: c6af0c227a ("ip: support SO_MARK cmsg")
Reported-and-tested-by: Xintong Hu <huxintong@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:12:48 +00:00
Yu Xiao
f7536ffb09 nfp: flower: Allow ipv6gretap interface for offloading
The tunnel_type check only allows for "netif_is_gretap", but for
OVS the port is actually "netif_is_ip6gretap" when setting up GRE
for ipv6, which means offloading request was rejected before.

Therefore, adding "netif_is_ip6gretap" allow ipv6gretap interface
for offloading.

Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:09:55 +00:00
Marek Behún
c07c6e8eb4 net: dsa: populate supported_interfaces member
Add a new DSA switch operation, phylink_get_interfaces, which should
fill in which PHY_INTERFACE_MODE_* are supported by given port.

Use this before phylink_create() to fill phylinks supported_interfaces
member, allowing phylink to determine which PHY_INTERFACE_MODEs are
supported.

Signed-off-by: Marek Behún <kabel@kernel.org>
[tweaked patch and description to add more complete support -- rmk]
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:06:32 +00:00
David S. Miller
ebed1cf5b8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-10-29

This series contains updates to ice and iavf drivers and virtchnl header
file.

Brett removes vlan_promisc argument from a function call for ice driver.
In the virtchnl header file he removes an unused, reserved define and
converts raw value defines to instead use the BIT macro.

Marcin adds syncing of MAC addresses when creating switchdev VFs to
remove error messages on link up and stops showing buffer information
for port representors to remove duplicated entries being displayed for
ice driver.

Karen introduces a helper to go from pci_dev to iavf_adapter in the
iavf driver.

Przemyslaw fixes an issue where iavf was attempting to free IRQs before
calling disable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:05:20 +00:00
David S. Miller
06f1ecd433 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2021-10-30

Just two minor changes this time:

1) Remove some superfluous header files from xfrm4_tunnel.c
   From Mianhan Liu.

2) Simplify some error checks in xfrm_input().
   From luo penghao.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:01:44 +00:00
David S. Miller
894d084434 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Use array_size() in ebtables, from Gustavo A. R. Silva.

2) Attach IPS_ASSURED to internal UDP stream state, reported by
   Maciej Zenczykowski.

3) Add NFT_META_IFTYPE to match on the interface type either
   from ingress or egress.

4) Generalize pktinfo->tprot_set to flags field.

5) Allow to match on inner headers / payload data.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 12:59:58 +00:00
David S. Miller
2aec919f8d mlx5-updates-2021-10-29
1) Minor trivial refactoring and improvements
 2) Check for unsupported parameters fields in SW steering
 3) Support TC offload for OVS internal port, from Ariel, see below.
 
 Ariel Levkovich says:
 
 =====================
 
 Support HW offload of TC rules involving OVS internal port
 device type as the filter device or the destination
 device.
 
 The support is for flows which explicitly use the internal
 port as source or destination device as well as indirect offload
 for flows performing tunnel set or unset via a tunnel device
 and the internal port is the tunnel overlay device.
 
 Since flows with internal port as source port are added
 as egress rules while redirecting to internal port is done
 as an ingress redirect, the series introduces the necessary
 changes in mlx5_core driver to support the new types of flows
 and actions.
 
 =====================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmF8X0sACgkQSD+KveBX
 +j5PsQf/RfsE+spW0yJriJQ6Et+o+/CYR+AQYat5MaXjRw8uMz6uBcfXWCIBbYjw
 OwNP4ZagWXIHMkelj2Ap0Qlu4yqkUBy1A0le7HcAzOeje1vc9BObS15w9pJvQ9cp
 br3ZK5VZnQccSfF/LQpSjlGhD9083kETA2uXlCz7vitn8MVaya6ue6GU+wFC4Wnz
 LjOJ4PMXCEfhpA+efD0nD4EK6FJjqvJoVQkxWNmgOW7yg5PcyWXZD/tsDZUI8DGl
 0GlnM6W2H8bC0YhW01cnOsWPU+vtMLCsaF0YKqsLhnWUsaYSD5lXPIHqH6VpucZ7
 LSv/c2U9pBnWkf7UoFyuEeQxAhz1rg==
 =Uxa9
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-10-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-10-29

1) Minor trivial refactoring and improvements
2) Check for unsupported parameters fields in SW steering
3) Support TC offload for OVS internal port, from Ariel, see below.

Ariel Levkovich says:

=====================

Support HW offload of TC rules involving OVS internal port
device type as the filter device or the destination
device.

The support is for flows which explicitly use the internal
port as source or destination device as well as indirect offload
for flows performing tunnel set or unset via a tunnel device
and the internal port is the tunnel overlay device.

Since flows with internal port as source port are added
as egress rules while redirecting to internal port is done
as an ingress redirect, the series introduces the necessary
changes in mlx5_core driver to support the new types of flows
and actions.

=====================

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 12:53:24 +00:00
Zhang Mingyu
15c72660fe samples: remove duplicate include in fs-monitor.c
'sys/types.h' included in 'samples/fanotify/fs-monitor.c'
is duplicated.It is also included on 15 line.

Link: https://lore.kernel.org/r/20211101075152.35780-1-zhang.mingyu@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Zhang Mingyu <zhang.mingyu@zte.com.cn>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-11-01 13:17:47 +01:00
Kamal Heib
4e446714fb RDMA/qed: Use helper function to set GUIDs
Use addrconf_addr_eui48() helper function to set the GUIDs and remove the
driver specific version.

Link: https://lore.kernel.org/r/20211031170743.81755-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01 09:17:10 -03:00
Bixuan Cui
bbd5ba8db7 RISC-V: KVM: fix boolreturn.cocci warnings
Fix boolreturn.cocci warnings:
./arch/riscv/kvm/mmu.c:603:9-10: WARNING: return of 0/1 in function
'kvm_age_gfn' with return type bool
./arch/riscv/kvm/mmu.c:582:9-10: WARNING: return of 0/1 in function
'kvm_set_spte_gfn' with return type bool
./arch/riscv/kvm/mmu.c:621:9-10: WARNING: return of 0/1 in function
'kvm_test_age_gfn' with return type bool
./arch/riscv/kvm/mmu.c:568:9-10: WARNING: return of 0/1 in function
'kvm_unmap_gfn_range' with return type bool

Signed-off-by: Bixuan Cui <cuibixuan@linux.alibaba.com>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2021-11-01 17:35:17 +05:30
ran jianping
7b161d9cab RISC-V: KVM: remove unneeded semicolon
Elimate the following coccinelle check warning:
 ./arch/riscv/kvm/vcpu_sbi.c:169:2-3: Unneeded semicolon
 ./arch/riscv/kvm/vcpu_exit.c:397:2-3: Unneeded semicolon
 ./arch/riscv/kvm/vcpu_exit.c:687:2-3: Unneeded semicolon
 ./arch/riscv/kvm/vcpu_exit.c:645:2-3: Unneeded semicolon
 ./arch/riscv/kvm/vcpu.c:247:2-3: Unneeded semicolon
 ./arch/riscv/kvm/vcpu.c:284:2-3: Unneeded semicolon
 ./arch/riscv/kvm/vcpu_timer.c:123:2-3: Unneeded semicolon
 ./arch/riscv/kvm/vcpu_timer.c:170:2-3: Unneeded semicolon

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ran jianping <ran.jianping@zte.com.cn>
Signed-off-by: Anup Patel <anup.patel@wdc.com>
2021-11-01 17:35:13 +05:30
Jan Kara
b7eccf75c2 samples: Fix warning in fsnotify sample
The fsnotify sample code generates the following warning on powerpc:

samples/fanotify/fs-monitor.c: In function 'handle_notifications':
samples/fanotify/fs-monitor.c:68:36: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type '__u64' {aka 'long unsigned int'} [-Wformat=]
   68 |    printf("unexpected FAN MARK: %llx\n", event->mask);
      |                                 ~~~^     ~~~~~~~~~~~
      |                                    |          |
      |                                    |          __u64 {aka long unsigned int}
      |                                    long long unsigned int
      |                                 %lx

Fix the problem by explicitely typing the argument to proper type.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-11-01 12:48:01 +01:00
Gabriel Krisman Bertazi
9abeae5d44 docs: Fix formatting of literal sections in fanotify docs
Stephen Rothwell reported the following warning was introduced by commit
c0baf9ac0b ("docs: Document the FAN_FS_ERROR event").

Documentation/admin-guide/filesystem-monitoring.rst:60: WARNING:
 Definition list ends without a blank line; unexpected unindent.

Link: https://lore.kernel.org/r/87y26camhe.fsf@collabora.com
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-11-01 12:45:06 +01:00
Gabriel Krisman Bertazi
8fc70b3a14 samples: Make fs-monitor depend on libc and headers
Prevent build errors when headers or libc are not available, such as on
kernel build bots, like the below:

samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file
or directory
  7 | #include <errno.h>
    |          ^~~~~~~~~

Link: https://lore.kernel.org/r/87fsslasgz.fsf@collabora.com
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-11-01 12:39:53 +01:00
Helge Deller
6e866a4628 parisc: Fix set_fixmap() on PA1.x CPUs
Fix a kernel crash which happens on PA1.x CPUs while initializing the
FTRACE/KPROBE breakpoints.  The PTE table entries for the fixmap area
were not created correctly.

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: ccfbc68d41 ("parisc: add set_fixmap()/clear_fixmap()")
Cc: stable@vger.kernel.org # v5.2+
2021-11-01 12:00:22 +01:00
Yihao Han
1ae8e91e81 parisc: Use swap() to swap values in setup_bootmem()
Signed-off-by: Yihao Han <hanyihao@vivo.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-01 11:59:49 +01:00
Christophe Leroy
c12ab8dbc4 powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST
Until now, all tests involving CONFIG_STRICT_KERNEL_RWX were done with
DEBUG_RODATA_TEST to check the result. But now that
CONFIG_STRICT_KERNEL_RWX is selected by default, it came without
CONFIG_DEBUG_RODATA_TEST and led to the following Oops

[    6.830908] Freeing unused kernel image (initmem) memory: 352K
[    6.840077] BUG: Unable to handle kernel data access on write at 0xc1285200
[    6.846836] Faulting instruction address: 0xc0004b6c
[    6.851745] Oops: Kernel access of bad area, sig: 11 [#1]
[    6.857075] BE PAGE_SIZE=16K PREEMPT CMPC885
[    6.861348] SAF3000 DIE NOTIFICATION
[    6.864830] CPU: 0 PID: 1 Comm: swapper Not tainted 5.15.0-rc5-s3k-dev-02255-g2747d7b7916f #451
[    6.873429] NIP:  c0004b6c LR: c0004b60 CTR: 00000000
[    6.878419] REGS: c902be60 TRAP: 0300   Not tainted  (5.15.0-rc5-s3k-dev-02255-g2747d7b7916f)
[    6.886852] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 53000335  XER: 8000ff40
[    6.893564] DAR: c1285200 DSISR: 82000000
[    6.893564] GPR00: 0c000000 c902bf20 c20f4000 08000000 00000001 04001f00 c1800000 00000035
[    6.893564] GPR08: ff0001ff c1280000 00000002 c0004b60 00001000 00000000 c0004b1c 00000000
[    6.893564] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    6.893564] GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c1060000
[    6.932034] NIP [c0004b6c] kernel_init+0x50/0x138
[    6.936682] LR [c0004b60] kernel_init+0x44/0x138
[    6.941245] Call Trace:
[    6.943653] [c902bf20] [c0004b60] kernel_init+0x44/0x138 (unreliable)
[    6.950022] [c902bf30] [c001122c] ret_from_kernel_thread+0x5c/0x64
[    6.956135] Instruction dump:
[    6.959060] 48ffc521 48045469 4800d8cd 3d20c086 89295fa0 2c090000 41820058 480796c9
[    6.966890] 4800e48d 3d20c128 39400002 3fe0c106 <91495200> 3bff8000 4806fa1d 481f7d75
[    6.974902] ---[ end trace 1e397bacba4aa610 ]---

0xc1285200 corresponds to 'system_state' global var that the kernel is trying to set to
SYSTEM_RUNNING. This var is above the RO/RW limit so it shouldn't Oops.

It oopses because the dirty bit is missing.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3d5800b0bbcd7b19761b98f50421358667b45331.1635520232.git.christophe.leroy@csgroup.eu
2021-11-01 21:39:03 +11:00
Arnaldo Carvalho de Melo
875eaa3990 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-01 07:10:30 -03:00
Eli Cohen
540061ac79 vdpa/mlx5: Forward only packets with allowed MAC address
Add rules to forward packets to the net device's TIR only if the
destination MAC is equal to the configured MAC. This is required to
prevent the netdevice from receiving traffic not destined to its
configured MAC.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Link: https://lore.kernel.org/r/20211026175519.87795-9-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01 05:26:49 -04:00
Eli Cohen
a007d94004 vdpa/mlx5: Support configuration of MAC
Add code to accept MAC configuration through vdpa tool. The MAC is
written into the config struct and later can be retrieved through
get_config().

Examples:
1. Configure MAC while adding a device:
$ vdpa dev add mgmtdev pci/0000:06:00.2 name vdpa0 mac 00:11:22:33:44:55

2. Show configured params:
$ vdpa dev config show
vdpa0: mac 00:11:22:33:44:55 link down link_announce false max_vq_pairs 8 mtu 1500

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-8-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
ef76eb83a1 vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit
Cited patch in the fixes tag clears the features bit during reset.
mlx5 vdpa device feature bits are static decided by device capabilities.
These feature bits (including VIRTIO_NET_F_MAC) are initialized during
device addition time.

Clearing features bit in reset callback cleared the VIRTIO_NET_F_MAC. Due
to this, MAC address provided by the device is not honored.

Fix it by not clearing the static feature bits during reset.

Fixes: 0686082dbf ("vdpa: Add reset callback in vdpa_config_ops")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20211026175519.87795-7-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
1138b9818e vdpa_sim_net: Enable user to set mac address and mtu
Enable user to set the mac address and mtu so that each vdpa device
can have its own user specified mac address and mtu.

Now that user is enabled to set the mac address, remove the module
parameter for same.

And example of setting mac addr and mtu and view the configuration:
$ vdpa mgmtdev show
vdpasim_net:
  supported_classes net

$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000

$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-6-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
d8ca2fa5be vdpa: Enable user to set mac and mtu of vdpa device
$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000

$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000

$ vdpa dev config show -jp
{
    "config": {
        "bar": {
            "mac": "00:11:22:33:44:55",
            "link ": "up",
            "link_announce ": false,
            "mtu": 9000,
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>

Link: https://lore.kernel.org/r/20211026175519.87795-5-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
960deb33be vdpa: Use kernel coding style for structure comments
As subsequent patch adds new structure field with comment, move the
structure comment to follow kernel coding style.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-4-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
ad69dd0bf2 vdpa: Introduce query of device config layout
Introduce a command to query a device config layout.

An example query of network vdpa device:

$ vdpa dev add name bar mgmtdev vdpasim_net

$ vdpa dev config show
bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500

$ vdpa dev config show -jp
{
    "config": {
        "bar": {
            "mac": "00:35:09:19:48:05",
            "link ": "up",
            "link_announce ": false,
            "mtu": 1500,
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-3-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Parav Pandit
6dbb1f1687 vdpa: Introduce and use vdpa device get, set config helpers
Subsequent patches enable get and set configuration either
via management device or via vdpa device' config ops.

This requires synchronization between multiple callers to get and set
config callbacks. Features setting also influence the layout of the
configuration fields endianness.

To avoid exposing synchronization primitives to callers, introduce
helper for setting the configuration and use it.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-2-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Jason Wang
c57911ebfb virtio-scsi: don't let virtio core to validate used buffer length
We never tries to use used length, so the patch prevents the virtio
core from validating used length.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211027022107.14357-5-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Jason Wang
a40392edf1 virtio-blk: don't let virtio core to validate used length
We never tries to use used length, so the patch prevents the virtio
core from validating used length.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211027022107.14357-4-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:48 -04:00
Jason Wang
816625c136 virtio-net: don't let virtio core to validate used length
For RX virtuqueue, the used length is validated in all the three paths
(big, small and mergeable). For control vq, we never tries to use used
length. So this patch forbids the core to validate the used length.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211027022107.14357-3-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:48 -04:00
Jason Wang
939779f515 virtio_ring: validate used buffer length
This patch validate the used buffer length provided by the device
before trying to use it. This is done by record the in buffer length
in a new field in desc_state structure during virtqueue_add(), then we
can fail the virtqueue_get_buf() when we find the device is trying to
give us a used buffer length which is greater than the in buffer
length.

Since some drivers have already done the validation by themselves,
this patch tries to makes the core validation optional. For the driver
that doesn't want the validation, it can set the
suppress_used_validation to be true (which could be overridden by
force_used_validation module parameter). To be more efficient, a
dedicate array is used for storing the validate used length, this
helps to eliminate the cache stress if validation is done by the
driver.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211027022107.14357-2-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:48 -04:00
Michael S. Tsirkin
f083937247 virtio_blk: correct types for status handling
virtblk_setup_cmd returns blk_status_t in an int, callers then assign it
back to a blk_status_t variable. blk_status_t is either u32 or (more
typically) u8 so it works, but is inelegant and causes sparse warnings.

Pass the status in blk_status_t in a consistent way.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: b2c5221fd074 ("virtio-blk: avoid preallocating big SGL for data")
Cc: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-01 05:26:48 -04:00
Michael S. Tsirkin
ead65f7695 virtio_blk: allow 0 as num_request_queues
The default value is 0 meaning "no limit". However if 0
is specified on the command line it is instead silently
converted to 1. Further, the value is already validated
at point of use, there's no point in duplicating code
validating the value when it is set.

Simplify the code while making the behaviour more consistent
by using plain module_param.

Fixes: 1a662cf6cb9a ("virtio-blk: add num_request_queues module parameter")
Cc: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:48 -04:00
Viresh Kumar
dcce162559 i2c: virtio: Add support for zero-length requests
The virtio specification received a new mandatory feature
(VIRTIO_I2C_F_ZERO_LENGTH_REQUEST) for zero length requests. Fail if the
feature isn't offered by the device.

For each read-request, set the VIRTIO_I2C_FLAGS_M_RD flag, as required
by the VIRTIO_I2C_F_ZERO_LENGTH_REQUEST feature.

This allows us to support zero length requests, like SMBUS Quick, where
the buffer need not be sent anymore.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/7c58868cd26d2fc4bd82d0d8b0dfb55636380110.1634808714.git.viresh.kumar@linaro.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jie Deng <jie.deng@intel.com> # once the spec is merged
2021-11-01 05:26:48 -04:00
Ye Guojin
f1aa12f535 virtio-blk: fixup coccinelle warnings
coccicheck complains about the use of snprintf() in sysfs show
functions:
WARNING  use scnprintf or sprintf

Use sysfs_emit instead of scnprintf or sprintf makes more sense.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Link: https://lore.kernel.org/r/20211021065111.1047824-1-ye.guojin@zte.com.cn
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-11-01 05:26:48 -04:00
Jason Wang
ef5c366fea virtio_ring: fix typos in vring_desc_extra
We're actually tracking descriptor address and length instead of the
buffer.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211019070152.8236-7-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:48 -04:00
Jason Wang
080cd7c3ac virtio-pci: harden INTX interrupts
This patch tries to make sure the virtio interrupt handler for INTX
won't be called after a reset and before virtio_device_ready(). We
can't use IRQF_NO_AUTOEN since we're using shared interrupt
(IRQF_SHARED). So this patch tracks the INTX enabling status in a new
intx_soft_enabled variable and toggle it during in
vp_disable/enable_vectors(). The INTX interrupt handler will check
intx_soft_enabled before processing the actual interrupt.

Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211019070152.8236-6-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:48 -04:00
Jason Wang
9e35276a53 virtio_pci: harden MSI-X interrupts
We used to synchronize pending MSI-X irq handlers via
synchronize_irq(), this may not work for the untrusted device which
may keep sending interrupts after reset which may lead unexpected
results. Similarly, we should not enable MSI-X interrupt until the
device is ready. So this patch fixes those two issues by:

1) switching to use disable_irq() to prevent the virtio interrupt
   handlers to be called after the device is reset.
2) using IRQF_NO_AUTOEN and enable the MSI-X irq during .ready()

This can make sure the virtio interrupt handler won't be called before
virtio_device_ready() and after reset.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211019070152.8236-5-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:48 -04:00