Commit graph

1059881 commits

Author SHA1 Message Date
Eric W. Biederman
1a4d21a23c signal/vm86_32: Replace open coded BUG_ON with an actual BUG_ON
The function save_v86_state is only called when userspace was
operating in vm86 mode before entering the kernel.  Not having vm86
state in the task_struct should never happen.  So transform the hand
rolled BUG_ON into an actual BUG_ON to make it clear what is
happening.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: H Peter Anvin <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20211020174406.17889-9-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25 15:56:29 -05:00
Eric W. Biederman
984bd71fb3 signal/sparc: In setup_tsb_params convert open coded BUG into BUG
The function setup_tsb_params has exactly one caller tsb_grow.  The
function tsb_grow passes in a tsb_bytes value that is between 8192 and
1048576 inclusive, and is guaranteed to be a power of 2.  The function
setup_tsb_params verifies this property with a switch statement and
then prints an error and causes the task to exit if this is not true.

In practice that print statement can never be reached because tsb_grow
never passes in a bad tsb_size.  So if tsb_size ever gets a bad value
that is a kernel bug.

So replace the do_exit which is effectively an open coded version of
BUG() with an actuall call to BUG().  Making it clearer that this
is a case that can never, and should never happen.

Cc: David Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Link: https://lkml.kernel.org/r/20211020174406.17889-8-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25 15:56:29 -05:00
Eric W. Biederman
83a1f27ad7 signal/powerpc: On swapcontext failure force SIGSEGV
If the register state may be partial and corrupted instead of calling
do_exit, call force_sigsegv(SIGSEGV).  Which properly kills the
process with SIGSEGV and does not let any more userspace code execute,
instead of just killing one thread of the process and potentially
confusing everything.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: 756f1ae8a44e ("PPC32: Rework signal code and add a swapcontext system call.")
Fixes: 04879b04bf50 ("[PATCH] ppc64: VMX (Altivec) support & signal32 rework, from Ben Herrenschmidt")
Link: https://lkml.kernel.org/r/20211020174406.17889-7-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25 15:56:29 -05:00
Eric W. Biederman
ce0ee4e6ac signal/sh: Use force_sig(SIGKILL) instead of do_group_exit(SIGKILL)
Today the sh code allocates memory the first time a process uses
the fpu.  If that memory allocation fails, kill the affected task
with force_sig(SIGKILL) rather than do_group_exit(SIGKILL).

Calling do_group_exit from an exception handler can potentially lead
to dead locks as do_group_exit is not designed to be called from
interrupt context.  Instead use force_sig(SIGKILL) to kill the
userspace process.  Sending signals in general and force_sig in
particular has been tested from interrupt context so there should be
no problems.

Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Fixes: 0ea820cf9b ("sh: Move over to dynamically allocated FPU context.")
Link: https://lkml.kernel.org/r/20211020174406.17889-6-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25 15:56:29 -05:00
Eric W. Biederman
95bf9d646c signal/mips: Update (_save|_restore)_fp_context to fail with -EFAULT
When an instruction to save or restore a register from the stack fails
in _save_fp_context or _restore_fp_context return with -EFAULT.  This
change was made to r2300_fpu.S[1] but it looks like it got lost with
the introduction of EX2[2].  This is also what the other implementation
of _save_fp_context and _restore_fp_context in r4k_fpu.S does, and
what is needed for the callers to be able to handle the error.

Furthermore calling do_exit(SIGSEGV) from bad_stack is wrong because
it does not terminate the entire process it just terminates a single
thread.

As the changed code was the only caller of arch/mips/kernel/syscall.c:bad_stack
remove the problematic and now unused helper function.

Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Maciej Rozycki <macro@orcam.me.uk>
Cc: linux-mips@vger.kernel.org
[1] 35938a00ba ("MIPS: Fix ISA I FP sigcontext access violation handling")
[2] f92722dc45 ("MIPS: Correct MIPS I FP sigcontext layout")
Cc: stable@vger.kernel.org
Fixes: f92722dc45 ("MIPS: Correct MIPS I FP sigcontext layout")
Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lkml.kernel.org/r/20211020174406.17889-5-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-25 15:55:35 -05:00
Parav Pandit
d67ab0a8c1 net/mlx5: SF_DEV Add SF device trace points
Add SF device add and delete specific trace points.

echo mlx5:mlx5_sf_dev_add >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_dev_del >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_vhca_event >> /sys/kernel/debug/tracing/set_event

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:21 -07:00
Parav Pandit
b3ccada68b net/mlx5: SF, Add SF trace points
Add support for trace events for SFs to improve debugging.
This covers
(a) port add and free trace points
(b) device level trace points
(c) SF hardware context add, free trace points.
(d) SF function activate/deacticate and state trace points

SF events examples:
echo mlx5:mlx5_sf_add >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_alloc >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_deferred_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_update_state >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_activate >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_deactivate >> /sys/kernel/debug/tracing/set_event

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:21 -07:00
Shay Drory
5546040619 net/mlx5: Let user configure max_macs param
Currently, max_macs is taking 70Kbytes of memory per function. This
size is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the number of max_macs.

For example, to reduce the number of max_macs to 1, execute::
$ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:21 -07:00
Shay Drory
a6cb08daa3 net/mlx5: Let user configure event_eq_size param
Event EQ is an EQ which received the notification of almost all the
events generated by the NIC.
Currently, each event EQ is taking 512KB of memory. This size is not
needed in most use cases, and is critical with large scale. Hence,
allow user to configure the size of the event EQ.

For example to reduce event EQ size to 64, execute::
$ devlink resource set pci/0000:00:0b.0 path /event_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:20 -07:00
Shay Drory
46ae40b94d net/mlx5: Let user configure io_eq_size param
Currently, each I/O EQ is taking 128KB of memory. This size
is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the size of I/O EQs.

For example, to reduce I/O EQ size to 64, execute:
$ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:20 -07:00
Vlad Buslov
3518c83fc9 net/mlx5: Bridge, support replacing existing FDB entry
The SWITCHDEV_FDB_ADD_TO_DEVICE is used for both adding new and replacing
existing entry. Implement support for replacing existing FDB entries in
mlx5 offload code.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:20 -07:00
Vlad Buslov
2deda2f1bf net/mlx5: Bridge, extract code to lookup and del/notify entry
Following two patterns in bridge code are used in multiple places where
similar code is duplicated:

- Lookup FDB entry from hashtable by address+vid pair.

- Notify software bridge and then delete existing FDB entry.

In order to improve code quality and prepare for following patch series
that also uses described patterns, extract the codes to dedicated helper
functions.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:20 -07:00
Aya Levin
5a1023deee net/mlx5: Add periodic update of host time to firmware
Firmware logs its asserts also to non-volatile memory. In order to
reduce drift between the NIC and the host, the driver sets the host
epoch-time to the firmware every hour.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:19 -07:00
Aya Levin
b87ef75cb5 net/mlx5: Print health buffer by log level
Add log macro which gets log level as a parameter. Use the severity
read from the health buffer and the new log macro to log the health buffer
with severity as log level.  Prior to this patch, health buffer was
printed in error log level regardless of its severity. Now the user may
filter dmesg (--level) or change kernel log level to focus on different
severity levels of firmware errors.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:19 -07:00
Aya Levin
cb464ba53c net/mlx5: Extend health buffer dump
Enhance health buffer to include:
 - assert_var5: expose the 6'th assert variable.
 - time: error's time-stamp in seconds (epoch time).
 - rfr: Recovery Flow Requiered. When set, indicates that the error
        cannot be recovered without flow involving reset.
 - severity: error's severity value, ranging from emergency to debug.
Expose them in the health buffer dump (dmesg and devlink fw reporter).

Health buffer in dmesg:
mlx5_core 0000:08:00.0: print_health_info:425:(pid 912): Health issue observed, firmware internal error, severity(3) ERROR:
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[0] 0x08040700
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[1] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[2] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[3] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[4] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[5] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:432:(pid 912): assert_exit_ptr 0x00aaf800
mlx5_core 0000:08:00.0: print_health_info:434:(pid 912): assert_callra 0x00aaf70c
mlx5_core 0000:08:00.0: print_health_info:436:(pid 912): fw_ver 16.32.492
mlx5_core 0000:08:00.0: print_health_info:437:(pid 912): time 1634819758
mlx5_core 0000:08:00.0: print_health_info:438:(pid 912): hw_id 0x0000020d
mlx5_core 0000:08:00.0: print_health_info:439:(pid 912): rfr 0
mlx5_core 0000:08:00.0: print_health_info:440:(pid 912): severity 3 (ERROR)
mlx5_core 0000:08:00.0: print_health_info:441:(pid 912): irisc_index 9
mlx5_core 0000:08:00.0: print_health_info:442:(pid 912): synd 0x1: firmware internal error
mlx5_core 0000:08:00.0: print_health_info:444:(pid 912): ext_synd 0x802b
mlx5_core 0000:08:00.0: print_health_info:445:(pid 912): raw fw_ver 0x102001ec

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:19 -07:00
Avihai Horon
2fdeb4f4c2 net/mlx5: Reduce flow counters bulk query buffer size for SFs
Currently, the flow counters bulk query buffer takes a little more than
512KB of memory, which is aligned to the next power of 2, to 1MB.

The buffer size determines the maximum number of flow counters that can
be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.

SFs don't use many flow counters and don't need a big buffer. Since this
size is critical with large scale, reduce the size of the bulk query
buffer for SFs.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:18 -07:00
Shay Drory
038e5e4718 net/mlx5: Fix unused function warning of mlx5i_flow_type_mask
The cited commit is causing unused-function warning[1] when
CONFIG_MLX5_EN_RXNFC is not set.
Fix this by moving the function into the ifdef, where it's only used

[1]
warning: ‘mlx5i_flow_type_mask’ defined but not used [-Wunused-function]

Fixes: 9fbe1c25ec ("net/mlx5i: Enable Rx steering for IPoIB via ethtool")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:18 -07:00
Paul Blakey
a64c5edbd2 net/mlx5: Remove unnecessary checks for slow path flag
After previous changes, caller (mlx5e_tc_offload_fdb_rules()) already
checks for the slow path flag, and if set won't call offload/unoffload
sample.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:18 -07:00
Jakub Kicinski
537e4d2e6f net/mlx5e: don't write directly to netdev->dev_addr
Use a local buffer and eth_hw_addr_set()

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-25 13:51:17 -07:00
Yongxin Liu
fd1b5beb17 ice: check whether PTP is initialized in ice_ptp_release()
PTP is currently only supported on E810 devices, it is checked
in ice_ptp_init(). However, there is no check in ice_ptp_release().
For other E800 series devices, ice_ptp_release() will be wrongly executed.

Fix the following calltrace.

  INFO: trying to register non-static key.
  The code is fine but needs lockdep annotation, or maybe
  you didn't initialize this object before use?
  turning off the locking correctness validator.
  Workqueue: ice ice_service_task [ice]
  Call Trace:
   dump_stack_lvl+0x5b/0x82
   dump_stack+0x10/0x12
   register_lock_class+0x495/0x4a0
   ? find_held_lock+0x3c/0xb0
   __lock_acquire+0x71/0x1830
   lock_acquire+0x1e6/0x330
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   ? _raw_spin_lock+0x19/0x70
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   _raw_spin_lock+0x38/0x70
   ? ice_ptp_release+0x3c/0x1e0 [ice]
   ice_ptp_release+0x3c/0x1e0 [ice]
   ice_prepare_for_reset+0xcb/0xe0 [ice]
   ice_do_reset+0x38/0x110 [ice]
   ice_service_task+0x138/0xf10 [ice]
   ? __this_cpu_preempt_check+0x13/0x20
   process_one_work+0x26a/0x650
   worker_thread+0x3f/0x3b0
   ? __kthread_parkme+0x51/0xb0
   ? process_one_work+0x650/0x650
   kthread+0x161/0x190
   ? set_kthread_struct+0x40/0x40
   ret_from_fork+0x1f/0x30

Fixes: 4dd0d5c33c ("ice: add lock around Tx timestamp tracker flush")
Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-25 13:44:55 -07:00
Dave Ertman
6a8b357278 ice: Respond to a NETDEV_UNREGISTER event for LAG
When the PF is a member of a link aggregate, and the driver
is removed, the process will hang unless we respond to the
NETDEV_UNREGISTER event that is sent to the event_handler
for LAG.

Add a case statement for the ice_lag_event_handler to unlink
the PF from the link aggregate.

Also remove code that was incorrectly applying a dev_hold to
peer_netdevs that were associated with the ice driver.

Fixes: df006dd4b1 ("ice: Add initial support framework for LAG")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-25 13:44:37 -07:00
Amit Pundir
e091b836a3 Revert "arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target"
This reverts commit 001ce9785c.

This upstream commit broke AOSP (post Android 12 merge) build
on RB5. The device either silently crashes into USB crash mode
after android boot animation or we see a blank blue screen
with following dpu errors in dmesg:

[  T444] hw recovery is not complete for ctl:3
[  T444] [drm:dpu_encoder_phys_vid_prepare_for_kickoff:539] [dpu error]enc31 intf1 ctl 3 reset failure: -22
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110
[    C7] [drm:dpu_encoder_frame_done_timeout:2127] [dpu error]enc31 frame done timeout
[  T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[  T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110

Fixes: 001ce9785c ("arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211014135410.4136412-1-dmitry.baryshkov@linaro.org
2021-10-25 14:26:00 -05:00
Bjorn Andersson
c50031f03d firmware: qcom: scm: Don't break compile test on non-ARM platforms
The introduction of __qcom_scm_set_boot_addr_mc() relies on
cpu_logical_map() and MPIDR_AFFINITY_LEVEL() from smp_plat.h, but only
ARM and ARM64 has this include file, so the introduction of this
dependency broke compile testing on e.g. x86_64.

Make the inclusion of smp_plat.h and the affected function depend on
ARM || ARM64 to allow the code to still be compiled.

Fixes: 55845f46df ("firmware: qcom: scm: Add support for MC boot address API")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211025025816.2937465-1-bjorn.andersson@linaro.org
2021-10-25 14:21:34 -05:00
Qu Wenruo
8481dd80ab btrfs: subpage: introduce btrfs_subpage_bitmap_info
Currently we use fixed size u16 bitmap for subpage bitmap.  This is fine
for 4K sectorsize with 64K page size.

But for 4K sectorsize and larger page size, the bitmap is too small,
while for smaller page size like 16K, u16 bitmaps waste too much space.

Here we introduce a new helper structure, btrfs_subpage_bitmap_info, to
record the proper bitmap size, and where each bitmap should start at.

By this, we can later compact all subpage bitmaps into one u32 bitmap.
This patch is the first step.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Qu Wenruo
651fb41927 btrfs: subpage: make btrfs_alloc_subpage() return btrfs_subpage directly
The existing calling convention of btrfs_alloc_subpage() is pretty
awful.  Change it to a more common pattern by returning struct
btrfs_subpage directly and let the caller to determine if the call
succeeded.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Qu Wenruo
fdf250db89 btrfs: subpage: only call btrfs_alloc_subpage() when sectorsize is smaller than PAGE_SIZE
There are two call sites of btrfs_alloc_subpage():

- btrfs_attach_subpage()
  We have ensured sectorsize is smaller than PAGE_SIZE

- alloc_extent_buffer()
  We call btrfs_alloc_subpage() unconditionally.

The alloc_extent_buffer() forces us to check the sectorsize size against
page size inside btrfs_alloc_subpage().

Since the function name, btrfs_alloc_subpage(), already indicates it
should only get called for subpage cases, do the check in
alloc_extent_buffer() and add an ASSERT() in btrfs_alloc_subpage().

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Su Yue
9675ea8c9d btrfs: update comment for fs_devices::seed_list in btrfs_rm_device
Update it since commit 944d3f9fac ("btrfs: switch seed device to
list api") did conversion from fs_devices::seed to fs_devices::seed_list.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Su Yue <l@damenly.su>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Anand Jain
991a3daeda btrfs: drop unnecessary ret in ioctl_quota_rescan_status
There is no need for the variable ret after d66105cfa873 ("btrfs:
allocate btrfs_ioctl_quota_rescan_args on stack"), remove it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Marcos Paulo de Souza
0e3dd5bce8 btrfs: send: simplify send_create_inode_if_needed
The out label is being overused, we can simply return if the condition
permits.

No functional changes.

Reviewed-by: Su Yue <l@damenly.su>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Nikolay Borisov
f6f39f7a0a btrfs: rename btrfs_alloc_chunk to btrfs_create_chunk
The user facing function used to allocate new chunks is
btrfs_chunk_alloc, unfortunately there is yet another similar sounding
function - btrfs_alloc_chunk. This creates confusion, especially since
the latter function can be considered "private" in the sense that it
implements the first stage of chunk creation and as such is called by
btrfs_chunk_alloc.

To avoid the awkwardness that comes with having similarly named but
distinctly different in their purpose function rename btrfs_alloc_chunk
to btrfs_create_chunk, given that the main purpose of this function is
to orchestrate the whole process of allocating a chunk - reserving space
into devices, deciding on characteristics of the stripe size and
creating the in-memory structures.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-25 21:17:16 +02:00
Daniel Latypov
2ab5d5e67f kunit: tool: continue past invalid utf-8 output
kunit.py currently crashes and fails to parse kernel output if it's not
fully valid utf-8.

This can come from memory corruption or just inadvertently printing
out binary data as strings.

E.g. adding this line into a kunit test
  pr_info("\x80")
will cause this exception
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position
  1961: invalid start byte

We can tell Python how to handle errors, see
https://docs.python.org/3/library/codecs.html#error-handlers

Unfortunately, it doesn't seem like there's a way to specify this in
just one location, so we need to repeat ourselves quite a bit.

Specify `errors='backslashreplace'` so we instead:
* print out the offending byte as '\x80'
* try and continue parsing the output.
  * as long as the TAP lines themselves are valid, we're fine.

Fixed spelling/grammar in commit log:
Shuah Khan <<skhan@linuxfoundation.org>

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-25 13:06:45 -06:00
Linus Torvalds
3906fe9bb7 Linux 5.15-rc7 2021-10-25 11:30:31 -07:00
Matthew Wilcox (Oracle)
cb68543239 secretmem: Prevent secretmem_users from wrapping to zero
Commit 110860541f ("mm/secretmem: use refcount_t instead of atomic_t")
attempted to fix the problem of secretmem_users wrapping to zero and
allowing suspend once again.

But it was reverted in commit 87066fdd2e ("Revert 'mm/secretmem: use
refcount_t instead of atomic_t'") because of the problems it caused - a
refcount_t was not semantically the right type to use.

Instead prevent secretmem_users from wrapping to zero by forbidding new
users if the number of users has wrapped from positive to negative.
This stops a long way short of reaching the necessary 4 billion users
where it wraps to zero again, so there's no need to be clever with
special anti-wrap types or checking the return value from atomic_inc().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jordy Zomer <jordy@pwning.systems>
Cc: Kees Cook <keescook@chromium.org>,
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25 11:27:31 -07:00
Jakub Kicinski
dcd63d4326 Merge branch 'bluetooth-don-t-write-directly-to-netdev-dev_addr'
Jakub Kicinski says:

====================
bluetooth: don't write directly to netdev->dev_addr

The usual conversions.
====================

Link: https://lore.kernel.org/r/20211022231834.2710245-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25 11:01:33 -07:00
Jakub Kicinski
a1916d3446 bluetooth: use dev_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25 11:01:29 -07:00
Jakub Kicinski
08c181f052 bluetooth: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Convert bluetooth from memcpy(... ETH_ADDR) to eth_hw_addr_set():

  @@
  expression dev, np;
  @@
  - memcpy(dev->dev_addr, np, ETH_ALEN)
  + eth_hw_addr_set(dev, np)

Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25 11:01:24 -07:00
Kamal Heib
e058953c0e RDMA/qedr: Remove unsupported qedr_resize_cq callback
There is no need to return always zero for function which is not
supported, especially since 0 is the wrong return code.

Fixes: a7efd7773e ("qedr: Add support for PD,PKEY and CQ verbs")
Link: https://lore.kernel.org/r/20211025062632.3960-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:56:55 -03:00
Linus Torvalds
ac8a6eba2a spi: Fix tegra20 build with CONFIG_PM=n once again
Commit efafec27c5 ("spi: Fix tegra20 build with CONFIG_PM=n") already
fixed the build without PM support once.  There was an alternative fix
by Guenter in commit 2bab94090b ("spi: tegra20-slink: Declare runtime
suspend and resume functions conditionally"), and Mark then merged the
two correctly in ffb1e76f4f ("Merge tag 'v5.15-rc2' into spi-5.15").

But for some inexplicable reason, Mark then merged things _again_ in
commit 59c4e190b1 ("Merge tag 'v5.15-rc3' into spi-5.15"), and screwed
things up at that point, and the __maybe_unused attribute on
tegra_slink_runtime_resume() went missing.

Reinstate it, so that alpha (and other architectures without PM support)
builds cleanly again.

Btw, this is another prime example of how random back-merges are not
good.  Just don't do them.  Subsystem developers should not merge my
tree in any normal circumstances.  Both of those merge commits pointed
to above are bad: even the one that got the merge result right doesn't
even mention _why_ it was done, and the one that got it wrong is
obviously broken.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-25 10:46:41 -07:00
Zhu Yanjun
86479f8a3f RDMA/irdma: Remove the unused spin lock in struct irdma_qp_uk
The spin lock in struct irdma_qp_uk is not used. So remove it.

Link: https://lore.kernel.org/r/20211021230612.153812-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:36:58 -03:00
Jakub Kicinski
fd92213e9a RDMA: Constify netdev->dev_addr accesses
netdev->dev_addr will become const soon, make sure drivers propagate the
qualifier.

Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:33:09 -03:00
Jakub Kicinski
50693e66fd RDMA/mlx5: Use dev_addr_mod()
Commit 406f42fa0d ("net-next: When a bond have a massive amount of
VLANs...") introduced a rbtree for faster Ethernet address look up. To
maintain netdev->dev_addr in this tree we need to make all the writes to
it got through appropriate helpers.

Link: https://lore.kernel.org/r/20211019182604.1441387-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:33:09 -03:00
Jakub Kicinski
10f7b9bc85 RDMA/ipoib: Use dev_addr_mod()
Commit 406f42fa0d ("net-next: When a bond have a massive amount of
VLANs...") introduced a rbtree for faster Ethernet address look up. To
maintain netdev->dev_addr in this tree we need to make all the writes to
it got through appropriate helpers.

Link: https://lore.kernel.org/r/20211019182604.1441387-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:33:08 -03:00
Linus Torvalds
c2b43854aa ARM updates for 5.15:
- Fix clang-related relocation warning in futex code
 - Fix incorrect use of get_kernel_nofault()
 - Fix bad code generation in __get_user_check() when kasan is enabled
 - Ensure TLB function table is correctly aligned
 - Remove duplicated string function definitions in decompressor
 - Fix link-time orphan section warnings
 - Fix old-style function prototype for arch_init_kprobes()
 - Only warn about XIP address when not compile testing
 - Handle BE32 big endian for keystone2 remapping
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmF2tT8ACgkQ9OeQG+St
 rGRfpxAAhAco8l1Lm5+0zHAozIi4CogLAcg1EigsQEgorrpJJBSQb0PP5VS0BAnU
 Q48KmE4r5WNGouWNwhHALXMX7Vzv72S5XoFf1Df/LImrIP9qUvuqgbr1gfvgt8M0
 Ktc5P1eS4HC9WxrHHAcWsKaO/Uye+M3adNLNl5K50ADywSExa9VpY6I7ak/OfPot
 BlO9bXkk2991yI/Fg+9cqW7ub9WkabayioYWLuCaTtt99+MSNCDmYcZkTUQkLeSQ
 btF0+jW6/+odUXrA8zFx5QvIp8v35uO2w6fAw8FPjrXm0u2copr7JOAb/yVN4hJR
 sSKSrr+kRFZa/TCjUt8t+2fephQU6ppxJI9BlL+lQ3dyX+dyUmMKRjZP1ju8R2wc
 xKv6cMGfXbZu4jBUgpkekXZsvPs+05nr9op/yBDDHsuIuz0wa3n+oSVNaE9sm0az
 d/QUxA9ZeofKxnzWMvo2D4RWOVprCoqASqt4700Z1KXvNWd6kZaBL0HVREOaFdvt
 /AFNh3nVNDxjhED4NPVPFPv+INrY1EtUF6q8QUJmlj+7xcqaqkV7CgC6Ku8Wkbi1
 ELTx7gy0mZlM8wuMbMeNCdW/qQxuR/8JCETtNTF+ZpiTWWQ+uox7lEWi4xzFQNtG
 NDMoHd077Y8WFRJKaSpQHAbdcPlfSv6TDQ0uOkiPL1D6yiBeqvw=
 =cxhy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

 - Fix clang-related relocation warning in futex code

 - Fix incorrect use of get_kernel_nofault()

 - Fix bad code generation in __get_user_check() when kasan is enabled

 - Ensure TLB function table is correctly aligned

 - Remove duplicated string function definitions in decompressor

 - Fix link-time orphan section warnings

 - Fix old-style function prototype for arch_init_kprobes()

 - Only warn about XIP address when not compile testing

 - Handle BE32 big endian for keystone2 remapping

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S
  ARM: 9141/1: only warn about XIP address when not compile testing
  ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
  ARM: 9138/1: fix link warning with XIP + frame-pointer
  ARM: 9134/1: remove duplicate memcpy() definition
  ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
  ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images
  ARM: 9125/1: fix incorrect use of get_kernel_nofault()
  ARM: 9122/1: select HAVE_FUTEX_CMPXCHG
2021-10-25 10:28:52 -07:00
Jakub Kicinski
a0c8c3372b fddi: defza: add missing pointer type cast
hw_addr is a uint AKA unsigned int. dev_addr_set() takes
a u8 *.

  drivers/net/fddi/defza.c:1383:27: error: passing argument 2 of 'dev_addr_set' from incompatible pointer type [-Werror=incompatible-pointer-types]

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1e9258c389 ("fddi: defxx,defza: use dev_addr_set()")
Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/r/20211025160000.2803818-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25 10:28:47 -07:00
Shakeel Butt
822bc9bac9 cgroup: no need for cgroup_mutex for /proc/cgroups
On the real systems, the cgroups hierarchies are setup early and just
once by the node controller, so, other than number of cgroups, all
information in /proc/cgroups remain same for the system uptime. Let's
remove the cgroup_mutex usage on reading /proc/cgroups. There is a
chance of inconsistent number of cgroups for co-mounted cgroups while
printing the information from /proc/cgroups but that is not a big
issue. In addition /proc/cgroups is a v1 specific interface, so the
dependency on it should reduce over time.

The main motivation for removing the cgroup_mutex from /proc/cgroups is
to reduce the avenues of its contention. On our fleet, we have observed
buggy application hammering on /proc/cgroups and drastically slowing
down the node controller on the system which have many negative
consequences on other workloads running on the system.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-10-25 07:26:00 -10:00
Shakeel Butt
bb75842141 cgroup: remove cgroup_mutex from cgroupstats_build
The function cgroupstats_build extracts cgroup from the kernfs_node's
priv pointer which is a RCU pointer. So, there is no need to grab
cgroup_mutex. Just get the reference on the cgroup before using and
remove the cgroup_mutex altogether.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-10-25 07:24:03 -10:00
Shakeel Butt
be28816971 cgroup: reduce dependency on cgroup_mutex
Currently cgroup_get_from_path() and cgroup_get_from_id() grab
cgroup_mutex before traversing the default hierarchy to find the
kernfs_node corresponding to the path/id and then extract the linked
cgroup. Since cgroup_mutex is still held, it is guaranteed that the
cgroup will be alive and the reference can be taken on it.

However similar guarantee can be provided without depending on the
cgroup_mutex and potentially reducing avenues of cgroup_mutex contentions.
The kernfs_node's priv pointer is RCU protected pointer and with just
rcu read lock we can grab the reference on the cgroup without
cgroup_mutex. So, remove cgroup_mutex from them.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-10-25 07:23:31 -10:00
Boqun Feng
f9eaaa82b4 workqueue: doc: Call out the non-reentrance conditions
The current doc of workqueue API suggests that work items are
non-reentrant: any work item is guaranteed to be executed by at most one
worker system-wide at any given time. However this is not true, the
following case can cause a work item W executed by two workers at
the same time:

        queue_work_on(0, WQ1, W);
        // after a worker picks up W and clear the pending bit
        queue_work_on(1, WQ2, W);
        // workers on CPU0 and CPU1 will execute W in the same time.

, which means the non-reentrance of a work item is conditional, and
Lai Jiangshan provided a nice summary[1] of the conditions, therefore
use it to describe a work item instance and improve the doc.

[1]: https://lore.kernel.org/lkml/CAJhGHyDudet_xyNk=8xnuO2==o-u06s0E0GZVP4Q67nmQ84Ceg@mail.gmail.com/

Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2021-10-25 07:18:40 -10:00
Arnd Bergmann
97ad8c8c71 RDMA/mlx5: fix build error with INFINIBAND_USER_ACCESS=n
The mlx5_ib_fs_add_op_fc/mlx5_ib_fs_remove_op_fc functions are only
available when user access is enabled, without that we run into a link
error:

ERROR: modpost: "mlx5_ib_fs_add_op_fc" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!
ERROR: modpost: "mlx5_ib_fs_remove_op_fc" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!

Conditionally compiling the newly added code section makes it build,
though this is probably not a correct fix.

Fixes: a29b934ceb ("RDMA/mlx5: Add modify_op_stat() support")
Link: https://lore.kernel.org/r/20211019061602.3062196-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:16:05 -03:00
Linus Torvalds
4862649f16 libata fixes for 5.15-rc7
A single fix in this pull request addressing an invalid error code
 return in the sata_mv driver (from Zheyu).
 
 Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYXX2FwAKCRDdoc3SxdoY
 dh2CAQC4Cha8PiSZvR6QVzJDaznQj3O59wP236x/lO9Y8+AFPwEAgFhyVyuMnipR
 fslV8bA4PsiSIluXHT0ACJhFC25Lmgw=
 =pkuU
 -----END PGP SIGNATURE-----

Merge tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull libata fix from Damien Le Moal:
 "A single fix in this pull request addressing an invalid error code
  return in the sata_mv driver (from Zheyu)"

* tag 'libata-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: sata_mv: Fix the error handling of mv_chip_id()
2021-10-25 09:57:28 -07:00