Commit graph

1059881 commits

Author SHA1 Message Date
Maor Gottlieb
e465550b38 net/mlx5: Lag, set match mask according to the traffic type bitmap
Set the related bits in the match definer mask according to the
TT mapping.
This mask will be used to create the match definers.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:09 -07:00
Maor Gottlieb
1065e0015d net/mlx5: Lag, set LAG traffic type mapping
Generate a traffic type bitmap that will define which
steering objects we need to create for the steering
based LAG.

Bits in this bitmap are set according to the LAG hash type.
In addition, have a field that indicate if the lag is in encap
mode or not.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:08 -07:00
Maor Gottlieb
3d677735d3 net/mlx5: Lag, move lag files into directory
Downstream patches add another lag related file so it makes
sense to have all the lag files in a dedicated directory.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:08 -07:00
Maor Gottlieb
58a606dba7 net/mlx5: Introduce new uplink destination type
The uplink destination type should be used in rules to steer the
packet to the uplink when the device is in steering based LAG mode.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:08 -07:00
Maor Gottlieb
e7e2519e36 net/mlx5: Add support to create match definer
Introduce new APIs to create and destroy flow matcher
for given format id.

Flow match definer object is used for defining the fields and
mask used for the hash calculation. User should mask the desired
fields like done in the match criteria.

This object is assigned to flow group of type hash. In this flow
group type, packets lookup is done based on the hash result.

This patch also adds the required bits to create such flow group.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:07 -07:00
Maor Gottlieb
425a563acb net/mlx5: Introduce port selection namespace
Add new port selection flow steering namespace. Flow steering rules in
this namespaceare are used to determine the physical port for egress
packets.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:07 -07:00
Maor Gottlieb
4c71ce50d2 net/mlx5: Support partial TTC rules
Add bitmasks to ttc_params to indicate if rule is valid or not.
It will allow to create TTC table with support only in part of the
traffic types.
In later patches which introduce the steering based LAG port selection,
TTC will be created with only part of the rules according to the hash
type.

Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-18 20:18:07 -07:00
Joy Gu
7fb223d0ad scsi: qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()
Commit 8c0eb596ba ("[SCSI] qla2xxx: Fix a memory leak in an error path of
qla2x00_process_els()"), intended to change:

        bsg_job->request->msgcode == FC_BSG_HST_ELS_NOLOGIN

to:

        bsg_job->request->msgcode != FC_BSG_RPT_ELS

but changed it to:

        bsg_job->request->msgcode == FC_BSG_RPT_ELS

instead.

Change the == to a != to avoid leaking the fcport structure or freeing
unallocated memory.

Link: https://lore.kernel.org/r/20211012191834.90306-2-jgu@purestorage.com
Fixes: 8c0eb596ba ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Joy Gu <jgu@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:04:31 -04:00
Zheyu Ma
06634d5b6e scsi: qla2xxx: Return -ENOMEM if kzalloc() fails
The driver probing function should return < 0 for failure, otherwise
kernel will treat value > 0 as success.

Link: https://lore.kernel.org/r/1634522181-31166-1-git-send-email-zheyuma97@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 23:01:10 -04:00
Luis Chamberlain
e9d658c217 scsi: sr: Add error handling support for add_disk()
We never checked for errors on add_disk() as this function returned
void. Now that this is fixed, use the shiny new error handling.

Just put the cdrom kref and have the unwinding be done by
sr_kref_release().

Link: https://lore.kernel.org/r/20211015233028.2167651-3-mcgrof@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:51:34 -04:00
Luis Chamberlain
2a7a891f4c scsi: sd: Add error handling support for add_disk()
We never checked for errors on add_disk() as this function returned
void. Now that this is fixed, use the shiny new error handling.

As with the error handling for device_add() we follow the same logic and
just put the device so that cleanup is done via the scsi_disk_release().

Link: https://lore.kernel.org/r/20211015233028.2167651-2-mcgrof@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:51:34 -04:00
Mike Christie
f9793d649c scsi: target: Perform ALUA group changes in one step
When userspace changes the LUN's ALUA group, it will set the LUN's group to
NULL then to the new group. Before the new group is set,
target_alua_state_check() will return 0 and allow the I/O to execute. This
has us skip the NULL stage, and just swap in the new group.

Link: https://lore.kernel.org/r/20210930020422.92578-6-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:36 -04:00
Mike Christie
7324f47d42 scsi: target: Replace lun_tg_pt_gp_lock with rcu in I/O path
We are only holding the lun_tg_pt_gp_lock in target_alua_state_check() to
make sure tg_pt_gp is not freed from under us while we copy the state,
delay, ID values. We can instead use RCU here to access the tg_pt_gp.

With this patch IOPs can increase up to 10% for jobs like:

  fio  --filename=/dev/sdX  --direct=1 --rw=randrw --bs=4k \
    --ioengine=libaio --iodepth=64  --numjobs=N

when there are multiple sessions (running that fio command to each /dev/sdX
or using multipath and there are over 8 paths), or more than 8 queues for
the loop or vhost with multiple threads case and numjobs > 8.

Link: https://lore.kernel.org/r/20210930020422.92578-5-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:36 -04:00
Mike Christie
1283c0d1a3 scsi: target: Fix alua_tg_pt_gps_count tracking
We can't free the tg_pt_gp in core_alua_set_tg_pt_gp_id() because it's
still accessed via configfs. Its release must go through the normal
configfs/refcount process.

The max alua_tg_pt_gps_count check should probably have been done in
core_alua_allocate_tg_pt_gp(), but with the current code userspace could
have created 0x0000ffff + 1 groups, but only set the id for 0x0000ffff.
Then it could have deleted a group with an ID set, and then set the ID for
that extra group and it would work ok.

It's unlikely, but just in case this patch continues to allow that type of
behavior, and just fixes the kfree() while in use bug.

Link: https://lore.kernel.org/r/20210930020422.92578-4-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:36 -04:00
Mike Christie
ed1227e080 scsi: target: Fix ordered tag handling
This patch fixes the following bugs:

1. If there are multiple ordered cmds queued and multiple simple cmds
   completing, target_restart_delayed_cmds() could be called on different
   CPUs and each instance could start a ordered cmd. They could then run in
   different orders than they were queued.

2. target_restart_delayed_cmds() and target_handle_task_attr() can race
   where:

   1. target_handle_task_attr() has passed the simple_cmds == 0 check.

   2. transport_complete_task_attr() then decrements simple_cmds to 0.

   3. transport_complete_task_attr() runs target_restart_delayed_cmds() and
      it does not see any cmds on the delayed_cmd_list.

   4. target_handle_task_attr() adds the cmd to the delayed_cmd_list.

   The cmd will then end up timing out.

3. If we are sent > 1 ordered cmds and simple_cmds == 0, we can execute
   them out of order, because target_handle_task_attr() will hit that
   simple_cmds check first and return false for all ordered cmds sent.

4. We run target_restart_delayed_cmds() after every cmd completion, so if
   there is more than 1 simple cmd running, we start executing ordered cmds
   after that first cmd instead of waiting for all of them to complete.

5. Ordered cmds are not supposed to start until HEAD OF QUEUE and all older
   cmds have completed, and not just simple.

6. It's not a bug but it doesn't make sense to take the delayed_cmd_lock
   for every cmd completion when ordered cmds are almost never used. Just
   replacing that lock with an atomic increases IOPs by up to 10% when
   completions are spread over multiple CPUs and there are multiple
   sessions/ mqs/thread accessing the same device.

This patch moves the queued delayed handling to a per device work to
serialze the cmd executions for each device and adds a new counter to track
HEAD_OF_QUEUE and SIMPLE cmds. We can then check the new counter to
determine when to run the work on the completion path.

Link: https://lore.kernel.org/r/20210930020422.92578-3-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Mike Christie
945a160794 scsi: target: Fix ordered CMD_T_SENT handling
We can race where target_handle_task_attr() has put the cmd on the
delayed_cmd_list. Then target_restart_delayed_cmds() has removed it and set
CMD_T_SENT, but then target_execute_cmd() now clears that bit.

This patch moves the clearing to before we've put the cmd on the list.

Link: https://lore.kernel.org/r/20210930020422.92578-2-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Peter Wang
25d542a853 scsi: ufs: ufs-mediatek: Fix wrong location for ref-clk delay
Fix the location of delay for ref-clk gating and ungating in
ufs_mtk_setup_ref_clk().

Link: https://lore.kernel.org/r/20211016005802.7729-4-stanley.chu@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Stanley Chu
1eaff502a8 scsi: ufs: ufs-mediatek: Fix build error caused by use of sched_clock()
Add proper header for using sched_clock().

Link: https://lore.kernel.org/r/20211016005802.7729-3-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Stanley Chu
fc65e933fb scsi: ufs: ufs-mediatek: Introduce default delay for reference clock
Introduce default delay time for gating or ungating reference clock instead
of ambiguous magic numbers.

The defined value is suitable for all current MediaTek UFS platforms.

Link: https://lore.kernel.org/r/20211016005802.7729-2-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Bodo Stroesser
1d2ac7b69d scsi: target: tcmu: Allocate zeroed pages for data area
Tcmu populates the data area (used for communication with userspace) with
pages that are allocated by calling alloc_page(GFP_NOIO).  Therefore
previous content of the allocated pages is exposed to user space. Avoid
this by adding __GFP_ZERO flag.

Zeroing the pages does (nearly) not affect tcmu throughput, because
allocated pages are re-used for the data transfers of later SCSI cmds.

Link: https://lore.kernel.org/r/20211013171606.25197-1-bostroesser@gmail.com
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Varun Prakash
d1e51ea6bf scsi: target: cxgbit: Enable Delayed ACK
Enable Delayed ACK to reduce the number of TCP ACKs.

Link: https://lore.kernel.org/r/1634135109-5044-1-git-send-email-varun@chelsio.com
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Varun Prakash
7f96c7a67e scsi: target: cxgbit: Increase max DataSegmentLength
Current value of max DataSegmentLength is 8K. T5/T6 adapters support
DataSegmentLength upto 16K. Increase max DataSegmentLength.

Link: https://lore.kernel.org/r/1634135087-4996-1-git-send-email-varun@chelsio.com
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Ye Bin
f347c26836 scsi: scsi_debug: Fix out-of-bound read in resp_report_tgtpgs()
The following issue was observed running syzkaller:

BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:377 [inline]
BUG: KASAN: slab-out-of-bounds in sg_copy_buffer+0x150/0x1c0 lib/scatterlist.c:831
Read of size 2132 at addr ffff8880aea95dc8 by task syz-executor.0/9815

CPU: 0 PID: 9815 Comm: syz-executor.0 Not tainted 4.19.202-00874-gfc0fe04215a9 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xe4/0x14a lib/dump_stack.c:118
 print_address_description+0x73/0x280 mm/kasan/report.c:253
 kasan_report_error mm/kasan/report.c:352 [inline]
 kasan_report+0x272/0x370 mm/kasan/report.c:410
 memcpy+0x1f/0x50 mm/kasan/kasan.c:302
 memcpy include/linux/string.h:377 [inline]
 sg_copy_buffer+0x150/0x1c0 lib/scatterlist.c:831
 fill_from_dev_buffer+0x14f/0x340 drivers/scsi/scsi_debug.c:1021
 resp_report_tgtpgs+0x5aa/0x770 drivers/scsi/scsi_debug.c:1772
 schedule_resp+0x464/0x12f0 drivers/scsi/scsi_debug.c:4429
 scsi_debug_queuecommand+0x467/0x1390 drivers/scsi/scsi_debug.c:5835
 scsi_dispatch_cmd+0x3fc/0x9b0 drivers/scsi/scsi_lib.c:1896
 scsi_request_fn+0x1042/0x1810 drivers/scsi/scsi_lib.c:2034
 __blk_run_queue_uncond block/blk-core.c:464 [inline]
 __blk_run_queue+0x1a4/0x380 block/blk-core.c:484
 blk_execute_rq_nowait+0x1c2/0x2d0 block/blk-exec.c:78
 sg_common_write.isra.19+0xd74/0x1dc0 drivers/scsi/sg.c:847
 sg_write.part.23+0x6e0/0xd00 drivers/scsi/sg.c:716
 sg_write+0x64/0xa0 drivers/scsi/sg.c:622
 __vfs_write+0xed/0x690 fs/read_write.c:485
kill_bdev:block_device:00000000e138492c
 vfs_write+0x184/0x4c0 fs/read_write.c:549
 ksys_write+0x107/0x240 fs/read_write.c:599
 do_syscall_64+0xc2/0x560 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

We get 'alen' from command its type is int. If userspace passes a large
length we will get a negative 'alen'.

Switch n, alen, and rlen to u32.

Link: https://lore.kernel.org/r/20211013033913.2551004-3-yebin10@huawei.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Ye Bin
4e3ace0051 scsi: scsi_debug: Fix out-of-bound read in resp_readcap16()
The following warning was observed running syzkaller:

[ 3813.830724] sg_write: data in/out 65466/242 bytes for SCSI command 0x9e-- guessing data in;
[ 3813.830724]    program syz-executor not setting count and/or reply_len properly
[ 3813.836956] ==================================================================
[ 3813.839465] BUG: KASAN: stack-out-of-bounds in sg_copy_buffer+0x157/0x1e0
[ 3813.841773] Read of size 4096 at addr ffff8883cf80f540 by task syz-executor/1549
[ 3813.846612] Call Trace:
[ 3813.846995]  dump_stack+0x108/0x15f
[ 3813.847524]  print_address_description+0xa5/0x372
[ 3813.848243]  kasan_report.cold+0x236/0x2a8
[ 3813.849439]  check_memory_region+0x240/0x270
[ 3813.850094]  memcpy+0x30/0x80
[ 3813.850553]  sg_copy_buffer+0x157/0x1e0
[ 3813.853032]  sg_copy_from_buffer+0x13/0x20
[ 3813.853660]  fill_from_dev_buffer+0x135/0x370
[ 3813.854329]  resp_readcap16+0x1ac/0x280
[ 3813.856917]  schedule_resp+0x41f/0x1630
[ 3813.858203]  scsi_debug_queuecommand+0xb32/0x17e0
[ 3813.862699]  scsi_dispatch_cmd+0x330/0x950
[ 3813.863329]  scsi_request_fn+0xd8e/0x1710
[ 3813.863946]  __blk_run_queue+0x10b/0x230
[ 3813.864544]  blk_execute_rq_nowait+0x1d8/0x400
[ 3813.865220]  sg_common_write.isra.0+0xe61/0x2420
[ 3813.871637]  sg_write+0x6c8/0xef0
[ 3813.878853]  __vfs_write+0xe4/0x800
[ 3813.883487]  vfs_write+0x17b/0x530
[ 3813.884008]  ksys_write+0x103/0x270
[ 3813.886268]  __x64_sys_write+0x77/0xc0
[ 3813.886841]  do_syscall_64+0x106/0x360
[ 3813.887415]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

This issue can be reproduced with the following syzkaller log:

r0 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x26e1, 0x0)
r1 = syz_open_procfs(0xffffffffffffffff, &(0x7f0000000000)='fd/3\x00')
open_by_handle_at(r1, &(0x7f00000003c0)=ANY=[@ANYRESHEX], 0x602000)
r2 = syz_open_dev$sg(&(0x7f0000000000), 0x0, 0x40782)
write$binfmt_aout(r2, &(0x7f0000000340)=ANY=[@ANYBLOB="00000000deff000000000000000000000000000000000000000000000000000047f007af9e107a41ec395f1bded7be24277a1501ff6196a83366f4e6362bc0ff2b247f68a972989b094b2da4fb3607fcf611a22dd04310d28c75039d"], 0x126)

In resp_readcap16() we get "int alloc_len" value -1104926854, and then pass
the huge arr_len to fill_from_dev_buffer(), but arr is only 32 bytes. This
leads to OOB in sg_copy_buffer().

To solve this issue, define alloc_len as u32.

Link: https://lore.kernel.org/r/20211013033913.2551004-2-yebin10@huawei.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Colin Ian King
8ecfb16c9b scsi: 3w-xxx: Remove redundant initialization of variable retval
The variable retval is being initialized with a value that is never read,
it is being updated immediately afterwards. The assignment is redundant and
can be removed.

Link: https://lore.kernel.org/r/20211013182834.137410-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Addresses-Coverity: ("Unused value")
2021-10-18 22:38:34 -04:00
MichelleJin
b3ef4a0e40 scsi: fcoe: Use netif_is_bond_master() instead of open code
'netdev->priv_flags & IFF_BONDING && netdev->flags & IFF_MASTER' is defined
as netif_is_bond_master() in netdevice.h. Replace it to clean up code.

Link: https://lore.kernel.org/r/20211015142006.540773-1-shjy180909@gmail.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: MichelleJin <shjy180909@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Tyrel Datwyler
3319a8ba82 scsi: ibmvscsi: Use GFP_KERNEL with dma_alloc_coherent() in initialize_event_pool()
During driver probe we allocate a dma region for our event pool.
Currently, zero is passed for the gfp_flags parameter. Driver probe
callbacks are run in process context and we hold no locks so we can sleep
here if necessary.

Fix by passing GFP_KERNEL explicitly to dma_alloc_coherent().

Link: https://lore.kernel.org/r/1547089149-20577-1-git-send-email-tyreld@linux.vnet.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Dan Carpenter
30e99f05f8 scsi: mpi3mr: Use scnprintf() instead of snprintf()
I intended to move from snprintf() to scnprintf() in the previous patch but
I messed up and did not do that.  The result of my bug is that it this
function could trigger a WARN() if the buffer is too large.

Link: https://lore.kernel.org/r/20211013083005.GA8592@kili
Fixes: 76a4f7cc59 ("scsi: mpi3mr: Clean up mpi3mr_print_ioc_info()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Martin Kepplinger
c4da120575 scsi: sd: Print write through due to no caching mode page as warning
For SD cardreaders it is extremely common not to have a cache.
Consequently, the following messages do not point to a real error one could
try to fix but rather describe how the disk works:

  sd 0:0:0:0: [sda] No Caching mode page found
  sd 0:0:0:0: [sda] Assuming drive cache: write through

Print these messages as warnings instead of errors.

Link: https://lore.kernel.org/r/20211013075050.3870354-1-martin.kepplinger@puri.sm
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Yonghong Song
223f903e9c bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
Patch set [1] introduced BTF_KIND_TAG to allow tagging
declarations for struct/union, struct/union field, var, func
and func arguments and these tags will be encoded into
dwarf. They are also encoded to btf by llvm for the bpf target.

After BTF_KIND_TAG is introduced, we intended to use it
for kernel __user attributes. But kernel __user is actually
a type attribute. Upstream and internal discussion showed
it is not a good idea to mix declaration attribute and
type attribute. So we proposed to introduce btf_type_tag
as a type attribute and existing btf_tag renamed to
btf_decl_tag ([2]).

This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
other declarations with *_tag to *_decl_tag to make it clear
the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
might be introduced per [3].

 [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
 [2] https://reviews.llvm.org/D111588
 [3] https://reviews.llvm.org/D111199

Fixes: b5ea834dde ("bpf: Support for new btf kind BTF_KIND_TAG")
Fixes: 5b84bd1036 ("libbpf: Add support for BTF_KIND_TAG")
Fixes: 5c07f2fec0 ("bpftool: Add support for BTF_KIND_TAG")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com
2021-10-18 18:35:36 -07:00
Shai Malin
939a6567f9 qed: Change the TCP common variable - "iscsi_ooo"
Change the TCP common variable - "iscsi_ooo" to "ooo_opq".
This variable is common between all the TCP L5 protocols and not
specific to iSCSI.

Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Link: https://lore.kernel.org/r/20211015124118.29041-2-smalin@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-18 15:58:21 -07:00
Shai Malin
891e861efb qed: Optimize the ll2 ooo flow
Optimize the ll2 TCP out-of-order likely flows:
- Optimize the non-error flows of the ll2 ooo data path.
- Optimize "QED_OOO_RIGHT_BUF" over "QED_OOO_LEFT_BUF".

Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Link: https://lore.kernel.org/r/20211015124118.29041-1-smalin@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-18 15:58:21 -07:00
Gaosheng Cui
d9516f346e audit: return early if the filter rule has a lower priority
It is not necessary for audit_filter_rules() functions to check
audit fileds of the rule with a lower priority, and if we did,
there might be some unintended effects, such as the ctx->ppid
may be changed unexpectedly, so return early if the rule has
a lower priority.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
[PM: slight tweak to the subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-18 18:34:37 -04:00
Gaosheng Cui
6e3ee990c9 audit: fix possible null-pointer dereference in audit_filter_rules
Fix  possible null-pointer dereference in audit_filter_rules.

audit_filter_rules() error: we previously assumed 'ctx' could be null

Cc: stable@vger.kernel.org
Fixes: bf361231c2 ("audit: add saddr_fam filter field")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-18 18:27:47 -04:00
Steven Rostedt (VMware)
ed65df63a3 tracing: Have all levels of checks prevent recursion
While writing an email explaining the "bit = 0" logic for a discussion on
making ftrace_test_recursion_trylock() disable preemption, I discovered a
path that makes the "not do the logic if bit is zero" unsafe.

The recursion logic is done in hot paths like the function tracer. Thus,
any code executed causes noticeable overhead. Thus, tricks are done to try
to limit the amount of code executed. This included the recursion testing
logic.

Having recursion testing is important, as there are many paths that can
end up in an infinite recursion cycle when tracing every function in the
kernel. Thus protection is needed to prevent that from happening.

Because it is OK to recurse due to different running context levels (e.g.
an interrupt preempts a trace, and then a trace occurs in the interrupt
handler), a set of bits are used to know which context one is in (normal,
softirq, irq and NMI). If a recursion occurs in the same level, it is
prevented*.

Then there are infrastructure levels of recursion as well. When more than
one callback is attached to the same function to trace, it calls a loop
function to iterate over all the callbacks. Both the callbacks and the
loop function have recursion protection. The callbacks use the
"ftrace_test_recursion_trylock()" which has a "function" set of context
bits to test, and the loop function calls the internal
trace_test_and_set_recursion() directly, with an "internal" set of bits.

If an architecture does not implement all the features supported by ftrace
then the callbacks are never called directly, and the loop function is
called instead, which will implement the features of ftrace.

Since both the loop function and the callbacks do recursion protection, it
was seemed unnecessary to do it in both locations. Thus, a trick was made
to have the internal set of recursion bits at a more significant bit
location than the function bits. Then, if any of the higher bits were set,
the logic of the function bits could be skipped, as any new recursion
would first have to go through the loop function.

This is true for architectures that do not support all the ftrace
features, because all functions being traced must first go through the
loop function before going to the callbacks. But this is not true for
architectures that support all the ftrace features. That's because the
loop function could be called due to two callbacks attached to the same
function, but then a recursion function inside the callback could be
called that does not share any other callback, and it will be called
directly.

i.e.

 traced_function_1: [ more than one callback tracing it ]
   call loop_func

 loop_func:
   trace_recursion set internal bit
   call callback

 callback:
   trace_recursion [ skipped because internal bit is set, return 0 ]
   call traced_function_2

 traced_function_2: [ only traced by above callback ]
   call callback

 callback:
   trace_recursion [ skipped because internal bit is set, return 0 ]
   call traced_function_2

 [ wash, rinse, repeat, BOOM! out of shampoo! ]

Thus, the "bit == 0 skip" trick is not safe, unless the loop function is
call for all functions.

Since we want to encourage architectures to implement all ftrace features,
having them slow down due to this extra logic may encourage the
maintainers to update to the latest ftrace features. And because this
logic is only safe for them, remove it completely.

 [*] There is on layer of recursion that is allowed, and that is to allow
     for the transition between interrupt context (normal -> softirq ->
     irq -> NMI), because a trace may occur before the context update is
     visible to the trace recursion logic.

Link: https://lore.kernel.org/all/609b565a-ed6e-a1da-f025-166691b5d994@linux.alibaba.com/
Link: https://lkml.kernel.org/r/20211018154412.09fcad3c@gandalf.local.home

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jisheng Zhang <jszhang@kernel.org>
Cc: =?utf-8?b?546L6LSH?= <yun.wang@linux.alibaba.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Fixes: edc15cafcb ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-18 18:12:09 -04:00
Nathan Chancellor
8a64ef042e nfp: bpf: silence bitwise vs. logical OR warning
A new warning in clang points out two places in this driver where
boolean expressions are being used with a bitwise OR instead of a
logical one:

drivers/net/ethernet/netronome/nfp/nfp_asm.c:199:20: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
        reg->src_lmextn = swreg_lmextn(lreg) | swreg_lmextn(rreg);
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                             ||
drivers/net/ethernet/netronome/nfp/nfp_asm.c:199:20: note: cast one or both operands to int to silence this warning
drivers/net/ethernet/netronome/nfp/nfp_asm.c:280:20: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
        reg->src_lmextn = swreg_lmextn(lreg) | swreg_lmextn(rreg);
                          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                             ||
drivers/net/ethernet/netronome/nfp/nfp_asm.c:280:20: note: cast one or both operands to int to silence this warning
2 errors generated.

The motivation for the warning is that logical operations short circuit
while bitwise operations do not. In this case, it does not seem like
short circuiting is harmful so implement the suggested fix of changing
to a logical operation to fix the warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/1479
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20211018193101.2340261-1-nathan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-18 14:50:01 -07:00
Barry Song
ac8e3cef58 PCI/sysfs: Explicitly show first MSI IRQ for 'irq'
The sysfs "irq" file contains the legacy INTx IRQ.  Or, if the device has
MSI enabled, it contains the first MSI IRQ instead.

Previously this file showed the pci_dev.irq value directly.  But we'd
prefer to use pci_dev.irq only for the INTx IRQ and decouple that from any
MSI or MSI-X IRQs.

If the device has MSI enabled, explicitly look up and show the first MSI
IRQ in the sysfs "irq" file.  Otherwise, show the INTx IRQ.

This removes the requirement that msi_capability_init() set pci_dev.irq to
the first MSI IRQ when enabling MSI and pci_msi_shutdown() restore the INTx
IRQ when disabling MSI.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20210825102636.52757-3-21cnbao@gmail.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-10-18 16:43:04 -05:00
Rob Clark
5ca6779d2f drm/msm/devfreq: Restrict idle clamping to a618 for now
Until we better understand the stability issues caused by frequent
frequency changes, lets limit them to a618.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Caleb Connolly <caleb.connolly@linaro.org>
Link: https://lore.kernel.org/r/20211018153627.2787882-1-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:31:57 -07:00
Bjorn Andersson
e60af4f855 dt-bindings: msm/dp: Add SC8180x compatibles
The Qualcomm SC8180x has 2 DP controllers and 1 eDP controller, add
compatibles for these to the msm/dp binding.

Reviewed-by: Abhinav Kumar <abhinavk@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-7-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:28:30 -07:00
Bjorn Andersson
bb3de286d9 drm/msm/dp: Support up to 3 DP controllers
Based on the removal of the g_dp_display and the movement of the
priv->dp lookup into the DP code it's now possible to have multiple
DP instances.

In line with the other controllers in the MSM driver, introduce a
per-compatible list of base addresses which is used to resolve the
"instance id" for the given DP controller. This instance id is used as
index in the priv->dp[] array.

Then extend the initialization code to initialize struct drm_encoder for
each of the registered priv->dp[] and update the logic for associating
each struct msm_dp with the struct dpu_encoder_virt.

A new enum is introduced to document the connection between the
instances referenced in the dpu_intf_cfg array and the controllers in
the DP driver and sc7180 is updated.

Lastly, bump the number of struct msm_dp instances carries by priv->dp
to 3, the currently known maximum number of controllers found in a
Qualcomm SoC.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-6-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:28:30 -07:00
Bjorn Andersson
4b296d15b3 drm/msm/dp: Allow attaching a drm_panel
eDP panels might need some power sequencing and backlight management,
so make it possible to associate a drm_panel with an eDP instance and
prepare and enable the panel accordingly.

Now that we know which hardware instance is DP and which is eDP,
parser->parse() is passed the connector_type and the parser is limited
to only search for a panel in the eDP case.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-5-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:28:30 -07:00
Bjorn Andersson
269e92d84c drm/msm/dp: Allow specifying connector_type per controller
As the following patches introduced support for multiple DP blocks in a
platform and some of those block might be eDP it becomes useful to be
able to specify the connector type per block.

Although there's only a single block at this point, the array of descs
and the search in dp_display_get_desc() are introduced here to simplify
the next patch, that does introduce support for multiple DP blocks.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-4-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:28:29 -07:00
Bjorn Andersson
167dac97eb drm/msm/dp: Modify prototype of encoder based API
Functions in the DisplayPort code that relates to individual instances
(encoders) are passed both the struct msm_dp and the struct drm_encoder.
But in a situation where multiple DP instances would exist this means
that the caller need to resolve which struct msm_dp relates to the
struct drm_encoder at hand.

Store a reference to the struct msm_dp associated with each
dpu_encoder_virt to allow the particular instance to be associate with
the encoder in the following patch.

Reviewed-by: Abhinav Kumar <abhinavk@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-3-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:28:29 -07:00
Bjorn Andersson
d624e50aa3 drm/msm/dp: Remove global g_dp_display variable
As the Qualcomm DisplayPort driver only supports a single instance of
the driver the commonly used struct dp_display is kept in a global
variable. As we introduce additional instances this obviously doesn't
work.

Replace this with a combination of existing references to adjacent
objects and drvdata.

Reviewed-by: Abhinav Kumar <abhinavk@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211016221843.2167329-2-bjorn.andersson@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-10-18 14:28:29 -07:00
Lukas Bulwahn
f616447034 MAINTAINERS: adjust file entry for of_net.c after movement
Commit e330fb1459 ("of: net: move of_net under net/") moves of_net.c
to ./net/core/, but misses to adjust the reference to this file in
MAINTAINERS.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:

   warning: no file matches    F:    drivers/of/of_net.c

Adjust the file entry after this file movement.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20211016055815.14397-1-lukas.bulwahn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-18 14:19:49 -07:00
Barry Song
5e3be666f4 PCI: Document /sys/bus/pci/devices/.../irq
Document /sys/bus/pci/devices/.../irq.

This file contains the IRQ of the INTx interrupt (or zero if the device
doesn't support INTx interrupts).

If the device has enabled MSI (not MSI-X), it contains the first MSI IRQ
instead.  This is a historical mistake because devices may support several
MSI or MSI-X vectors, and this file can't contain them all.  But we
preserve this behavior to avoid breaking userspace.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20210825102636.52757-2-21cnbao@gmail.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-18 16:16:28 -05:00
Mateusz Palczewski
898ef1cb1c iavf: Combine init and watchdog state machines
Use single state machine for driver initialization and for service
initialized driver. The init state machine implemented in init_task()
is merged into the watchdog_task(). The init_task() function is
removed.

Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-18 14:07:24 -07:00
Mateusz Palczewski
59756ad694 iavf: Add __IAVF_INIT_FAILED state
This commit adds a new state, __IAVF_INIT_FAILED to the state machine.
From now on initialization functions report errors not by returning an
error value, but by changing the state to indicate that something went
wrong.

Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-18 14:07:24 -07:00
Mateusz Palczewski
45eebd6299 iavf: Refactor iavf state machine tracking
Replace state changes of iavf state machine
with a method that also tracks the previous
state the machine was on.

This change is required for further work with
refactoring init and watchdog state machines.

Tracking of previous state would help us
recover iavf after failure has occurred.

Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-18 14:07:16 -07:00
Eric W. Biederman
15bc01effe ucounts: Fix signal ucount refcounting
In commit fda31c5029 ("signal: avoid double atomic counter
increments for user accounting") Linus made a clever optimization to
how rlimits and the struct user_struct.  Unfortunately that
optimization does not work in the obvious way when moved to nested
rlimits.  The problem is that the last decrement of the per user
namespace per user sigpending counter might also be the last decrement
of the sigpending counter in the parent user namespace as well.  Which
means that simply freeing the leaf ucount in __free_sigqueue is not
enough.

Maintain the optimization and handle the tricky cases by introducing
inc_rlimit_get_ucounts and dec_rlimit_put_ucounts.

By moving the entire optimization into functions that perform all of
the work it becomes possible to ensure that every level is handled
properly.

The new function inc_rlimit_get_ucounts returns 0 on failure to
increment the ucount.  This is different than inc_rlimit_ucounts which
increments the ucounts and returns LONG_MAX if the ucount counter has
exceeded it's maximum or it wrapped (to indicate the counter needs to
decremented).

I wish we had a single user to account all pending signals to across
all of the threads of a process so this complexity was not necessary

Cc: stable@vger.kernel.org
Fixes: d646969055 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
v1: https://lkml.kernel.org/r/87mtnavszx.fsf_-_@disp2133
Link: https://lkml.kernel.org/r/87fssytizw.fsf_-_@disp2133
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Tested-by: Rune Kleveland <rune.kleveland@infomedia.dk>
Tested-by: Yu Zhao <yuzhao@google.com>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-10-18 16:02:30 -05:00