Commit graph

1059881 commits

Author SHA1 Message Date
Quinn Tran
8062b742d3 scsi: qla2xxx: edif: Replace list_for_each_safe with list_for_each_entry_safe
This patch is per review comment by Hannes Reinecke from previous
submission to replace list_for_each_safe with list_for_each_entry_safe.

Link: https://lore.kernel.org/r/20211026115412.27691-8-njavali@marvell.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:52:00 -04:00
Quinn Tran
b1af26c245 scsi: qla2xxx: edif: Flush stale events and msgs on session down
On session down, driver will flush all stale messages and doorbell
events. This prevents authentication application from having to process
stale data.

Link: https://lore.kernel.org/r/20211026115412.27691-7-njavali@marvell.com
Fixes: 4de067e5df ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Karunakara Merugu <kmerugu@marvell.com>
Signed-off-by: Karunakara Merugu <kmerugu@marvell.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
b492d6a488 scsi: qla2xxx: edif: Fix app start delay
Current driver does unnecessary pause for each session to get to certain
state before allowing the app start call to return. In larger environment,
this introduces a long delay.  Originally the delay was meant to
synchronize app and driver. However, the with current implementation the
two sides use various events to synchronize their state.

The same is applied to the authentication failure call.

Link: https://lore.kernel.org/r/20211026115412.27691-6-njavali@marvell.com
Fixes: 4de067e5df ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
8e6d5df3cb scsi: qla2xxx: edif: Fix app start fail
On app start, all sessions need to be reset to see if secure connection can
be made. Fix the broken check which prevents that process.

Link: https://lore.kernel.org/r/20211026115412.27691-5-njavali@marvell.com
Fixes: 4de067e5df ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
0b7a9fd934 scsi: qla2xxx: Turn off target reset during issue_lip
When user uses issue_lip to do link bounce, driver sends additional target
reset to remote device before resetting the link. The target reset would
affect other paths with active I/Os. This patch will remove the unnecessary
target reset.

Link: https://lore.kernel.org/r/20211026115412.27691-4-njavali@marvell.com
Fixes: 5854771e31 ("[SCSI] qla2xxx: Add ISPFX00 specific bus reset routine")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
c98c5daaa2 scsi: qla2xxx: Fix gnl list corruption
Current code does list element deletion and addition in and out of lock
protection. This patch moves deletion behind lock.

list_add double add: new=ffff9130b5eb89f8, prev=ffff9130b5eb89f8,
    next=ffff9130c6a715f0.
 ------------[ cut here ]------------
 kernel BUG at lib/list_debug.c:31!
 invalid opcode: 0000 [#1] SMP PTI
 CPU: 1 PID: 182395 Comm: kworker/1:37 Kdump: loaded Tainted: G W  OE
 --------- -  - 4.18.0-193.el8.x86_64 #1
 Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014
 Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx]
 RIP: 0010:__list_add_valid+0x41/0x50
 Code: 85 94 00 00 00 48 39 c7 74 0b 48 39 d7 74 06 b8 01 00 00 00 c3 48 89 f2
 4c 89 c1 48 89 fe 48 c7 c7 60 83 ad 97 e8 4d bd ce ff <0f> 0b 0f 1f 00 66 2e
 0f 1f 84 00 00 00 00 00 48 8b 07 48 8b 57 08
 RSP: 0018:ffffaba306f47d68 EFLAGS: 00010046
 RAX: 0000000000000058 RBX: ffff9130b5eb8800 RCX: 0000000000000006
 RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff9130b7456a00
 RBP: ffff9130c6a70a58 R08: 000000000008d7be R09: 0000000000000001
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff9130c6a715f0
 R13: ffff9130b5eb8824 R14: ffff9130b5eb89f8 R15: ffff9130b5eb89f8
 FS:  0000000000000000(0000) GS:ffff9130b7440000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007efcaaef11a0 CR3: 000000005200a002 CR4: 00000000000606e0
 Call Trace:
  qla24xx_async_gnl+0x113/0x3c0 [qla2xxx]
  ? qla2x00_iocb_work_fn+0x53/0x80 [qla2xxx]
  ? process_one_work+0x1a7/0x3b0
  ? worker_thread+0x30/0x390
  ? create_worker+0x1a0/0x1a0
  ? kthread+0x112/0x130

Link: https://lore.kernel.org/r/20211026115412.27691-3-njavali@marvell.com
Fixes: 726b854870 ("qla2xxx: Add framework for async fabric discovery")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
bb2ca6b3f0 scsi: qla2xxx: Relogin during fabric disturbance
For RSCN of type "Area, Domain, or Fabric", which indicate a portion or
entire fabric was disturbed, current driver does not set the scan_need flag
to indicate a session was affected by the disturbance. This in turn can
lead to I/O timeout and delay of relogin. Hence initiate relogin in the
event of fabric disturbance.

Link: https://lore.kernel.org/r/20211026115412.27691-2-njavali@marvell.com
Fixes: 1560bafdff ("scsi: qla2xxx: Use complete switch scan for RSCN events")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Christophe JAILLET
2c2934c80e scsi: elx: Use 'bitmap_zalloc()' when applicable
'sli4->ext[i].use_map' is a bitmap. Use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Link: https://lore.kernel.org/r/2a0a83949fb896a0a236dcca94dfdc8486d489f5.1635104793.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:28:33 -04:00
Bart Van Assche
1ea7d80263 scsi: ufs: core: Micro-optimize ufshcd_map_sg()
Replace two cpu_to_le32() calls by a single cpu_to_le64() call.

Additionally, issue a warning if the length of an scatter gather list
element exceeds what is allowed by the UFSHCI specification.

Link: https://lore.kernel.org/r/20211020214024.2007615-11-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
9a868c8ad3 scsi: ufs: core: Add a compile-time structure size check
Before modifying struct ufshcd_sg_entry, add a compile-time structure size
check.

Link: https://lore.kernel.org/r/20211020214024.2007615-10-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
3ad317a1f9 scsi: ufs: core: Remove three superfluous casts
Casting an int explicitly to u16 when passed as an argument to a function
is not necessary.

Since prd_table and ucd_prdt_ptr both have type struct ufshcd_sg_entry *,
remove the casts from assignments of these two to each other.

This patch does not change any functionality.

Link: https://lore.kernel.org/r/20211020214024.2007615-9-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
7340faae94 scsi: ufs: core: Add debugfs attributes for triggering the UFS EH
Make it easier to test the impact of the UFS error handler on software that
submits SCSI commands to the UFS driver.

Link: https://lore.kernel.org/r/20211020214024.2007615-8-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
e0022c6c29 scsi: ufs: core: Make it easier to add new debugfs attributes
Introduce an array for debugfs attributes to make it easier to add new
debugfs attributes. Change the value of the inode.i_private pointer for
debugfs attributes from a pointer to the HBA data structure to a pointer to
the attribute description for the stats attribute. Store the HBA pointer in
the private data of the parent inode instead.

Link: https://lore.kernel.org/r/20211020214024.2007615-7-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
267a59f6a5 scsi: ufs: core: Export ufshcd_schedule_eh_work()
Make it possible to call ufshcd_schedule_eh_work() from other source files
than ufshcd.c. Additionally, convert a source code comment into a
lockdep_assert_held() call.

Link: https://lore.kernel.org/r/20211020214024.2007615-6-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
4693fad7d6 scsi: ufs: core: Log error handler activity
Kernel logs are hard to comprehend without information about what the UFS
error handler is doing. Hence this patch that logs information about error
handler activity.

Link: https://lore.kernel.org/r/20211020214024.2007615-5-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
957d63e77a scsi: ufs: core: Improve static type checking
Introduce an enumeration type for the overall command status to allow the
compiler to perform more static type checking.

Link: https://lore.kernel.org/r/20211020214024.2007615-4-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:49 -04:00
Bart Van Assche
91bb765cca scsi: ufs: core: Improve source code comments
Make the descriptions above data structures that come from the UFS
specification match the terminology from that specification. This makes it
easier to find these data structures while reading the UFS specification.

Link: https://lore.kernel.org/r/20211020214024.2007615-3-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:49 -04:00
Bart Van Assche
1168252357 scsi: ufs: Revert "Retry aborted SCSI commands instead of completing these successfully"
Commit 73dc3c4ac7 ("scsi: ufs: Retry aborted SCSI commands instead of
completing these successfully") is not necessary. If a SCSI command is
aborted successfully the UFS controller has not modified the command status
and the command status still has the value assigned by
ufshcd_prepare_req_desc_hdr(), namely OCS_INVALID_COMMAND_STATUS. The
function ufshcd_transfer_rsp_status() requeues commands that have an
invalid command status. Hence revert commit 73dc3c4ac7.

Link: https://lore.kernel.org/r/20211020214024.2007615-2-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:49 -04:00
Dmitry Bogdanov
12b6fcd0ea scsi: target: core: Remove from tmr_list during LUN unlink
Currently TMF commands are removed from de_device.dev_tmf_list at the very
end of se_cmd lifecycle. However, se_lun unlinks from se_cmd upon a command
status (response) being queued in transport layer. This means that LUN and
backend device can be deleted in the meantime and a panic will occur:

target_tmr_work()
	cmd->se_tfo->queue_tm_rsp(cmd); // send abort_rsp to a wire
	transport_lun_remove_cmd(cmd) // unlink se_cmd from se_lun
- // - // - // -
<<<--- lun remove
<<<--- core backend device remove
- // - // - // -
qlt_handle_abts_completion()
  tfo->free_mcmd()
    transport_generic_free_cmd()
      target_put_sess_cmd()
        core_tmr_release_req() {
          if (dev) { // backend device, can not be null
            spin_lock_irqsave(&dev->se_tmr_lock, flags); //<<<--- CRASH

Call Trace:
NIP [c000000000e1683c] _raw_spin_lock_irqsave+0x2c/0xc0
LR [c00800000e433338] core_tmr_release_req+0x40/0xa0 [target_core_mod]
Call Trace:
(unreliable)
0x0
target_put_sess_cmd+0x2a0/0x370 [target_core_mod]
transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
tcm_qla2xxx_complete_mcmd+0x28/0x50 [tcm_qla2xxx]
process_one_work+0x2c4/0x5c0
worker_thread+0x88/0x690

For the iSCSI protocol this is easily reproduced:

 - Send some SCSI sommand

 - Send Abort of that command over iSCSI

 - Remove LUN on target

 - Send next iSCSI command to acknowledge the Abort_Response

 - Target panics

There is no need to keep the command in tmr_list until response completion,
so move the removal from tmr_list from the response completion to the
response queueing when the LUN is unlinked.  Move the removal from state
list too as it is a subject to the same race condition.

Link: https://lore.kernel.org/r/20211018135753.15297-1-d.bogdanov@yadro.com
Fixes: c66ac9db8d ("[SCSI] target: Add LIO target core v4.0.0-rc6")
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:15:23 -04:00
Damien Le Moal
9d82464288 doc: Fix typo in request queue sysfs documentation
Fix a typo (are -> as) in the introduction paragraph of
Documentation/block/queue-sysfs.rst.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20211027022223.183838-6-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26 21:01:48 -06:00
Damien Le Moal
6b3bae2324 doc: document sysfs queue/independent_access_ranges attributes
Update the file Documentation/block/queue-sysfs.rst to add a description
of a device queue sysfs entries related to independent access ranges
(e.g. concurrent positioning ranges for multi-actuator hard-disks).

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20211027022223.183838-5-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26 21:01:48 -06:00
Damien Le Moal
fe22e1c2f7 libata: support concurrent positioning ranges log
Add support to discover if an ATA device supports the Concurrent
Positioning Ranges data log (address 0x47), indicating that the device
is capable of seeking to multiple different locations in parallel using
multiple actuators serving different LBA ranges.

Also add support to translate the concurrent positioning ranges log
into its equivalent Concurrent Positioning Ranges VPD page B9h in
libata-scsi.c.

The format of the Concurrent Positioning Ranges Log is defined in ACS-5
r9.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20211027022223.183838-4-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26 21:01:48 -06:00
Damien Le Moal
e815d36548 scsi: sd: add concurrent positioning ranges support
Add the sd_read_cpr() function to the sd scsi disk driver to discover
if a device has multiple concurrent positioning ranges (i.e. multiple
actuators on an HDD). The existence of VPD page B9h indicates if a
device has multiple concurrent positioning ranges. The page content
describes each range supported by the device.

sd_read_cpr() is called from sd_revalidate_disk() and uses the block
layer functions disk_alloc_independent_access_ranges() and
disk_set_independent_access_ranges() to represent the set of actuators
of the device as independent access ranges.

The format of the Concurrent Positioning Ranges VPD page B9h is defined
in section 6.6.6 of SBC-5.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20211027022223.183838-3-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26 21:01:48 -06:00
Damien Le Moal
a2247f19ee block: Add independent access ranges support
The Concurrent Positioning Ranges VPD page (for SCSI) and data log page
(for ATA) contain parameters describing the set of contiguous LBAs that
can be served independently by a single LUN multi-actuator hard-disk.
Similarly, a logically defined block device composed of multiple disks
can in some cases execute requests directed at different sector ranges
in parallel. A dm-linear device aggregating 2 block devices together is
an example.

This patch implements support for exposing a block device independent
access ranges to the user through sysfs to allow optimizing device
accesses to increase performance.

To describe the set of independent sector ranges of a device (actuators
of a multi-actuator HDDs or table entries of a dm-linear device),
The type struct blk_independent_access_ranges is introduced. This
structure describes the sector ranges using an array of
struct blk_independent_access_range structures. This range structure
defines the start sector and number of sectors of the access range.
The ranges in the array cannot overlap and must contain all sectors
within the device capacity.

The function disk_set_independent_access_ranges() allows a device
driver to signal to the block layer that a device has multiple
independent access ranges.  In this case, a struct
blk_independent_access_ranges is attached to the device request queue
by the function disk_set_independent_access_ranges(). The function
disk_alloc_independent_access_ranges() is provided for drivers to
allocate this structure.

struct blk_independent_access_ranges contains kobjects (struct kobject)
to expose to the user through sysfs the set of independent access ranges
supported by a device. When the device is initialized, sysfs
registration of the ranges information is done from blk_register_queue()
using the block layer internal function
disk_register_independent_access_ranges(). If a driver calls
disk_set_independent_access_ranges() for a registered queue, e.g. when a
device is revalidated, disk_set_independent_access_ranges() will execute
disk_register_independent_access_ranges() to update the sysfs attribute
files.  The sysfs file structure created starts from the
independent_access_ranges sub-directory and contains the start sector
and number of sectors of each range, with the information for each range
grouped in numbered sub-directories.

E.g. for a dual actuator HDD, the user sees:

$ tree /sys/block/sdk/queue/independent_access_ranges/
/sys/block/sdk/queue/independent_access_ranges/
|-- 0
|   |-- nr_sectors
|   `-- sector
`-- 1
    |-- nr_sectors
    `-- sector

For a regular device with a single access range, the
independent_access_ranges sysfs directory does not exist.

Device revalidation may lead to changes to this structure and to the
attribute values. When manipulated, the queue sysfs_lock and
sysfs_dir_lock mutexes are held for atomicity, similarly to how the
blk-mq and elevator sysfs queue sub-directories are protected.

The code related to the management of independent access ranges is
added in the new file block/blk-ia-ranges.c.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20211027022223.183838-2-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-26 20:36:47 -06:00
Maor Dickman
8ca9caee85 net/mlx5: Lag, Make mlx5_lag_is_multipath() be static inline
Fix "no previous prototype" W=1 warnings when CONFIG_MLX5_CORE_EN is not set:

  drivers/net/ethernet/mellanox/mlx5/core/lag_mp.h:34:6: error: no previous prototype for ‘mlx5_lag_is_multipath’ [-Werror=missing-prototypes]
     34 | bool mlx5_lag_is_multipath(struct mlx5_core_dev *dev) { return false; }
        |      ^~~~~~~~~~~~~~~~~~~~~

Fixes: 14fe2471c6 ("net/mlx5: Lag, change multipath and bonding to be mutually exclusive")
Signed-off-by: Maor Dickman <maord@nvidia.com>
2021-10-26 19:30:42 -07:00
Khalid Manaa
ae3452995b net/mlx5e: Prevent HW-GRO and CQE-COMPRESS features operate together
HW-GRO and CQE-COMPRESS are mutually exclusive, this commit adds this
restriction.

Signed-off-by: Khalid Manaa <khalidm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:42 -07:00
Khalid Manaa
83439f3c37 net/mlx5e: Add HW-GRO offload
This commit introduces HW-GRO offload by using the SHAMPO feature
- Add set feature handler for HW-GRO.

Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Signed-off-by: Khalid Manaa <khalidm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:41 -07:00
Khalid Manaa
def09e7bbc net/mlx5e: Add HW_GRO statistics
This patch adds HW_GRO counters to RX packets statistics:
 - gro_match_packets: counter of received packets with set match flag.

 - gro_packets: counter of received packets over the HW_GRO feature,
                this counter is increased by one for every received
                HW_GRO cqe.

 - gro_bytes: counter of received bytes over the HW_GRO feature,
              this counter is increased by the received bytes for every
              received HW_GRO cqe.

 - gro_skbs: counter of built HW_GRO skbs,
             increased by one when we flush HW_GRO skb
             (when we call a napi_gro_receive with hw_gro skb).

 - gro_large_hds: counter of received packets with large headers size,
                  in case the packet needs new SKB, the driver will allocate
                  new one and will not use the headers entry to build it.

Signed-off-by: Khalid Manaa <khalidm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:41 -07:00
Khalid Manaa
92552d3abd net/mlx5e: HW_GRO cqe handler implementation
this patch updates the SHAMPO CQE handler to support HW_GRO,

changes in the SHAMPO CQE handler:
- CQE match and flush fields are used to determine if to build new skb
  using the new received packet,
  or to add the received packet data to the existing RQ.hw_gro_skb,
  also this fields are used to determine when to flush the skb.
- in the end of the function mlx5e_poll_rx_cq the RQ.hw_gro_skb is flushed.

Signed-off-by: Khalid Manaa <khalidm@nvidia.com>
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:41 -07:00
Ben Ben-Ishay
64509b0525 net/mlx5e: Add data path for SHAMPO feature
The header buffer is used to store the headers of the rx packets.
The header buffer size deduced from WorkQueue size + restriction
of max packets per WorkQueueElement.
This commit adds the functionality for posting/updating memory for
the header buffer during the posting/updating of WQEs.

Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:40 -07:00
Khalid Manaa
f97d5c2a45 net/mlx5e: Add handle SHAMPO cqe support
This patch adds the new CQE SHAMPO fields:
- flush: indicates that we must close the current session and pass the SKB
         to the network stack.

- match: indicates that the current packet matches the oppened session,
         the packet will be merge into the current SKB.

- header_size: the size of the packet headers that written into the headers
               buffer.

- header_entry_index: the entry index in the headers buffer.

- data_offset: packets data offset in the WQE.

Also new cqe handler is added to handle SHAMPO packets:
- The new handler uses CQE SHAMPO fields to build the SKB.
  CQE's Flush and match fields are not used in this patch, packets are not
  merged in this patch.

Signed-off-by: Khalid Manaa <khalidm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:40 -07:00
Ben Ben-Ishay
e5ca8fb08a net/mlx5e: Add control path for SHAMPO feature
This commit introduces the control path infrastructure for SHAMPO feature.

SHAMPO feature enables packet stitching by splitting packets to
header and payload, the header is placed on a dedicated buffer
and the payload on the RX ring, this allows stitching the data part
of a flow together continuously in the receive buffer.

SHAMPO feature is implemented as linked list striding RQ feature.
To support packets splitting and payload stitching:
- Enlarge the ICOSQ and the correspond CQ to support the header buffer
  memory regions.
- Add support to create linked list striding RQ with SHAMPO feature set
  in the open_rq function.
- Add deallocation function and corresponded calls for SHAMPO header
  buffer.
- Add mlx5e_create_umr_klm_mkey to support KLM mkey for the header
  buffer.
- Rename mlx5e_create_umr_mkey to mlx5e_create_umr_mtt_mkey.

Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:40 -07:00
Ben Ben-Ishay
d7b896acbd net/mlx5e: Add support to klm_umr_wqe
This commit adds the needed definitions for using the klm_umr_wqe.
UMR stands for user-mode memory registration, is a mechanism to alter
address translation properties of MKEY by posting WorkQueueElement
aka WQE on send queue.
MKEY stands for memory key, MKEY are used to describe a region in memory that
can be later used by HW.
KLM stands for {Key, Length, MemVa}, KLM_MKEY is indirect MKEY that enables
to map multiple memory spaces with different sizes in unified MKEY.
klm_umr_wqe is a UMR that use to update a KLM_MKEY.
SHAMPO feature uses KLM_MKEY for memory registration of his header buffer.

Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:39 -07:00
Khalid Manaa
eaee12f046 net/mlx5e: Rename TIR lro functions to TIR packet merge functions
This series introduces new packet merge type, therefore rename lro
functions to packet merge to support the new merge type:
- Generalize + rename mlx5e_build_tir_ctx_lro to
  mlx5e_build_tir_ctx_packet_merge.
- Rename mlx5e_modify_tirs_lro to mlx5e_modify_tirs_packet_merge.
- Rename lro bit in mlx5_ifc_modify_tir_bitmask_bits to packet_merge.
- Rename lro_en in mlx5e_params to packet_merge_type type and combine
  packet_merge params into one struct mlx5e_packet_merge_param.

Signed-off-by: Khalid Manaa <khalidm@nvidia.com>
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:39 -07:00
Ben Ben-Ishay
7025329d20 net/mlx5: Add SHAMPO caps, HW bits and enumerations
This commit adds SHAMPO bit to hca_cap and SHAMPO capabilities structure,
SHAMPO related HW spec hardware fields and enumerations.
SHAMPO stands for: split headers and merge payload offload.
SHAMPO new fields:
WQ:
 - headers_mkey: mkey that represents the headers buffer, where the packets
   headers will be written by the HW.

 - shampo_enable: flag to verify if the WQ supports SHAMPO feature.

 - log_reservation_size: the log of the reservation size where the data of
   the packet will be written by the HW.

 - log_max_num_of_packets_per_reservation: log of the maximum number of
   packets that can be written to the same reservation.

 - log_headers_entry_size: log of the header entry size of the headers buffer.

 - log_headers_buffer_entry_num: log of the entries number of the headers buffer.

RQ:
 - shampo_no_match_alignment_granularity: the HW alignment granularity
   in case the received packet doesn't match the current session.

 - shampo_match_criteria_type: the type of match criteria.

 - reservation_timeout: the maximum time that the HW will hold the
   reservation.

mlx5_ifc_shampo_cap_bits, the capabilities of the SHAMPO feature:
 - shampo_log_max_reservation_size: the maximum allowed value of the field
   WQ.log_reservation_size.

 - log_reservation_size: the minimum allowed value of the field
   WQ.log_reservation_size.

 - shampo_min_mss_size: the minimum payload size of packet that can open
   a new session or be merged to a session.

 - shampo_max_log_headers_entry_size: the maximum allowed value of the field
   WQ.log_headers_entry_size

Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:39 -07:00
Ben Ben-Ishay
50f477fe99 net/mlx5e: Rename lro_timeout to packet_merge_timeout
TIR stands for transport interface receive, the TIR object is
responsible for performing all transport related operations on
the receive side like packet processing, demultiplexing the packets
to different RQ's, etc.
lro_timeout is a field in the TIR that is used to set the timeout for lro
session, this series introduces new packet merge type, therefore rename
lro_timeout to packet_merge_timeout for all packet merge types.

Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:38 -07:00
Ben Ben-ishay
54b2b3ecca net: Prevent HW-GRO and LRO features operate together
LRO and HW-GRO are mutually exclusive, this commit adds this restriction
in netdev_fix_feature. HW-GRO is preferred, that means in case both
HW-GRO and LRO features are requested, LRO is cleared.

Signed-off-by: Ben Ben-ishay <benishay@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:38 -07:00
Tariq Toukan
7529cc7fbd lib: bitmap: Introduce node-aware alloc API
Expose new node-aware API for bitmap allocation:
bitmap_alloc_node() / bitmap_zalloc_node().

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-26 19:30:38 -07:00
Claudiu Beznea
dd742cac34 clk: use clk_core_get_rate_recalc() in clk_rate_get()
In case clock flags contains CLK_GET_RATE_NOCACHE the clk_rate_get()
will return the cached rate. Thus, use clk_core_get_rate_recalc() which
takes proper action when clock flags contains CLK_GET_RATE_NOCACHE.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-16-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
[sboyd@kernel.org: Grab prepare lock around operation]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:31:23 -07:00
Claudiu Beznea
0b59e619ef clk: at91: sama7g5: set low limit for mck0 at 32KHz
MCK0 could go as low as 32KHz. Set this limit.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-15-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:43 -07:00
Claudiu Beznea
facb87ad75 clk: at91: sama7g5: remove prescaler part of master clock
On SAMA7G5 the prescaler part of master clock has been implemented as a
changeable one. Everytime the prescaler is changed the PMC_SR.MCKRDY bit
must be polled. Value 1 for PMC_SR.MCKRDY means the prescaler update is
done. Driver polls for this bit until it becomes 1. On SAMA7G5 it has
been discovered that in some conditions the PMC_SR.MCKRDY is not rising
but the rate it provides it's stable. The workaround is to add a timeout
when polling for PMC_SR.MCKRDY. At the moment, for SAMA7G5, the prescaler
will be removed from Linux clock tree as all the frequencies for CPU could
be obtained from PLL and also there will be less overhead when changing
frequency via DVFS.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-14-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:43 -07:00
Claudiu Beznea
7029db09b2 clk: at91: clk-master: add notifier for divider
SAMA7G5 supports DVFS by changing cpuck. On SAMA7G5 mck0 shares the same
parent with cpuck as seen in the following clock tree:

                       +----------> cpuck
                       |
FRAC PLL ---> DIV PLL -+-> DIV ---> mck0

mck0 could go b/w 32KHz and 200MHz on SAMA7G5. To avoid mck0 overclocking
while changing FRAC PLL or DIV PLL the commit implements a notifier for
mck0 which applies a safe divider to register (maximum value of the divider
which is 5) on PRE_RATE_CHANGE events (such that changes on PLL to not
overclock mck0) and sets the maximum allowed rate on POST_RATE_CHANGE
events.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-13-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:43 -07:00
Claudiu Beznea
1e229c21a4 clk: at91: clk-sam9x60-pll: add notifier for div part of PLL
SAM9X60's PLL which is also part of SAMA7G5 is composed of 2 parts:
one fractional part and one divider. On SAMA7G5 the CPU PLL could be
changed at run-time to implement DVFS. The hardware clock tree on
SAMA7G5 for CPU PLL is as follows:

                       +---- div1 ----------------> cpuck
                       |
FRAC PLL ---> DIV PLL -+-> prescaler ---> div0 ---> mck0

The div1 block is not implemented in Linux; on prescaler block it has
been discovered a bug on some scenarios and will be removed from Linux
in next commits. Thus, the final clock tree that will be used in Linux
will be as follows:

                       +-----------> cpuck
                       |
FRAC PLL ---> DIV PLL -+-> div0 ---> mck0

It has been proposed in [1] to not introduce a new CPUFreq driver but
to overload the proper clock drivers with proper operation such that
cpufreq-dt to be used. To accomplish this DIV PLL and div0 implement
clock notifiers which applies safe dividers before FRAC PLL is changed.
The current commit treats only the DIV PLL by adding a notifier that
sets a safe divider on PRE_RATE_CHANGE events. The safe divider is
provided by initialization clock code (sama7g5.c). The div0 is treated
in next commits (to keep the changes as clean as possible).

[1] https://lore.kernel.org/lkml/20210105104426.4tmgc2l3vyicwedd@vireshk-i7/

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-12-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:43 -07:00
Claudiu Beznea
0ef99f8202 clk: at91: clk-master: fix prescaler logic
When prescaler value read from register is MASTER_PRES_MAX it means
that the input clock will be divided by 3. Fix the code to reflect
this.

Fixes: 7a110b9107 ("clk: at91: clk-master: re-factor master clock")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-11-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:43 -07:00
Claudiu Beznea
a27748adea clk: at91: clk-master: mask mckr against layout->mask
Mask values read/written from/to MCKR against layout->mask as this
mask may be different b/w PMC versions.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-10-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:42 -07:00
Claudiu Beznea
c2910c00fe clk: at91: clk-master: check if div or pres is zero
Check if div or pres is zero before using it as argument for ffs().
In case div is zero ffs() will return 0 and thus substracting from
zero will lead to invalid values to be setup in registers.

Fixes: 7a110b9107 ("clk: at91: clk-master: re-factor master clock")
Fixes: 75c88143f3 ("clk: at91: clk-master: add master clock support for SAMA7G5")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-9-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:42 -07:00
Claudiu Beznea
f12d028b74 clk: at91: sam9x60-pll: use DIV_ROUND_CLOSEST_ULL
Use DIV_ROUND_CLOSEST_ULL() to avoid any inconsistency b/w the rate
computed in sam9x60_frac_pll_recalc_rate() and the one computed in
sam9x60_frac_pll_compute_mul_frac().

Fixes: 43b1bb4a9b ("clk: at91: clk-sam9x60-pll: re-factor to support plls with multiple outputs")
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-8-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:42 -07:00
Claudiu Beznea
5df4cd9099 clk: at91: pmc: add sama7g5 to the list of available pmcs
Add SAMA7G5 to the list of available PMCs such that the suspend/resume
code for clocks to be used on backup mode.

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-7-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:42 -07:00
Claudiu Beznea
88bdeed3d0 clk: at91: clk-master: improve readability by using local variables
Improve readability in clk_sama7g5_master_set() by using local
variables.

Suggested-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-6-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:42 -07:00
Claudiu Beznea
c553881677 clk: at91: clk-master: add register definition for sama7g5's master clock
SAMA7G5 has 4 master clocks (MCK1..4) which are controlled though the
register at offset 0x30 (relative to PMC). In the last/first phase of
suspend/resume procedure (which is architecture specific) the parent
of master clocks are changed (via assembly code) for more power saving
(see file arch/arm/mach-at91/pm_suspend.S, macros at91_mckx_ps_enable
and at91_mckx_ps_restore). Thus the macros corresponding to register
at offset 0x30 need to be shared b/w clk-master.c and pm_suspend.S.
commit ec03f18cc2 ("clk: at91: add register definition for sama7g5's
master clock") introduced the proper macros but didn't adapted the
clk-master.c as well. Thus, this commit adapt the clk-master.c to use
the macros introduced in commit ec03f18cc2 ("clk: at91: add register
definition for sama7g5's master clock").

Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20211011112719.3951784-5-claudiu.beznea@microchip.com
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2021-10-26 18:27:42 -07:00