According to the algorithm described in the comment block at the
beginning of ip_rt_send_redirect, the host should try to send
'ip_rt_redirect_number' ICMP redirect packets with an exponential
backoff and then stop sending them at all assuming that the destination
ignores redirects.
If the device has previously sent some ICMP error packets that are
rate-limited (e.g TTL expired) and continues to receive traffic,
the redirect packets will never be transmitted. This happens since
peer->rate_tokens will be typically greater than 'ip_rt_redirect_number'
and so it will never be reset even if the redirect silence timeout
(ip_rt_redirect_silence) has elapsed without receiving any packet
requiring redirects.
Fix it by using a dedicated counter for the number of ICMP redirect
packets that has been sent by the host
I have not been able to identify a given commit that introduced the
issue since ip_rt_send_redirect implements the same rate-limiting
algorithm from commit 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the user doesn't care about the return value from xa_insert(), then
they should be using xa_store() instead. The point of xa_reserve() is
to get the return value early before taking another lock, so this should
also be __must_check.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
This is a bit larger than normal, as we had not managed to send out
a pull request before traveling for a week without my signing key.
There are multiple code fixes for older bugs, all of which should
get backported into stable kernels:
- tango: one fix for multiplatform configurations broken on other
platforms when tango is enabled
- arm_scmi: device unregistration fix
- iop32x: fix kernel oops from extraneous __init annotation
- pxa: remove a double kfree
- fsl qbman: close an interrupt clearing race
The rest is the usual collection of smaller fixes for device tree
files, on the renesas, allwinner, meson, omap, davinci, qualcomm
and imx platforms. Some of these are for compile-time warnings,
most are for board specific functionality that fails to work
because of incorrect settings.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJcXg9kAAoJEGCrR//JCVInM/UP/1ikwMujrB33oT41l21KFvlw
yrP3ji9Cyr6Ag8WCtgFYDXWw6uNW1eFYov8E4y8UKc16TSWWSvGmmIFM5K3OOtLe
qAJrXTXCTBV2lGiWLIMlYQLAFav7H2CBgMLkRVLek1q7s6rV+hqV5hxfcAs6l2w7
G5Qe8pwuGuZ2qINTs7OdLizd+JAmMeIuPQHGhrZnEupiy+44hHgbrIacXPhwX4Ff
s5MwGON4H3pL1PtVIXlWo5nQwHyF+mkbSzn1RwmKpsQ4wK0vP3LgUURlvc945JNo
zA5C/eCO6xFv7LCvBsuw515eEfI74K/9PDPr7txDz8TePjusPMv5zrYkb+jUhFhm
dELhd8dmh50chXXgHVggRbIjYCpOJeVqm9aeYVvHyKOTNmVohGDc06To/0hFHljw
1kgX4r2hUduTex0wwFks22TfcXr/cQzarXqyV6lRP5K/4IoU8MJCp4QLYXQK7HYY
K9644aSaCTRGfRMbvVXYeykRgilEWT1wG8oREAH+PTWNIb47rqi/ByXitIrLkIWh
Lnefj6bB863E0lPson03sBksylDRaluSeT5lVyjHzJsHwVLt2haqtaI892SZhUy1
/oR60CMkGuRhmwi4ASCCbr20E+sa/LDNUVC6+d/xs9+Bc/54GEKxS11ffthMUoO0
3EpgCZDHno+PMSIRkzPN
=koHS
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes-5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"This is a bit larger than normal, as we had not managed to send out a
pull request before traveling for a week without my signing key.
There are multiple code fixes for older bugs, all of which should get
backported into stable kernels:
- tango: one fix for multiplatform configurations broken on other
platforms when tango is enabled
- arm_scmi: device unregistration fix
- iop32x: fix kernel oops from extraneous __init annotation
- pxa: remove a double kfree
- fsl qbman: close an interrupt clearing race
The rest is the usual collection of smaller fixes for device tree
files, on the renesas, allwinner, meson, omap, davinci, qualcomm and
imx platforms.
Some of these are for compile-time warnings, most are for board
specific functionality that fails to work because of incorrect
settings"
* tag 'armsoc-fixes-5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
ARM: tango: Improve ARCH_MULTIPLATFORM compatibility
firmware: arm_scmi: provide the mandatory device release callback
ARM: iop32x/n2100: fix PCI IRQ mapping
arm64: dts: add msm8996 compatible to gicv3
ARM: dts: am335x-shc.dts: fix wrong cd pin level
ARM: dts: n900: fix mmc1 card detect gpio polarity
ARM: dts: omap3-gta04: Fix graph_port warning
ARM: pxa: ssp: unneeded to free devm_ allocated data
ARM: dts: r8a7743: Convert to new LVDS DT bindings
soc: fsl: qbman: avoid race in clearing QMan interrupt
arm64: dts: renesas: r8a77965: Enable DMA for SCIF2
arm64: dts: renesas: r8a7796: Enable DMA for SCIF2
arm64: dts: renesas: r8a774a1: Enable DMA for SCIF2
ARM: dts: da850: fix interrupt numbers for clocksource
dt-bindings: imx8mq: Number clocks consecutively
arm64: dts: meson: Fix mmc cd-gpios polarity
ARM: dts: imx6sx: correct backward compatible of gpt
ARM: dts: imx: replace gpio-key,wakeup with wakeup-source property
ARM: dts: vf610-bk4: fix incorrect #address-cells for dspi3
ARM: dts: meson8m2: mxiii-plus: mark the SD card detection GPIO active-low
...
- Fix kernel oops when attemping kexec_file() with a NULL cmdline
- Fix page table output in debugfs when CONFIG_ARM64_USER_VA_BITS_52=y
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlxdeK4ACgkQt6xw3ITB
YzTwrwgArV1AHQTG6HG0Xz86zNbjA25XHr1HFTl75Q7SRu8lufxUXJ/JbIfE5v/v
D/a40PDsWQeXMHvZT0C/DyByfbfSSigLAwS/psZxmjT5KDgfQCor01UjtWvh+DS+
U7DnXjLxJFZZ13dJmMPGn3JGmNB5VoKv6w3+9f2rZgECZXdVBQzGPopmjhjmVx5/
ncGLF4ecxQ1+ldIGGTIC5NSYsFPyhcKxYRlQ/sT/d5oN3wIJNixj7yL14jzx3izG
OEsu967q3Ajn5lMs2j3k1ixopeOLWYOsSB27oWczJ/XAB4EGU9bbuHOz4alDpE5X
oDIFuJGrD6s6Ba5E62XtHfp62Iy5pg==
=UCO2
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Two arm64 fixes for -rc6. They resolve a kernel NULL dereference in
kexec and bogus kernel page table dumping when userspace is configured
for 52-bit virtual addressing.
Summary:
- Fix kernel oops when attemping kexec_file() with a NULL cmdline
- Fix page table output in debugfs when ARM64_USER_VA_BITS_52=y"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kexec_file: handle empty command-line
arm64: ptdump: Don't iterate kernel page tables using PTRS_PER_PXX
Just two fixes, both going to stable.
Our support for split pmd page table lock had a bug which could lead to a crash
on mremap() when using the Radix MMU (Power9 only).
A fix for the PAPR SCM driver (nvdimm) we added last release, which had a bug
where we might mis-handle a hypervisor response leading to us failing to attach
the memory region.
Thanks to:
Aneesh Kumar K.V, Oliver O'Halloran.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcXXVeAAoJEFHr6jzI4aWA+18P/2EJHmTJ2XgXfQdz7XAEb5YJ
AXUsg47rm1Cx83PNTOpY4uGVEQmvb+a4DkeSIucoISJeGdo3lDIkGluySYZNaT6E
1Z8Tm6v5j9WLSV7CQcx0p3jU2xR/iap4HpDa6IiPjT8/4v4SwJvDkZLnqflwA2Q5
yk8e7gfViWccCD3F+/MyDvOF+t/9PEHP8qd86NtVrUxjx57WN+LehW2D3gi6LdNv
L4L42ZQndGYXjmF4WkoDVLB/AQLFD95XiO0FlP45nqJK/CPBhi6g9knwg1zI7Fgd
2yXJJWfxW52EqhFv9D9hmU/5SqgKb0vXgqrNW07HvqMp5WMWP69+gsvLughBb5SE
KFos9EkVlCkKdBsjdC9nT2p/qxP0MXe8CrGuVSNXAGjw79Je9FM8byYpL45xB5Pm
bqZnvrjktAhgLvyzz48eSNqMmX1c3GfVwQn3WIxBlO+k1Hd5hzndBD4PEeX8WOo1
/sTb8B0VHhYM+cyexUufXE5FwXzMWOgW/7GqcgHClHg5/PgaFKs7ZGVa5y0oRn+A
KWnPjkwQQUNk712AZosEUjffHASNXzkM1ckMm36E8j11OgpplLea3nSoa9vqXZJi
92d5bqDgUZljQLjxeg52CrPOOr6RCDLQ9UdYgt72UOjhKk2f/Aim9jcFnxg71xOC
mfEx7R64crDZFMYZT2Ij
=anwQ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Just two fixes, both going to stable.
- Our support for split pmd page table lock had a bug which could
lead to a crash on mremap() when using the Radix MMU (Power9 only).
- A fix for the PAPR SCM driver (nvdimm) we added last release, which
had a bug where we might mis-handle a hypervisor response leading
to us failing to attach the memory region.
Thanks to: Aneesh Kumar K.V, Oliver O'Halloran"
* tag 'powerpc-5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/papr_scm: Use the correct bind address
powerpc/radix: Fix kernel crash with mremap()
When an application aborts the connection by moving QP from RTS to ERROR,
then iw_cxgb4's modify_rc_qp() RTS->ERROR logic sets the
*srqidxp to 0 via t4_set_wq_in_error(&qhp->wq, 0), and aborts the
connection by calling c4iw_ep_disconnect().
c4iw_ep_disconnect() does the following:
1. sends up a close_complete_upcall(ep, -ECONNRESET) to libcxgb4.
2. sends abort request CPL to hw.
But, since the close_complete_upcall() is sent before sending the
ABORT_REQ to hw, libcxgb4 would fail to release the srqidx if the
connection holds one. Because, the srqidx is passed up to libcxgb4 only
after corresponding ABORT_RPL is processed by kernel in abort_rpl().
This patch handle the corner-case by moving the call to
close_complete_upcall() from c4iw_ep_disconnect() to abort_rpl(). So that
libcxgb4 is notified about the -ECONNRESET only after abort_rpl(), and
libcxgb4 can relinquish the srqidx properly.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If TP fetches an SRQ buffer but ends up not using it before the connection
is aborted, then it passes the index of that SRQ buffer to the host in
ABORT_REQ_RSS or ABORT_RPL CPL message.
But, if the srqidx field is zero in the received ABORT_RPL or
ABORT_REQ_RSS CPL, then we need to read the tcb.rq_start field to see if
it really did have an RQE cached. This works around a case where HW does
not include the srqidx in the ABORT_RPL/ABORT_REQ_RSS CPL.
The final value of rq_start is the one present in TCB with the
TF_RX_PDU_OUT bit cleared. So, we need to read the TCB, examine the
TF_RX_PDU_OUT (bit 49 of t_flags) in order to determine if there's a rx
PDU feedback event pending.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch adds the tcb flags and structures needed for querying tcb
information.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The locking here started out with a single lock that covered everything
and then has lately veered into crazy town.
The fundamental problem is that several places need to iterate over a
linked list, but also need to drop their locks to avoid deadlock during
client callbacks.
xarray's restartable iteration offers a simple solution to the
problem. Once all the lists are xarrays we can drop locks in the places
that need that and rely on xarray to provide consistency and locking for
the data structure.
The resulting simplification is that each of the three lists has a
dedicated rwsem that must be held when working with the list it
covers. One data structure is no longer covered by multiple locks.
The sleeping semaphore is selected because the read side generally needs
to be held over something sleeping, and using RCU reader locking in those
cases is overkill.
In the process this simplifies the entire registration/unregistration flow
to be the expected list of setups and the reversed list of matching
teardowns, and the registration lock 'refcount' can now be revised to be
released after the ULPs are removed, providing a very sane semantic for
this feature.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Now that we have a small ID for each client we can use xarray instead of
linearly searching linked lists for client data. This will give much
faster and scalable client data lookup, and will lets us revise the
locking scheme.
Since xarray can store 'going_down' using a mark just entirely eliminate
the struct ib_client_data and directly store the client_data value in the
xarray. However this does require a special iterator as we must still
iterate over any NULL client_data values.
Also eliminate the client_data_lock in favour of internal xarray locking.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This gives each client a unique ID and will let us move client_data to use
xarray, and revise the locking scheme.
clients have to be add/removed in strict FIFO/LIFO order as they
interdepend. To support this the client_ids are assigned to increase in
FIFO order. The existing linked list is kept to support reverse iteration
until xarray can get a reverse iteration API.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
ida is the proper data structure to hold list of clustered small integers
and then allocate an unused integer. Get rid of the convoluted and limited
open-coded bitmap.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This really has no purpose anymore, refcount can be used to tell if the
device is still registered. Keeping it around just invites mis-use.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Instead of complicated logic about when this memory is freed, always free
it during device release(). All the cache pointers start out as NULL, so
it is safe to call this before the cache is initialized.
This makes for a simpler error unwind flow, and a simpler understanding of
the lifetime of the memory allocations inside the struct ib_device.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since this only frees memory it should be done during the release
callback. Otherwise there are possible error flows where it might not get
called if registration aborts.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Since another rename could be running in parallel it is safer to check
that the name is not changing inside the lock, where we already know the
device name will not change.
Fixes: d21943dd19 ("RDMA/core: Implement IB device rename function")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
The PD allocations in IB/core allows us to simplify drivers and their
error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand
in had with relevant update in .dealloc_pd().
We will use this opportunity and convert .dealloc_pd() to don't fail, as
it was suggested a long time ago, failures are not happening as we have
never seen a WARN_ON print.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add new macros to be used in drivers while registering ops structure and
IB/core while calling allocation routines, so drivers won't need to
perform kzalloc/kfree in their paths.
The change in allocation stage allows us to initialize common fields prior
to calling to drivers (e.g. restrack).
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Pull signal fixes from Eric Biederman:
"This contains four small fixes for signal handling. A missing range
check, a regression fix, prioritizing signals we have already started
a signal group exit for, and better detection of synchronous signals.
The confused decision of which signals to handle failed spectacularly
when a timer was pointed at SIGBUS and the stack overflowed. Resulting
in an unkillable process in an infinite loop instead of a SIGSEGV and
core dump"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Better detection of synchronous signals
signal: Always notice exiting tasks
signal: Always attempt to allocate siginfo for SIGSTOP
signal: Make siginmask safe when passed a signal of 0
This is a set of five minor fixes (although, tecnhincally, the aicxxx
fix is for a major problem in that the driver won't load without it,
but I think the fact it's taken us since 4.10 to discover this
indicates that the user base for these things has declined).
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXF3VNSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishUz2AP9L+n9A
Ma5WutU8gkoNcttX7RJvRmtha9RiwvxRi7cs6QD+OToBDpTbo+kLuzfXz0Gop4Go
qQziEsBm1P9ShCti3K0=
=hptI
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of five minor fixes (although, tecnhincally, the aicxxx
fix is for a major problem in that the driver won't load without it,
but I think the fact it's taken us since 4.10 to discover this
indicates that the user base for these things has declined)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: cxlflash: Prevent deadlock when adapter probe fails
Revert "scsi: libfc: Add WARN_ON() when deleting rports"
scsi: sd_zbc: Fix zone information messages
scsi: target: make the pi_prot_format ConfigFS path readable
scsi: aic94xx: fix module loading
Prevent excessive ACPI debug messages from being printed to the
kernel log, which has started to happen after one of the recent
ACPICA commits (Erik Schmauss).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJcXWIGAAoJEILEb/54YlRx6WYP/j/Gu/7CSQNqHV2LHLnLHY1I
xQVPxtH2NGtMCpkzUSxWmB0U4fkDhQVfkqCbnIgjiD54Ndrr0ni7yeDDth2QOBas
d8A8NidVE3cVZB2L1g5FprWeyFjH/Sb8mmkxvSZnS4PL4A7LT9csmWrMiECShaI3
u/ARMVTruIh+R4q+CsG63VKo96qu1ySWqG08IEWMNuPK3bv2CrPEVrPM/TBR/7VD
hy1IJ/PfJhIgIoOBAM9Q6ysZn1Ssmro1UB0g8rjabgTn8RUWm4u5j7DeSCTYQJh9
knirQxzCKsQYdz43sWOegm1MTRAEpZfUc9jqQw557axm8otQNLB3iMMXsnizqKPg
COJxGtSXRBmJzhvavIzSBkGf3T2aEtLp4Bmj00hFZZKaPD6AIDEkR2OpKzRF7pVY
KdNh2OjAbhOy4B/9NS5KPQaL5g8fTh0cR1ZZNfA8Qp2xCLTGwcdrVEuURQXXy2xr
OJyDNSDHzQBWi0usAN1+/uYYKurPwmxW0HtQ8a3z/rUs7O8z6cW9kq/vc6W+XKUD
GJt4XB9TM1cvxrXl6cMvOXUT12zcx/BYUX1FSzRP7xH76q4n5GQ3NwdmMEuDuMgG
MNdd5dQ9NNg5oSW/lcdorPZuwGdraplP4aLjO4z0dC71lyLyT6HWzmdlAlFfTtYY
re6mj6Nww7c7f/LOltWp
=4PuZ
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"This prevents excessive ACPI debug messages from being printed to the
kernel log, which has started to happen after one of the recent ACPICA
commits (Erik Schmauss)"
* tag 'acpi-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: Set debug output flags independent of ACPICA
When creating many MAD agents in a short period of time, receive packet
processing can be delayed long enough to cause timeouts while new agents
are being added to the atomic notifier chain with IRQs disabled. Notifier
chain registration and unregstration is an O(n) operation. With large
numbers of MAD agents being created and destroyed simultaneously the CPUs
spend too much time with interrupts disabled.
Instead of each MAD agent registering for it's own LSM notification,
maintain a list of agents internally and register once, this registration
already existed for handling the PKeys. This list is write mostly, so a
normal spin lock is used vs a read/write lock. All MAD agents must be
checked, so a single list is used instead of breaking them down per
device.
Notifier calls are done under rcu_read_lock, so there isn't a risk of
similar packet timeouts while checking the MAD agents security settings
when notified.
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If the MAD agents isn't allowed to manage the subnet, or fails to register
for the LSM notifier, the security context is leaked. Free the context in
these cases.
Fixes: 47a2b338fe ("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reported-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If the notifier runs after the security context is freed an access of
freed memory can occur.
Fixes: 47a2b338fe ("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Move the call to usnic_ib_device_remove after usnic_ib_ibdev_list_lock has
been released.
Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add a new compatibility string for this product. It's using
at91sam9260-macb layout but has a newer hardware revision: it's safer
to use its own string.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the compatibility sting documentation for sam9x60 10/100 interface.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes a line left while adding the correct compatibility string for
sama5d3 10/100 interface. Now use the "atmel,sama5d3-macb" string.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When IPv6 support was added, the correct tos was not passed to
cxgb_find_route6(). This potentially results in the wrong route entry.
Fixes: 830662f6f0 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
import_ep() is passed the correct tos, but doesn't use it correctly.
Fixes: ac8e4c69a0 ("cxgb4/iw_cxgb4: TOS support")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If the parent listening endpoint has a service type set, then use that
when setting up the connection. This allows server-side applications to
mandate the tos for passive side connections via rdma_set_service_type()
on the listening endpoints.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This allows drivers to know the tos was actively set by the application.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If a user binds to INADDR_ANY and sets the service id, then the
device-specific cm_ids should also use this tos. This allows an app to
do:
rdma_bind_addr(INADDR_ANY)
set_service_type()
rdma_listen()
And connections setup via this listening endpoint will use the correct
tos.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Define new option in 'rdma_set_option' to override calculated QP timeout
when requested to provide QP attributes to modify a QP.
At the same time, pack tos_set to be bitfield.
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When we probe a SFP module, we expect to be able to call the upstream
device's module_insert() function so that the upstream link can be
configured. However, when the upstream device is delayed, we currently
may end up probing the module before the upstream device is available,
and lose the module_insert() call.
Avoid this by holding off probing the module until the SFP bus is
properly connected to both the SFP socket driver and the upstream
driver.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do some very minor tidy-up, for things like needlessly initing variable and
not leaving whitespace before quote endings.
Originally-from: Xiang Chen <chenxiang66@hisilicon.com>
Originally-from: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For auto-control irq affinity mode, choose the dq to deliver IO according
to the current CPU.
Then it decreases the performance regression that fio and CQ interrupts are
processed on different node.
For user control irq affinity mode, keep it as before.
To realize it, also need to distinguish the usage of dq lock and sas_dev
lock.
We mark as experimental due to ongoing discussion on managed MSI IRQ
during hotplug:
https://marc.info/?l=linux-scsi&m=154876335707751&w=2
We're almost at the point where we can expose multiple queues to the upper
layer for SCSI MQ, but we need to sort out the per-HBA tags performance
issue.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To support queue mapped to a CPU, it needs to be ensured that issuing an
internal abort is safe, in that it is guaranteed that an internal abort is
processed for a single IO or a device after all the relevant command(s)
which it is attempting to abort have been processed by the controller.
Currently we only deliver commands for any device on a single queue to
solve this problem, as we know that commands issued on the same queue will
be processed in order, and we will not have a scenario where the internal
abort is racing against a command(s) which it is trying to abort.
To enqueue commands on queue mapped to a CPU, choosing a queue for an
command is based on the associated queue for the current CPU, so this is
not safe for internal abort since it would definitely not be guaranteed
that commands for the command devices are issued on the same queue.
To solve this issue, we take a bludgeoning approach, and issue a separate
internal abort on any queue(s) relevant to the command or device, in that
we will be guaranteed that at least one of these internal aborts will be
received last in the controller.
So, for aborting a single command, we can just force the internal abort to
be issued on the same queue as the command which we are trying to abort.
For aborting all commands associated with a device, we issue a separate
internal abort on all relevant queues. Issuing multiple internal aborts in
this fashion would have not side affect.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If sending IOs to many disks from single queue, it is possible that the
queue may be full. To avoid the situation, change queue depth from 512 to
4096 which is the max number of IOs for v3 hw.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add an interface to manually trigger a debugfs dump.
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch adds support for DIX to v3 hw driver.
For this, we build upon support for DIF, most significantly is adding new
DMA map and unmap paths.
Some pre-existing macro precedence issues are also tidied. They were
detected by checkpatch --strict.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ido Schimmel says:
====================
mlxsw: Implement periodic ERP rehash
Currently, an ERP set is created for each region according to rules
inserted and order of their insertion. However that might lead to
suboptimal ERP sets and possible unnecessary spillage into C-TCAM.
This patchset aims to fix this problem and introduces periodical checking
of used ERP sets and in case a better ERP set is possible for the given
set of rules, it rehashes the region to use the better ERP set.
Patch 1 prepares devlink params infra in order to fix the
init/fini sequences.
Patch 2 implements hints infra in objagg library.
Patch 3 fixes a typo
Patch 4 adds number of root objects directly into objagg stats.
Patches 5-7 do split of multiple structs in Spectrum TCAM code.
Patch 8 introduces initial implementation of ERP rehash logic,
according to objagg hints.
Patch 9 adds hints priv passing trought the layers.
Patch 10 adds multi field into PAGT reg. (new patch)
Patch 11 implements actual region rules migration in TCAM code.
Patch 12 adds a devlink param so user is able to control
rehash interval.
Patch 13 adds couple of tracepoints in order to track
rehash procedures.
Patch 14 adds a simple selftest to test region rehash.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Track the basic codepaths of delta rehash handling,
using mlxsw tracepoints.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As vregion rehash is happening in delayed work, add some visibility to
the process using a few tracepoints.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expose new driver-specific "acl_region_rehash_interval" devlink param
which would allow user to alter default ACL region rehash interval.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the hints are returned, the migration should be started. For that to
happen, there is a need to create a second physical region in TCAM with
new ERP set by passing the hints and then move chunk by chunk,
entry by entry.
During the transition, two lookups will occur. One in old region and
another in new region. The highest priority rule will be chosen.
In an unlikely case that the migration will fail and also rollback to
original region will fail the vregion will become in bad state.
Everything will work, only no future rehash will be possible. In a
follow-up work, this can be resolved by trying to resume the rollback
in delayed work and repair the vregion.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For Spectrum-2 this allows parallel lookups in multiple regions.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>