Commit graph

824682 commits

Author SHA1 Message Date
Lorenzo Bianconi
c09551c6ff net: ipv4: use a dedicated counter for icmp_v4 redirect packets
According to the algorithm described in the comment block at the
beginning of ip_rt_send_redirect, the host should try to send
'ip_rt_redirect_number' ICMP redirect packets with an exponential
backoff and then stop sending them at all assuming that the destination
ignores redirects.
If the device has previously sent some ICMP error packets that are
rate-limited (e.g TTL expired) and continues to receive traffic,
the redirect packets will never be transmitted. This happens since
peer->rate_tokens will be typically greater than 'ip_rt_redirect_number'
and so it will never be reset even if the redirect silence timeout
(ip_rt_redirect_silence) has elapsed without receiving any packet
requiring redirects.

Fix it by using a dedicated counter for the number of ICMP redirect
packets that has been sent by the host

I have not been able to identify a given commit that introduced the
issue since ip_rt_send_redirect implements the same rate-limiting
algorithm from commit 1da177e4c3 ("Linux-2.6.12-rc2")

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 21:50:15 -08:00
Matthew Wilcox
f818b82b80 XArray: Mark xa_insert and xa_reserve as must_check
If the user doesn't care about the return value from xa_insert(), then
they should be using xa_store() instead.  The point of xa_reserve() is
to get the return value early before taking another lock, so this should
also be __must_check.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
2019-02-09 00:00:49 -05:00
Linus Torvalds
46c291e277 ARM: SoC fixes for linux-5.0
This is a bit larger than normal, as we had not managed to send out
 a pull request before traveling for a week without my signing key.
 
 There are multiple code fixes for older bugs, all of which should
 get backported into stable kernels:
 
 - tango: one fix for multiplatform configurations broken on other
   platforms when tango is enabled
 - arm_scmi: device unregistration fix
 - iop32x: fix kernel oops from extraneous __init annotation
 - pxa: remove a double kfree
 - fsl qbman: close an interrupt clearing race
 
 The rest is the usual collection of smaller fixes for device tree
 files, on the renesas, allwinner, meson, omap, davinci, qualcomm
 and imx platforms. Some of these are for compile-time warnings,
 most are for board specific functionality that fails to work
 because of incorrect settings.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcXg9kAAoJEGCrR//JCVInM/UP/1ikwMujrB33oT41l21KFvlw
 yrP3ji9Cyr6Ag8WCtgFYDXWw6uNW1eFYov8E4y8UKc16TSWWSvGmmIFM5K3OOtLe
 qAJrXTXCTBV2lGiWLIMlYQLAFav7H2CBgMLkRVLek1q7s6rV+hqV5hxfcAs6l2w7
 G5Qe8pwuGuZ2qINTs7OdLizd+JAmMeIuPQHGhrZnEupiy+44hHgbrIacXPhwX4Ff
 s5MwGON4H3pL1PtVIXlWo5nQwHyF+mkbSzn1RwmKpsQ4wK0vP3LgUURlvc945JNo
 zA5C/eCO6xFv7LCvBsuw515eEfI74K/9PDPr7txDz8TePjusPMv5zrYkb+jUhFhm
 dELhd8dmh50chXXgHVggRbIjYCpOJeVqm9aeYVvHyKOTNmVohGDc06To/0hFHljw
 1kgX4r2hUduTex0wwFks22TfcXr/cQzarXqyV6lRP5K/4IoU8MJCp4QLYXQK7HYY
 K9644aSaCTRGfRMbvVXYeykRgilEWT1wG8oREAH+PTWNIb47rqi/ByXitIrLkIWh
 Lnefj6bB863E0lPson03sBksylDRaluSeT5lVyjHzJsHwVLt2haqtaI892SZhUy1
 /oR60CMkGuRhmwi4ASCCbr20E+sa/LDNUVC6+d/xs9+Bc/54GEKxS11ffthMUoO0
 3EpgCZDHno+PMSIRkzPN
 =koHS
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes-5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "This is a bit larger than normal, as we had not managed to send out a
  pull request before traveling for a week without my signing key.

  There are multiple code fixes for older bugs, all of which should get
  backported into stable kernels:

   - tango: one fix for multiplatform configurations broken on other
     platforms when tango is enabled

   - arm_scmi: device unregistration fix

   - iop32x: fix kernel oops from extraneous __init annotation

   - pxa: remove a double kfree

   - fsl qbman: close an interrupt clearing race

  The rest is the usual collection of smaller fixes for device tree
  files, on the renesas, allwinner, meson, omap, davinci, qualcomm and
  imx platforms.

  Some of these are for compile-time warnings, most are for board
  specific functionality that fails to work because of incorrect
  settings"

* tag 'armsoc-fixes-5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
  ARM: tango: Improve ARCH_MULTIPLATFORM compatibility
  firmware: arm_scmi: provide the mandatory device release callback
  ARM: iop32x/n2100: fix PCI IRQ mapping
  arm64: dts: add msm8996 compatible to gicv3
  ARM: dts: am335x-shc.dts: fix wrong cd pin level
  ARM: dts: n900: fix mmc1 card detect gpio polarity
  ARM: dts: omap3-gta04: Fix graph_port warning
  ARM: pxa: ssp: unneeded to free devm_ allocated data
  ARM: dts: r8a7743: Convert to new LVDS DT bindings
  soc: fsl: qbman: avoid race in clearing QMan interrupt
  arm64: dts: renesas: r8a77965: Enable DMA for SCIF2
  arm64: dts: renesas: r8a7796: Enable DMA for SCIF2
  arm64: dts: renesas: r8a774a1: Enable DMA for SCIF2
  ARM: dts: da850: fix interrupt numbers for clocksource
  dt-bindings: imx8mq: Number clocks consecutively
  arm64: dts: meson: Fix mmc cd-gpios polarity
  ARM: dts: imx6sx: correct backward compatible of gpt
  ARM: dts: imx: replace gpio-key,wakeup with wakeup-source property
  ARM: dts: vf610-bk4: fix incorrect #address-cells for dspi3
  ARM: dts: meson8m2: mxiii-plus: mark the SD card detection GPIO active-low
  ...
2019-02-08 16:23:41 -08:00
Linus Torvalds
5bb513ed83 arm64 fixes for -rc6
- Fix kernel oops when attemping kexec_file() with a NULL cmdline
 
 - Fix page table output in debugfs when CONFIG_ARM64_USER_VA_BITS_52=y
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlxdeK4ACgkQt6xw3ITB
 YzTwrwgArV1AHQTG6HG0Xz86zNbjA25XHr1HFTl75Q7SRu8lufxUXJ/JbIfE5v/v
 D/a40PDsWQeXMHvZT0C/DyByfbfSSigLAwS/psZxmjT5KDgfQCor01UjtWvh+DS+
 U7DnXjLxJFZZ13dJmMPGn3JGmNB5VoKv6w3+9f2rZgECZXdVBQzGPopmjhjmVx5/
 ncGLF4ecxQ1+ldIGGTIC5NSYsFPyhcKxYRlQ/sT/d5oN3wIJNixj7yL14jzx3izG
 OEsu967q3Ajn5lMs2j3k1ixopeOLWYOsSB27oWczJ/XAB4EGU9bbuHOz4alDpE5X
 oDIFuJGrD6s6Ba5E62XtHfp62Iy5pg==
 =UCO2
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Two arm64 fixes for -rc6. They resolve a kernel NULL dereference in
  kexec and bogus kernel page table dumping when userspace is configured
  for 52-bit virtual addressing.

  Summary:

   - Fix kernel oops when attemping kexec_file() with a NULL cmdline

   - Fix page table output in debugfs when ARM64_USER_VA_BITS_52=y"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kexec_file: handle empty command-line
  arm64: ptdump: Don't iterate kernel page tables using PTRS_PER_PXX
2019-02-08 16:21:33 -08:00
Linus Torvalds
820828bffe powerpc fixes for 5.0 #4
Just two fixes, both going to stable.
 
 Our support for split pmd page table lock had a bug which could lead to a crash
 on mremap() when using the Radix MMU (Power9 only).
 
 A fix for the PAPR SCM driver (nvdimm) we added last release, which had a bug
 where we might mis-handle a hypervisor response leading to us failing to attach
 the memory region.
 
 Thanks to:
   Aneesh Kumar K.V, Oliver O'Halloran.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcXXVeAAoJEFHr6jzI4aWA+18P/2EJHmTJ2XgXfQdz7XAEb5YJ
 AXUsg47rm1Cx83PNTOpY4uGVEQmvb+a4DkeSIucoISJeGdo3lDIkGluySYZNaT6E
 1Z8Tm6v5j9WLSV7CQcx0p3jU2xR/iap4HpDa6IiPjT8/4v4SwJvDkZLnqflwA2Q5
 yk8e7gfViWccCD3F+/MyDvOF+t/9PEHP8qd86NtVrUxjx57WN+LehW2D3gi6LdNv
 L4L42ZQndGYXjmF4WkoDVLB/AQLFD95XiO0FlP45nqJK/CPBhi6g9knwg1zI7Fgd
 2yXJJWfxW52EqhFv9D9hmU/5SqgKb0vXgqrNW07HvqMp5WMWP69+gsvLughBb5SE
 KFos9EkVlCkKdBsjdC9nT2p/qxP0MXe8CrGuVSNXAGjw79Je9FM8byYpL45xB5Pm
 bqZnvrjktAhgLvyzz48eSNqMmX1c3GfVwQn3WIxBlO+k1Hd5hzndBD4PEeX8WOo1
 /sTb8B0VHhYM+cyexUufXE5FwXzMWOgW/7GqcgHClHg5/PgaFKs7ZGVa5y0oRn+A
 KWnPjkwQQUNk712AZosEUjffHASNXzkM1ckMm36E8j11OgpplLea3nSoa9vqXZJi
 92d5bqDgUZljQLjxeg52CrPOOr6RCDLQ9UdYgt72UOjhKk2f/Aim9jcFnxg71xOC
 mfEx7R64crDZFMYZT2Ij
 =anwQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Just two fixes, both going to stable.

   - Our support for split pmd page table lock had a bug which could
     lead to a crash on mremap() when using the Radix MMU (Power9 only).

   - A fix for the PAPR SCM driver (nvdimm) we added last release, which
     had a bug where we might mis-handle a hypervisor response leading
     to us failing to attach the memory region.

  Thanks to: Aneesh Kumar K.V, Oliver O'Halloran"

* tag 'powerpc-5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/papr_scm: Use the correct bind address
  powerpc/radix: Fix kernel crash with mremap()
2019-02-08 16:04:12 -08:00
Raju Rangoju
f368ff188a iw_cxgb4: fix srqidx leak during connection abort
When an application aborts the connection by moving QP from RTS to ERROR,
then iw_cxgb4's modify_rc_qp() RTS->ERROR logic sets the
*srqidxp to 0 via t4_set_wq_in_error(&qhp->wq, 0), and aborts the
connection by calling c4iw_ep_disconnect().

c4iw_ep_disconnect() does the following:
 1. sends up a close_complete_upcall(ep, -ECONNRESET) to libcxgb4.
 2. sends abort request CPL to hw.

But, since the close_complete_upcall() is sent before sending the
ABORT_REQ to hw, libcxgb4 would fail to release the srqidx if the
connection holds one. Because, the srqidx is passed up to libcxgb4 only
after corresponding ABORT_RPL is processed by kernel in abort_rpl().

This patch handle the corner-case by moving the call to
close_complete_upcall() from c4iw_ep_disconnect() to abort_rpl().  So that
libcxgb4 is notified about the -ECONNRESET only after abort_rpl(), and
libcxgb4 can relinquish the srqidx properly.

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 17:02:05 -07:00
Raju Rangoju
11a27e2121 iw_cxgb4: complete the cached SRQ buffers
If TP fetches an SRQ buffer but ends up not using it before the connection
is aborted, then it passes the index of that SRQ buffer to the host in
ABORT_REQ_RSS or ABORT_RPL CPL message.

But, if the srqidx field is zero in the received ABORT_RPL or
ABORT_REQ_RSS CPL, then we need to read the tcb.rq_start field to see if
it really did have an RQE cached. This works around a case where HW does
not include the srqidx in the ABORT_RPL/ABORT_REQ_RSS CPL.

The final value of rq_start is the one present in TCB with the
TF_RX_PDU_OUT bit cleared. So, we need to read the TCB, examine the
TF_RX_PDU_OUT (bit 49 of t_flags) in order to determine if there's a rx
PDU feedback event pending.

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 17:02:05 -07:00
Raju Rangoju
e381a1cb65 cxgb4: add tcb flags and tcb rpl struct
This patch adds the tcb flags and structures needed for querying tcb
information.

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 17:02:05 -07:00
Jason Gunthorpe
921eab1143 RDMA/devices: Re-organize device.c locking
The locking here started out with a single lock that covered everything
and then has lately veered into crazy town.

The fundamental problem is that several places need to iterate over a
linked list, but also need to drop their locks to avoid deadlock during
client callbacks.

xarray's restartable iteration offers a simple solution to the
problem. Once all the lists are xarrays we can drop locks in the places
that need that and rely on xarray to provide consistency and locking for
the data structure.

The resulting simplification is that each of the three lists has a
dedicated rwsem that must be held when working with the list it
covers. One data structure is no longer covered by multiple locks.

The sleeping semaphore is selected because the read side generally needs
to be held over something sleeping, and using RCU reader locking in those
cases is overkill.

In the process this simplifies the entire registration/unregistration flow
to be the expected list of setups and the reversed list of matching
teardowns, and the registration lock 'refcount' can now be revised to be
released after the ULPs are removed, providing a very sane semantic for
this feature.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:56:45 -07:00
Jason Gunthorpe
0df91bb673 RDMA/devices: Use xarray to store the client_data
Now that we have a small ID for each client we can use xarray instead of
linearly searching linked lists for client data. This will give much
faster and scalable client data lookup, and will lets us revise the
locking scheme.

Since xarray can store 'going_down' using a mark just entirely eliminate
the struct ib_client_data and directly store the client_data value in the
xarray. However this does require a special iterator as we must still
iterate over any NULL client_data values.

Also eliminate the client_data_lock in favour of internal xarray locking.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:56:45 -07:00
Jason Gunthorpe
e59178d895 RDMA/devices: Use xarray to store the clients
This gives each client a unique ID and will let us move client_data to use
xarray, and revise the locking scheme.

clients have to be add/removed in strict FIFO/LIFO order as they
interdepend. To support this the client_ids are assigned to increase in
FIFO order. The existing linked list is kept to support reverse iteration
until xarray can get a reverse iteration API.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
2019-02-08 16:56:45 -07:00
Jason Gunthorpe
3b88afd38e RDMA/device: Use an ida instead of a free page in alloc_name
ida is the proper data structure to hold list of clustered small integers
and then allocate an unused integer. Get rid of the convoluted and limited
open-coded bitmap.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:56:45 -07:00
Jason Gunthorpe
652432f33c RDMA/device: Get rid of reg_state
This really has no purpose anymore, refcount can be used to tell if the
device is still registered. Keeping it around just invites mis-use.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
2019-02-08 16:56:45 -07:00
Jason Gunthorpe
d45f89d59b RDMA/device: Call ib_cache_release_one() only from ib_device_release()
Instead of complicated logic about when this memory is freed, always free
it during device release(). All the cache pointers start out as NULL, so
it is safe to call this before the cache is initialized.

This makes for a simpler error unwind flow, and a simpler understanding of
the lifetime of the memory allocations inside the struct ib_device.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:56:45 -07:00
Jason Gunthorpe
b34b269ad8 RDMA/device: Ensure that security memory is always freed
Since this only frees memory it should be done during the release
callback. Otherwise there are possible error flows where it might not get
called if registration aborts.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:56:45 -07:00
Jason Gunthorpe
e3593b568a RDMA/device: Check that the rename is nop under the lock
Since another rename could be running in parallel it is safer to check
that the name is not changing inside the lock, where we already know the
device name will not change.

Fixes: d21943dd19 ("RDMA/core: Implement IB device rename function")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
2019-02-08 16:56:45 -07:00
Leon Romanovsky
21a428a019 RDMA: Handle PD allocations by IB/core
The PD allocations in IB/core allows us to simplify drivers and their
error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand
in had with relevant update in .dealloc_pd().

We will use this opportunity and convert .dealloc_pd() to don't fail, as
it was suggested a long time ago, failures are not happening as we have
never seen a WARN_ON print.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:51:04 -07:00
Leon Romanovsky
30471d4b20 RDMA/core: Share driver structure size with core
Add new macros to be used in drivers while registering ops structure and
IB/core while calling allocation routines, so drivers won't need to
perform kzalloc/kfree in their paths.

The change in allocation stage allows us to initialize common fields prior
to calling to drivers (e.g. restrack).

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:50:58 -07:00
Linus Torvalds
6b2912cedc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal fixes from Eric Biederman:
 "This contains four small fixes for signal handling. A missing range
  check, a regression fix, prioritizing signals we have already started
  a signal group exit for, and better detection of synchronous signals.

  The confused decision of which signals to handle failed spectacularly
  when a timer was pointed at SIGBUS and the stack overflowed. Resulting
  in an unkillable process in an infinite loop instead of a SIGSEGV and
  core dump"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  signal: Better detection of synchronous signals
  signal: Always notice exiting tasks
  signal: Always attempt to allocate siginfo for SIGSTOP
  signal: Make siginmask safe when passed a signal of 0
2019-02-08 15:39:28 -08:00
Linus Torvalds
3b6e8204a9 SCSI fixes on 20190208
This is a set of five minor fixes (although, tecnhincally, the aicxxx
 fix is for a major problem in that the driver won't load without it,
 but I think the fact it's taken us since 4.10 to discover this
 indicates that the user base for these things has declined).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXF3VNSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishUz2AP9L+n9A
 Ma5WutU8gkoNcttX7RJvRmtha9RiwvxRi7cs6QD+OToBDpTbo+kLuzfXz0Gop4Go
 qQziEsBm1P9ShCti3K0=
 =hptI
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of five minor fixes (although, tecnhincally, the aicxxx
  fix is for a major problem in that the driver won't load without it,
  but I think the fact it's taken us since 4.10 to discover this
  indicates that the user base for these things has declined)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: cxlflash: Prevent deadlock when adapter probe fails
  Revert "scsi: libfc: Add WARN_ON() when deleting rports"
  scsi: sd_zbc: Fix zone information messages
  scsi: target: make the pi_prot_format ConfigFS path readable
  scsi: aic94xx: fix module loading
2019-02-08 15:37:17 -08:00
Linus Torvalds
2e277fa089 IOMMU Fix for Linux v5.0-rc5:
* Intel decided to leave the newly added Scalable Mode Feature
 	  default-disabled for now. The patch here accomplishes that.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJcXbITAAoJECvwRC2XARrjuyEQALi6rRLaXXTqGdSDNAC5no7U
 EMHg3u/Ezm6ynLfZVbbCVih4rDy2gXh9dwacFjoWAgodU58wGUVJuGgMok+anax1
 VJPxcHfn0yltb767p4hF65kqcWEKKRssnZLw43E5Tkw/JjyrBwDQ95StdOJIlmd8
 qDiEPIKfTyiZn9QUErCHp+JbrCnkVvxFcFazHh2f8bzv3IjLuN1dA3Uof4mkyqgn
 tt7mmwAi1MdQAaSRUDlnKB9Qqi+TvR3JGII37v54gkmz4BC7xCnAxBZPZHzCJcp8
 JMODezy/ZX2HiaWqWYMf+Idrkd0fxpnfChtkl8SX33UZQcwNS57kBr/L0AshqAjS
 hWUoyOHTF9VWYtA6KgbULpUtlCpWImCwu2tkdYk1fl/OzAWqhoZ11YbSarsbQPC9
 RM8Ear5EhoWIyjqUDm0iKTYifbga2oJ9wBglRC8/QielZdbva3Yh2OKHXNWwhlml
 hUdIUkXHoTLaI0nyhUVTxOs3ixVzv/gZM7HxCXmnNVoPa+jvgFuMXTYPfuFDdX5/
 BHke5FZHGKCTTqRxfcipzdiBcKD71+iGIHMHPw2hBgn026dCCHVW8baFHmXDs5Sv
 nFxzRru0XtxiN9cCXv1IXasxWizrSnVkdy+jIAqwY3r0iO0j9CtKdh3SYd9vpQ/q
 3YbTsBTdgVrJ4GdFoFD+
 =4Nrv
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fix from Joerg Roedel:
 "Intel decided to leave the newly added Scalable Mode Feature
  default-disabled for now. The patch here accomplishes that"

* tag 'iommu-fixes-v5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/vt-d: Leave scalable mode default off
2019-02-08 15:34:10 -08:00
Linus Torvalds
70be9ac2b6 pci-v5.0-fixes-4
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlxdj9kUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxgyw/5AZCm3QCVGurRJFSGoG1tYodNGut6
 gy1gMP9InYfJIzNQqXWoyGpv/Clu+U2uxCOPYbuz2n7gSFxQq+JN0QygD9m2Ed9W
 Dd7RjbuSjn9ZvWAHJevLI99vxfnWQhX2JoJm5Vhtq6C8k5u1nOXV80C5QVxIjrkf
 sVdzklaO/pcQJ1SnTDhZ2lQc9ruRuDJn3putRNK8LQwfiqY2iDpzUva9DV0V4fb9
 aZtrzacg9DjpJDw27slFbqcl/uuTLjdAELfGJK0gc5Ji7wNdCy1SLFsACVgTId7m
 TyJtxZQaUMhCOeWbgs12quPUQkHY0CR9cmvt/wBU8No+gC3TuhL+bo6ErFpCNDc+
 WkQe7M6G6kFKRqyxgMeqSSZC33vi3kjgILpjce+EXdIYk7XMqG2ZU9NnoXJGnrkB
 hjBRBKb2mcle5K0NpAyFTYPPHzqPa55criKPd8+FOyVDdujMYUEuvZPBarpL+r60
 MdJVFCztkEJpvwLmOqhUnSwATPqV75qWB4l8NwMWQOQFEoWiTEVvSyWqUllcWPQE
 VxB3iRAVat16xL0dZElOaPd/xBmFUIv6UT3FZaP5+RIzUZhf6hN1Hj7w7vYtsoGu
 lSzb3UYNigS9IVj0x8ofltZr9eNnbn7KYzEG37jr6MnwPruGaaT71nNbpIJu0Zk9
 YOiFWttjeQtQZEY=
 =yf6E
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.0-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fix from Bjorn Helgaas:
 "Work around Synopsys duplicate Device ID (HAPS USB3, NXP i.MX) that
  breaks PCIe on I.MX SoCs (Thinh Nguyen)"

* tag 'pci-v5.0-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Work around Synopsys duplicate Device ID (HAPS USB3, NXP i.MX)
2019-02-08 15:32:10 -08:00
Linus Torvalds
e2dac603d4 ACPI fix for 5.0-rc6
Prevent excessive ACPI debug messages from being printed to the
 kernel log, which has started to happen after one of the recent
 ACPICA commits (Erik Schmauss).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcXWIGAAoJEILEb/54YlRx6WYP/j/Gu/7CSQNqHV2LHLnLHY1I
 xQVPxtH2NGtMCpkzUSxWmB0U4fkDhQVfkqCbnIgjiD54Ndrr0ni7yeDDth2QOBas
 d8A8NidVE3cVZB2L1g5FprWeyFjH/Sb8mmkxvSZnS4PL4A7LT9csmWrMiECShaI3
 u/ARMVTruIh+R4q+CsG63VKo96qu1ySWqG08IEWMNuPK3bv2CrPEVrPM/TBR/7VD
 hy1IJ/PfJhIgIoOBAM9Q6ysZn1Ssmro1UB0g8rjabgTn8RUWm4u5j7DeSCTYQJh9
 knirQxzCKsQYdz43sWOegm1MTRAEpZfUc9jqQw557axm8otQNLB3iMMXsnizqKPg
 COJxGtSXRBmJzhvavIzSBkGf3T2aEtLp4Bmj00hFZZKaPD6AIDEkR2OpKzRF7pVY
 KdNh2OjAbhOy4B/9NS5KPQaL5g8fTh0cR1ZZNfA8Qp2xCLTGwcdrVEuURQXXy2xr
 OJyDNSDHzQBWi0usAN1+/uYYKurPwmxW0HtQ8a3z/rUs7O8z6cW9kq/vc6W+XKUD
 GJt4XB9TM1cvxrXl6cMvOXUT12zcx/BYUX1FSzRP7xH76q4n5GQ3NwdmMEuDuMgG
 MNdd5dQ9NNg5oSW/lcdorPZuwGdraplP4aLjO4z0dC71lyLyT6HWzmdlAlFfTtYY
 re6mj6Nww7c7f/LOltWp
 =4PuZ
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "This prevents excessive ACPI debug messages from being printed to the
  kernel log, which has started to happen after one of the recent ACPICA
  commits (Erik Schmauss)"

* tag 'acpi-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: Set debug output flags independent of ACPICA
2019-02-08 15:30:02 -08:00
Daniel Jurgens
c66f67414c IB/core: Don't register each MAD agent for LSM notifier
When creating many MAD agents in a short period of time, receive packet
processing can be delayed long enough to cause timeouts while new agents
are being added to the atomic notifier chain with IRQs disabled.  Notifier
chain registration and unregstration is an O(n) operation. With large
numbers of MAD agents being created and destroyed simultaneously the CPUs
spend too much time with interrupts disabled.

Instead of each MAD agent registering for it's own LSM notification,
maintain a list of agents internally and register once, this registration
already existed for handling the PKeys. This list is write mostly, so a
normal spin lock is used vs a read/write lock. All MAD agents must be
checked, so a single list is used instead of breaking them down per
device.

Notifier calls are done under rcu_read_lock, so there isn't a risk of
similar packet timeouts while checking the MAD agents security settings
when notified.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:24:44 -07:00
Daniel Jurgens
805b754d49 IB/core: Eliminate a hole in MAD agent struct
Move the security related fields above the u8s to eliminate a hole in the
struct.

pahole before:
struct ib_mad_agent {
...
u32                        hi_tid;               /*    48     4 */
u32                        flags;                /*    52     4 */
u8                         port_num;             /*    56     1 */
u8                         rmpp_version;         /*    57     1 */

/* XXX 6 bytes hole, try to pack */

/* --- cacheline 1 boundary (64 bytes) --- */
void *                     security;             /*    64     8 */
bool                       smp_allowed;          /*    72     1 */
bool                       lsm_nb_reg;           /*    73     1 */

/* XXX 6 bytes hole, try to pack */

struct notifier_block      lsm_nb;               /*    80    24 */

/* XXX last struct has 4 bytes of padding */

/* size: 104, cachelines: 2, members: 14 */
...
};

pahole after:
struct ib_mad_agent {
...
u32                        hi_tid;               /*    48     4 */
u32                        flags;                /*    52     4 */
void *                     security;             /*    56     8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct notifier_block      lsm_nb;               /*    64    24 */

/* XXX last struct has 4 bytes of padding */

u8                         port_num;             /*    88     1 */
u8                         rmpp_version;         /*    89     1 */
bool                       smp_allowed;          /*    90     1 */
bool                       lsm_nb_reg;           /*    91     1 */

/* size: 96, cachelines: 2, members: 14 */
...
};

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:24:44 -07:00
Daniel Jurgens
6e88e672b6 IB/core: Fix potential memory leak while creating MAD agents
If the MAD agents isn't allowed to manage the subnet, or fails to register
for the LSM notifier, the security context is leaked. Free the context in
these cases.

Fixes: 47a2b338fe ("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reported-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:24:44 -07:00
Daniel Jurgens
d60667fc39 IB/core: Unregister notifier before freeing MAD security
If the notifier runs after the security context is freed an access of
freed memory can occur.

Fixes: 47a2b338fe ("IB/core: Enforce security on management datagrams")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:24:44 -07:00
Parvi Kaustubhi
0c23660649 IB/usnic: Fix locking when unregistering
Move the call to usnic_ib_device_remove after usnic_ib_ibdev_list_lock has
been released.

Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:21:59 -07:00
Nicolas Ferre
3e3e0cdfca net: macb: add sam9x60-macb compatibility string
Add a new compatibility string for this product. It's using
at91sam9260-macb layout but has a newer hardware revision: it's safer
to use its own string.

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:19:50 -08:00
Nicolas Ferre
4973a1276c net/macb: bindings doc: add sam9x60 binding
Add the compatibility sting documentation for sam9x60 10/100 interface.

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:19:50 -08:00
Nicolas Ferre
83ef97d1d3 net/macb: bindings doc/trivial: fix documentation for sama5d3 10/100 interface
This removes a line left while adding the correct compatibility string for
sama5d3 10/100 interface. Now use the "atmel,sama5d3-macb" string.

Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:19:50 -08:00
Steve Wise
c8a7eb554a iw_cxgb4: use tos when finding ipv6 routes
When IPv6 support was added, the correct tos was not passed to
cxgb_find_route6(). This potentially results in the wrong route entry.

Fixes: 830662f6f0 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:18:06 -07:00
Steve Wise
cb3ba0bde8 iw_cxgb4: use tos when importing the endpoint
import_ep() is passed the correct tos, but doesn't use it correctly.

Fixes: ac8e4c69a0 ("cxgb4/iw_cxgb4: TOS support")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:18:06 -07:00
Steve Wise
7235ea227e iw_cxgb4: use listening ep tos when accepting new connections
If the parent listening endpoint has a service type set, then use that
when setting up the connection.  This allows server-side applications to
mandate the tos for passive side connections via rdma_set_service_type()
on the listening endpoints.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:18:06 -07:00
Steve Wise
926ba19b35 RDMA/iwcm: add tos_set bool to iw_cm struct
This allows drivers to know the tos was actively set by the application.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:18:06 -07:00
Steve Wise
9491128f78 RDMA/cma: listening device cm_ids should inherit tos
If a user binds to INADDR_ANY and sets the service id, then the
device-specific cm_ids should also use this tos.  This allows an app to
do:

rdma_bind_addr(INADDR_ANY)
set_service_type()
rdma_listen()

And connections setup via this listening endpoint will use the correct
tos.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:15:40 -07:00
Danit Goldberg
2c1619edef IB/cma: Define option to set ack timeout and pack tos_set
Define new option in 'rdma_set_option' to override calculated QP timeout
when requested to provide QP attributes to modify a QP.

At the same time, pack tos_set to be bitfield.

Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:14:21 -07:00
Russell King
b5bfc21af5 net: sfp: do not probe SFP module before we're attached
When we probe a SFP module, we expect to be able to call the upstream
device's module_insert() function so that the upstream link can be
configured.  However, when the upstream device is delayed, we currently
may end up probing the module before the upstream device is available,
and lose the module_insert() call.

Avoid this by holding off probing the module until the SFP bus is
properly connected to both the SFP socket driver and the upstream
driver.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:11:25 -08:00
John Garry
4a8bec88f7 scsi: hisi_sas: Do some more tidy-up
Do some very minor tidy-up, for things like needlessly initing variable and
not leaving whitespace before quote endings.

Originally-from: Xiang Chen <chenxiang66@hisilicon.com>
Originally-from: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
4fefe5bbf5 scsi: hisi_sas: Use pci_irq_get_affinity() for v3 hw as experimental
For auto-control irq affinity mode, choose the dq to deliver IO according
to the current CPU.

Then it decreases the performance regression that fio and CQ interrupts are
processed on different node.

For user control irq affinity mode, keep it as before.

To realize it, also need to distinguish the usage of dq lock and sas_dev
lock.

We mark as experimental due to ongoing discussion on managed MSI IRQ
during hotplug:
https://marc.info/?l=linux-scsi&m=154876335707751&w=2

We're almost at the point where we can expose multiple queues to the upper
layer for SCSI MQ, but we need to sort out the per-HBA tags performance
issue.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
John Garry
795f25a31b scsi: hisi_sas: Issue internal abort on all relevant queues
To support queue mapped to a CPU, it needs to be ensured that issuing an
internal abort is safe, in that it is guaranteed that an internal abort is
processed for a single IO or a device after all the relevant command(s)
which it is attempting to abort have been processed by the controller.

Currently we only deliver commands for any device on a single queue to
solve this problem, as we know that commands issued on the same queue will
be processed in order, and we will not have a scenario where the internal
abort is racing against a command(s) which it is trying to abort.

To enqueue commands on queue mapped to a CPU, choosing a queue for an
command is based on the associated queue for the current CPU, so this is
not safe for internal abort since it would definitely not be guaranteed
that commands for the command devices are issued on the same queue.

To solve this issue, we take a bludgeoning approach, and issue a separate
internal abort on any queue(s) relevant to the command or device, in that
we will be guaranteed that at least one of these internal aborts will be
received last in the controller.

So, for aborting a single command, we can just force the internal abort to
be issued on the same queue as the command which we are trying to abort.

For aborting all commands associated with a device, we issue a separate
internal abort on all relevant queues. Issuing multiple internal aborts in
this fashion would have not side affect.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
1273d65f29 scsi: hisi_sas: change queue depth from 512 to 4096
If sending IOs to many disks from single queue, it is possible that the
queue may be full. To avoid the situation, change queue depth from 512 to
4096 which is the max number of IOs for v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Luo Jiaxing
7c5e136363 scsi: hisi_sas: Add manual trigger for debugfs dump
Add an interface to manually trigger a debugfs dump.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
b3cce125cb scsi: hisi_sas: Add support for DIX feature for v3 hw
This patch adds support for DIX to v3 hw driver.

For this, we build upon support for DIF, most significantly is adding new
DMA map and unmap paths.

Some pre-existing macro precedence issues are also tidied. They were
detected by checkpatch --strict.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:21 -05:00
David S. Miller
3e32675c05 Merge branch 'mlxsw-Implement-periodic-ERP-rehash'
Ido Schimmel says:

====================
mlxsw: Implement periodic ERP rehash

Currently, an ERP set is created for each region according to rules
inserted and order of their insertion. However that might lead to
suboptimal ERP sets and possible unnecessary spillage into C-TCAM.
This patchset aims to fix this problem and introduces periodical checking
of used ERP sets and in case a better ERP set is possible for the given
set of rules, it rehashes the region to use the better ERP set.

Patch 1 prepares devlink params infra in order to fix the
        init/fini sequences.
Patch 2 implements hints infra in objagg library.
Patch 3 fixes a typo
Patch 4 adds number of root objects directly into objagg stats.
Patches 5-7 do split of multiple structs in Spectrum TCAM code.
Patch 8 introduces initial implementation of ERP rehash logic,
         according to objagg hints.
Patch 9 adds hints priv passing trought the layers.
Patch 10 adds multi field into PAGT reg. (new patch)
Patch 11 implements actual region rules migration in TCAM code.
Patch 12 adds a devlink param so user is able to control
         rehash interval.
Patch 13 adds couple of tracepoints in order to track
         rehash procedures.
Patch 14 adds a simple selftest to test region rehash.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00
Jiri Pirko
c478d3c347 selftests: mlxsw: spectrum-2: Add simple delta rehash test
Track the basic codepaths of delta rehash handling,
using mlxsw tracepoints.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00
Jiri Pirko
3985de7260 mlxsw: spectrum_acl: Add couple of vregion rehash tracepoints
As vregion rehash is happening in delayed work, add some visibility to
the process using a few tracepoints.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00
Jiri Pirko
98bbf70c1c mlxsw: spectrum: add "acl_region_rehash_interval" devlink param
Expose new driver-specific "acl_region_rehash_interval" devlink param
which would allow user to alter default ACL region rehash interval.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00
Jiri Pirko
e5e7962ee5 mlxsw: spectrum_acl: Implement region migration according to hints
If the hints are returned, the migration should be started. For that to
happen, there is a need to create a second physical region in TCAM with
new ERP set by passing the hints and then move chunk by chunk,
entry by entry.

During the transition, two lookups will occur. One in old region and
another in new region. The highest priority rule will be chosen.

In an unlikely case that the migration will fail and also rollback to
original region will fail the vregion will become in bad state.
Everything will work, only no future rehash will be possible. In a
follow-up work, this can be resolved by trying to resume the rollback
in delayed work and repair the vregion.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00
Jiri Pirko
5c661f142c mlxsw: reg: Add multi field to PAGT register
For Spectrum-2 this allows parallel lookups in multiple regions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:50 -08:00