Commit graph

706291 commits

Author SHA1 Message Date
Noam Camus
1112c3b2ce ARC: [plat-eznps] spinlock aware for MTM
This way when we execute "ex" during trying to hold lock we can switch to
other HW thread and utilize the core intead of just spinning on a lock.

We noticed about 10% improvement of execution time with hackbench test.

Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-28 15:17:36 -07:00
Vineet Gupta
c2bdac146b ARC: spinlock: Document the EX based spin_unlock
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-28 15:17:36 -07:00
Noam Camus
ab1e8660c1 ARC: [plat-eznps] disabled stall counter due to a HW bug
This counter represents threshold for consecutive stall which
would trigger HW threads scheduling. However when enabled, low
threshhold values cause performance degradation and in the
worst case even livelock.

So disable it by resorting to HW reset value

Signed-off-by: Noam Camus <noamca@mellanox.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: fixed changelog]
2017-08-28 15:17:36 -07:00
Noam Camus
30b7af252e ARC: [plat-eznps] Fix TLB Errata
Due to a HW bug in NPS400 we get from time to time false TLB miss.
Workaround this by validating each miss.

Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-28 15:17:36 -07:00
Noam Camus
9e9395525b ARC: [plat-eznps] typo fix at Kconfig
Signed-off-by: Noam Camus <noamca@mellanox.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-28 15:17:36 -07:00
Liav Rehana
9405530469 ARC: typos fix in kernel/entry-compact.S
Signed-off-by: Liav Rehana <liavr@mellanox.com>
Signed-off-by: Noam Camus <noamca@mellanox.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-28 15:17:36 -07:00
Liav Rehana
ddf720f86e ARC: typo fix in mm/fault.c
Signed-off-by: Liav Rehana <liavr@mellanox.com>
Signed-off-by: Noam Camus <noamca@mellanox.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-08-28 15:17:36 -07:00
David Ahern
a8e3bb347d net: Add comment that early_demux can change via sysctl
Twice patches trying to constify inet{6}_protocol have been reverted:
39294c3df2 ("Revert "ipv6: constify inet6_protocol structures"") to
revert 3a3a4e3054 and then 03157937fe ("Revert "ipv4: make
net_protocol const"") to revert aa8db499ea.

Add a comment that the structures can not be const because the
early_demux field can change based on a sysctl.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 15:17:29 -07:00
Rafael J. Wysocki
7aa7a0360a PM: docs: Delete the obsolete states.txt document
The Documentation/power/states.txt document is now redundant and
sonewhat outdated, so delete it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-08-29 00:16:06 +02:00
Rafael J. Wysocki
cba630b6f5 Merge branch 'pm-sleep' into pm-docs 2017-08-29 00:15:47 +02:00
Rafael J. Wysocki
0c0b6b7bc4 PM: docs: Describe high-level PM strategies and sleep states
Reorganize the power management part of admin-guide by adding a
description of major power management strategies supported by the
kernel (system-wide and working-state power management) to it and
dividing the rest of the material into the system-wide PM and
working-state PM chapters.

On top of that, add a description of system sleep states to the
system-wide PM chapter.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2017-08-29 00:15:32 +02:00
Willem de Bruijn
cc8737a5fe xen-netback: update ubuf_info initialization to anonymous union
The xen driver initializes struct ubuf_info fields using designated
initializers. I recently moved these fields inside a nested anonymous
struct inside an anonymous union. I had missed this use case.

This breaks compilation of xen-netback with older compilers.
>From kbuild bot with gcc-4.4.7:

   drivers/net//xen-netback/interface.c: In function
   'xenvif_init_queue':
   >> drivers/net//xen-netback/interface.c:554: error: unknown field 'ctx' specified in initializer
   >> drivers/net//xen-netback/interface.c:554: warning: missing braces around initializer
      drivers/net//xen-netback/interface.c:554: warning: (near initialization for '(anonymous).<anonymous>')
   >> drivers/net//xen-netback/interface.c:554: warning: initialization makes integer from pointer without a cast
   >> drivers/net//xen-netback/interface.c:555: error: unknown field 'desc' specified in initializer

Add double braces around the designated initializers to match their
nested position in the struct. After this, compilation succeeds again.

Fixes: 4ab6c99d99 ("sock: MSG_ZEROCOPY notification coalescing")
Reported-by: kbuild bot <lpk@intel.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 15:11:50 -07:00
David S. Miller
d5a915b978 Merge branch 'gre-add-collect_md-mode-for-ERSPAN-tunnel'
William Tu says:

====================
gre: add collect_md mode for ERSPAN tunnel

This patch series provide collect_md mode for ERSPAN tunnel.  The fist patch
refactors the existing gre_fb_xmit function by exacting the route cache
portion into a new function called prepare_fb_xmit.  The second patch
introduces the collect_md mode for ERSPAN tunnel, by calling the
prepare_fb_xmit function and adding ERSPAN specific logic.  The final patch
adds the test case using bpf_skb_{set,get}_tunnel_{key,opt}.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 15:04:52 -07:00
William Tu
ef88f89c83 samples/bpf: extend test_tunnel_bpf.sh with ERSPAN
Extend existing tests for vxlan, gre, geneve, ipip to
include ERSPAN tunnel.

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 15:04:52 -07:00
William Tu
1a66a836da gre: add collect_md mode to ERSPAN tunnel
Similar to gre, vxlan, geneve, ipip tunnels, allow ERSPAN tunnels to
operate in 'collect metadata' mode.  bpf_skb_[gs]et_tunnel_key() helpers
can make use of it right away.  OVS can use it as well in the future.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 15:04:52 -07:00
William Tu
862a03c35e gre: refactor the gre_fb_xmit
The patch refactors the gre_fb_xmit function, by creating
prepare_fb_xmit function for later ERSPAN collect_md mode patch.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 15:04:52 -07:00
Oza Pawandeep
39b7a4ff93 PCI: iproc: Work around Stingray CRS defects
Configuration Request Retry Status ("CRS") completions are a required part
of PCIe.  A PCIe device may respond to config a request with a CRS
completion to indicate that it needs more time to initialize.  A Root Port
that receives a CRS completion may automatically retry the request, or it
may treat the request as a failed transaction.  For a failed read, it will
likely synthesize all 1's data, i.e., 0xffffffff, to complete the read to
the CPU.

CRS Software Visibility ("CRS SV") is an optional feature.  Per PCIe r3.1,
sec 2.3.2, if supported and enabled, a Root Port that receives a CRS
completion for a config read of the Vendor ID will synthesize 0x0001 data
(an invalid Vendor ID) instead of retrying or failing the transaction.  The
0x0001 data makes the CRS completion visible to software, so it can perform
other tasks while waiting for the device.

The iProc "Stingray" PCIe controller does not support CRS completions
correctly.  From the Stingray PCIe Controller spec:

  4.7.3.3. Retry Status On Configuration Cycle

  Endpoints are allowed to generate retry status on configuration cycles.
  In this case, the RC needs to re-issue the request. The IP does not
  handle this because the number of configuration cycles needed will
  probably be less than the total number of non-posted operations needed.

  When a retry status is received on the User RX interface for a
  configuration request that was sent on the User TX interface, it will be
  indicated with a completion with the CMPL_STATUS field set to 2=CRS, and
  the user will have to find the address and data values and send a new
  transaction on the User TX interface.  When the internal configuration
  space returns a retry status during a configuration cycle (user_cscfg =
  1) on the Command/Status interface, the pcie_cscrs will assert with the
  pcie_csack signal to indicate the CRS status.

  When the CRS Software Visibility Enable register in the Root Control
  register is enabled, the IP will return the data value to 0x0001 for the
  Vendor ID value and 0xffff  (all 1’s) for the rest of the data in the
  request for reads of offset 0 that return with CRS status.  This is true
  for both the User RX Interface and for the Command/Status interface.
  When CRS Software Visibility is enabled, the CMPL_STATUS field of the
  completion on the User RX Interface will not be 2=CRS and the pcie_cscrs
  signal will not assert on the Command/Status interface.

The Stingray hardware never reissues configuration requests when it
receives CRS completions.  Contrary to what sec 4.7.3.3 above says, when it
receives a CRS completion, it synthesizes 0xffff0001 data regardless of the
address of the read or the value of the CRS SV enable bit.

This is broken in two ways:

  1) When CRS SV is disabled, the Root Port should never synthesize the
  0x0001 value.  If it receives a CRS completion, it should fail the
  transaction and synthesize all 1's data.

  2) When CRS SV is enabled, the Root Port should only synthesize 0x0001
  data if it receives a CRS completion for a read of the Vendor ID.  If it
  receives a CRS completion for any other read, it should fail the
  transaction and synthesize all 1's data.

This breaks pci_flr_wait(), which reads the Command register and expects to
see all 1's data if the read fails because of CRS completions.  On
Stingray, it sees the incorrect 0xffff0001 data instead.

It also breaks config registers that contain the 0xffff0001 value.  If we
read such a register, software can't distinguish a CRS completion from the
actual value read from the device.

On Stingray, if we read 0xffff0001 data, assume this indicates a CRS
completion and retry the read for 500ms.  If we time out, return all 1's
(0xffffffff) data.  Note that this corrupts registers that happen to
contain 0xffff0001.

Stingray advertises CRS SV support in its Root Capabilities register, and
the CRS SV enable bit is writable (even though the hardware ignores it).
Mask out PCI_EXP_RTCAP_CRSVIS so software doesn't try to use CRS SV.

Signed-off-by: Oza Pawandeep <oza.oza@broadcom.com>
[bhelgaas: changelog, add probe-time warning about corruption, don't
advertise CRS SV support, remove duplicate pci_generic_config_read32(),
fix alignment based on patch from Arnd Bergmann <arnd@arndb.de>]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-28 16:43:30 -05:00
Oza Pawandeep
d005045bcf PCI: iproc: Factor out memory-mapped config access address calculation
Factor out the address calculation for memory-mapped config accesses as a
separate function.  No functional change intended.

Signed-off-by: Oza Pawandeep <oza.oza@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-08-28 16:43:24 -05:00
Arvind Yadav
0c3014f22d selinux: constify nf_hook_ops
nf_hook_ops are not supposed to change at runtime. nf_register_net_hooks
and nf_unregister_net_hooks are working with const nf_hook_ops.
So mark the non-const nf_hook_ops structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-28 17:33:19 -04:00
David Ahern
03157937fe Revert "ipv4: make net_protocol const"
This reverts commit aa8db499ea.

Early demux structs can not be made const. Doing so results in:
[   84.967355] BUG: unable to handle kernel paging request at ffffffff81684b10
[   84.969272] IP: proc_configure_early_demux+0x1e/0x3d
[   84.970544] PGD 1a0a067
[   84.970546] P4D 1a0a067
[   84.971212] PUD 1a0b063
[   84.971733] PMD 80000000016001e1

[   84.972669] Oops: 0003 [#1] SMP
[   84.973065] Modules linked in: ip6table_filter ip6_tables veth vrf
[   84.973833] CPU: 0 PID: 955 Comm: sysctl Not tainted 4.13.0-rc6+ #22
[   84.974612] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[   84.975855] task: ffff88003854ce00 task.stack: ffffc900005a4000
[   84.976580] RIP: 0010:proc_configure_early_demux+0x1e/0x3d
[   84.977253] RSP: 0018:ffffc900005a7dd0 EFLAGS: 00010246
[   84.977891] RAX: ffffffff81684b10 RBX: 0000000000000001 RCX: 0000000000000000
[   84.978759] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000000
[   84.979628] RBP: ffffc900005a7dd0 R08: 0000000000000000 R09: 0000000000000000
[   84.980501] R10: 0000000000000001 R11: 0000000000000008 R12: 0000000000000001
[   84.981373] R13: ffffffffffffffea R14: ffffffff81a9b4c0 R15: 0000000000000002
[   84.982249] FS:  00007feb237b7700(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[   84.983231] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   84.983941] CR2: ffffffff81684b10 CR3: 0000000038492000 CR4: 00000000000406f0
[   84.984817] Call Trace:
[   84.985133]  proc_tcp_early_demux+0x29/0x30

I think this is the second time such a patch has been reverted.

Cc: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-28 14:30:46 -07:00
Bhumika Goyal
dfbde55249 nbd: make device_attribute const
Make this const as is is only passed as an argument to the
function device_create_file and device_remove_file and the corresponding
arguments are of type const.
Done using Coccinelle

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28 15:21:27 -06:00
Jens Axboe
b3c3051220 null_blk: use available 'dev' in nullb_device_power_store()
We already have this pointer, no need to use to_nullb_device()
again.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28 15:06:31 -06:00
Shaohua Li
060fd198a3 block/nullb: delete unnecessary memory free
Commit 2984c86(nullb: factor disk parameters) has a typo. The
nullb_device allocation/free is done outside of null_add_dev. The commit
accidentally frees the nullb_device in error code path.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-28 15:06:17 -06:00
Jason Ekstrand
3e6fb72d6c drm/syncobj: Add a syncobj_array_find helper
The wait ioctl has a bunch of code to read an syncobj handle array from
userspace and turn it into an array of syncobj pointers.  We're about to
add two new IOCTLs which will need to work with arrays of syncobj
handles so let's make some helpers.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:28:23 +10:00
Jason Ekstrand
e7aca5031a drm/syncobj: Allow wait for submit and signal behavior (v5)
Vulkan VkFence semantics require that the application be able to perform
a CPU wait on work which may not yet have been submitted.  This is
perfectly safe because the CPU wait has a timeout which will get
triggered eventually if no work is ever submitted.  This behavior is
advantageous for multi-threaded workloads because, so long as all of the
threads agree on what fences to use up-front, you don't have the extra
cross-thread synchronization cost of thread A telling thread B that it
has submitted its dependent work and thread B is now free to wait.

Within a single process, this can be implemented in the userspace driver
by doing exactly the same kind of tracking the app would have to do
using posix condition variables or similar.  However, in order for this
to work cross-process (as is required by VK_KHR_external_fence), we need
to handle this in the kernel.

This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which
instructs the IOCTL to wait for the syncobj to have a non-null fence and
then wait on the fence.  Combined with DRM_IOCTL_SYNCOBJ_RESET, you can
easily get the Vulkan behavior.

v2:
 - Fix a bug in the invalid syncobj error path
 - Unify the wait-all and wait-any cases
v3:
 - Unify the timeout == 0 case a bit with the timeout > 0 case
 - Use wait_event_interruptible_timeout
v4:
 - Use proxy fence
v5:
 - Revert to a combination of v2 and v3
 - Don't use proxy fences
 - Don't use wait_event_interruptible_timeout because it just adds an
   extra layer of callbacks

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:28:17 +10:00
Jason Ekstrand
1fc08218ed drm/syncobj: Add a CREATE_SIGNALED flag
This requests that the driver create the sync object such that it
already has a signaled dma_fence attached.  Because we don't need
anything in particular (just something signaled), we use a dummy null
fence.  This is useful for Vulkan which has a similar flag that can be
passed to vkCreateFence.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:27:41 +10:00
Jason Ekstrand
9c19fb10a5 drm/syncobj: Add a callback mechanism for replace_fence (v3)
It is useful in certain circumstances to know when the fence is replaced
in a syncobj.  Specifically, it may be useful to know when the fence
goes from NULL to something valid.  This does make syncobj_replace_fence
a little more expensive because it has to take a lock but, in the common
case where there is no callback list, it spends a very short amount of
time inside the lock.

v2:
 - Don't lock in drm_syncobj_fence_get.  We only really need to lock
   around fence_replace to make the callback work.
v3:
 - Fix the cb_list comment to make kbuild happy

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:26:42 +10:00
Dave Airlie
5e60a10eae drm/syncobj: add sync obj wait interface. (v8)
This interface will allow sync object to be used to back
Vulkan fences. This API is pretty much the vulkan fence waiting
API, and I've ported the code from amdgpu.

v2: accept relative timeout, pass remaining time back
to userspace.
v3: return to absolute timeouts.
v4: absolute zero = poll,
    rewrite any/all code to have same operation for arrays
    return -EINVAL for 0 fences.
v4.1: fixup fences allocation check, use u64_to_user_ptr
v5: move to sec/nsec, and use timespec64 for calcs.
v6: use -ETIME and drop the out status flag. (-ETIME
is suggested by ickle, I can feel a shed painting)
v7: talked to Daniel/Arnd, use ktime and ns everywhere.
v8: be more careful in the timeout calculations
    use uint32_t for counter variables so we don't overflow
    graciously handle -ENOINT being returned from dma_fence_wait_timeout

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:26:32 +10:00
Jason Ekstrand
afca4216b8 i915: Use drm_syncobj_fence_get
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:20:31 +10:00
Jason Ekstrand
309a5482fa drm/syncobj: Add a race-free drm_syncobj_fence_get helper (v2)
The atomic exchange operation in drm_syncobj_replace_fence is sufficient
for the case where it races with itself.  However, if you have a race
between a replace_fence and dma_fence_get(syncobj->fence), you may end
up with the entire replace_fence happening between the point in time
where the one thread gets the syncobj->fence pointer and when it calls
dma_fence_get() on it.  If this happens, then the reference may be
dropped before we get a chance to get a new one.  The new helper uses
dma_fence_get_rcu_safe to get rid of the race.

This is also needed because it allows us to do a bit more than just get
a reference in drm_syncobj_fence_get should we wish to do so.

v2:
 - RCU isn't that scary
 - Call rcu_read_lock/unlock
 - Don't rename fence to _fence
 - Make the helper static inline

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:20:30 +10:00
Jason Ekstrand
afaf592378 drm/syncobj: Rename fence_get to find_fence
The function has far more in common with drm_syncobj_find than with
any in the get/put functions.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-29 06:17:37 +10:00
Martin K. Petersen
07fbd32a6b nvme: honor RTD3 Entry Latency for shutdowns
If an NVMe controller reports RTD3 Entry Latency larger than
shutdown_timeout, up to a maximum of 60 seconds, use that value to set
the shutdown timer. Otherwise fall back to the module parameter which
defaults to 5 seconds.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
[hch: removed do_div, made transition time local scope]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:44 +03:00
Jan H. Schönherr
5228b3280b nvme: fix uninitialized prp2 value on small transfers
The value of iod->first_dma ends up as prp2 in NVMe commands. In case
there is not enough data to cross a page boundary, iod->first_dma is
never initialized and contains random data.

Comply with the NVMe specification and fill in 0 in that case.

Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:44 +03:00
Max Gurtovoy
a7b7c7a105 nvme-rdma: Use unlikely macro in the fast path
This patch slightly improves performance (mainly for small block sizes).

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:43 +03:00
Martin Wilck
17c39d053a nvmet: use memcpy_and_pad for identify sn/fr
This changes the earlier patch "nvmet: don't report 0-bytes
in serial number" to use the memcpy_and_pad() helper introduced
in a previous patch.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimbeg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:42 +03:00
Martin Wilck
01f33c336e string.h: add memcpy_and_pad()
This helper function is useful for the nvme subsystem, and maybe
others.

Note: the warnings reported by the kbuild test robot for this patch
are actually generated by the use of CONFIG_PROFILE_ALL_BRANCHES
together with __FORTIFY_INLINE.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimbeg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:41 +03:00
James Smart
48fa362b6c nvmet-fc: simplify sg list handling
The existing nvmet_fc sg list handling has 2 faults:
a) the request between LLDD and transport has too large of an sg
   list (256 elements), which is normally 256k (64 elements).
b) sglist handling doesn't optimize on the fact that each element
   is a page.

This patch removes the static sg list in the request and uses the
dynamic list already present in the nvmet_fc transport. It also
simplies the handling of the sg list on multiple sequences to
take advantage of the per-page divisions.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:41 +03:00
James Smart
5533d42480 nvme-fc: Reattach to localports on re-registration
If the LLDD resets or detaches from an fc port, the LLDD will
deregister all remoteports seen by the fc port and deregister the
localport associated with the fc port. The teardown of the localport
structure will be held off due to reference counting until all the
remoteports are removed (and they are held off until all
controllers/associations to terminated). Currently, if the fc port
is reinit/reattached and registered again as a localport it is
treated as an independent entity from the prior localport and all
prior remoteports and controllers cannot be revived. They are
created as new and separate entities.

This patch changes the localport registration to look at the known
localports that are waiting to be torndown. If they are the same port
based on wwn's, the local port is transitioned out of the teardown
state.  This allows the remote ports and controller connections to
be reestablished and resumed as long as the localport can also be
reregistered within the timeout windows.

The patch adds a new routine nvme_fc_attach_to_unreg_lport() with
the functionality and moves the lport get/put routines to avoid
forward references.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:40 +03:00
Max Gurtovoy
60b43f627a nvme: rename AMS symbolic constants to fit specification
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:39 +03:00
Max Gurtovoy
ad4e05b24c nvme: add symbolic constants for CC identifiers
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:38 +03:00
Sagi Grimberg
caaa15c509 nvme: fix identify namespace logging
Use ctrl->device and lose the func name.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:38 +03:00
Guan Junxiong
9b483da15d nvme-fabrics: log a warning if hostid is invalid
This helps users to quickly spot the reason of why connection fails
if the hostid is not compliant with the uuid format.

Signed-off-by: Guan Junxiong <guanjunxiong@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:37 +03:00
Sagi Grimberg
09fdc23b29 nvme-rdma: call ops->reg_read64 instead of nvmf_reg_read64
To make the nvme_rdma_configure_admin_queue generic in preparation of
moving it to common code.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:36 +03:00
Sagi Grimberg
370ae6e450 nvme-rdma: cleanup error path in controller reset
No need to queue an extra work to indirect controller removal, just call the
ctrl remove routine.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:36 +03:00
Sagi Grimberg
68e16fcfaf nvme-rdma: introduce nvme_rdma_start_queue
This should pair with nvme_rdma_stop_queue.  While this is not a complete
inverse, it still pairs up pretty well because in fabrics we don't have a
disconnect capsule (yet) but we simply teardown the transport association.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:35 +03:00
Sagi Grimberg
41e8cfa117 nvme-rdma: rename nvme_rdma_init_queue to nvme_rdma_alloc_queue
Give it a name symmetric to nvme_rdma_free_queue. Also pass in the ctrl
sqsize+1 and not the opts queue_size.  And suppress a superflous
failure message.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:34 +03:00
Sagi Grimberg
148b4e7ff3 nvme-rdma: stop queues instead of simply flipping their state
If we move the queues from LIVE state, we might as well stop them (drain
for rdma).  Do it after we stop the request queues to prevent a stray
request sneaking in .queue_rq after we stop the queue.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:33 +03:00
Sagi Grimberg
a57bd54122 nvme-rdma: introduce configure/destroy io queues
Make a symmetrical handling with admin queue.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:33 +03:00
Sagi Grimberg
31fdf18401 nvme-rdma: reuse configure/destroy_admin_queue
No need to open-code it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:32 +03:00
Sagi Grimberg
3f02fffb74 nvme-rdma: don't free tagset on resets
We're not supposed to do that.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-08-28 23:00:27 +03:00