Commit graph

737480 commits

Author SHA1 Message Date
Wei Yongjun
a31e795a3b net: dsa: lan9303: Fix error return code in lan9303_check_device()
Fix to return error code -ENODEV from the chip not found error handling
case instead of 0(ret have been overwritten to 0 by lan9303_read()), as
done elsewhere in this function.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:24:54 -05:00
Paolo Valente
f0ba5ea2fe block, bfq: increase threshold to deem I/O as random
If two processes do I/O close to each other, i.e., are cooperating
processes in BFQ (and CFQ'S) nomenclature, then BFQ merges their
associated bfq_queues, so as to get sequential I/O from the union of
the I/O requests of the processes, and thus reach a higher
throughput. A merged queue is then split if its I/O stops being
sequential. In this respect, BFQ deems the I/O of a bfq_queue as
(mostly) sequential only if less than 4 I/O requests are random, out
of the last 32 requests inserted into the queue.

Unfortunately, extensive testing (with the interleaved_io benchmark of
the S suite [1], and with real applications spawning cooperating
processes) has clearly shown that, with such a low threshold, only a
rather low I/O throughput may be reached when several cooperating
processes do I/O. In particular, the outcome of each test run was
bimodal: if queue merging occurred and was stable during the test,
then the throughput was close to the peak rate of the storage device,
otherwise the throughput was arbitrarily low (usually around 1/10 of
the peak rate with a rotational device). The probability to get the
unlucky outcomes grew with the number of cooperating processes: it was
already significant with 5 processes, and close to one with 7 or more
processes.

The cause of the low throughput in the unlucky runs was that the
merged queues containing the I/O of these cooperating processes were
soon split, because they contained more random I/O requests than those
tolerated by the 4/32 threshold, but
- that I/O would have however allowed the storage device to reach
  peak throughput or almost peak throughput;
- in contrast, the I/O of these processes, if served individually
  (from separate queues) yielded a rather low throughput.

So we repeated our tests with increasing values of the threshold,
until we found the minimum value (19) for which we obtained maximum
throughput, reliably, with at least up to 9 cooperating
processes. Then we checked that the use of that higher threshold value
did not cause any regression for any other benchmark in the suite [1].
This commit raises the threshold to such a higher value.

[1] https://github.com/Algodev-github/S

Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:23:57 -07:00
Damien Le Moal
8dc8146f9c deadline-iosched: Introduce zone locking support
Introduce zone write locking to avoid write request reordering with
zoned block devices. This is achieved using a finer selection of the
next request to dispatch:
1) Any non-write request is always allowed to proceed.
2) Any write to a conventional zone is always allowed to proceed.
3) For a write to a sequential zone, the zone lock is first checked.
   a) If the zone is not locked, the write is allowed to proceed after
      its target zone is locked.
   b) If the zone is locked, the write request is skipped and the next
      request in the dispatch queue tested (back to step 1).

For a write request that has locked its target zone, the zone is
unlocked either when the request completes and the method
deadline_request_completed() is called, or when the request is requeued
using the method deadline_add_request().

Requests targeting a locked zone are always left in the scheduler queue
to preserve the initial write order. If no write request can be
dispatched, allow reads to be dispatched even if the write batch is not
done.

If the device used is not a zoned block device, or if zoned block device
support is disabled, this patch does not modify deadline behavior.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:22:17 -07:00
Damien Le Moal
c117bac701 deadline-iosched: Introduce dispatch helpers
Avoid directly referencing the next_rq and fifo_list arrays using the
helper functions deadline_next_request() and deadline_fifo_request() to
facilitate changes in the dispatch request selection in
deadline_dispatch_requests() for zoned block devices.

While at it, also remove the unnecessary forward declaration of the
function deadline_move_request().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:22:17 -07:00
Damien Le Moal
5700f69178 mq-deadline: Introduce zone locking support
Introduce zone write locking to avoid write request reordering with
zoned block devices. This is achieved using a finer selection of the
next request to dispatch:
1) Any non-write request is always allowed to proceed.
2) Any write to a conventional zone is always allowed to proceed.
3) For a write to a sequential zone, the zone lock is first checked.
   a) If the zone is not locked, the write is allowed to proceed after
      its target zone is locked.
   b) If the zone is locked, the write request is skipped and the next
      request in the dispatch queue tested (back to step 1).

For a write request that has locked its target zone, the zone is
unlocked either when the request completes with a call to the method
deadline_request_completed() or when the request is requeued using
dd_insert_request().

Requests targeting a locked zone are always left in the scheduler queue
to preserve the lba ordering for write requests. If no write request
can be dispatched, allow reads to be dispatched even if the write batch
is not done.

If the device used is not a zoned block device, or if zoned block device
support is disabled, this patch does not modify mq-deadline behavior.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:22:17 -07:00
Damien Le Moal
bf09ce56f0 mq-deadline: Introduce dispatch helpers
Avoid directly referencing the next_rq and fifo_list arrays using the
helper functions deadline_next_request() and deadline_fifo_request() to
facilitate changes in the dispatch request selection in
__dd_dispatch_request() for zoned block devices.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:22:17 -07:00
Christoph Hellwig
6cc77e9cb0 block: introduce zoned block devices zone write locking
Components relying only on the request_queue structure for accessing
block devices (e.g. I/O schedulers) have a limited knowledged of the
device characteristics. In particular, the device capacity cannot be
easily discovered, which for a zoned block device also result in the
inability to easily know the number of zones of the device (the zone
size is indicated by the chunk_sectors field of the queue limits).

Introduce the nr_zones field to the request_queue structure to simplify
access to this information. Also, add the bitmap seq_zone_bitmap which
indicates which zones of the device are sequential zones (write
preferred or write required) and the bitmap seq_zones_wlock which
indicates if a zone is write locked, that is, if a write request
targeting a zone was dispatched to the device. These fields are
initialized by the low level block device driver (sd.c for ZBC/ZAC
disks). They are not initialized by stacking drivers (device mappers)
handling zoned block devices (e.g. dm-linear).

Using this, I/O schedulers can introduce zone write locking to control
request dispatching to a zoned block device and avoid write request
reordering by limiting to at most a single write request per zone
outside of the scheduler at any time.

Based on previous patches from Damien Le Moal.

Signed-off-by: Christoph Hellwig <hch@lst.de>
[Damien]
* Fixed comments and identation in blkdev.h
* Changed helper functions
* Fixed this commit message
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:22:17 -07:00
David S. Miller
7892ea23c9 Merge branch 'dsa-Move-padding-into-Broadcom-tagger'
Florian Fainelli says:

====================
net: dsa: Move padding into Broadcom tagger

This patch series moves the padding of short packets to where it belongs
within the DSA Broadcom tagger code, I just found myself doing this for
a third driver, which was a clear indication this was wrong and did not
scale.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:21:32 -05:00
Florian Fainelli
c979da77b3 net: bgmac: Remove short packet padding for DSA
DSA now correctly pads short packets within net/dsa/tag_brcm.c such that
this it is no longer necessary to do this within bgmac.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:21:31 -05:00
Florian Fainelli
398aff64d5 net: systemport: Remove short packet padding
Short packet padding added to the driver is only necessary when using
Broadcom tags, but since this is now taken care of net/dsa/tag_brcm.c,
we are guaranteed being given correctly padded packets.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:21:31 -05:00
Florian Fainelli
bf08c34086 net: dsa: Move padding into Broadcom tagger
Instead of having the different master network device drivers
potentially used by DSA/Broadcom tags, move the padding necessary for
the switches to accept short packets where it makes most sense: within
tag_brcm.c. This avoids multiplying the number of similar commits to
e.g: bgmac, bcmsysport, etc.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:21:31 -05:00
Fugang Duan
d1616f07e8 net: fec: free/restore resource in related probe error pathes
Fixes in probe error path:
- Restore dev_id before failed_ioremap path.
  Fixes: ("net: fec: restore dev_id in the cases of probe error")
- Call of_node_put(phy_node) before failed_phy path.
  Fixes: ("net: fec: Support phys probed from devicetree and fixed-link")

Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:19:11 -05:00
Quentin Monnet
33c30a8b68 net: sched: fix tcf_block_get_ext() in case CONFIG_NET_CLS is not set
The definition of functions tcf_block_get() and tcf_block_get_ext()
depends of CONFIG_NET_CLS being set. When those functions gained extack
support, only one version of the declaration of those functions was
updated. Function tcf_block_get() was later fixed with commit
3c1490913f ("net: sch: api: fix tcf_block_get").

Change arguments of tcf_block_get_ext() for the case when CONFIG_NET_CLS
is not set.

Fixes: 8d1a77f974 ("net: sch: api: add extack support in tcf_block_get")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:18:27 -05:00
Arnd Bergmann
8c11fcc212 mvebu dt64 for 4.16 (part 2)
The main change here are the series of commits doing the Armada 7K/8K
 CP110 DT de-duplication, they include the de-duplication itself and
 small fixes in the device tree files.
 
 Besides them there are 2 other patches:
  - One adding the crypto support for Armada 37xx SoCs
  - An other adding Ethernet aliases on A7K/A8K base boards
 -----BEGIN PGP SIGNATURE-----
 
 iIEEABECAEEWIQQYqXDMF3cvSLY+g9cLBhiOFHI71QUCWk+h0CMcZ3JlZ29yeS5j
 bGVtZW50QGZyZWUtZWxlY3Ryb25zLmNvbQAKCRALBhiOFHI71ZU9AKCNLcEcewii
 UWPVUzEsQ/+UPojO4wCdHqum9OT33XChVrxHGKP89Dnj1ro=
 =HCIC
 -----END PGP SIGNATURE-----

Merge tag 'mvebu-dt64-4.16-2' of git://git.infradead.org/linux-mvebu into next/dt

Pull "mvebu dt64 for 4.16 (part 2)" from Gregory CLEMENT:

The main change here are the series of commits doing the Armada 7K/8K
CP110 DT de-duplication, they include the de-duplication itself and
small fixes in the device tree files.

Besides them there are 2 other patches:
 - One adding the crypto support for Armada 37xx SoCs
 - An other adding Ethernet aliases on A7K/A8K base boards

* tag 'mvebu-dt64-4.16-2' of git://git.infradead.org/linux-mvebu:
  arm64: dts: marvell: add Ethernet aliases
  arm64: dts: marvell: replace cpm by cp0, cps by cp1
  arm64: dts: marvell: de-duplicate CP110 description
  arm64: dts: marvell: use aliases for SPI busses on Armada 7K/8K
  arm64: dts: marvell: use mvebu-icu.h where possible
  arm64: dts: marvell: fix compatible string list for Armada CP110 slave NAND
  arm64: dts: marvell: fix typos in comment describing the NAND controller
  arm64: dts: marvell: use lower case for unit address and reg property
  arm64: dts: marvell: fix watchdog unit address in Armada AP806
  arm64: dts: marvell: armada-37xx: add a crypto node
  ARM64: dts: marvell: armada-cp110: Fix clock resources for various node
  ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7
2018-01-05 17:17:25 +01:00
Soheil Hassas Yeganeh
0a38806f31 net: revert "Update RFS target at poll for tcp/udp"
On multi-threaded processes, one common architecture is to have
one (or a small number of) threads polling sockets, and a
considerably larger pool of threads reading form and writing to the
sockets. When we set RPS core on tcp_poll() or udp_poll() we essentially
steer all packets of all the polled FDs to one (or small number of)
cores, creaing a bottleneck and/or RPS misprediction.

Another common architecture is to shard FDs among threads pinned
to cores. In such a setting, setting RPS core in tcp_poll() and
udp_poll() is redundant because the RFS core is correctly
set in recvmsg and sendmsg.

Thus, revert the following commit:
c3f1dbaf6e ("net: Update RFS target at poll for tcp/udp").

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:14:57 -05:00
Soheil Hassas Yeganeh
e3f2c4a3db ip: do not set RFS core on error queue reads
We should only record RPS on normal reads and writes.
In single threaded processes, all calls record the same state. In
multi-threaded processes where a separate thread processes
errors, the RFS table mispredicts.

Note that, when CONFIG_RPS is disabled, sock_rps_record_flow
is a noop and no branch is added as a result of this patch.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:14:56 -05:00
Arnd Bergmann
c503f594d6 Freescale arm64 device tree updates for 4.16:
- LS1088A updates: add device support for DCFG, qoriq-mc, and USB.
  - Add power monitor device INA220 for ls208xa-rdb board.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJaTGwHAAoJEFBXWFqHsHzO4nsIAKpeAn/RYAifDCZnDuFGwb/A
 k5VmnFtRGjJZE1si5jFzMyNs+aygyTlmwU2GfXWnsiw6uanirVDEfWfZvtwli82P
 mWn7farB0Z+bMheGXS0+f6btJ26dmoj4BA6U7JNMKa09rW9+q/Mgj+yhmH+EH0bi
 EcHUBv8p8R8yrW7YwWQ2Cn+kjplteraVDy1gH+pOvE8uXDGBo4qq0K1AUQtZOiIY
 rwwRNVmQaJ7JKKbx3xP6kQ+M7saO5JkIqVai7WAcRzHslhEUwrGDwMOMcmMnyPLY
 oCISjgD97qJtEOiFEjSN22RhpIUb75bde/q7pi2ODdRxSstN+V5+Se+e5V/lC5E=
 =4XoM
 -----END PGP SIGNATURE-----

Merge tag 'imx-dt64-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/dt

Pull "Freescale arm64 device tree updates for 4.16" from Shawn Guo:

 - LS1088A updates: add device support for DCFG, qoriq-mc, and USB.
 - Add power monitor device INA220 for ls208xa-rdb board.

* tag 'imx-dt64-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: ls208xa: add power monitor chip node
  arm64: dts: ls1088a: Add USB support
  arm64: dts: ls1088a: add fsl-mc hardware resource manager node
  arm64: dts: ls1088a: Added dcfg node in ls1088a dtsi
2018-01-05 17:11:29 +01:00
Ming Lei
454be724f6 block: drain queue before waiting for q_usage_counter becoming zero
Now we track legacy requests with .q_usage_counter in commit 055f6e18e0
("block: Make q_usage_counter also track legacy requests"), but that
commit never runs and drains legacy queue before waiting for this counter
becoming zero, then IO hang is caused in the test of pulling disk during IO.

This patch fixes the issue by draining requests before waiting for
q_usage_counter becoming zero, both Mauricio and chenxiang reported this
issue, and observed that it can be fixed by this patch.

Link: https://marc.info/?l=linux-block&m=151192424731797&w=2
Fixes: 055f6e18e08f("block: Make q_usage_counter also track legacy requests")
Cc: Wen Xiong <wenxiong@us.ibm.com>
Tested-by: "chenxiang (M)" <chenxiang66@hisilicon.com>
Tested-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:09:48 -07:00
Arnd Bergmann
7c179f9dff i.MX device tree changes for 4.16:
- A few random updates for vf610-zii board: correct switch EEPROM size,
    enable edma1, correct GPIO expander interrupt, add PHYs for switch2
    device.
  - LS1021A device tree updates: add reboot and QSPI device nodes, label
    USB controllers, specify interrupt-affinity for PMU, fix TMR_FIPER1
    setting, enable esdhc device, add Moxa UC-8410A board support.
  - A bunch of patches from Fabio: fix reg - unit address mismatches,
    remove leading zero in unit address, move regulators out of
    simple-bus, move nodes with no reg property out of bus, remove extra
    clock cell, add missing phy-cells to usb-nop-xceiv, etc.
  - A couple series from Hummingboard developers: re-organise device tree
    files for better handling various board versions, and then add the
    new hummingboard2 board support on top of that.
  - Disable AC'97 input pins pad and add support for powering off for
    imx6qdl-udoo board.
  - Convert from fbdev to drm bindings for imx6sx-sdb and imx6sl-evk
    board.
  - Add device tree for Variscite DART-MX6 SoM and Carrier-board support.
  - Add new board support of TS-4600 and TS-7970 from Technologic
    Systems.
  - A series from Stefan to update imx7-colibri device tree and then add
    new version of Toradex Colibri iMX7D board with eMMC support.
  - Other random updates on various board support.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJaTDRSAAoJEFBXWFqHsHzO5qwIAJ9iqSd41KE50kX2QPWa6Uqa
 Dfj0BcR4RdFpf4FqCOM6ntjVhUEyxNUtwINuMr6eCI8BK1NGeLNJGm9LK77/RwoE
 wmUFTcGelzx4iEWVouD1NoCxIvVFm5RyO26JC/0GPUbulKvcTRma+GQBV218ZOcz
 5GYZ2vlmvddwfgNCF+w2tRB07s5kFKWk9S+w7oDd2qF4qztOzBWMr+i5gdtLAboc
 iaWS1+9RQu1FbtuanHAbCFmaQrPV2YsDnnIQYMBqpKlFoO7oUSJCDkINWy0BXq7f
 eXXLj/hwc8cAC0MM5kuvScEcgKth0p7W0kQETzC19v1EFYx1CxFwFjxXYQ16iCQ=
 =1F3D
 -----END PGP SIGNATURE-----

Merge tag 'imx-dt-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/dt

Pull "i.MX device tree changes for 4.16" from Shawn Guo:

 - A few random updates for vf610-zii board: correct switch EEPROM size,
   enable edma1, correct GPIO expander interrupt, add PHYs for switch2
   device.
 - LS1021A device tree updates: add reboot and QSPI device nodes, label
   USB controllers, specify interrupt-affinity for PMU, fix TMR_FIPER1
   setting, enable esdhc device, add Moxa UC-8410A board support.
 - A bunch of patches from Fabio: fix reg - unit address mismatches,
   remove leading zero in unit address, move regulators out of
   simple-bus, move nodes with no reg property out of bus, remove extra
   clock cell, add missing phy-cells to usb-nop-xceiv, etc.
 - A couple series from Hummingboard developers: re-organise device tree
   files for better handling various board versions, and then add the
   new hummingboard2 board support on top of that.
 - Disable AC'97 input pins pad and add support for powering off for
   imx6qdl-udoo board.
 - Convert from fbdev to drm bindings for imx6sx-sdb and imx6sl-evk
   board.
 - Add device tree for Variscite DART-MX6 SoM and Carrier-board support.
 - Add new board support of TS-4600 and TS-7970 from Technologic
   Systems.
 - A series from Stefan to update imx7-colibri device tree and then add
   new version of Toradex Colibri iMX7D board with eMMC support.
 - Other random updates on various board support.

* tag 'imx-dt-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: (126 commits)
  ARM: dts: imx7s: Avoid using label in unit address and reg
  ARM: dts: imx51-zii-rdu1: Add missing #phy-cells to usb-nop-xceiv
  ARM: dts: imx6qdl-hummingboard2: Remove leading zero in unit address
  ARM: dts: ls1021a: add support for Moxa UC-8410A open platform
  ARM: dts: imx51-babbage: Fix the 26MHz clock modelling
  ARM: dts: vf610-zii-dev-rev-b: add PHYs for switch2
  ARM: dts: vf610-zii-dev-rev-b: fix interrupt for GPIO expander
  ARM: dts: vf610-zii-dev: enable edma1
  ARM: dts: ls1021a-twr: Remove extra clock cell
  ARM: dts: ls1021a-qds: Remove extra clock cell
  ARM: dts: imx53: add srtc node
  dt-bindings: imx-gpcv2: Fix the unit address
  ARM: imx: dts: Use lower case for bindings notation
  ARM: dts: imx6q-h100: use usdhc2 VSELECT
  ARM: dts: imx6sx: Add support for PCI power domain
  ARM: dts: imx6sx: Fix PCI non-prefetchable memory range
  ARM: dts: imx6qdl-hummingboard2: rename regulators to match schematic
  ARM: dts: imx6qdl-hummingboard2: add v1.5 som with eMMC
  ARM: dts: imx6qdl-hummingboard2: add v1.5 som without eMMC
  ARM: dts: imx6qdl-hummingboard2: add PWM3 support
  ...
2018-01-05 17:07:32 +01:00
Arnd Bergmann
b55eb1ae91 ASPEED device tree updates for 4.16
Clock driver support:
 
  Rework all platforms to use proper clock bindings. Linux should now boot
  upstream kernels on ast2400 and ast2500 platforms without out of tree
  patches.
 
 New systems:
 
  Witherspoon: OpenPower Power9 server manufactured by IBM that uses the ASPEED ast2500
  Zaius: OpenPower Power9 server manufactured by Invatech that uses the ASPEED ast2500
  Q71L: Intel Xeon server manufactured by Qanta that uses the ASPEED ast2400
 
  We also see updates to the Palmetto and Romulus systems to bring them in
  line with the functionality of those above.
 
  The systems take advantage of recently added drivers for LPC Snoop
  device and the PWM/Tachometer fan controller.
 
 OpenBMC flash layout:
 
  The flash layout used OpenBMC systems is added and the device trees now
  use it.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJaSxoRAAoJEGt2WQeBR3CeiPgP/iBQ9qidGwAAGEAeg2UDnM6u
 5oyFj6oMEE7oifUdCxgwWUv9S4TJqev7ux5vsp8d5iqd7bTmxTexRoAbhsADOnxz
 UlCibUVQm6ai5tDe1e9cSVtylo08PYi9yafLyQ37DPsvbkj+HrUN5RT0VXYKDMKL
 zV0X5jAZ49AAbekAGEFXIZsqFz9vEL2Z/6a8zHl2igRd/rlwLtMUdqeRdZKYIUJu
 SFPa7OCTKUFX44tD8tUh+VUabOHjgGM4ObeKm6ePAAtVnZ/fkaVM3wna7p1iCnJt
 o+6ZD3wnrDvfK8hNN+fdV7i4060B3G6CLjBsoJLWUl2/DEOedfW067vr1o8EvkYX
 jZvILGwAY7P5e6Y/7ugb46KKk/X/J4ViunPpjbzA/vLXpo7oafKF1DlzAa4jNkoT
 n/VyYu6Q1Xzh/axa7XUeqZIbBqzwuhSVA1NLrCwghSg/GPYHM4OyzjIunfuLlHSR
 6Z1yy4KSmDDDHJx3gAMcxyBEVPm0g7d82e/OZDzaaapuKiFzSvH1OPYaK45944hn
 9JspNS6zpGzUBpnMRfYCL76+UDOKugg6Gdctlj6A2BOHd+bRAxVeN9R+jZe7q/0w
 kNgySXI27rnZbc9nSNPDj2epm6DcQZwgq0kQUS07avrI4kccj8Lq0dDDlybePwCB
 2T0s8+XhGiPSPht6W1jw
 =H9im
 -----END PGP SIGNATURE-----

Merge tag 'aspeed-4.16-devicetree' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/joel/aspeed into next/dt

Pull "ASPEED device tree updates for 4.16" from Joel Stanley:

Clock driver support:

 Rework all platforms to use proper clock bindings. Linux should now boot
 upstream kernels on ast2400 and ast2500 platforms without out of tree
 patches.

New systems:

 Witherspoon: OpenPower Power9 server manufactured by IBM that uses the ASPEED ast2500
 Zaius: OpenPower Power9 server manufactured by Invatech that uses the ASPEED ast2500
 Q71L: Intel Xeon server manufactured by Qanta that uses the ASPEED ast2400

 We also see updates to the Palmetto and Romulus systems to bring them in
 line with the functionality of those above.

 The systems take advantage of recently added drivers for LPC Snoop
 device and the PWM/Tachometer fan controller.

OpenBMC flash layout:

 The flash layout used OpenBMC systems is added and the device trees now
 use it.

* tag 'aspeed-4.16-devicetree' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/joel/aspeed:
  ARM: dts: aspeed-evb: Add unit name to memory node
  ARM: dts: aspeed-plametto: Add flash layout and fix memory node
  ARM: dts: aspeed-romulus: Update Romulus system
  ARM: dts: aspeed: Add Qanta Q71L BMC machine
  ARM: dts: aspeed: Add Ingrasys Zaius BMC machine
  ARM: dts: aspeed: Add Witherspoon BMC machine
  ARM: dts: aspeed: Sort ASPEED entries in makefile
  ARM: dts: Add OpenBMC flash layout
  ARM: dts: aspeed: Update license headers
  ARM: dts: aspeed: Remove skeleton.dtsi
  ARM: dts: aspeed: Add LPC Snoop device
  ARM: dts: aspeed: Add PWM and tachometer node
  ARM: dts: aspeed: Add clock phandle to GPIO
  ARM: dts: aspeed: Add flash controller clocks
  ARM: dts: aspeed: Add watchdog clocks
  ARM: dts: aspeed: Add MAC clocks
  ARM: dts: aspeed: Add proper clock references
  ARM: dts: aspeed: Add LPC and child devices
  dt-bindings: gpio: Add ASPEED constants
  dt-bindings: clock: Add ASPEED constants

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-01-05 17:06:04 +01:00
David S. Miller
2e40b823f0 Merge branch 'l2tp-remove-configurable-offset-parameters'
James Chapman says:

====================
l2tp: remove configurable offset parameters

This patch series removes all code to support a configurable offset in
transmitted l2tp packets. Code to handle this is incomplete and buggy
and has been this way for years. If anyone tried to configure an
offset, it would be ignored for L2TPv2 tunnels, or for L2TPv3 tunnels,
could result in L2TPv3 packets being transmitted which are not
compliant with L2TPv3 RFC3931. This patch series removes the support
for configurable offsets.

No known userspace l2tp daemon configures an offset. However,
iproute2's "ip l2tp" command has an offset parameter and if set, the
value is passed to the kernel. This is the most likely use case where
offsets might be configured, e.g.

   ip l2tp add tunnel local 1.1.1.1 remote 1.1.1.2 tunnel_id 1 \
       peer_tunnel_id 2 encap ip
   ip l2tp add session name l2tp0 tunnel_id 1 session_id 1 \
       peer_session_id 2 offset 8

The above would result in packets being transmitted to 1.1.1.2 with 8
bytes padding between the L2TPv3 header and the payload. The peer
would need to be configured with the same offset value. However, the
packets are not compliant with the L2TPv3 RFC, hence I think it's
unlikely that offset is being used. With this patch series applied,
the offset would not be configured. The peer would need to be modified to
remove its offset setting too.

iproute2 should be modified to remove or ignore the ip l2tp offset
parameter.

This issue was discovered when reviewing a patch series from
lorenzo.bianconi@redhat.com which adds another netlink attribute to
configure the expected offset in received L2TPv3 packets. This change
is reverted by this series because offsets do not exist in L2TPv3
packets. These commits are:

  commit f15bc54eee ("l2tp: add peer_offset parameter")
  commit 820da53575 ("l2tp: fix missing print session offset info")

In more detail:

The L2TPv2 protocol supports a variable offset from the L2TPv2 header
to the payload to give the sender implementation some flexibility for
data alignment when adding L2TP headers on to payloads. The offset
value is indicated by an optional field in the L2TP header.  Our L2TP
implementation already detects the presence of the optional offset in
received packets and skips those bytes when parsing packets. All
transmitted L2TPv2 packets are always transmitted with no offset.

L2TPv3 has no optional offset field in the L2TPv3 packet
header. Instead, L2TPv3 defines optional fields in a "Layer-2 Specific
Sublayer". At the time when the original L2TP code was written, there
was talk at IETF of offset being implemented in a new Layer-2 Specific
Sublayer. A L2TP_ATTR_OFFSET netlink attribute was added so that this
offset could be configured and the intention was to allow it to be
also used to set the tx offset for L2TPv2. However, no L2TPv3 offset
was ever specified and the L2TP_ATTR_OFFSET parameter was forgotten
about.

Setting L2TP_ATTR_OFFSET results in L2TPv3 packets being transmitted
with the specified number of bytes padding between L2TPv3 header and
payload. This is not compliant with L2TPv3 RFC3931. So this change
removes the configurable offset altogether while retaining
L2TP_ATTR_OFFSET in the API for backwards compatibility. If
L2TP_ATTR_OFFSET is given, its value is now silently ignored.
====================

Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Tested-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:03:42 -05:00
James Chapman
4887d8933a l2tp: add comment in API header that L2TP_ATTR_OFFSET is not used
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:03:26 -05:00
James Chapman
900631ee6a l2tp: remove configurable payload offset
If L2TP_ATTR_OFFSET is set to a non-zero value in L2TPv3 tunnels, it
results in L2TPv3 packets being transmitted which might not be
compliant with the L2TPv3 RFC. This patch has l2tp ignore the offset
setting and send all packets with no offset.

In more detail:

L2TPv2 supports a variable offset from the L2TPv2 header to the
payload. The offset value is indicated by an optional field in the
L2TP header.  Our L2TP implementation already detects the presence of
the optional offset and skips that many bytes when handling data
received packets. All transmitted packets are always transmitted with
no offset.

L2TPv3 has no optional offset field in the L2TPv3 packet
header. Instead, L2TPv3 defines optional fields in a "Layer-2 Specific
Sublayer". At the time when the original L2TP code was written, there
was talk at IETF of offset being implemented in a new Layer-2 Specific
Sublayer. A L2TP_ATTR_OFFSET netlink attribute was added so that this
offset could be configured and the intention was to allow it to be
also used to set the tx offset for L2TPv2. However, no L2TPv3 offset
was ever specified and the L2TP_ATTR_OFFSET parameter was forgotten
about.

Setting L2TP_ATTR_OFFSET results in L2TPv3 packets being transmitted
with the specified number of bytes padding between L2TPv3 header and
payload. This is not compliant with L2TPv3 RFC3931. This change
removes the configurable offset altogether while retaining
L2TP_ATTR_OFFSET for backwards compatibility. Any L2TP_ATTR_OFFSET
value is ignored.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:03:19 -05:00
James Chapman
de3b58bc35 l2tp: revert "l2tp: fix missing print session offset info"
Revert commit 820da53575 ("l2tp: fix missing print session offset
info").  The peer_offset parameter is removed.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:03:13 -05:00
Bart Van Assche
882d4171a8 pktcdvd: Fix a recently introduced NULL pointer dereference
Call bdev_get_queue(bdev) after bdev->bd_disk has been initialized
instead of just before that pointer has been initialized. This patch
avoids that the following command

pktsetup 1 /dev/sr0

triggers the following kernel crash:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000548
IP: pkt_setup_dev+0x2db/0x670 [pktcdvd]
CPU: 2 PID: 724 Comm: pktsetup Not tainted 4.15.0-rc4-dbg+ #1
Call Trace:
 pkt_ctl_ioctl+0xce/0x1c0 [pktcdvd]
 do_vfs_ioctl+0x8e/0x670
 SyS_ioctl+0x3c/0x70
 entry_SYSCALL_64_fastpath+0x23/0x9a

Reported-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: commit ca18d6f769 ("block: Make most scsi_req_init() calls implicit")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:03:04 -07:00
James Chapman
863def15b9 l2tp: revert "l2tp: add peer_offset parameter"
Revert commit f15bc54eee ("l2tp: add peer_offset parameter"). This
is removed because it is adding another configurable offset and
configurable offsets are being removed.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 11:03:04 -05:00
Bart Van Assche
5a0ec388ef pktcdvd: Fix pkt_setup_dev() error path
Commit 523e1d399c ("block: make gendisk hold a reference to its queue")
modified add_disk() and disk_release() but did not update any of the
error paths that trigger a put_disk() call after disk->queue has been
assigned. That introduced the following behavior in the pktcdvd driver
if pkt_new_dev() fails:

Kernel BUG at 00000000e98fd882 [verbose debug info unavailable]

Since disk_release() calls blk_put_queue() anyway if disk->queue != NULL,
fix this by removing the blk_cleanup_queue() call from the pkt_setup_dev()
error path.

Fixes: commit 523e1d399c ("block: make gendisk hold a reference to its queue")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Cc: <stable@vger.kernel.org> # v3.2
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 09:03:03 -07:00
Yan Markman
474c588558 arm64: dts: marvell: add Ethernet aliases
This patch adds Ethernet aliases in the Marvell Armada 7040 DB, 8040 DB
and 8040 mcbin device trees so that the bootloader setup the MAC
addresses correctly.

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: commit message, small fixes]
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:45 +01:00
Thomas Petazzoni
91f1be92eb arm64: dts: marvell: replace cpm by cp0, cps by cp1
In preparation for the introduction of more than 2 CPs in upcoming
SoCs, it makes sense to move away from the "CP master" (cpm) and "CP
slave" (cps) naming, and use instead cp0/cp1.

This commit is the result of:

 sed 's%cpm%cp0g%' arch/arm64/boot/dts/marvell/*
 sed 's%cps%cp1g%' arch/arm64/boot/dts/marvell/*

So it is a purely mechaninal change.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Suggested-by: Hanna Hawa <hannah@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:43 +01:00
Thomas Petazzoni
72a3713fad arm64: dts: marvell: de-duplicate CP110 description
One concept of Marvell Armada 7K/8K SoCs is that they are made of HW
blocks composed of a variety of IPs (network, PCIe, SATA, XOR, SPI,
I2C, etc.), and those HW blocks can be duplicated several times within
a given SoC. The Armada 7K SoC has a single CP110 (so no duplication),
while the Armada 8K SoC has two CP110. In the future, SoCs with more
than 2 CP110s will be introduced.

In current kernel versions, the master CP110 is described in
armada-cp110-master.dtsi and the slave CP110 is described in
armada-cp110-slave.dtsi. Those files are basically exactly the same,
since they describe the same hardware. They only have a few
differences:

 - Base address of the registers is different for the "config-space"

 - Base address of the PCIe registers, MEM, CONF and IO areas were
   different

 - Labels (and phandles pointing to them) of the nodes were different
   ("cpm" prefix in the master CP, "cps" prefix in the slave CP)

This duplication issue has been discussed at the DT workshop [1] in
Prague last October, and we presented on this topic [2]. The solution
of using the C pre-processor to avoid this duplication has been
validated by the people present in this DT workshop, and this patch
simply implements what has been presented.

We handle differences between the master CP and slave CP description
using the C pre-processor, by defining a set of macros with different
values armada-cp110.dtsi is included to instantiate one of the master
or slave CP110.

There are a few aspects that deserve additional explanations:

 - PCIe needs to be handled separately because it is not part of the
   config-space {...} node, since it has registers outside of the
   range covered by config-space {...}.

 - We need to defined CP110_BASE, CP110_PCIEx_BASE without 0x, because
   they are used for the unit address part of some DT nodes. But since
   they are also used for the "reg" property of the same nodes, we
   have an ADDRESSIFY() macro that prepends 0x to those values.

We compared the resulting .dtb for armada-8040-db.dtb before and after
this patch is applied, and the result is exactly the same, except for
a few differences:

 - the SDHCI controller that was only described in the master CP110 is
   now also described in the slave CP110. Even though the SDHCI
   controller from the slave CP110 is indeed not usable (as it isn't
   wired to the outside world) it is technically part of the silicon,
   and therefore it is reasonable to also describe it to be part of
   the slave CP110. In addition, if we wanted to get this correct for
   the SDHCI controller, we should also do it for the NAND controller,
   for which the situation is even more complicated: in a single CP110
   configuration (Armada 7K), the usable NAND controller is in the
   master CP110, while in a dual CP110 configuration (Armada 8K), the
   usable NAND controller is in the slave CP110. Since that would add
   a lot of additional complexity for no good reason, and since the IP
   blocks are in fact really present in both CPs, we simply describe
   them in both CPs at the DT level.

 - the cp110-master and cp110-slave nodes are now named cpm and
   cps. We could have kept cp110-master and cp110-slave, but that
   would have required adding another CP110_xyz define, which didn't
   seem very useful.

Note that this commit also gets rid of the armada-cp110-master.dtsi
and armada-cp110-slave.dtsi files, as future SoCs will have more than
2 CPs. Instead, we instantiate the CPs directly from the SoC-specific
.dtsi files, i.e armada-70x0.dtsi and armada-80x0.dtsi.

[1] https://elinux.org/Device_tree_kernel_summit_2017_etherpad
[2] https://elinux.org/images/1/14/DTWorkshop2017-duplicate-data.pdf

[gregory.clement@free-electrons.com: add back the "ARM64: dts: marvell:
Fix clock resources for various node" commit]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:41 +01:00
Thomas Petazzoni
e2a393c699 arm64: dts: marvell: use aliases for SPI busses on Armada 7K/8K
We are currently using the cell-index DT property to assign SPI bus
numbers. This property is specific to the spi-orion driver, and
requires each SPI controller to have a unique ID defined in the Device
Tree.

As we are about to merge armada-cp110-master.dtsi and
armada-cp110-slave.dtsi into a single file, those cell-index
properties that differ between the master CP110 and the slave CP110
are a difference that would have to be handled.

In order to avoid this, we switch to using the "aliases" DT node to
assign a unique number to each SPI controller. This is more generic,
and directly handled by the SPI core.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:40 +01:00
Thomas Petazzoni
af9ad5bcd9 arm64: dts: marvell: use mvebu-icu.h where possible
Back when the ICU Device Tree binding was introduced, we could not use
mvebu-icu.h from the Device Tree files, because the DT files and
mvebu-icu.h were following different merge routes towards Linus
tree. Now that both have been merged, we can switch the Marvell Armada
CP110 Device Tree files to use the mvebu-icu.h header instead of
duplicating the ICU_GRP_NSR definition.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:39 +01:00
Thomas Petazzoni
4003e96a7b arm64: dts: marvell: fix compatible string list for Armada CP110 slave NAND
The Armada CP110 slave NAND controller Device Tree description lists
the compatible string in the wrong order: marvell,armada-8k-nand
should come first. This commit alignes the slave CP110 description
with the master CP110 description from that respect.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:39 +01:00
Thomas Petazzoni
ab8637ed30 arm64: dts: marvell: fix typos in comment describing the NAND controller
Fix the same typo duplicated in both master and slave version of
armada-cp110-*.dtsi file: s/limiation/limitation/.

[gregory.clement@free-electrons.com: add the commit log]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:38 +01:00
Thomas Petazzoni
123c27c89c arm64: dts: marvell: use lower case for unit address and reg property
This fixes the following DTC warning:

  <stdout>: Warning (simple_bus_reg): Node /ap806/config-space@f0000000/thermal@6f808C simple-bus unit address format error, expected "6f808c"

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:37 +01:00
Thomas Petazzoni
d3ce06b4db arm64: dts: marvell: fix watchdog unit address in Armada AP806
This fixes the following DTC warning:

  Warning (simple_bus_reg): Node /ap806/config-space@f0000000/watchdog@600000 simple-bus unit address format error, expected "610000"

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:36 +01:00
Antoine Tenart
e2707a288c arm64: dts: marvell: armada-37xx: add a crypto node
This patch adds a crypto node describing the EIP97 engine found in
Armada 37xx SoCs. The cryptographic engine is enabled by default.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 17:02:36 +01:00
Gregory CLEMENT
42a4a26bb4 Merge branch 'mvebu/fixes' into HEAD 2018-01-05 17:02:27 +01:00
Arnd Bergmann
74bd5d56bf net/mlx5e: hide an unused variable
The uplink_rpriv variable was added at the start of the function but
only used inside of an #ifdef:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 'mlx5e_route_lookup_ipv6':
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:1549:25: error: unused variable 'uplink_rpriv' [-Werror=unused-variable]

This moves the declaration into that #ifdef as well.

Fixes: 5ed99fb421 ("net/mlx5e: Move ethernet representors data into separate struct")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-05 10:55:34 -05:00
Gregory CLEMENT
e3af9f7c6e ARM64: dts: marvell: armada-cp110: Fix clock resources for various node
On the CP modules we found on Armada 7K/8K, many IP block actually also
need a "functional" clock (from the bus). This patch add them which allows
to fix some issues hanging the kernel:

If Ethernet and sdhci driver are built as modules and sdhci was loaded
first then the kernel hang.

Fixes: bb16ea1742 ("mmc: sdhci-xenon: Fix clock resource by adding an
optional bus clock")
Cc: stable@vger.kernel.org
Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2018-01-05 16:54:40 +01:00
Matias Bjørling
8b7bc84988 lightnvm: pblk: refactor pblk_ppa_comp function
Shorten function to simply return the value of the if statement.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
998ba62973 lightnvm: pblk: add iostat support
Since pblk registers its own block device, the iostat accounting is
not automatically done for us. Therefore, add the necessary
accounting logic to satisfy the iostat interface.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
30d82a8631 lightnvm: pblk: print instance name on instance info
Add the instance name to the information printed out on target creation.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
c6847e4e35 lightnvm: pblk: free write buffer on init failure
Refactor the way we free the write buffer to ensure that all entries get
freed in case of an error on the init sequence.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
cc4f5ba1fb lightnvm: pblk: ensure kthread alloc. before kicking it
When creating the write thread, ensure that the kthread has been created
before initializing the timer responsible from kicking it. Otherwise, if
the kthread creation fails or gets killed from used space, we risk
kicking an empty thread structure.

Also, since the kthread creation can be interrupted form user space,
adapt the error path to not report an error when this happens, since it
is intentional that the instance creation is aborted.

Signed-off-by: Javier González <javier@cnexlabs.com>
Updated source to reflect the new timer_setup API.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
8f554597e0 lightnvm: pblk: do not log recovery read errors
On scan recovery, reads can fail. This happens because the first page
for each line is read in order to determined if the line has been used
(and thus needs to be recovered), or not. This can lead to "empty page"
read errors.

Since these errors are normal, do not log them, as they are confusing
when reviewing the logs.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
5d201f0720 lightnvm: pblk: ignore high ecc errors on recovery
On recovery, do not stop L2P recovery if reads report high ECC error
as the data is still available.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
e53927393b lightnvm: set target over-provision on create ioctl
Allow to set the over-provision percentage on target creation. In case
that the value is not provided, fall back to the default value set by
the target.

In pblk, set the default OP to 11% of the total size of the device

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Javier González
a7689938ef lightnvm: pblk: use exact free block counter in RL
Until now, pblk's rate-limiter has used a heuristic to reserve space for
GC I/O given that the over-provision area was fixed.

In preparation for allowing to define the over-provision area on target
creation, define a dedicated free_block counter in the rate-limiter to
track the number of blocks being used for user data.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00
Hans Holmberg
aed49e195a lightnvm: pblk: remove pblk_gc_stop
pblk_gc_stop just sets pblk->gc->gc_active to zero, ignoring
the flush parameter. This is plain confusing, so remove the
function and set the gc active flag at the call points instead.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05 08:50:12 -07:00