Commit graph

56841 commits

Author SHA1 Message Date
Eric Dumazet
9a568de481 tcp: switch TCP TS option (RFC 7323) to 1ms clock
TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffies' variable, because it had been a cheap and good enough
generator.

For TCP flows on the Internet, 1 ms resolution would be much better
than 4ms or 10ms (HZ=250 or HZ=100 respectively)

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1]

Receive size autotuning (DRS) is indeed more precise and converges
faster to optimal window size.

This patch converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17 16:06:01 -04:00
CQ Tang
15060aba71 iommu/vt-d: Helper function to query if a pasid has any active users
A driver would need to know if there are any active references to a
a PASID before cleaning up its resources. This function helps check
if there are any active users of a PASID before it can perform any
recovery on that device.

To: Joerg Roedel <joro@8bytes.org>
To: linux-kernel@vger.kernel.org
To: David Woodhouse <dwmw2@infradead.org>
Cc: Jean-Phillipe Brucker <jean-philippe.brucker@arm.com>
Cc: iommu@lists.linux-foundation.org

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-05-17 14:57:56 +02:00
Baolin Wang
7d21114dc6 usb: phy: Introduce one extcon device into usb phy
Usually usb phy need register one extcon device to get the connection
notifications. It will remove some duplicate code if the extcon device
is registered using common code instead of each phy driver having its
own related extcon APIs. So we add one pointer of extcon device into
usb phy structure, and some other helper functions to register extcon.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-05-17 14:15:28 +03:00
Mats Karrman
a86c309e71 usb: typec: Don't prevent using constant typec_mode_desc initializers
In some situations, e.g. when registering alternate modes for local typec
ports, it may be handy to use constant mode descriptors. Allow this by
changing the mode descriptor arguments of typec_port_register_altmode()
et.al. to using const pointers.

Signed-off-by: Mats Karrman <mats.dev.list@gmail.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-17 12:20:53 +02:00
Matthias Kaehlcke
3ffad468cf regulator: Allow for asymmetric settling times
Some regulators have different settling times for voltage increases and
decreases. To avoid a time penalty on the faster transition allow for
different settings for up- and downward transitions.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2017-05-17 10:49:25 +01:00
Kuninori Morimoto
ac1e6958d3 of_graph: add of_graph_get_endpoint_count()
OF graph want to count its endpoint number, same as
of_get_child_count(). This patch adds of_graph_get_endpoint_count()

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2017-05-17 10:21:16 +01:00
Kuninori Morimoto
0ef472a973 of_graph: add of_graph_get_port_parent()
Linux kernel already has of_graph_get_remote_port_parent(),
but, sometimes we want to get own port parent.
This patch adds of_graph_get_port_parent()

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2017-05-17 10:21:12 +01:00
Kuninori Morimoto
4c9c3d595f of_graph: add of_graph_get_remote_endpoint()
It should use same method to get same result.
To getting remote-endpoint node,
let's use of_graph_get_remote_endpoint()

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2017-05-17 10:21:08 +01:00
J. Bruce Fields
9512a16b0e nfsd: Revert "nfsd: check for oversized NFSv2/v3 arguments"
This reverts commit 51f5677777 "nfsd: check for oversized NFSv2/v3
arguments", which breaks support for NFSv3 ACLs.

That patch was actually an earlier draft of a fix for the problem that
was eventually fixed by e6838a29ec "nfsd: check for oversized NFSv2/v3
arguments".  But somehow I accidentally left this earlier draft in the
branch that was part of my 2.12 pull request.

Reported-by: Eryu Guan <eguan@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-05-16 16:16:30 -04:00
Andrew Lunn
1b86f702f8 net: phy: Remove residual magic from PHY drivers
commit fa8cddaf90 ("net phylib: Remove unnecessary condition check in phy")
removed the only place where the PHY flag PHY_HAS_MAGICANEG was
checked. But it left the flag being set in the drivers. Remove the flag.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16 15:58:18 -04:00
Eric Dumazet
218af599fa tcp: internal implementation for pacing
BBR congestion control depends on pacing, and pacing is
currently handled by sch_fq packet scheduler for performance reasons,
and also because implemening pacing with FQ was convenient to truly
avoid bursts.

However there are many cases where this packet scheduler constraint
is not practical.
- Many linux hosts are not focusing on handling thousands of TCP
  flows in the most efficient way.
- Some routers use fq_codel or other AQM, but still would like
  to use BBR for the few TCP flows they initiate/terminate.

This patch implements an automatic fallback to internal pacing.

Pacing is requested either by BBR or use of SO_MAX_PACING_RATE option.

If sch_fq happens to be in the egress path, pacing is delegated to
the qdisc, otherwise pacing is done by TCP itself.

One advantage of pacing from TCP stack is to get more precise rtt
estimations, and less work done from TX completion, since TCP Small
queue limits are not generally hit. Setups with single TX queue but
many cpus might even benefit from this.

Note that unlike sch_fq, we do not take into account header sizes.
Taking care of these headers would add additional complexity for
no practical differences in behavior.

Some performance numbers using 800 TCP_STREAM flows rate limited to
~48 Mbit per second on 40Gbit NIC.

If MQ+pfifo_fast is used on the NIC :

$ sar -n DEV 1 5 | grep eth
14:48:44         eth0 725743.00 2932134.00  46776.76 4335184.68      0.00      0.00      1.00
14:48:45         eth0 725349.00 2932112.00  46751.86 4335158.90      0.00      0.00      0.00
14:48:46         eth0 725101.00 2931153.00  46735.07 4333748.63      0.00      0.00      0.00
14:48:47         eth0 725099.00 2931161.00  46735.11 4333760.44      0.00      0.00      1.00
14:48:48         eth0 725160.00 2931731.00  46738.88 4334606.07      0.00      0.00      0.00
Average:         eth0 725290.40 2931658.20  46747.54 4334491.74      0.00      0.00      0.40
$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 4  0      0 259825920  45644 2708324    0    0    21     2  247   98  0  0 100  0  0
 4  0      0 259823744  45644 2708356    0    0     0     0 2400825 159843  0 19 81  0  0
 0  0      0 259824208  45644 2708072    0    0     0     0 2407351 159929  0 19 81  0  0
 1  0      0 259824592  45644 2708128    0    0     0     0 2405183 160386  0 19 80  0  0
 1  0      0 259824272  45644 2707868    0    0     0    32 2396361 158037  0 19 81  0  0

Now use MQ+FQ :

lpaa23:~# echo fq >/proc/sys/net/core/default_qdisc
lpaa23:~# tc qdisc replace dev eth0 root mq

$ sar -n DEV 1 5 | grep eth
14:49:57         eth0 678614.00 2727930.00  43739.13 4033279.14      0.00      0.00      0.00
14:49:58         eth0 677620.00 2723971.00  43674.69 4027429.62      0.00      0.00      1.00
14:49:59         eth0 676396.00 2719050.00  43596.83 4020125.02      0.00      0.00      0.00
14:50:00         eth0 675197.00 2714173.00  43518.62 4012938.90      0.00      0.00      1.00
14:50:01         eth0 676388.00 2719063.00  43595.47 4020171.64      0.00      0.00      0.00
Average:         eth0 676843.00 2720837.40  43624.95 4022788.86      0.00      0.00      0.40
$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 259832240  46008 2710912    0    0    21     2  223  192  0  1 99  0  0
 1  0      0 259832896  46008 2710744    0    0     0     0 1702206 198078  0 17 82  0  0
 0  0      0 259830272  46008 2710596    0    0     0     0 1696340 197756  1 17 83  0  0
 4  0      0 259829168  46024 2710584    0    0    16     0 1688472 197158  1 17 82  0  0
 3  0      0 259830224  46024 2710408    0    0     0     0 1692450 197212  0 18 82  0  0

As expected, number of interrupts per second is very different.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16 15:43:31 -04:00
Paolo Abeni
2276f58ac5 udp: use a separate rx queue for packet reception
under udp flood the sk_receive_queue spinlock is heavily contended.
This patch try to reduce the contention on such lock adding a
second receive queue to the udp sockets; recvmsg() looks first
in such queue and, only if empty, tries to fetch the data from
sk_receive_queue. The latter is spliced into the newly added
queue every time the receive path has to acquire the
sk_receive_queue lock.

The accounting of forward allocated memory is still protected with
the sk_receive_queue lock, so udp_rmem_release() needs to acquire
both locks when the forward deficit is flushed.

On specific scenarios we can end up acquiring and releasing the
sk_receive_queue lock multiple times; that will be covered by
the next patch

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16 15:41:29 -04:00
Paolo Abeni
65101aeca5 net/sock: factor out dequeue/peek with offset code
And update __sk_queue_drop_skb() to work on the specified queue.
This will help the udp protocol to use an additional private
rx queue in a later patch.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16 15:41:29 -04:00
Srinivas Pandruvada
138bc7969c iio: hid-sensor-hub: Implement batch mode
HID sensor hubs using Integrated Senor Hub (ISH) has added capability to
support batch mode. This allows host processor to go to sleep for extended
duration, while the sensor hub is storing samples in its internal buffers.

'Commit f4f4673b75 ("iio: add support for hardware fifo")' implements
feature in IIO core to implement such feature. This feature is used in
bmc150-accel-core.c to implement batch mode. This implementation allows
software device buffer watermark to be used as a hint to adjust hardware
FIFO.

But HID sensor hubs don't allow to change internal buffer size of FIFOs.
Instead an additional usage id to set "maximum report latency" is defined.
This allows host to go to sleep upto this latency period without getting
any report. Since there is no ABI to set this latency, a new attribute
"hwfifo_timeout" is added so that user mode can specify a latency.

This change checks presence of usage id to get/set maximum report latency
and if present, it will expose hwfifo_timeout.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2017-05-16 19:44:01 +01:00
Mauro Carvalho Chehab
9bb9a39ce5 ata: update references for libata documentation
The libata documentation is now using ReST. Update references
to it to point to the new place.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-05-16 11:25:59 -04:00
Okash Khawaja
12e84c71b7 tty: export tty_open_by_driver
This exports tty_open_by_driver so that it can be called from other
places inside the kernel. The checks for null file pointer are based on
Alan Cox's patch here:
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1215095.html.
Description below is quoted from it:

"[RFC] tty_port: allow a port to be opened with a tty that has no file handle

    Let us create tty objects entirely in kernel space. Untested proposal to
    show why all the ideas around rewriting half the uart stack are not needed.

    With this a kernel created non file backed tty object could be used to handle
    data, and set terminal modes. Not all ldiscs can cope with this as N_TTY in
    particular has to work back to the fs/tty layer.

    The tty_port code is however otherwise clean of file handles as far as I can
    tell as is the low level tty port write path used by the ldisc, the
    configuration low level interfaces and most of the ldiscs.

    Currently you don't have any exposure to see tty hangups because those are
    built around the file layer. However a) it's a fixed port so you probably
    don't care about that b) if you do we can add a callback and c) you almost
    certainly don't want the userspace tear down/rebuild behaviour anyway.

    This should however be sufficient if we wanted for example to enumerate all
    the bluetooth bound fixed ports via ACPI and make them directly available.

    It doesn't deal with the case of a user opening a port that's also kernel
    opened and that would need some locking out (so it returned EBUSY if bound
    to a kernel device of some kind). That needs resolving along with how you
    "up" or "down" your new bluetooth device, or enumerate it while providing
    the existing tty API to avoid regressions (and to debug)."

The exported funtion is used later in this patch set to gain access to tty_struct.

[changed export symbol level - gkh]

Signed-off-by: Okash Khawaja <okash.khawaja@gmail.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-16 14:04:49 +02:00
Mauro Carvalho Chehab
e1b4fc7add fs: update location of filesystems documentation
The filesystem documentation was moved from DocBook to
Documentation/filesystems/. Update it at the sources.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:44:22 -03:00
Mauro Carvalho Chehab
19285f3c46 ata: update references for libata documentation
The libata documentation is now using ReST. Update references
to it to point to the new place.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:44:20 -03:00
Mauro Carvalho Chehab
b6f6c29454 mtd: adjust kernel-docs to avoid Sphinx/kerneldoc warnings
./drivers/mtd/nand/nand_bbt.c:1: warning: no structured comments found
./include/linux/mtd/nand.h:785: ERROR: Unexpected indentation.
./drivers/mtd/nand/nand_base.c:449: WARNING: Definition list ends without a blank line; unexpected unindent.
./drivers/mtd/nand/nand_base.c:1161: ERROR: Unexpected indentation.
./drivers/mtd/nand/nand_base.c:1162: WARNING: Block quote ends without a blank line; unexpected unindent.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:44:18 -03:00
Mauro Carvalho Chehab
d651983dde net: fix some identation issues at kernel-doc markups
Sphinx is very pedantic with regards to identation and
escape sequences:

  ./include/net/sock.h:1967: ERROR: Unexpected indentation.
  ./include/net/sock.h:1969: ERROR: Unexpected indentation.
  ./include/net/sock.h:1970: WARNING: Block quote ends without a blank line; unexpected unindent.
  ./include/net/sock.h:1971: WARNING: Block quote ends without a blank line; unexpected unindent.
  ./include/net/sock.h:2268: WARNING: Inline emphasis start-string without end-string.
  ./net/core/sock.c:2686: ERROR: Unexpected indentation.
  ./net/core/sock.c:2687: WARNING: Block quote ends without a blank line; unexpected unindent.
  ./net/core/datagram.c:182: WARNING: Inline emphasis start-string without end-string.
  ./include/linux/netdevice.h:1444: ERROR: Unexpected indentation.
  ./drivers/net/phy/phy.c:381: ERROR: Unexpected indentation.
  ./drivers/net/phy/phy.c:382: WARNING: Block quote ends without a blank line; unexpected unindent.

- Fix spacing where needed;
- Properly escape constants;
- Use a literal block for a race description.

No functional changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:44:14 -03:00
Mauro Carvalho Chehab
771b00a84b net: skbuff.h: properly escape a macro name on kernel-doc
The "%" escape code of kernel-doc only handle letters. It
doesn't handle special chars. So, use the ``literal``
notation. That fixes this warning:

./include/linux/skbuff.h:2695: WARNING: Inline literal start-string without end-string.

No functional changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:44:13 -03:00
Mauro Carvalho Chehab
7b4ff1adb5 mutex, futex: adjust kernel-doc markups to generate ReST
There are a few issues on some kernel-doc markups that was
causing troubles with kernel-doc output on ReST format:

./kernel/futex.c:492: WARNING: Inline emphasis start-string without end-string.
./kernel/futex.c:1264: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:1721: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2338: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2426: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2899: WARNING: Block quote ends without a blank line; unexpected unindent.
./kernel/futex.c:2972: WARNING: Block quote ends without a blank line; unexpected unindent.

Fix them.

No functional changes.

Acked-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:43:25 -03:00
Gao Feng
c953d63548 ebtables: arpreply: Add the standard target sanity check
The info->target comes from userspace and it would be used directly.
So we need to add the sanity check to make sure it is a valid standard
target, although the ebtables tool has already checked it. Kernel needs
to validate anything coming from userspace.

If the target is set as an evil value, it would break the ebtables
and cause a panic. Because the non-standard target is treated as one
offset.

Now add one helper function ebt_invalid_target, and we would replace
the macro INVALID_TARGET later.

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-16 10:24:27 +02:00
Anup Patel
b5dceda1f7 lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position
The raid6_gfexp table represents {2}^n values for 0 <= n < 256. The
Linux async_tx framework pass values from raid6_gfexp as coefficients
for each source to prep_dma_pq() callback of DMA channel with PQ
capability. This creates problem for RAID6 offload engines (such as
Broadcom SBA) which take disk position (i.e. log of {2}) instead of
multiplicative cofficients from raid6_gfexp table.

This patch adds raid6_gflog table having log-of-2 value for any given
x such that 0 <= x < 256. For any given disk coefficient x, the
corresponding disk position is given by raid6_gflog[x]. The RAID6
offload engine driver can use this newly added raid6_gflog table to
get disk position from multiplicative coefficient.

Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Acked-by: Shaohua Li <shli@fb.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2017-05-16 10:01:57 +05:30
Linus Torvalds
a95cfad947 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Track alignment in BPF verifier so that legitimate programs won't be
    rejected on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS architectures.

 2) Make tail calls work properly in arm64 BPF JIT, from Deniel
    Borkmann.

 3) Make the configuration and semantics Generic XDP make more sense and
    don't allow both generic XDP and a driver specific instance to be
    active at the same time. Also from Daniel.

 4) Don't crash on resume in xen-netfront, from Vitaly Kuznetsov.

 5) Fix use-after-free in VRF driver, from Gao Feng.

 6) Use netdev_alloc_skb_ip_align() to avoid unaligned IP headers in
    qca_spi driver, from Stefan Wahren.

 7) Always run cleanup routines in BPF samples when we get SIGTERM, from
    Andy Gospodarek.

 8) The mdio phy code should bring PHYs out of reset using the shared
    GPIO lines before invoking bus->reset(). From Florian Fainelli.

 9) Some USB descriptor access endian fixes in various drivers from
    Johan Hovold.

10) Handle PAUSE advertisements properly in mlx5 driver, from Gal
    Pressman.

11) Fix reversed test in mlx5e_setup_tc(), from Saeed Mahameed.

12) Cure netdev leak in AF_PACKET when using timestamping via control
    messages. From Douglas Caetano dos Santos.

13) netcp doesn't support HWTSTAMP_FILTER_ALl, reject it. From Miroslav
    Lichvar.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  ldmvsw: stop the clean timer at beginning of remove
  ldmvsw: unregistering netdev before disable hardware
  net: netcp: fix check of requested timestamping filter
  ipv6: avoid dad-failures for addresses with NODAD
  qed: Fix uninitialized data in aRFS infrastructure
  mdio: mux: fix device_node_continue.cocci warnings
  net/packet: fix missing net_device reference release
  net/mlx4_core: Use min3 to select number of MSI-X vectors
  macvlan: Fix performance issues with vlan tagged packets
  net: stmmac: use correct pointer when printing normal descriptor ring
  net/mlx5: Use underlay QPN from the root name space
  net/mlx5e: IPoIB, Only support regular RQ for now
  net/mlx5e: Fix setup TC ndo
  net/mlx5e: Fix ethtool pause support and advertise reporting
  net/mlx5e: Use the correct pause values for ethtool advertising
  vmxnet3: ensure that adapter is in proper state during force_close
  sfc: revert changes to NIC revision numbers
  net: ch9200: add missing USB-descriptor endianness conversions
  net: irda: irda-usb: fix firmware name on big-endian hosts
  net: dsa: mv88e6xxx: add default case to switch
  ...
2017-05-15 15:50:49 -07:00
Cyrille Pitchen
fe488a5e48 mtd: spi-nor: introduce Octo SPI protocols
This patch starts adding support to Octo SPI protocols (SPI x-y-8).

Op codes for Fast Read and/or Page Program operations using Octo SPI
protocols are not known yet (no JEDEC specification has defined them yet)
but we'd rather introduce the Octo SPI protocols now so it's done as it
should be.

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reviewed-by: Marek Vasut <marek.vasut@gmail.com>
2017-05-15 21:56:17 +02:00
Cyrille Pitchen
15f5533152 mtd: spi-nor: introduce Double Transfer Rate (DTR) SPI protocols
This patch introduces support to Double Transfer Rate (DTR) SPI protocols.
DTR is used only for Fast Read operations.

According to manufacturer datasheets, whatever the number of I/O lines
used during instruction (x) and address/mode/dummy (y) clock cycles, DTR
is used only during data (z) clock cycles of SPI x-y-z protocols.

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reviewed-by: Marek Vasut <marek.vasut@gmail.com>
2017-05-15 21:56:17 +02:00
Cyrille Pitchen
cfc5604c48 mtd: spi-nor: introduce SPI 1-2-2 and SPI 1-4-4 protocols
This patch changes the prototype of spi_nor_scan(): its 3rd parameter
is replaced by a 'struct spi_nor_hwcaps' pointer, which tells the spi-nor
framework about the actual hardware capabilities supported by the SPI
controller and its driver.

Besides, this patch also introduces a new 'struct spi_nor_flash_parameter'
telling the spi-nor framework about the hardware capabilities supported by
the SPI flash memory and the associated settings required to use those
hardware caps.

Then, to improve the readability of spi_nor_scan(), the discovery of the
memory settings and the memory initialization are now split into two
dedicated functions.

1 - spi_nor_init_params()

The spi_nor_init_params() function is responsible for initializing the
'struct spi_nor_flash_parameter'. Currently this structure is filled with
legacy values but further patches will allow to override some parameter
values dynamically, for instance by reading the JESD216 Serial Flash
Discoverable Parameter (SFDP) tables from the SPI memory.
The spi_nor_init_params() function only deals with the hardware
capabilities of the SPI flash memory: especially it doesn't care about
the hardware capabilities supported by the SPI controller.

2 - spi_nor_setup()

The second function is called once the 'struct spi_nor_flash_parameter'
has been initialized by spi_nor_init_params().
With both 'struct spi_nor_flash_parameter' and 'struct spi_nor_hwcaps',
the new argument of spi_nor_scan(), spi_nor_setup() computes the best
match between hardware caps supported by both the (Q)SPI memory and
controller hence selecting the relevant settings for (Fast) Read and Page
Program operations.

Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reviewed-by: Marek Vasut <marek.vasut@gmail.com>
2017-05-15 21:56:17 +02:00
Christoph Hellwig
e9679189e3 sunrpc: mark all struct svc_version instances as const
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15 17:42:31 +02:00
Christoph Hellwig
860bda29b9 sunrpc: mark all struct svc_procinfo instances as const
struct svc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:31 +02:00
Christoph Hellwig
7fd38af9ca sunrpc: move pc_count out of struct svc_procinfo
pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:30 +02:00
Christoph Hellwig
35f297e537 sunrpc: remove kxdrproc_t
Remove the now unused typedef.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:26 +02:00
Christoph Hellwig
63f8de3795 sunrpc: properly type pc_encode callbacks
Drop the resp argument as it can trivially be derived from the rqstp
argument.  With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15 17:42:25 +02:00
Christoph Hellwig
026fec7e7c sunrpc: properly type pc_decode callbacks
Drop the argp argument as it can trivially be derived from the rqstp
argument.  With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:24 +02:00
Christoph Hellwig
8537488b5a sunrpc: properly type pc_release callbacks
Drop the p and resp arguments as they are always NULL or can trivially
be derived from the rqstp argument.  With that all functions now have the
same prototype, and we can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:23 +02:00
Christoph Hellwig
a6beb73272 sunrpc: properly type pc_func callbacks
Drop the argp and resp arguments as they can trivially be derived from
the rqstp argument.  With that all functions now have the same prototype,
and we can remove the unsafe casting to svc_procfunc as well as the
svc_procfunc typedef itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:23 +02:00
Christoph Hellwig
499b498810 sunrpc: mark all struct rpc_procinfo instances as const
struct rpc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15 17:42:20 +02:00
Christoph Hellwig
1c5876ddbd sunrpc: move p_count out of struct rpc_procinfo
p_count is the only writeable memeber of struct rpc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct rpc_procinfo, and into a
separate writable array that is pointed to by struct rpc_version and
indexed by p_statidx.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:18 +02:00
Christoph Hellwig
73c8dc133a sunrpc: properly type argument to kxdrdproc_t
Pass struct rpc_request as the first argument instead of an untyped blob.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15 17:42:12 +02:00
Christoph Hellwig
c512f36b58 sunrpc: properly type argument to kxdreproc_t
Pass struct rpc_request as the first argument instead of an untyped blob,
and mark the data object as const.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2017-05-15 17:42:07 +02:00
Andy Shevchenko
14bebd01c5 dmaengine: dw: Remove AVR32 bits from the driver
AVR32 is gone. Now it's time to clean up the driver by removing
leftovers that was used by AVR32 related code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-05-15 17:07:30 +02:00
Thomas Petazzoni
cc0f51ec11 mtd: nand: export nand_{read,write}_page_raw()
The nand_read_page_raw() and nand_write_page_raw() functions might be
re-used by vendor-specific implementations of the read_page/write_page
functions. Instead of having vendor-specific code duplicate this code,
it is much better to export those functions and allow them to be
re-used.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-05-15 13:18:28 +02:00
Thomas Petazzoni
785818fa83 mtd: nand: add core support for on-die ECC
A number of NAND flashes have a capability called "on-die ECC" where the
NAND chip itself is capable of detecting and correcting errors.

Linux already has support for using the ECC implementation of the NAND
controller, or a software based ECC implementation, but not for using
the ECC implementation of the NAND controller. However, such an
implementation is sometimes useful in situations where the NAND
controller provides ECC algorithms that are not strong enough for the
NAND chip used on the system. A typical case is a NAND chip that
requires a 4-bit ECC, while the NAND controller only provides a 1-bit
ECC algorithm.

This commit introduces the support for the NAND_ECC_ON_DIE ECC mode:

 - Parsing of the "on-die" value for the "nand-ecc-mode" Device Tree
   property

 - Handling NAND_ECC_ON_DIE case in nand_scan_tail(). The idea is that
   the vendor specific code for the NAND chip must implement
   ->read_page() and ->write_page(). It may optionally provide its own
   ->read_page_raw() and ->write_page_raw() as well. For OOB operation,
   we assume the standard operations are good enough, but they can be
   overridden by the vendor specific code if needed.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-05-15 13:18:26 +02:00
Willem de Bruijn
324318f024 netfilter: xtables: zero padding in data_to_user
When looking up an iptables rule, the iptables binary compares the
aligned match and target data (XT_ALIGN). In some cases this can
exceed the actual data size to include padding bytes.

Before commit f77bc5b23f ("iptables: use match, target and data
copy_to_user helpers") the malloc()ed bytes were overwritten by the
kernel with kzalloced contents, zeroing the padding and making the
comparison succeed. After this patch, the kernel copies and clears
only data, leaving the padding bytes undefined.

Extend the clear operation from data size to aligned data size to
include the padding bytes, if any.

Padding bytes can be observed in both match and target, and the bug
triggered, by issuing a rule with match icmp and target ACCEPT:

  iptables -t mangle -A INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
  iptables -t mangle -D INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT

Fixes: f77bc5b23f ("iptables: use match, target and data copy_to_user helpers")
Reported-by: Paul Moore <pmoore@redhat.com>
Reported-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-05-15 12:51:38 +02:00
Peter Zijlstra
c743f0a5c5 sched/fair, cpumask: Export for_each_cpu_wrap()
More users for for_each_cpu_wrap() have appeared. Promote the construct
to generic cpumask interface.

The implementation is slightly modified to reduce arguments.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Lauro Ramos Venancio <lvenanci@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: lwang@redhat.com
Link: http://lkml.kernel.org/r/20170414122005.o35me2h5nowqkxbv@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15 10:15:23 +02:00
Peter Zijlstra
2e44b7ddf8 sched/clock: Use late_initcall() instead of sched_init_smp()
Core2 marks its TSC unstable in ACPI Processor Idle, which is probed
after sched_init_smp(). Luckily it appears both acpi_processor and
intel_idle (which has a similar check) are mandatory built-in.

This means we can delay switching to stable until after these drivers
have ran (if they were modules, this would be impossible).

Delay the stable switch to late_initcall() to allow these drivers to
mark TSC unstable and avoid difficult stable->unstable transitions.

Reported-by: Lofstedt, Marta <marta.lofstedt@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15 10:15:21 +02:00
Peter Zijlstra
ac1e843f09 sched/clock: Remove unused argument to sched_clock_idle_wakeup_event()
The argument to sched_clock_idle_wakeup_event() has not been used in a
long time. Remove it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15 10:15:18 +02:00
Peter Zijlstra
b421b22b00 x86/tsc, sched/clock, clocksource: Use clocksource watchdog to provide stable sync points
Currently we keep sched_clock_tick() active for stable TSC in order to
keep the per-CPU state semi up-to-date. The (obvious) problem is that
by the time we detect TSC is borked, our per-CPU state is also borked.

So hook into the clocksource watchdog and call a method after we've
found it to still be stable.

There's the obvious race where the TSC goes wonky between finding it
stable and us running the callback, but closing that is too much work
and not really worth it, since we're already detecting TSC wobbles
after the fact, so we cannot, per definition, fully avoid funny clock
values.

And since the watchdog runs less often than the tick, this is also an
optimization.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-15 10:15:18 +02:00
Ilan Tayari
e29341fb3a net/mlx5: FPGA, Add basic support for Innova
Mellanox Innova is a NIC with ConnectX and an FPGA on the same
board. The FPGA is a bump-on-the-wire and thus affects operation of
the mlx5_core driver on the ConnectX ASIC.

Add basic support for Innova in mlx5_core.

This allows using the Innova card as a regular NIC, by detecting
the FPGA capability bit, and verifying its load state before
initializing ConnectX interfaces.

Also detect FPGA fatal runtime failures and enter error state if
they ever happen.

All new FPGA-related logic is placed in its own subdirectory 'fpga',
which may be built by selecting CONFIG_MLX5_FPGA.
This prepares for further support of various Innova features in later
patchsets.
Additional details about hardware architecture will be provided as
more features get submitted.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14 14:24:17 +03:00
Ilan Tayari
0179720d6b net/mlx5: Introduce trigger_health_work function
Introduce new function for entering bad-health state.

This function will be called from FPGA-related logic in a later patch from
asynchronous event (IRQ) context, for that we change the spin lock to an
IRQ-safe one.

Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14 14:22:10 +03:00