Commit graph

791450 commits

Author SHA1 Message Date
Alexander Shishkin
39f10239df stm class: p_sys-t: Add support for CLOCKSYNC packets
This adds support for CLOCKSYNC SyS-T packets, that establish correlation
between the transport clock (STP timestamps) and SyS-T timestamps. These
packets are sent periodically to allow the decoder to keep both time
sources in sync.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
d69d5e8311 stm class: Add MIPI SyS-T protocol support
This adds support for MIPI SyS-T protocol as specified in an open
standard [1]. In addition to marking message boundaries, it also
supports tagging messages with the source UUID, to provide better
distinction between trace sources, including payload length and
timestamp in the message's metadata.

This driver adds attributes to STP policy nodes to control/configure
these metadata features.

[1] https://www.mipi.org/specifications/sys-t

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
24c7bcb6a7 stm class: Switch over to the protocol driver
Now that the default framing protocol is factored out into its own driver,
switch over to using the driver for writing data. To that end, make the
policy code require a valid protocol name (or absence thereof, which is
equivalent to "p_basic").

Also, to make transition easier, make stm class request "p_basic" module
at initialization time.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
a02509f301 stm class: Factor out default framing protocol
The STP framing pattern that the stm class implicitly applies to the
data payload is, in fact, a protocol. This patch moves the relevant code
out of the stm core into its own driver module.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
d279a38020 stm class: Add a helper for writing data packets
Add a helper to write a sequence of bytes as STP data packets. This
is used by protocol drivers to output their metadata, as well as the
actual data payload.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
c7fd62bc69 stm class: Introduce framing protocol drivers
At the moment, the stm class applies a certain STP framing pattern to
the data as it is written to the underlying STM device. In order to
allow different framing patterns (aka protocols), this patch introduces
the concept of STP protocol drivers, defines data structures and APIs
for the protocol drivers to use.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
e967b8bdd4 stm class: Clean up stp_configfs_init
Minor code shortening, no functional changes.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
25e3c0062a stm class: Clarify configfs root type/operations names
The current naming of stp-policy root type and group ops is confusing,
rename them for better readability.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Alexander Shishkin
cb6102bd99 stm class: Rework policy node fallback
Currently, if no matching policy node can be found for a trace source,
we'll try to use "default" policy node, then, if that doesn't exist,
we'll pick the first node, in order of creation. If that also fails,
we'll allocate M/C range from the beginning of the device's M/C range.

This makes it difficult to know which node (if any) was used in any
particular case.

In order to make things more deterministic, the new order is as follows:
  * if they supply ID string, use that and nothing else,
  * if they are a task, use their task name (comm),
  * use "default", if it exists,
  * return failure, to let them know there is no suitable rule.

This should provide enough convenience with the "default" catch-all node,
while not leaving *everything* to chance. As a side effect, this relaxes
the requirement of using ioctl() for identification with the possibility of
using task names as policy nodes.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-11 12:12:54 +02:00
Pablo Neira Ayuso
d701d81172 netfilter: nft_compat: do not dump private area
Zero pad private area, otherwise we expose private kernel pointer to
userspace. This patch also zeroes the tail area after the ->matchsize
and ->targetsize that results from XT_ALIGN().

Fixes: 0ca743a559 ("netfilter: nf_tables: add compatibility layer for x_tables")
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:53 +02:00
Taehee Yoo
18c0ab8736 netfilter: xt_TEE: add missing code to get interface index in checkentry.
checkentry(tee_tg_check) should initialize priv->oif from dev if possible.
But only netdevice notifier handler can set that.
Hence priv->oif is always -1 until notifier handler is called.

Fixes: 9e2f6c5d78 ("netfilter: Rework xt_TEE netdevice notifier")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Taehee Yoo
f24d2d4f95 netfilter: xt_TEE: fix wrong interface selection
TEE netdevice notifier handler checks only interface name. however
each netns can have same interface name. hence other netns's interface
could be selected.

test commands:
   %ip netns add vm1
   %iptables -I INPUT -p icmp -j TEE --gateway 192.168.1.1 --oif enp2s0
   %ip link set enp2s0 netns vm1

Above rule is in the root netns. but that rule could get enp2s0
ifindex of vm1 by notifier handler.

After this patch, TEE rule is added to the per-netns list.

Fixes: 9e2f6c5d78 ("netfilter: Rework xt_TEE netdevice notifier")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Fernando Fernandez Mancera
4a3e71b7b7 netfilter: nft_osf: usage from output path is not valid
The nft_osf extension, like xt_osf, is not supported from the output
path.

Fixes: b96af92d6e ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Pablo Neira Ayuso
3b18d5eba4 netfilter: nft_set_rbtree: allow loose matching of closing element in interval
Allow to find closest matching for the right side of an interval (end
flag set on) so we allow lookups in inner ranges, eg. 10-20 in 5-25.

Fixes: ba0e4d9917 ("netfilter: nf_tables: get set elements via netlink")
Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-11 11:29:14 +02:00
Ingo Molnar
0c373344b5 sched/completions/Documentation: Clean up the document some more
Refresh the document:

 - Remove unnecessary liguistic complexity and improve the clarity of the text

 - Improve the explanations all around

 - Remove unnecessary and stale version info

 - Fix whitespace noise

 - Make pseudo-code match kernel style

 - Fix minor syntax errors in pseudo-code

 - Use consistent denotation

 - Mark multi-CPU sequences more explicitly

 - Unbreak line breaks

 - Use quotes to refer to 'struct completion'

 - Use 'IRQ context' and 'IRQs' consistently

 - Improve grammar

 - etc.

Cc: John Garry <john.garry@huawei.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicholas Mc Guire <hofrat@osadl.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: corbet@lwn.net
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1539183392-239389-1-git-send-email-john.garry@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-11 10:36:23 +02:00
John Garry
7b6abce7e1 sched/completions/Documentation: Fix a couple of punctuation nits
This patch fixes a couple of punctuation nits which can make the document
more correct and readable.

Also missing "()" are added to some function references for consistency.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: corbet@lwn.net
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1539183392-239389-1-git-send-email-john.garry@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-11 10:36:22 +02:00
Anders Roxell
ef4ab8447a selftests: bpf: install script with_addr.sh
When test_flow_dissector.sh runs it complains that it can't find script
with_addr.sh:

./test_flow_dissector.sh: line 81: ./with_addr.sh: No such file or
directory

Rework so that with_addr.sh gets installed, add it to
TEST_PROGS_EXTENDED variable.

Fixes: 50b3ed57de ("selftests/bpf: test bpf flow dissection")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-11 10:35:45 +02:00
Anders Roxell
d3c72d7a20 selftests: bpf: add config fragment LWTUNNEL
When test_lwt_seg6local.sh was added commit c99a84eac0
("selftests/bpf: test for seg6local End.BPF action") config fragment
wasn't added, and without CONFIG_LWTUNNEL enabled we see this:
Error: CONFIG_LWTUNNEL is not enabled in this kernel.
selftests: test_lwt_seg6local [FAILED]

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-11 10:33:17 +02:00
Jiri Olsa
c85061657e bpftool: Allow add linker flags via EXTRA_LDFLAGS variable
Adding EXTRA_LDFLAGS allowing user to specify extra flags
for LD_FLAGS variable. Also adding LDFLAGS to build command
line.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-11 10:24:53 +02:00
Jiri Olsa
0ef6bf39f0 bpftool: Allow to add compiler flags via EXTRA_CFLAGS variable
Adding EXTRA_CFLAGS allowing user to specify extra flags
for CFLAGS variable.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-11 10:24:52 +02:00
Björn Töpel
cee271678d xsk: do not call synchronize_net() under RCU read lock
The XSKMAP update and delete functions called synchronize_net(), which
can sleep. It is not allowed to sleep during an RCU read section.

Instead we need to make sure that the sock sk_destruct (xsk_destruct)
function is asynchronously called after an RCU grace period. Setting
the SOCK_RCU_FREE flag for XDP sockets takes care of this.

Fixes: fbfc504a24 ("bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-11 10:19:01 +02:00
David S. Miller
b7ec45a868 Merge branch 'hns3-next'
Salil Mehta says:

====================
Adds support of RSS to HNS3 Driver for Rev 2(=0x21) H/W

This patch-set mainly adds new additions related to RSS for the new
hardware Revision 0x21. It also adds support to use RSS hash value
provided by the hardware along with descriptor.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Peng Li
232fc64b6e net: hns3: Add HW RSS hash information to RX skb
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Jian Shen
d97b307213 net: hns3: Add RSS tuples support for VF
This patch adds RSS tuple support for VF in revision
0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Jian Shen
374ad29176 net: hns3: Add RSS general configuration support for VF
This patch adds RSS key, hash algorithm configuration support
for VF in revision 0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Jian Shen
775501a1aa net: hns3: Add new RSS hash algorithm support for PF
This patch adds ETH_RSS_HASH_XOR hash algorithm supports, which
is supported by hw revision 0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:59:08 -07:00
Giacinto Cifelli
4f7617705b qmi_wwan: Added support for Gemalto's Cinterion ALASxx WWAN interface
Added support for Gemalto's Cinterion ALASxx WWAN interfaces
by adding QMI_FIXED_INTF with Cinterion's VID and PID.

Signed-off-by: Giacinto Cifelli <gciofono@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:57:23 -07:00
Parthasarathy Bhuvaragan
e7eb058238 tipc: queue socket protocol error messages into socket receive buffer
In tipc_sk_filter_rcv(), when we detect protocol messages with error we
call tipc_sk_conn_proto_rcv() and let it reset the connection and notify
the socket by calling sk->sk_state_change().

However, tipc_sk_filter_rcv() may have been called from the function
tipc_backlog_rcv(), in which case the socket lock is held and the socket
already awake. This means that the sk_state_change() call is ignored and
the error notification lost. Now the receive queue will remain empty and
the socket sleeps forever.

In this commit, we convert the protocol message into a connection abort
message and enqueue it into the socket's receive queue. By this addition
to the above state change we cover all conditions.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:56:07 -07:00
Jon Maloy
047491ea33 tipc: set link tolerance correctly in broadcast link
In the patch referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
affected links.

However, the 'tolerance' field of the broadcast link is never set, and
remains at zero. This renders the whole commit without the intended
improving effect, but luckily also with no negative effect.

In this commit we add the missing initialization.

Fixes: a4dc70d46c ("tipc: extend link reset criteria for stale packet retransmission")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:56:07 -07:00
Wei Yongjun
9047fa5d32 phy: phy-ocelot-serdes: fix return value check in serdes_probe()
In case of error, the function syscon_node_to_regmap() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Fixes: 51f6b410fc ("phy: add driver for Microsemi Ocelot SerDes muxing")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:54:26 -07:00
David S. Miller
302d20e57a Merge branch 'net-dsa-bcm_sf2-Couple-of-fixes'
Florian Fainelli says:

====================
net: dsa: bcm_sf2: Couple of fixes

Here are two fixes for the bcm_sf2 driver that were found during testing
unbind and analysing another issue during system suspend/resume.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:53:03 -07:00
Florian Fainelli
abd01ba2f7 net: dsa: bcm_sf2: Call setup during switch resume
There is no reason to open code what the switch setup function does, in
fact, because we just issued a switch reset, we would make all the
register get their default values, including for instance, having unused
port be enabled again and wasting power and leading to an inappropriate
switch core clock being selected.

Fixes: 8cfa94984c ("net: dsa: bcm_sf2: add suspend/resume callbacks")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:53:03 -07:00
Florian Fainelli
448765e1cf net: dsa: bcm_sf2: Fix unbind ordering
The order in which we release resources is unfortunately leading to bus
errors while dismantling the port. This is because we set
priv->wol_ports_mask to 0 to tell bcm_sf2_sw_suspend() that it is now
permissible to clock gate the switch. Later on, when dsa_slave_destroy()
comes in from dsa_unregister_switch() and calls
dsa_switch_ops::port_disable, we perform the same dismantling again, and
this time we hit registers that are clock gated.

Make sure that dsa_unregister_switch() is the first thing that happens,
which takes care of releasing all user visible resources, then proceed
with clock gating hardware. We still need to set priv->wol_ports_mask to
0 to make sure that an enabled port properly gets disabled in case it
was previously used as part of Wake-on-LAN.

Fixes: d9338023fb ("net: dsa: bcm_sf2: Make it a real platform device driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:53:02 -07:00
Eric Dumazet
f98ebd47fd net: sched: avoid writing on noop_qdisc
While noop_qdisc.gso_skb and noop_qdisc.skb_bad_txq are not used
in other places, it seems not correct to overwrite their fields
in dev_init_scheduler_queue().

noop_qdisc is essentially a shared and read-only object, even if
it is not marked as const because of some implementation detail.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:49:16 -07:00
David Ahern
d8a66aa254 net/mpls: Implement handler for strict data checking on dumps
Without CONFIG_INET enabled compiles fail with:

net/mpls/af_mpls.o: In function `mpls_dump_routes':
af_mpls.c:(.text+0xed0): undefined reference to `ip_valid_fib_dump_req'

The preference is for MPLS to use the same handler as ipv4 and ipv6
to allow consistency when doing a dump for AF_UNSPEC which walks
all address families invoking the route dump handler. If INET is
disabled then fallback to an MPLS version which can be tighter on
the data checks.

Fixes: e8ba330ac0 ("rtnetlink: Update fib dumps for strict data checking")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:46:17 -07:00
David S. Miller
28b6bfebdd Merge branch 'net-ipv4-fixes-for-PMTU-when-link-MTU-changes'
Sabrina Dubroca says:

====================
net: ipv4: fixes for PMTU when link MTU changes

The first patch adapts the changes that commit e9fa1495d7 ("ipv6:
Reflect MTU changes on PMTU of exceptions for MTU-less routes") did in
IPv6 to IPv4: lower PMTU when the first hop's MTU drops below it, and
raise PMTU when the first hop was limiting PMTU discovery and its MTU
is increased.

The second patch fixes bugs introduced in commit d52e5a7e7c ("ipv4:
lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu") that
only appear once the first patch is applied.

Selftests for these cases were introduced in net-next commit
e44e428f59 ("selftests: pmtu: add basic IPv4 and IPv6 PMTU tests")

v2: add cover letter, and fix a few small things in patch 1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:44:47 -07:00
Sabrina Dubroca
28d35bcdd3 net: ipv4: don't let PMTU updates increase route MTU
When an MTU update with PMTU smaller than net.ipv4.route.min_pmtu is
received, we must clamp its value. However, we can receive a PMTU
exception with PMTU < old_mtu < ip_rt_min_pmtu, which would lead to an
increase in PMTU.

To fix this, take the smallest of the old MTU and ip_rt_min_pmtu.

Before this patch, in case of an update, the exception's MTU would
always change. Now, an exception can have only its lock flag updated,
but not the MTU, so we need to add a check on locking to the following
"is this exception getting updated, or close to expiring?" test.

Fixes: d52e5a7e7c ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:44:46 -07:00
Sabrina Dubroca
af7d6cce53 net: ipv4: update fnhe_pmtu when first hop's MTU changes
Since commit 5aad1de5ea ("ipv4: use separate genid for next hop
exceptions"), exceptions get deprecated separately from cached
routes. In particular, administrative changes don't clear PMTU anymore.

As Stefano described in commit e9fa1495d7 ("ipv6: Reflect MTU changes
on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
the local MTU change can become stale:
 - if the local MTU is now lower than the PMTU, that PMTU is now
   incorrect
 - if the local MTU was the lowest value in the path, and is increased,
   we might discover a higher PMTU

Similarly to what commit e9fa1495d7 did for IPv6, update PMTU in those
cases.

If the exception was locked, the discovered PMTU was smaller than the
minimal accepted PMTU. In that case, if the new local MTU is smaller
than the current PMTU, let PMTU discovery figure out if locking of the
exception is still needed.

To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
notifier. By the time the notifier is called, dev->mtu has been
changed. This patch adds the old MTU as additional information in the
notifier structure, and a new call_netdevice_notifiers_u32() function.

Fixes: 5aad1de5ea ("ipv4: use separate genid for next hop exceptions")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:44:46 -07:00
Mike Rapoport
7abab7b9b4 net/ipv6: stop leaking percpu memory in fib6 info
The fib6_info_alloc() function allocates percpu memory to hold per CPU
pointers to rt6_info, but this memory is never freed. Fix it.

Fixes: a64efe142f ("net/ipv6: introduce fib6_info struct and helpers")
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:42:54 -07:00
David S. Miller
83b59b46c8 Merge branch 'fore200e-DMA-cleanups-and-fixes'
Christoph Hellwig says:

====================
fore200e DMA cleanups and fixes

The fore200e driver came up during some dma-related audits, so
here is the fallout.  Compile tested (x86 & sparc) only.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Christoph Hellwig
1d9d8be917 fore200e: check for dma mapping failures
The driver was lacking any handling for failures from the DMA mapping
routines.  With an iommu or swiotlb this can be fatal.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Christoph Hellwig
0e21b2258a fore200e: don't use GFP_DMA
The driver properly uses the DMA mapping API, so it should not
pointlessly dip into the GFP_DMA pool, which is only 16MB on x86.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Christoph Hellwig
1335d6fd65 fore200e: devirtualize dma alloc calls
There is no need for an indirection before calling the dma alloc
routines now that we store a struct device in struct fore200e.

Also remove the pointless GFP_ATOMIC for the sbus case, and fix the
up the error handling by removing the 0 dma_addr test - some iommus
can return 0 as a perfectly valid bus address.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Christoph Hellwig
f3fadcb564 fore200e: devirtualize dma mapping calls
There is no need for an indirection before calling the dma mapping
routines now that we store a struct device in struct fore200e.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Christoph Hellwig
8b08adbd87 fore200e: remove the align_size field of struct chunk
There is no need for this field, as the only user of it can just use
the local size variable instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Christoph Hellwig
aff9d262fb fore200e: store a struct device in struct fore200e
This can be used much better than the untyped void pointer containing
either a PCI or platform device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Christoph Hellwig
0efe552389 fore200e: simplify fore200e_bus usage
There is no need to have a global array of the ops, instead PCI and sbus
can have their own instances assigned in *_probe.  Also switch to C99
initializers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:38:50 -07:00
Wang Li
4b035271fe net: tun: remove useless codes of tun_automq_select_queue
Because the function __skb_get_hash_symmetric always returns non-zero.

Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Wang Li <wangli39@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:35:22 -07:00
Jason Wang
0c465be183 virtio_net: ethtool tx napi configuration
Implement ethtool .set_coalesce (-C) and .get_coalesce (-c) handlers.
Interrupt moderation is currently not supported, so these accept and
display the default settings of 0 usec and 1 frame.

Toggle tx napi through setting tx-frames. So as to not interfere
with possible future interrupt moderation, value 1 means tx napi while
value 0 means not.

Only allow the switching when device is down for simplicity.

Link: https://patchwork.ozlabs.org/patch/948149/
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:34:38 -07:00
David S. Miller
1a21cc507b Merge branch 'nfp-flower-speed-up-stats-update-loop'
Jakub Kicinski says:

====================
nfp: flower: speed up stats update loop

This set from Pieter improves performance of processing FW stats
update notifications.  The FW seems to send those at relatively
high rate (roughly ten per second per flow), therefore if we want
to approach the million flows mark we have to be very careful
about our data structures.

We tried rhashtable for stat updates, but according to our experiments
rhashtable lookup on a u32 takes roughly 60ns on an Xeon E5-2670 v3.
Which translate to a hard limit of 16M lookups per second on this CPU,
and, according to perf record jhash and memcmp account for 60% of CPU
usage on the core handling the updates.

Given that our statistic IDs are already array indices, and considering
each statistic is only 24B in size, we decided to forego the use
of hashtables and use a directly indexed array.  The CPU savings are
considerable.

With the recent improvements in TC core and with our own bottlenecks
out of the way Pieter removes the artificial limit of 128 flows, and
allows the driver to install as many flows as FW supports.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10 22:32:45 -07:00