Commit graph

781771 commits

Author SHA1 Message Date
Ewan D. Milne
20c4515a1a qed: fix spelling mistake "successffuly" -> "successfully"
Trivial fix to spelling mistake in qed_probe message.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:02:05 -07:00
Ivan Khoronzhuk
d0c694fc7b net: ethernet: ti: cpts: break cycle once late ts is matched
The late ts queue can contain a bunch of skbs while hi rate testing,
no need to check all of them if timestamp is already matched.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 00:00:07 -07:00
Petr Machata
4280129838 selftests: forwarding: mirror_gre_nh: Unset rp_filter on host VRF
The mirrored packets arrive at $h3 encapsulated in GRE/IPv4, with IP
address from 192.0.2.128/28 network. However the interface is configured
as a member of 192.0.2.160/28 and there's no route directing traffic
from the former network through that interface. Correspondingly, the RP
filter on the VRF rejects it.

Therefore turn off the VRF's RP filter.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:59:27 -07:00
Rodrigo Siqueira
3a0709928b drm/vkms: Add vblank events simulated by hrtimers
This commit adds regular vblank events simulated through hrtimers, which
is a feature required by VKMS to mimic real hardware. Additionally, all
the vblank event send after pageflip is kept in the atomic_flush
function.

Changes since V1:
 - Compute the vblank timer interval per interruption
 Ville Syrjälä and Daniel Vetter:
 - Removes hardcoded vblank interval to get it from user space

Changes since V2:
 Chris Wilson
 - Removes unnecessary algorithm to compute the next period
 Daniel Vetter:
 - Uses drm_calc_timestamping_constants to get the vblank interval
   instead of calculating it manually
 - Adds disable_vblank helper that turns of crtc
 - Simplifies implementation by using drm_crtc_arm_vblank_event
 - Replaces the code in atomic_begin to atomic_flush
 - Removes unnecessary field in vkms_output

Changes since V3:
 Daniel Vetter:
 - Squash "drm/vkms: Add atomic helpers functions" into the commit that
   handling vblank events simulated by hrtimers

Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/7709bba40782ec06332d57fff337797b272581fc.1531359228.git.rodrigosiqueiramelo@gmail.com
2018-07-12 08:48:48 +02:00
Rodrigo Siqueira
d16489307a drm/vkms: Add connectors helpers
This patch adds the struct drm_connector_helper_funcs with some
necessary hooks. Additionally, it also adds some missing hooks at
drm_connector_funcs.

Changes since V1:
- None
Change since V2:
 Daniel Vetter:
 - Remove vkms_conn_mode_valid

Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/c8ee28b889234e866ef18bce4216385661c48041.1531359228.git.rodrigosiqueiramelo@gmail.com
2018-07-12 08:48:42 +02:00
Eames Trinh
657cd71e8e drm: gma500: Changed __attribute__((packed)) to __packed
Signed-off-by: Eames Trinh <eamestrinh@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710130021.4499-1-eamestrinh@gmail.com
2018-07-12 08:48:25 +02:00
Rodrigo Siqueira
559e50fd34 drm/vkms: Add dumb operations
VKMS currently does not handle dumb data, and as a consequence, it does
not provide mechanisms for handling gem. This commit adds the necessary
support for gem object/handler and the dumb functions.

Changes since V1:
 Daniel Vetter:
 - Add dumb buffer support to the same patchset
Changes since V2:
 Haneen:
 - Add missing gem_free_object_unlocked callback to fix the warning
   "Memory manager not clean during takedown"

Signed-off-by: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/70b7becc91c6a323dbc15cb5fc912cbdfe4ef7d9.1531359228.git.rodrigosiqueiramelo@gmail.com
2018-07-12 08:47:44 +02:00
Greg Kroah-Hartman
c82705c54f FSI fixes and updates:
- Reported build fixes
  - Add configuration of send/echo delayus
  - Object lifetime fix
  - Re-arrange some definitions in preparation for adding the CF master
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbRsEaAAoJEHM62YSLdExee9wP/jK8Y/mZkplr1hobt16BqHs8
 4u+4qgnqlLXWxYhUsfWGhUb+rTzAqyDUPRFnl/tOsAsMGYT2NGfpVmoxDp8+BbGd
 iVkmLvJLa5QhxXIyzcyGMQx9+WfcXnERIDKy76g8bPzW+smVulWyX3jR4iM92P9P
 myqtIbzaRbpOggwXIOLZjyR2+N+iUQl2TnHuN3/06gfrtwP/6z0pn6DwhgUu26Ft
 LUdzwbssqVJ4Tpz14DVNZVH7+6/8t626HCoe1SMi6Cfb5m8ovkDJdjE+blaXStjL
 rv1CNEXbIqEDtrBSu/ESoyuG3YtsON86PMzjl1ARkzw2TErGKZEjl8RKhWZnhN51
 sgWZ124n5f1RXhTZwZ+t1FhavNLlYxz6XPsG+gFu5OG7bh4CrF107SCEw+A15Z0h
 5C2yzRT7ZzcCznEZqwy1YW5btKrL03gyDvncnl3X7WJ0mkLSjwAQqybz1qWsUhnJ
 +qVzeQYag29pBNpClOQ6YX+ml+4hhXtbMMh89udmS7+bkvUkMXH9crYDAPf/ZZIe
 w1QuPnTC+/cCI7G/pS3hGppze/zpNa6yKVXkWaha50SeydLrtx8bMlezv+vKAwZI
 /bLzRaZjrXdbqqC8R4e6TMbIpsYxsVWrKanG8xHAVlKJdHU0SiCRxHzqJcldK75O
 ONLLruhgzt9slyryofng
 =lwbV
 -----END PGP SIGNATURE-----

Merge tag 'fsi-updates-2018-07-12' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/linux-fsi into char-misc-next

Ben writes:

FSI fixes and updates:

 - Reported build fixes
 - Add configuration of send/echo delayus
 - Object lifetime fix
 - Re-arrange some definitions in preparation for adding the CF master
2018-07-12 08:42:09 +02:00
Keith Busch
b6e44b4c74 nvme-pci: fix memory leak on probe failure
The nvme driver specific structures need to be initialized prior to
enabling the generic controller so we can unwind on failure with out
using the reference counting callbacks so that 'probe' and 'remove'
can be symmetric.

The newly added iod_mempool is the only resource that was being
allocated out of order, and a failure there would leak the generic
controller memory. This patch just moves that allocation above the
controller initialization.

Fixes: 943e942e62 ("nvme-pci: limit max IO size and segments to avoid high order allocations")
Reported-by: Weiping Zhang <zwp10758@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-12 08:23:56 +02:00
Russell King
576cd32082 sfp: fix module initialisation with netdev already up
It was been observed that with a particular order of initialisation,
the netdev can be up, but the SFP module still has its TX_DISABLE
signal asserted.  This occurs when the network device brought up before
the SFP kernel module has been inserted by userspace.

This occurs because sfp-bus layer does not hear about the change in
network device state, and so assumes that it is still down.  Set
netdev->sfp when the upstream is registered to work around this problem.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:11:34 -07:00
Russell King
f20a4c46b9 sfp: ensure we clean up properly on bus registration failure
We fail to correctly clean up after a bus registration failure, which
can lead to an incorrect assumption about the registration state of
the upstream or sfp cage.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:11:34 -07:00
David S. Miller
d90a5215c8 Merge branch 'mlxsw-ERSPAN-Take-LACP-state-into-consideration'
Ido Schimmel says:

====================
mlxsw: ERSPAN: Take LACP state into consideration

Petr says:

When offloading mirror-to-gretap, mlxsw needs to preroute the path that
the encapsulated packet will take. That path may include a LAG device
above a front panel port. So far, mlxsw resolved the path to the first
up front panel slave of the LAG interface, but that only reflects
administrative state of the port. It neglects to consider whether the
port actually has a carrier, and what the LACP state is. This patch set
aims to address these problems.

Patch #1 publishes team_port_get_rcu().

Then in patch #2, a new function is introduced,
mlxsw_sp_port_dev_check(). That returns, for a given netdevice that is a
slave of a LAG device, whether that device is "txable", i.e. whether the
LAG master would send traffic through it. Since there's no good place to
put LAG-wide helpers, introduce a new header include/net/lag.h.

Finally in patch #3, fix the slave selection logic to take into
consideration whether a given slave has a carrier and whether it is
txable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:10:20 -07:00
Petr Machata
b5de82f3df mlxsw: spectrum_span: Change LAG lower selection
When offloading mirror-to-gretap, mlxsw needs to preroute the path that
the encapsulated packet will take. That path may include a LAG device
above a front panel port. So far, mlxsw resolved the path to the first
up front panel slave of the LAG interface, but that only reflects
administrative state of the port. It neglects to consider whether the
port actually has a carrier, and what the LACP state is.

So instead of checking upness of the device, check carrier state and
txability.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:10:19 -07:00
Petr Machata
eeed992b77 net: Add lag.h, net_lag_port_dev_txable()
LAG devices (team or bond) recognize for each one of their slave devices
whether LAG traffic is going to be sent through that device. Bond calls
such devices "active", team calls them "txable". When this state
changes, a NETDEV_CHANGELOWERSTATE notification is distributed, together
with a netdev_notifier_changelowerstate_info structure that for LAG
devices includes a tx_enabled flag that refers to the new state. The
notification thus makes it possible to react to the changes in txability
in drivers.

However there's no way to query txability from the outside on demand.
That is problematic namely for mlxsw, which when resolving ERSPAN packet
path, may encounter a LAG device, and needs to determine which of the
slaves it should choose.

To that end, introduce a new function, net_lag_port_dev_txable(), which
determines whether a given slave device is "active" or
"txable" (depending on the flavor of the LAG device). That function then
dispatches to per-LAG-flavor helpers, bond_is_active_slave_dev() resp.
team_port_dev_txable().

Because there currently is no good place where net_lag_port_dev_txable()
should be added, introduce a new header file, lag.h, which should from
now on hold any logic common to both team and bond. (But keep
netif_is_lag_master() together with the rest of netif_is_*_master()
functions).

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:10:19 -07:00
Petr Machata
3443b00e07 team: Publish team_port_get_rcu()
A follow-up patch adds a new entry point, team_port_dev_txable(). Making
it an ordinary exported function would mean that any module that may
need the service in one of the supported configurations also
unconditionally needs to pull in the team module, whether or not the
user actually intends to create team interfaces.

To prevent that, team_port_dev_txable() is defined in if_team.h, and
therefore all dependencies of that function also need to be
publicly-visible.

Therefore move team_port_get_rcu() from team.c to if_team.h.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:10:19 -07:00
Travis Brown
80fd2d6ca5 macvlan: Change status when lower device goes down
Today macvlan ignores the notification when a lower device goes
administratively down, preventing the lack of connectivity from
bubbling up.

Processing NETDEV_DOWN results in a macvlan state of LOWERLAYERDOWN
with NO-CARRIER which should be easy to interpret in userspace.

2: lower: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
3: macvlan@lower: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000

Signed-off-by: Suresh Krishnan <skrishnan@arista.com>
Signed-off-by: Travis Brown <travisb@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:07:22 -07:00
David S. Miller
0e97c4fb18 Merge branch 'tipc-make-link-protocol-more-resilient'
Jon Maloy says:

====================
tipc: make link protocol more resilient

These two commits make the link ptotocol more resilient to
infrastructures with frequent packet duplication and long delays.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:06:14 -07:00
Jon Maloy
7ea817f4e8 tipc: check session number before accepting link protocol messages
In some virtual environments we observe a significant higher number of
packet reordering and delays than we have been used to traditionally.

This makes it necessary with stricter checks on incoming link protocol
messages' session number, which until now only has been validated for
RESET messages.

Since the other two message types, ACTIVATE and STATE messages also
carry this number, it is easy to extend the validation check to those
messages.

We also introduce a flag indicating if a link has a valid peer session
number or not. This eliminates the mixing of 32- and 16-bit arithmethics
we are currently using to achieve this.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:06:14 -07:00
Jon Maloy
9012de5089 tipc: add sequence number check for link STATE messages
Some switch infrastructures produce huge amounts of packet duplicates.
This becomes a problem if those messages are STATE/NACK protocol
messages, causing unnecessary retransmissions of already accepted
packets.

We now introduce a unique sequence number per STATE protocol message
so that duplicates can be identified and ignored. This will also be
useful when tracing such cases, and to avert replay attacks when TIPC
is encrypted.

For compatibility reasons we have to introduce a new capability flag
TIPC_LINK_PROTO_SEQNO to handle this new feature.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:06:14 -07:00
David S. Miller
e32f55f373 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
L2 Fwd Offload & 10GbE Intel Driver Updates 2018-07-09

This patch series is meant to allow support for the L2 forward offload, aka
MACVLAN offload without the need for using ndo_select_queue.

The existing solution currently requires that we use ndo_select_queue in
the transmit path if we want to associate specific Tx queues with a given
MACVLAN interface. In order to get away from this we need to repurpose the
tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
a means of accessing the queues on the lower device. As a result we cannot
offload a device that is configured as multiqueue, however it doesn't
really make sense to configure a macvlan interfaced as being multiqueue
anyway since it doesn't really have a qdisc of its own in the first place.

The big changes in this set are:
  Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
  Disable XPS for single queue devices
  Replace accel_priv with sb_dev in ndo_select_queue
  Add sb_dev parameter to fallback function for ndo_select_queue
  Consolidated ndo_select_queue functions that appeared to be duplicates
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:03:32 -07:00
Deepti Raghavan
4929c9428a tcp: expose both send and receive intervals for rate sample
Congestion control algorithms, which access the rate sample
through the tcp_cong_control function, only have access to the maximum
of the send and receive interval, for cases where the acknowledgment
rate may be inaccurate due to ACK compression or decimation. Algorithms
may want to use send rates and receive rates as separate signals.

Signed-off-by: Deepti Raghavan <deeptir@mit.edu>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:01:56 -07:00
Vlad Buslov
e0479b670d net: sched: fix unprotected access to rcu cookie pointer
Fix action attribute size calculation function to take rcu read lock and
access act_cookie pointer with rcu dereference.

Fixes: eec94fdb04 ("net: sched: use rcu for action cookie update")
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 23:01:02 -07:00
David S. Miller
2368957ab5 Merge branch 'cxgb4-move-stats-fetched-from-firmware-to-debugfs'
Rahul Lakkireddy says:

====================
cxgb4: move stats fetched from firmware to debugfs

Some stats are fetched via slow firmware mailbox, which can cause
packet drops under heavy load. So, this series removes these stats
from ethtool -S and expose them via debugfs.

Patch 1 removes stats fetched via firmware from ethtool -S.
Patch 2 exposes stats removed in Patch 1 via debugfs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:59:39 -07:00
Rahul Lakkireddy
31e5f5c3e9 cxgb4: expose stats fetched from firmware via debugfs
Expose stats obtained from firmware via debugfs. These stats can't
be part of ethtool -S because the slow firmware mailbox can cause
packet drops under heavy load.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:59:38 -07:00
Rahul Lakkireddy
b351b16d8a cxgb4: remove stats fetched from firmware
When running ethtool -S, some stats are requested from firmware.
Since getting these stats via firmware mailbox is slow, some packets
get dropped under heavy load while running ethtool -S.

So, remove these stats from ethtool -S.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:59:38 -07:00
Antoine Tenart
b32b088181 net: mvpp2: explicitly include linux/interrupt.h
The Marvell PPv2 driver uses interrupts and tasklet but does not
explicitly include linux/interrupt.h, relying on implicit includes. This
one particularly is included by chance after a long unlogical chain of
inclusions. Fix this so we do not get future build breaks.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:56:52 -07:00
Jan Dakinevich
fb8ed3af74 cnic: use kvzalloc to allocate memory for csk_tbl
Size of csk_tbl is about 58K, which means 3rd order page allocation.
kvzalloc provides a fallback if no high order memory is available.

Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:55:52 -07:00
Colin Ian King
3c546728df wimax/i2400m: remove redundant variables ack_status, bcf and protocol
Variables ack_status, bcf and protocol are being assigned but are
never used hence they are redundant and can be removed.

Also declare ack_type as unsigned int rather than unsigned to clean
up a checkpatch warning.

Cleans up clang warnings:
warning: variable 'ack_status' set but not used [-Wunused-but-set-variable]
warning: variable 'bcf' set but not used [-Wunused-but-set-variable]
warning: variable 'protocol' set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:54:25 -07:00
Vlad Buslov
01e866bf07 net: sched: act_ife: fix memory leak in ife init
Free params if tcf_idr_check_alloc() returned error.

Fixes: 0190c1d452 ("net: sched: atomically check-allocate action")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:53:00 -07:00
Arjun Vynipadath
8dce04f1fd cxgb4: specify IQTYPE in fw_iq_cmd
congestion argument passed to t4_sge_alloc_rxq() is used
to differentiate between nic/ofld queues.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:52:10 -07:00
David S. Miller
57cd07fbf7 Merge branch 'net-ipv6-addr_gen_mode-fixes'
Sabrina Dubroca says:

====================
net/ipv6: addr_gen_mode fixes

This series fixes bugs in handling of the addr_gen_mode option, mainly
related to the sysctl. A minor netlink issue was also present in the
initial commit introducing the option on a per-netdevice basis.

v2: add patch 4, requested by David Ahern during review of v1
    add patch 5, missing documentation for the sysctl
    patches 1, 2, 3 are unchanged
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:46 -07:00
Sabrina Dubroca
f168db5e25 Documentation: ip-sysctl.txt: document addr_gen_mode
addr_gen_mode was introduced in without documentation, add it now.

Fixes: d35a00b8e3 ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Sabrina Dubroca
f24c5987dd net/ipv6: propagate net.ipv6.conf.all.addr_gen_mode to devices
This aligns the addr_gen_mode sysctl with the expected behavior of the
"all" variant.

Fixes: d35a00b8e3 ("net/ipv6: allow sysctl to change link-local address generation mode")
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Sabrina Dubroca
bdd72f4133 net/ipv6: reserve room for IFLA_INET6_ADDR_GEN_MODE
inet6_ifla6_size() is called to check how much space is needed by
inet6_fill_link_af() and inet6_fill_ifinfo(), both of which include
the IFLA_INET6_ADDR_GEN_MODE attribute. Reserve some room for it.

Fixes: bc91b0f07a ("ipv6: addrconf: implement address generation modes")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Sabrina Dubroca
70c30d76e5 net/ipv6: don't reinitialize ndev->cnf.addr_gen_mode on new inet6_dev
The value has already been copied from this netns's devconf_dflt, it
shouldn't be reset to the global kernel default.

Fixes: d35a00b8e3 ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Sabrina Dubroca
c6dbf7aaa4 net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().

addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.

Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.

addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.

Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().

Fixes: d35a00b8e3 ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Jianbo Liu
5e9a0fe492 net/sched: flower: Fix null pointer dereference when run tc vlan command
Zahari issued tc vlan command without setting vlan_ethtype, which will
crash kernel. To avoid this, we must check tb[TCA_FLOWER_KEY_VLAN_ETH_TYPE]
is not null before use it.
Also we don't need to dump vlan_ethtype or cvlan_ethtype in this case.

Fixes: d64efd0926 ('net/sched: flower: Add supprt for matching on QinQ vlan headers')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reported-by: Zahari Doychev <zahari.doychev@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:48:13 -07:00
Carlos Maiolino
efe8032773 xfs: Initialize variables in xfs_alloc_get_rec before using them
Make sure we initialize *bno and *len, before jumping to out_bad_rec
label, and risk calling xfs_warn() with uninitialized variables.

Coverity: 100898
Coverity: 1437081
Coverity: 1437129
Coverity: 1437191
Coverity: 1437201
Coverity: 1437212
Coverity: 1437341
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:36 -07:00
Eric Sandeen
a4722a643f xfs: remove unused iolock arg from xfs_break_dax_layouts
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:36 -07:00
Brian Foster
bb00b6f1e2 xfs: kill __xfs_buf_submit_common()
Now that there is only one caller, fold the common submission helper
into __xfs_buf_submit().

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:35 -07:00
Brian Foster
6af88cda00 xfs: combine [a]sync buffer submission apis
The buffer I/O submission path consists of separate function calls
per type. The buffer I/O type is already controlled via buffer
state (XBF_ASYNC), however, so there is no real need for separate
submission functions.

Combine the buffer submission functions into a single function that
processes the buffer appropriately based on XBF_ASYNC. Retain an
internal helper with a conditional wait parameter to continue to
support batched !XBF_ASYNC submission/completion required by delwri
queues.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:35 -07:00
Brian Foster
e339dd8d8b xfs: use sync buffer I/O for sync delwri queue submission
If a delwri queue occurs of a buffer that sits on a delwri queue
wait list, the queue sets _XBF_DELWRI_Q without changing the state
of ->b_list. This occurs, for example, if another thread beats the
current delwri waiter thread to the buffer lock after I/O
completion. Once the waiter acquires the lock, it removes the buffer
from the wait list and leaves a buffer with _XBF_DELWRI_Q set but
not populated on a list. This results in a lost buffer submission
and in turn can result in assert failures due to _XBF_DELWRI_Q being
set on buffer reclaim or filesystem lockups if the buffer happens to
cover an item in the AIL.

This problem has been reproduced by repeated iterations of xfs/305
on high CPU count (28xcpu) systems with limited memory (~1GB). Dirty
dquot reclaim races with an xfsaild push of a separate dquot backed
by the same buffer such that the buffer sits on the reclaim wait
list at the time xfsaild attempts to queue it. Since the latter
dquot has been flush locked but the underlying buffer not submitted
for I/O, the dquot pins the AIL and causes the filesystem to
livelock.

This race is essentially made possible by the buffer lock cycle
involved with waiting on a synchronous delwri queue submission.
Close the race by using synchronous buffer I/O for respective delwri
queue submission. This means the buffer remains locked across the
I/O and so is inaccessible from other contexts while in the
intermediate wait list state. The sync buffer I/O wait mechanism is
factored into a helper such that sync delwri buffer submission and
serialization are batched operations.

Designed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:34 -07:00
Brian Foster
eaebb515f1 xfs: refactor buffer submission into a common helper
Sync and async buffer submission both do generally similar things
with a couple odd exceptions. Refactor the core buffer submission
code into a common helper to isolate buffer submission from
completion handling of synchronous buffer I/O.

This patch does not change behavior. It is a step towards support
for using synchronous buffer I/O via synchronous delwri queue
submission.

Designed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:34 -07:00
Brian Foster
5fdd97944e xfs: remove xfs_defer_init() firstblock param
All but one caller of xfs_defer_init() passes in the ->t_firstblock
of the associated transaction. The one outlier is
xlog_recover_process_intents(), which simply passes a dummy value
because a valid pointer is required. This firstblock variable can
simply be removed.

At this point we could remove the xfs_defer_init() firstblock
parameter and initialize ->t_firstblock directly. Even that is not
necessary, however, because ->t_firstblock is automatically
reinitialized in the new transaction on a transaction roll. Since
xfs_defer_init() should never occur more than once on a particular
transaction (since the corresponding finish will roll it), replace
the reinit from xfs_defer_init() with an assert that verifies the
transaction has a NULLFSBLOCK firstblock.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:33 -07:00
Brian Foster
9c3bf5da80 xfs: use ->t_firstblock in inode inactivate
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:32 -07:00
Brian Foster
f537538921 xfs: use ->t_firstblock in extent swap
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:32 -07:00
Brian Foster
381d592848 xfs: use ->t_firstblock in reflink cow block cancel
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:31 -07:00
Brian Foster
fb91f4b5d6 xfs: replace no-op firstblock init with ->t_firstblock
xfs_refcount_recover_cow_leftovers() has no need for a firstblock
variable and so passes an unrelated xfs_fsblock_t to
xfs_defer_init() to avoid declaring one. Replace this no-op
initialization with ->t_firstblock. This will be optimized away by
the removal of the xfs_defer_init() firstblock param.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:31 -07:00
Brian Foster
058529c5f5 xfs: use ->t_firstblock in dq alloc
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:30 -07:00
Brian Foster
64396ff2c2 xfs: remove xfs_alloc_arg firstblock field
The xfs_alloc_arg.firstblock field is used to control the starting
agno for an allocation. The structure already carries a pointer to
the transaction, which carries the current firstblock value.

Remove the field and access ->t_firstblock directly in the
allocation code.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-07-11 22:26:30 -07:00