Commit graph

767030 commits

Author SHA1 Message Date
Linus Torvalds
d8aed8415b Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns updates from Eric Biederman:
 "This is the last couple of vfs bits to enable root in a user namespace
  to mount and manipulate a filesystem with backing store (AKA not a
  virtual filesystem like proc, but a filesystem where the unprivileged
  user controls the content). The target filesystem for this work is
  fuse, and Miklos should be sending you the pull request for the fuse
  bits this merge window.

  The two key patches are "evm: Don't update hmacs in user ns mounts"
  and "vfs: Don't allow changing the link count of an inode with an
  invalid uid or gid". Those close small gaps in the vfs that would be a
  problem if an unprivileged fuse filesystem is mounted.

  The rest of the changes are things that are now safe to allow a root
  user in a user namespace to do with a filesystem they have mounted.
  The most interesting development is that remount is now safe"

* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems
  capabilities: Allow privileged user in s_user_ns to set security.* xattrs
  fs: Allow superblock owner to access do_remount_sb()
  fs: Allow superblock owner to replace invalid owners of inodes
  vfs: Allow userns root to call mknod on owned filesystems.
  vfs: Don't allow changing the link count of an inode with an invalid uid or gid
  evm: Don't update hmacs in user ns mounts
2018-06-04 15:21:19 -07:00
Darrick J. Wong
924cade4df xfs: don't return garbage buffers in xfs_da3_node_read
If we're reading a node in a dir/attr btree and the buffer comes off the
disk with a magic number we don't recognize, don't ASSERT and don't set
a garbage buffer type (0 also triggers ASSERTs).  Instead, report the
corruption, release the buffer, and return -EFSCORRUPTED because that's
what the dabtree is -- corrupt.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04 14:45:30 -07:00
Darrick J. Wong
1f5c071d19 xfs: don't ASSERT on short form btree root pointer of zero
Don't ASSERT if the short form btree root pointer is zero.  Now that we
use xfs_verify_agbno to check all short form btree pointers, we'll let
that log the error and pass it to the upper layers.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04 14:45:30 -07:00
Darrick J. Wong
eeee0d6a9b xfs: btree lookup shouldn't ASSERT on empty btree nodes
If a btree lookup encounters an empty btree node or an empty btree leaf
on a multi-level btree, that's evidence of a corrupt on-disk btree.
Therefore, we should return -EFSCORRUPTED to the upper levels, not an
ASSERT failure.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04 14:45:30 -07:00
Darrick J. Wong
a37f7b127e xfs: xfs_alloc_get_rec should return EFSCORRUPTED for obvious bnobt corruption
Return -EFSCORRUPTED when the bnobt/cntbt return obviously corrupt
values, rather than letting them bounce around in the internal code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04 14:45:30 -07:00
Darrick J. Wong
b3986010ce xfs: remove redundant ASSERT on insufficient bestfree length in _leaf_addname
In xfs_dir2_leaf_addname we ASSERT if the length of the unused space
described by bestfree[0] is less the amount of space we wish to consume.
Immediately after it is a call to xfs_dir2_data_use_free where the
offset parameter is offset of the unused space and the length parameter
is the amount of space we wish to consume.  Both values (and the unused
space pointer) are passed into xfs_dir2_data_check_free, which also
validates that the region of unused space is big enough to cover the
space we wish to consume.  This is effectively the same check that the
ASSERT covers, and since a check failure results in a corruption message
being logged we can remove the ASSERT.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04 14:45:29 -07:00
Darrick J. Wong
17ba2cc7b5 xfs: don't assert when reporting on-disk corruption while loading btree
Don't bother ASSERTing when we're already going to log and return the
corruption status.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04 14:45:29 -07:00
Darrick J. Wong
aaacdd257f xfs: don't forbid setting dax flag on directories if device doesn't dax
On a directory, the DAX flag is merely a hint that files created in the
directory should have the DAX flag set at creation time.  We don't care
if the underlying device supports DAX or not because directory metadata
are always cached in DRAM.  We don't care if new files get the flag even
if the device doesn't support DAX because we always check for DAX
support before setting the VFS flag (S_DAX).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-06-04 14:45:29 -07:00
Linus Torvalds
325520142b some smb3 fixes for stable, as well as addition of ftrace hooks for cifs.ko, and improvements in compounding and smbdirect (RDMA)
-----BEGIN PGP SIGNATURE-----
 
 iQHHBAABCgAxFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlsUs3QTHHNtZnJlbmNo
 QGdtYWlsLmNvbQAKCRCKLL1wB3JPUWFIDACpB/nxQT0l+SMfXe+7W5Txy6x1hasi
 fFtcQd03dSzTwGN6ANPmsszktu5LVU1htifwSDIMj9x6MLE+VRCFhuWEszs6N/WD
 kQO3DQmuyB3w+s8rkN2V+mjamDuf0jTiq2uLsTvicUsyEmJtJgKJdcmB+bDxXTcE
 AQbh5qjMtQIl4IyI6cqxIRm/6laeVyGurJKYGp2G8UcyLg6ddv/vp5mXDIdoCnrv
 /gvzi/A7a7UwTEZolFa2vNT/KTLofG8fstdtfqlSa9qTWZwjhoFPKVd3eQPJV3Fj
 SjTpwbnM4q0MSY7XijPMkSpnBniaF3e5bf3Cp6QmtQeU+FmfkoYOy7EUlZJ1ar15
 xNjLEl/EEfy1C3hIwsdoVB8iLFPhFlAMONUWGjs2cyiTRMZx2OP4VgbyN8pcZUcJ
 mGNbBx2SQDJo8iR579dRh2gl6Gy/i5DNOClyIPfEpzGTURrzoV0C9EAnoweyK1qh
 wxvdA7SfPRi2rw9a4Kg+PJxWnd8oNRyhzUw=
 =EXH7
 -----END PGP SIGNATURE-----

Merge tag '4.18-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:

 - smb3 fixes for stable

 - addition of ftrace hooks for cifs.ko

 - improvements in compounding and smbdirect (rdma)

* tag '4.18-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (38 commits)
  CIFS: Add support for direct pages in wdata
  CIFS: Use offset when reading pages
  CIFS: Add support for direct pages in rdata
  cifs: update multiplex loop to handle compounded responses
  cifs: remove header_preamble_size where it is always 0
  cifs: remove struct smb2_hdr
  CIFS: 511c54a2f6 adds a check for session expiry, status STATUS_NETWORK_SESSION_EXPIRED, however the server can also respond with STATUS_USER_SESSION_DELETED in cases where the session has been idle for some time and the server reaps the session to recover resources.
  cifs: change smb2_get_data_area_len to take a smb2_sync_hdr as argument
  cifs: update smb2_calc_size to use smb2_sync_hdr instead of smb2_hdr
  cifs: remove struct smb2_oplock_break_rsp
  cifs: remove rfc1002 header from all SMB2 response structures
  smb3: on reconnect set PreviousSessionId field
  smb3: Add posix create context for smb3.11 posix mounts
  smb3: add tracepoints for smb2/smb3 open
  cifs: add debug output to show nocase mount option
  smb3: add define for id for posix create context and corresponding struct
  cifs: update smb2_check_message to handle PDUs without a 4 byte length header
  smb3: allow "posix" mount option to enable new SMB311 protocol extensions
  smb3: add support for posix negotiate context
  cifs: allow disabling less secure legacy dialects
  ...
2018-06-04 14:42:46 -07:00
Linus Torvalds
1e43938bfb We've got 9 more patches for this merge window.
1. Andreas Gruenbacher contributed a patch to remove sd_jheightsize to
    greatly simplify some code.
 2. Andreas fixed some comments.
 3. Andreas fixed a glock recursion bug when allocation errors occur.
 4. Andreas improved the hole_size function so it returns the entire hole
    rather than figuring it out piecemeal.
 5. Andreas cleaned up gfs2_stuffed_write_end to remove a lot of redundancy.
 6. Andreas clarified code with regard to the way ordered writes are processed.
 7. Andreas did a bunch of improvements and cleanups of the iomap code to
    pave the way for iomap writes, which is a future patch set.
 8. I fixed a bug where block reservations can run off the end of a bitmap.
 9. I added Andreas to the MAINTAINERS file.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbFXdOAAoJENeLYdPf93o7hcMH/RmU2Pd3QTGxB2IOtnO/NUIg
 K+aOXIa6bVjGyNgx9C/to0tzXQ+E/MQIrhpaDd4Tlas4gsdTeOYtqZCYa7tfJFCz
 bNDcH+ix9f6hiFz5rMv7tCT9K2K3I3gnmjTgXak9OMQXsUqpB/QFs4ZSESdgChHl
 K+06R6tLfBMNGysWW3OIwA7Fb5i7tkWbxbyidI/GHVxZurRAIosvic7EOJT6oypr
 /N9PrTmHg+L8Kp/c7Cpge5hebfC+funXbcPk0vmpYAz9Aue8uerUEEzrI1QyPenQ
 +41xGG55R/qxkmYa998JvETAeMGeQjOYUOmNT7/QnZOn21fK1Q0OJWvt+teYjmE=
 =/I7+
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-4.18.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Bob Peterson:
 "We've got nine more patches for this merge window.

   - remove sd_jheightsize to greatly simplify some code (Andreas
     Gruenbacher)

   - fix some comments (Andreas)

   - fix a glock recursion bug when allocation errors occur (Andreas)

   - improve the hole_size function so it returns the entire hole rather
     than figuring it out piecemeal (Andreas)

   - clean up gfs2_stuffed_write_end to remove a lot of redundancy
     (Andreas)

   - clarify code with regard to the way ordered writes are processed
     (Andreas)

   - a bunch of improvements and cleanups of the iomap code to pave the
     way for iomap writes, which is a future patch set (Andreas)

   - fix a bug where block reservations can run off the end of a bitmap
     (Bob Peterson)

   - add Andreas to the MAINTAINERS file (Bob Peterson)"

* tag 'gfs2-4.18.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  MAINTAINERS: Add Andreas Gruenbacher as a maintainer for gfs2
  gfs2: Iomap cleanups and improvements
  gfs2: Remove ordered write mode handling from gfs2_trans_add_data
  gfs2: gfs2_stuffed_write_end cleanup
  gfs2: hole_size improvement
  GFS2: gfs2_free_extlen can return an extent that is too long
  GFS2: Fix allocation error bug with recursive rgrp glocking
  gfs2: Update find_metapath comment
  gfs2: Remove sdp->sd_jheightsize
2018-06-04 14:36:38 -07:00
David S. Miller
d67b66b45a Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2018-06-04

This series contains a smorgasbord of updates to documentation, e1000e,
igb, ixgbe, ixgbevf and i40e.

Benjamin Poirier fixes a potential kernel crash due to NULL pointer
dereference in e1000e.

Jeff updates the kernel documentation for e100 and e1000 to correct
default values and URLs which were incorrect in the documentation.  Also
took the time to update these to the new reStructured text format for
kernel documentation.

Joanna Yurdal fixes a missing PTP transmit timestamp by ensuring that
TSICR gets cleared when ICR is cleared.

Sergey updates igb to reset all the transmit queues at one time so that
we only have to wait once for all the queues to be reset.

Alex fixes ixgbevf so that malicious driver detection (MDD) can co-exist
with XDP.

Emil and Tony extend the RTNL lock to ensure we get the most up-to-date
values for the bits and avoid a possible race condition when going down.

YueHaibing from Huawei introduces a helper function in ixgbe for
operation reads to simplify the code a bit more.

Daniel Borkmann adds support for XDP meta data when using build SKB
for i40e.

Shannon Nelson provides twp fixes for the IPSec code in ixgbe, first is
to make sure we do not try to offload the decryption of any incoming
packet that is destined for the management engine.  The other fix is to
resolve a cast problem introduced by a sparse cleanup patch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:35:35 -04:00
Linus Torvalds
8a4631144b dlm for 4.18
These three commits fix and clean up the flags dlm was
 using on its SCTP sockets.  The result improves the
 performance and fixes some bad connection delays.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbFXmBAAoJEDgbc8f8gGmqDYsP/12AlXOdUJmhyK3/GwYGmayx
 +mnUkQ/PacfsjXIFt1HHCM9LcSCk9gTES8suoFHkP5wfMMqKz6VoVG6k2sXn2jMS
 Ig+xCCFsTIR1n8ZB7ZAhQFkYgOGsKujDgLQ/7dYBos331zdZSD8CqTYTcti5twdE
 C4uERBR2RLDHGTzgAVEWSE/wc3QK8Yw8O9Yh1tebBFRyMU/tQl53HlviMozhJGlZ
 vg6Mm8eVE9OSUBRp4LzG34xYj7YD05j5cJBkQWVjTlg+giNCZnlLWQOv0rbiy2+M
 IXf/TmYKqbtdc4ATMrJStIdJSjErZeuDtWvKEtpaW/w2TjiwOTS56KG/W4kC4Wjs
 5MuHVFI3jRmgrL1iCwIotqu+BkE1dlHP81DbLKRy53lvit+zEWoh3G1V9rUSGDcd
 tEs36BwJERpmiTw2roOzajdLyi6Z2Jlv/GJCXUyQ2YXBV8WvTKeQbb+VSc6rCz1E
 WBZ/vOIbA8JibXVuMHw6iPwy99M5lk/6q8Sa1CS9GiYmGfM3fDesKpYGnKIu77AH
 4pnm6OgPihU+dZratHIJMddNAyorkcaYuOq0TkLiTJZas/MpGKuKHncEARUfcHno
 e1RSz1yPuN0OEJoVJBiWfe8nSAV/6vXvEPlWTMfwADfpXTYxTphqRW3VvC4N1Cpi
 gNiAKXvdIjRpGjInPeI3
 =jiYf
 -----END PGP SIGNATURE-----

Merge tag 'dlm-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:
 "These three commits fix and clean up the flags dlm was using on its
  SCTP sockets. This improves performance and fixes some bad connection
  delays"

* tag 'dlm-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: remove O_NONBLOCK flag in sctp_connect_to_sock
  dlm: make sctp_connect_to_sock() return in specified time
  dlm: fix a clerical error when set SCTP_NODELAY
2018-06-04 14:34:06 -07:00
Chao Yu
dfa742803f f2fs: fix to clear FI_VOLATILE_FILE correctly
Thread A			Thread B
- f2fs_release_file
 - clear_inode_flag(FI_VOLATILE_FILE)
				- wb_writeback
				 - writeback_sb_inodes
				  - __writeback_single_inode
				   - do_writepages
				    - f2fs_write_data_pages
				     - __write_data_page
				     all volatile file's pages
				     are writebacked to storage
 - set_inode_flag(FI_DROP_CACHE)
 - filemap_fdatawrite

There is a hole that mm can flush all dirty pages of volatile file as
inode is not tagged with both FI_VOLATILE_FILE and FI_DROP_CACHE flags,
we should never writeback the page #0 and also it's unneeded to writeback
other pages.

This patch adjusts to relocate clear_inode_flag(FI_VOLATILE_FILE), so that
FI_VOLATILE_FILE flag can be remained before all dirty pages were dropped
to avoid issue.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04 14:33:27 -07:00
Chao Yu
c29fd0c0e2 f2fs: let sync node IO interrupt async one
Although mixed sync/async IOs can have continuous LBA, as they have
different IO priority, block IO scheduler will add them into different
queues and commit them separately, result in splited IOs which causes
wrose performance.

This patch gives high priority to synchronous IO of nodes, means that
once synchronous flow starts, it can interrupt asynchronous writeback
flow of system flusher, so more big IOs can be expected.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04 14:33:20 -07:00
Xi Wang
f0b964e5e4 net: hns: Fix the process of adding broadcast addresses to tcam
If the multicast mask value in device tree is configured not all
0xff, the broadcast mac will be lost from tcam table after the
execution of command 'ifconfig up'. The address is appended by
hns_ae_start, but will be clear later by hns_nic_set_rx_mode
called in dev_open process.

This patch fixed it by not use the multicast mask when add a
broadcast address.

Fixes: b5996f11ea ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:32:56 -04:00
Chao Yu
aae764ece6 f2fs: don't change wbc->sync_mode
We should never falsify wbc->sync_mode passed from mm, otherwise
mm can trigger writeback with wrong IO priority.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04 14:32:44 -07:00
Vlad Buslov
0e3990356d net: sched: return error code when tcf proto is not found
If requested tcf proto is not found, get and del filter netlink protocol
handlers output error message to extack, but do not return actual error
code. Add check to return ENOENT when result of tp find function is NULL
pointer.

Fixes: c431f89b18 ("net: sched: split tc_ctl_tfilter into three handlers")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:31:44 -04:00
Chao Yu
a1f72ac2c0 f2fs: fix to update mtime correctly
If we change system time to the past, get_mtime() will return a
overflowed time, and SIT_I(sbi)->max_mtime will be udpated
incorrectly, this patch fixes the two issues.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-04 14:31:11 -07:00
Dan Carpenter
25ea66544b team: use netdev_features_t instead of u32
This code was introduced in 2011 around the same time that we made
netdev_features_t a u64 type.  These days a u32 is not big enough to
hold all the potential features.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:30:04 -04:00
Dan Carpenter
a746407af1 net_failover: Use netdev_features_t instead of u32
The features mask needs to be a netdev_features_t (u64) because a u32
is not big enough.

Fixes: cfc80d9a11 ("net: Introduce net_failover driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:30:04 -04:00
Mike Marciniszyn
d9a6ce68a0 IB/hfi1: Fix comment on default hdr entry size
The comment for the default header queue entry size is incorrect.

Correct the comment and fix the resulting S_IRUGO warning that shows
up in the widened patch context.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-04 15:29:48 -06:00
Linus Torvalds
704996566f for-4.18-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlsVV6wACgkQxWXV+ddt
 WDsaOhAAnrG3o/Bkn4loifYjiWQoArswwVJ2m5MvG14oPcn/c6wCUlE6mHdEsT8l
 WgltSRSd/3hJ1BcuWjhcR8RSLBRSaPcw4FAHikAdOWMrLXSKfA+quT/upXD00iDS
 +bnHoknc5vvk+Ow0KyGODAeMYfJm8gGB43BeVDr/mgrVRrwYY9r14s+6LX897NOT
 qObdxBjtedua0uIYZX5673siFaSbvJNQyGn6iKO9Ww+7bkWjgKO8qY8/3/gEaJn1
 q1303Oo65VOXQQKZed+NjPNakpqRmABaN79qTwWMnUI4qYBKi651a7sr5HljDpmc
 szY3uF/E7AOfJSDeTBdglFR1eQgv20ceeIx7FN0n6iUbuIVH9P85mBF6f76Hivpq
 fA39uPYxWwWq8RxmC09yqoTkuNS0zAQhov4n8XO0q/81Gfek0RpR+LMK15W02Ur8
 +5Pazkgz+xRYRPhpExe75pbM5N31KFJdf+4wpXeZoXKGTeiiCl/SZV3NzyUtemGg
 eN7ltHD/Y9iNpx7sgHn+2veXwn6psDad8ORxIlRyKDeot2pNZ4lnekPbqMC/IwAz
 hF3ZpcEo8v2Q2kqwCqw5evq4NSOvfG0cEuIqAKir1AgTn2nPH99qRSR1He9kluap
 GBPTPXNgBrhhdsh5kltI6CJcGE46bHsLG4LI5WvRs+wOtyaG5Ww=
 =jIvf
 -----END PGP SIGNATURE-----

Merge tag 'for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "User visible features:

   - added support for the ioctl FS_IOC_FSGETXATTR, per-inode flags,
     successor of GET/SETFLAGS; now supports only existing flags:
     append, immutable, noatime, nodump, sync

   - 3 new unprivileged ioctls to allow users to enumerate subvolumes

   - dedupe syscall implementation does not restrict the range to 16MiB,
     though it still splits the whole range to 16MiB chunks

   - on user demand, rmdir() is able to delete an empty subvolume,
     export the capability in sysfs

   - fix inode number types in tracepoints, other cleanups

   - send: improved speed when dealing with a large removed directory,
     measurements show decrease from 2000 minutes to 2 minutes on a
     directory with 2 million entries

   - pre-commit check of superblock to detect a mysterious in-memory
     corruption

   - log message updates

  Other changes:

   - orphan inode cleanup improved, does no keep long-standing
     reservations that could lead up to early ENOSPC in some cases

   - slight improvement of handling snapshotted NOCOW files by avoiding
     some unnecessary tree searches

   - avoid OOM when dealing with many unmergeable small extents at flush
     time

   - speedup conversion of free space tree representations from/to
     bitmap/tree

   - code refactoring, deletion, cleanups:
      + delayed refs
      + delayed iput
      + redundant argument removals
      + memory barrier cleanups
      + remove a redundant mutex supposedly excluding several ioctls to
        run in parallel

   - new tracepoints for blockgroup manipulation

   - more sanity checks of compressed headers"

* tag 'for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (183 commits)
  btrfs: Add unprivileged version of ino_lookup ioctl
  btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF
  btrfs: Add unprivileged ioctl which returns subvolume information
  Btrfs: clean up error handling in btrfs_truncate()
  btrfs: Factor out write portion of btrfs_get_blocks_direct
  btrfs: Factor out read portion of btrfs_get_blocks_direct
  btrfs: return ENOMEM if path allocation fails in btrfs_cross_ref_exist
  btrfs: raid56: Remove VLA usage
  btrfs: return error value if create_io_em failed in cow_file_range
  btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot
  btrfs: drop unused parameter qgroup_reserved
  btrfs: balance dirty metadata pages in btrfs_finish_ordered_io
  btrfs: lift some btrfs_cross_ref_exist checks in nocow path
  btrfs: Remove fs_info argument from btrfs_uuid_tree_rem
  btrfs: Remove fs_info argument from btrfs_uuid_tree_add
  Btrfs: remove unused check of skip_locking
  Btrfs: remove always true check in unlock_up
  Btrfs: grab write lock directly if write_lock_level is the max level
  Btrfs: move get root out of btrfs_search_slot to a helper
  Btrfs: use more straightforward extent_buffer_uptodate check
  ...
2018-06-04 14:29:13 -07:00
YueHaibing
ff2e351e19 qed: use dma_zalloc_coherent instead of allocator/memset
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:28:43 -04:00
Mikulas Patocka
2026d35741 branch-check: fix long->int truncation when profiling branches
The function __builtin_expect returns long type (see the gcc
documentation), and so do macros likely and unlikely. Unfortunatelly, when
CONFIG_PROFILE_ANNOTATED_BRANCHES is selected, the macros likely and
unlikely expand to __branch_check__ and __branch_check__ truncates the
long type to int. This unintended truncation may cause bugs in various
kernel code (we found a bug in dm-writecache because of it), so it's
better to fix __branch_check__ to return long.

Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1805300818140.24812@file01.intranet.prod.int.rdu2.redhat.com

Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 1f0d69a9fc ("tracing: profile likely and unlikely annotations")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-06-04 17:28:20 -04:00
Vasyl Gomonovych
a9235b544a ring-buffer: Fix typo in comment
Fix typo of the word 'been'

Link: http://lkml.kernel.org/r/20180518203130.2011-1-gomonovych@gmail.com

Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-06-04 17:28:20 -04:00
Steven Rostedt (VMware)
6167c205ca ring-buffer: Fix a bunch of typos in comments
An anonymous source sent me a bunch of typo fixes in the comments of
ring_buffer.c file. That source did not want to be associated to this patch
because they don't want to be known as "one of those" commiters (you know who
you are!). They gave me permission to sign this off in my own name.

Suggested-by: One-of-those-commiters@YouKnowWhoYouAre.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-06-04 17:28:19 -04:00
Steven Rostedt (VMware)
33697bd486 tracing/selftest: Add test to test simple snapshot trigger for trace_marker
Several complex trigger tests were added for trace_marker, but not a simple
one. This could be used to help diagnose a problem with the code by giving a
reference between how complex a trigger is that fails.

Suggested-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-06-04 17:28:01 -04:00
YueHaibing
1f55c2865c wan/fsl_ucc_hdlc: use dma_zalloc_coherent instead of allocator/memset
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:27:59 -04:00
Linus Torvalds
e3a44fd7e6 affs-for-4.18-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlsVSBYACgkQxWXV+ddt
 WDswpQ//aeSKpmGtrhEl0kh2zt+0yyg9aBO74eN8xd27ADYcF4W3vDH91YNgz/HT
 y0Lu+vO0Y1ztKUpf7ho1fZq2By5wNvq/7XCBOYIFquXxSm1jW4TxdIzblEGvaz1N
 vBfb3wwlgvL6cVmjZjlQSW5bDwdX8yeSVPRfu/aWH2SAi/vjSmcNBTOl8cX3rXNE
 ae2bILFFOJGTzQbMPe8ORD5PuL0n3DCs+3EI7FVay5RAekM9+nglnk/mVX7xZAON
 Y2a2QEUNI2MOlwA2s7wfjWIJqp7XZzCzkop62p8omYX5jA5CtLZtBKGrlwrBu1tw
 nZ1XlTQC7R95HBc3Y7XHE2kiXjQ/SPEcQp5fXUzjfZPVWuLTrq0r9OgFxcyCjAGx
 yaS8V6WMdcloFY1NxG3SIe+4OrxCrnbd9BpUmsr9nTQd2elaV5W1I3a5Wqtvx2eF
 B5d8I/ErmuIGRvyIBJVj5bmXqbaN+F2n+KSNvGQS4GNAV4jV21mnlj04Cjava5ma
 6yzwZu1xLCVSPyjFYrxDVV//qq4jfs7Ou2Eo6W7Bwk5K6fOm6TQzwH3yUA6gMUip
 SdC0tNy2j5fTS6EETsxYMNGY74mNgmroWJZ/7n5nrDBkV3LZFLfZFevTE+wenzip
 IhyTL9tpWxrchPHsA3GdRsxSXKDYArqFL9D09W3NDtpjjkxvKs0=
 =slnj
 -----END PGP SIGNATURE-----

Merge tag 'affs-for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull affs fix from David Sterba:
 "A potential memory leak fix for AFFS"

* tag 'affs-for-4.18-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  affs: fix potential memory leak when parsing option 'prefix'
2018-06-04 14:27:09 -07:00
Kaike Wan
ed71e86a8d IB/hfi1: Rename exp_lock to exp_mutex
The mutex exp_lock in struct hfi1_ctxtdata is used to protect all
Expected TID data of a user context. This patch renames it to exp_mutex
to better reflect its identity and prepare for upcoming patches.

Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-04 15:25:27 -06:00
David S. Miller
828da43224 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2018-06-04

Here's one last bluetooth-next pull request for the 4.18 kernel:

 - New USB device IDs for Realtek 8822BE and 8723DE
 - reset/resume fix for Dell Inspiron 5565
 - Fix HCI_UART_INIT_PENDING flag behavior
 - Fix patching behavior for some ATH3012 models
 - A few other minor cleanups & fixes

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:22:17 -04:00
Olivier Gayot
bb38ccce88 docs: networking: fix minor typos in various documentation files
This patch fixes some typos/misspelling errors in the
Documentation/networking files.

Signed-off-by: Olivier Gayot <olivier.gayot@sigexec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:21:28 -04:00
Maciej Żenczykowski
f396922d86 net: do not allow changing SO_REUSEADDR/SO_REUSEPORT on bound sockets
It is not safe to do so because such sockets are already in the
hash tables and changing these options can result in invalidating
the tb->fastreuse(port) caching.

This can have later far reaching consequences wrt. bind conflict checks
which rely on these caches (for optimization purposes).

Not to mention that you can currently end up with two identical
non-reuseport listening sockets bound to the same local ip:port
by clearing reuseport on them after they've already both been bound.

There is unfortunately no EISBOUND error or anything similar,
and EISCONN seems to be misleading for a bound-but-not-connected
socket, so use EUCLEAN 'Structure needs cleaning' which AFAICT
is the closest you can get to meaning 'socket in bad state'.
(although perhaps EINVAL wouldn't be a bad choice either?)

This does unfortunately run the risk of breaking buggy
userspace programs...

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Change-Id: I77c2b3429b2fdf42671eee0fa7a8ba721c94963b
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:14:22 -04:00
Maciej Żenczykowski
79e9fed460 net-tcp: extend tcp_tw_reuse sysctl to enable loopback only optimization
This changes the /proc/sys/net/ipv4/tcp_tw_reuse from a boolean
to an integer.

It now takes the values 0, 1 and 2, where 0 and 1 behave as before,
while 2 enables timewait socket reuse only for sockets that we can
prove are loopback connections:
  ie. bound to 'lo' interface or where one of source or destination
  IPs is 127.0.0.0/8, ::ffff:127.0.0.0/104 or ::1.

This enables quicker reuse of ephemeral ports for loopback connections
- where tcp_tw_reuse is 100% safe from a protocol perspective
(this assumes no artificially induced packet loss on 'lo').

This also makes estblishing many loopback connections *much* faster
(allocating ports out of the first half of the ephemeral port range
is significantly faster, then allocating from the second half)

Without this change in a 32K ephemeral port space my sample program
(it just establishes and closes [::1]:ephemeral -> [::1]:server_port
connections in a tight loop) fails after 32765 connections in 24 seconds.
With it enabled 50000 connections only take 4.7 seconds.

This is particularly problematic for IPv6 where we only have one local
address and cannot play tricks with varying source IP from 127.0.0.0/8
pool.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Wei Wang <weiwan@google.com>
Change-Id: I0377961749979d0301b7b62871a32a4b34b654e1
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:13:35 -04:00
Yuval Bason
39dbc646fd qed: Add srq core support for RoCE and iWARP
This patch adds support for configuring SRQ and provides the necessary
APIs for rdma upper layer driver (qedr) to enable the SRQ feature.

Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Yuval Bason <yuval.bason@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:09:54 -04:00
David S. Miller
7a9ee41b83 Merge branch 'bnx2-warnings'
Varsha Rao says:

====================
net: bnx2: Fix checkpatch and clang warnings

This patchset fixes NULL comparison and extra parentheses, checkpatch
and clang warnings.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:07:28 -04:00
Varsha Rao
b8aac410b7 net: ethernet: bnx2: Replace NULL comparison
This patch fixes the checkpatch issue of NULL comparison. Replace x == NULL
with !x, by using the following coccinelle script:

@disable is_null@
expression e;
@@
-e==NULL
+!e

Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:07:27 -04:00
Varsha Rao
6dc5aa2123 net: ethernet: bnx2: Remove extra parentheses
The following coccinelle script removes extra parentheses to fix the
clang warning of extraneous parentheses.

@disable paren@
identifier i;
expression e;
statement s;
@@
if (
-(i == e)
+i == e
 )
s

Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:07:27 -04:00
YueHaibing
13ce3bc9c1 net: gemini: fix spelling mistake: "it" -> "is"
Trivial fix to spelling mistake in gemini dev_warn message

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:05:39 -04:00
Paul Blakey
f6521c587a cls_flower: Fix comparing of old filter mask with new filter
We incorrectly compare the mask and the result is that we can't modify
an already existing rule.

Fix that by comparing correctly.

Fixes: 05cd271fd6 ("cls_flower: Support multiple masks per priority")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:03:37 -04:00
Paul Blakey
de9dc650f0 cls_flower: Fix missing free of rhashtable
When destroying the instance, destroy the head rhashtable.

Fixes: 05cd271fd6 ("cls_flower: Support multiple masks per priority")
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:03:37 -04:00
Palmer Dabbelt
32c81bced3
RISC-V: Preliminary Perf Support
The RISC-V ISA defines a core set of performance counters that must
exist on all processors along with a standard way to add more
performance counters.

This patch set adds preliminary perf support for RISC-V systems.  Long
term we'll move to model where all PMUs can be built into the kernel at
the same time, detected at runtime (possibly via device tree), and
provided to userspace.  Since we currently only support the ISA-mandated
performance counters there's no need to detect anything right now.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-06-04 14:03:18 -07:00
Alan Kao
0d431558d7
perf: riscv: Add Document for Future Porting Guide
Reviewed-by: Alex Solomatnikov <sols@sifive.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Signed-off-by: Alan Kao <alankao@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-06-04 14:02:11 -07:00
Randy Dunlap
1f4c741321 net: skbuff.h: drop unneeded <linux/slab.h>
<linux/skbuff.h> does not use nor need <linux/slab.h>, so drop this
header file from skbuff.h.

<linux/skbuff.h> is currently #included in around 1200 C source and
header files, making it the 31st most-used header file.

Build tested [allmodconfig] on 20 arch-es.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04 17:02:06 -04:00
Alan Kao
178e9fc47a
perf: riscv: preliminary RISC-V support
This patch provide a basic PMU, riscv_base_pmu, which supports two
general hardware event, instructions and cycles.  Furthermore, this
PMU serves as a reference implementation to ease the portings in
the future.

riscv_base_pmu should be able to run on any RISC-V machine that
conforms to the Priv-Spec.  Note that the latest qemu model hasn't
fully support a proper behavior of Priv-Spec 1.10 yet, but work
around should be easy with very small fixes.  Please check
https://github.com/riscv/riscv-qemu/pull/115 for future updates.

Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Signed-off-by: Alan Kao <alankao@andestech.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-06-04 14:02:01 -07:00
Mike Marciniszyn
dc2b2a917c IB/hfi1: Add bypass register defines and replace blind constants
These registers were not added in the 16B work.

Add them and replace blind constants with the correct defines.

Fixes: 72c07e2b67 ("IB/hfi1: Add support to receive 16B bypass packets")
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-04 14:59:21 -06:00
Kaike Wan
5465f11083 IB/hfi1: Remove unused variable
The variable extended_psn was not used any more.

Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-04 14:59:21 -06:00
Linus Torvalds
408afb8d78 Merge branch 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull aio updates from Al Viro:
 "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

  The only thing I'm holding back for a day or so is Adam's aio ioprio -
  his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
  but let it sit in -next for decency sake..."

* 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  aio: sanitize the limit checking in io_submit(2)
  aio: fold do_io_submit() into callers
  aio: shift copyin of iocb into io_submit_one()
  aio_read_events_ring(): make a bit more readable
  aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
  aio: take list removal to (some) callers of aio_complete()
  aio: add missing break for the IOCB_CMD_FDSYNC case
  random: convert to ->poll_mask
  timerfd: convert to ->poll_mask
  eventfd: switch to ->poll_mask
  pipe: convert to ->poll_mask
  crypto: af_alg: convert to ->poll_mask
  net/rxrpc: convert to ->poll_mask
  net/iucv: convert to ->poll_mask
  net/phonet: convert to ->poll_mask
  net/nfc: convert to ->poll_mask
  net/caif: convert to ->poll_mask
  net/bluetooth: convert to ->poll_mask
  net/sctp: convert to ->poll_mask
  net/tipc: convert to ->poll_mask
  ...
2018-06-04 13:57:43 -07:00
Palmer Dabbelt
9c5217640e
MAINTAINERS: Update Albert's email, he's back at Berkeley
When I was adding a MAINTAINERS entry for SiFive's drivers I realized
that Albert's email is out of date -- he's gone back to Berkeley, so his
SiFive email is technically defunct.  This patch updates his entry to a
current email address, hosted at Berkeley.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-06-04 13:46:33 -07:00
Palmer Dabbelt
3ed45d7ffa
MAINTAINERS: Add myself as a maintainer for SiFive's drivers
There aren't actually any files in the tree that match these patterns
right now, but we've just started submitting our drivers so I thought it
would be good to make sure there's at least someone at SiFive who's
listed as maintaining them.  I'm leaving the RISC-V lists on here
because:

* As of today, all the RISC-V ASICs that people can actually buy are
  from SiFive -- though hopefully there'll be more soon!
* The RTL for many of our devices is open source, so I anticipate these
  devices might make they way chips from other vendors.
* We may standardize some of these devices as part of a RISC-V
  specification at some point in the future.

I'm a bit swamped right now so I might not be the most active maintainer
of these drivers, but I think it'd be good to make sure someone who has
hardware access gets CC'd on updates to our drivers just as a sanity
check.  Hopefully that's an OK way to handle this.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-06-04 13:46:31 -07:00