Commit graph

75139 commits

Author SHA1 Message Date
Alexey Dobriyan
2a4a4082cd cpumask: nicer for_each_cpumask_and() signature
Mask arguments can be swapped without changing anything.  Make arguments
names reflect that:

	#define for_each_cpu_and(cpu, mask1, mask2)

Link: http://lkml.kernel.org/r/20190724183350.GA15041@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:40 -07:00
Sai Praneeth Prakhya
8495f7e673 fork: improve error message for corrupted page tables
When a user process exits, the kernel cleans up the mm_struct of the user
process and during cleanup, check_mm() checks the page tables of the user
process for corruption (E.g: unexpected page flags set/cleared).  For
corrupted page tables, the error message printed by check_mm() isn't very
clear as it prints the loop index instead of page table type (E.g:
Resident file mapping pages vs Resident shared memory pages).  The loop
index in check_mm() is used to index rss_stat[] which represents
individual memory type stats.  Hence, instead of printing index, print
memory type, thereby improving error message.

Without patch:
--------------
[  204.836425] mm/pgtable-generic.c:29: bad p4d 0000000089eb4e92(800000025f941467)
[  204.836544] BUG: Bad rss-counter state mm:00000000f75895ea idx:0 val:2
[  204.836615] BUG: Bad rss-counter state mm:00000000f75895ea idx:1 val:5
[  204.836685] BUG: non-zero pgtables_bytes on freeing mm: 20480

With patch:
-----------
[   69.815453] mm/pgtable-generic.c:29: bad p4d 0000000084653642(800000025ca37467)
[   69.815872] BUG: Bad rss-counter state mm:00000000014a6c03 type:MM_FILEPAGES val:2
[   69.815962] BUG: Bad rss-counter state mm:00000000014a6c03 type:MM_ANONPAGES val:5
[   69.816050] BUG: non-zero pgtables_bytes on freeing mm: 20480

Also, change print function (from printk(KERN_ALERT, ..) to pr_alert()) so
that it matches the other print statement.

Link: http://lkml.kernel.org/r/da75b5153f617f4c5739c08ee6ebeb3d19db0fbc.1565123758.git.sai.praneeth.prakhya@intel.com
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:40 -07:00
Stephen Boyd
091cb0994e lib/hexdump: make print_hex_dump_bytes() a nop on !DEBUG builds
I'm seeing a bunch of debug prints from a user of print_hex_dump_bytes()
in my kernel logs, but I don't have CONFIG_DYNAMIC_DEBUG enabled nor do I
have DEBUG defined in my build.  The problem is that
print_hex_dump_bytes() calls a wrapper function in lib/hexdump.c that
calls print_hex_dump() with KERN_DEBUG level.  There are three cases to
consider here

  1. CONFIG_DYNAMIC_DEBUG=y  --> call dynamic_hex_dum()
  2. CONFIG_DYNAMIC_DEBUG=n && DEBUG --> call print_hex_dump()
  3. CONFIG_DYNAMIC_DEBUG=n && !DEBUG --> stub it out

Right now, that last case isn't detected and we still call
print_hex_dump() from the stub wrapper.

Let's make print_hex_dump_bytes() only call print_hex_dump_debug() so that
it works properly in all cases.

Case #1, print_hex_dump_debug() calls dynamic_hex_dump() and we get same
behavior.  Case #2, print_hex_dump_debug() calls print_hex_dump() with
KERN_DEBUG and we get the same behavior.  Case #3, print_hex_dump_debug()
is a nop, changing behavior to what we want, i.e.  print nothing.

Link: http://lkml.kernel.org/r/20190816235624.115280-1-swboyd@chromium.org
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:39 -07:00
Joe Perches
917cda2790 kernel-doc: core-api: include string.h into core-api
core-api should show all the various string functions including the newly
added stracpy and stracpy_pad.

Miscellanea:

o Update the Returns: value for strscpy
o fix a defect with %NUL)

[joe@perches.com: correct return of -E2BIG descriptions]
  Link: http://lkml.kernel.org/r/29f998b4c1a9d69fbeae70500ba0daa4b340c546.1563889130.git.joe@perches.com
Link: http://lkml.kernel.org/r/224a6ebf39955f4107c0c376d66155d970e46733.1563841972.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Stephen Kitt <steve@sk2.org>
Cc: Nitin Gote <nitin.r.gote@intel.com>
Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:39 -07:00
Michel Lespinasse
6d2052d188 augmented rbtree: rework the RB_DECLARE_CALLBACKS macro definition
Change the definition of the RBCOMPUTE function.  The propagate callback
repeatedly calls RBCOMPUTE as it moves from leaf to root.  it wants to
stop recomputing once the augmented subtree information doesn't change.
This was previously checked using the == operator, but that only works
when the augmented subtree information is a scalar field.  This commit
modifies the RBCOMPUTE function so that it now sets the augmented subtree
information instead of returning it, and returns a boolean value
indicating if the propagate callback should stop.

The motivation for this change is that I want to introduce augmented
rbtree uses where the augmented data for the subtree is a struct instead
of a scalar.

Link: http://lkml.kernel.org/r/20190703040156.56953-4-walken@google.com
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:39 -07:00
Michel Lespinasse
315cc066b8 augmented rbtree: add new RB_DECLARE_CALLBACKS_MAX macro
Add RB_DECLARE_CALLBACKS_MAX, which generates augmented rbtree callbacks
for the case where the augmented value is a scalar whose definition
follows a max(f(node)) pattern.  This actually covers all present uses of
RB_DECLARE_CALLBACKS, and saves some (source) code duplication in the
various RBCOMPUTE function definitions.

[walken@google.com: fix mm/vmalloc.c]
  Link: http://lkml.kernel.org/r/CANN689FXgK13wDYNh1zKxdipeTuALG4eKvKpsdZqKFJ-rvtGiQ@mail.gmail.com
[walken@google.com: re-add check to check_augmented()]
  Link: http://lkml.kernel.org/r/20190727022027.GA86863@google.com
Link: http://lkml.kernel.org/r/20190703040156.56953-3-walken@google.com
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:39 -07:00
Michel Lespinasse
444b8a83f1 augmented rbtree: add comments for RB_DECLARE_CALLBACKS macro
Patch series "make RB_DECLARE_CALLBACKS more generic", v3.

These changes are intended to make the RB_DECLARE_CALLBACKS macro more
generic (allowing the aubmented subtree information to be a struct instead
of a scalar).

I have verified the compiled lib/interval_tree.o and mm/mmap.o files to
check that they didn't change.  This held as expected for interval_tree.o;
mmap.o did have some changes which could be reverted by marking
__vma_link_rb as noinline.  I did not add such a change to the patchset; I
felt it was reasonable enough to leave the inlining decision up to the
compiler.

This patch (of 3):

Add a short comment summarizing the arguments to RB_DECLARE_CALLBACKS.
The arguments are also now capitalized.  This copies the style of the
INTERVAL_TREE_DEFINE macro.

No functional changes in this commit, only comments and capitalization.

Link: http://lkml.kernel.org/r/20190703040156.56953-2-walken@google.com
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:39 -07:00
Linus Torvalds
f41def3971 The highlights are:
- automatic recovery of a blacklisted filesystem session (Zheng Yan).
   This is disabled by default and can be enabled by mounting with the
   new "recover_session=clean" option.
 
 - serialize buffered reads and O_DIRECT writes (Jeff Layton).  Care is
   taken to avoid serializing O_DIRECT reads and writes with each other,
   this is based on the exclusion scheme from NFS.
 
 - handle large osdmaps better in the face of fragmented memory (myself)
 
 - don't limit what security.* xattrs can be get or set (Jeff Layton).
   We were overly restrictive here, unnecessarily preventing things like
   file capability sets stored in security.capability from working.
 
 - allow copy_file_range() within the same inode and across different
   filesystems within the same cluster (Luis Henriques)
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl2LoD8THGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzixRYB/9H5S4fif8Pn9eiXkIJiT9UZR/o7S1k
 ikfQNPeDxlBLKnoZXpDp2HqCu1/YuCcJ0zpZzPGGrKECZb7r+NaayxhmEXAZ+Vsg
 YwsO3eNHBbb58pe9T4oiHp19sflwcTOeNwg8wlvmvrgfBupFz2pU8Xm72EdFyoYm
 tP0QNTOCAuQK3pJcgozaptAO1TzBL3LomyVM0YzAKcumgMg47zALpaSLWJLGtDLM
 5+5WLvcVfBGLVv60h4B62ldS39eBxqTsFodcRMUaqAsnhLK70HVfKlwR3GgtZggr
 PDqbsuIfw/O3b65U2XDKZt1P9dyG3OE/ucueduXUxJPYNGmooEE+PpE+
 =DRVP
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.4-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "The highlights are:

   - automatic recovery of a blacklisted filesystem session (Zheng Yan).
     This is disabled by default and can be enabled by mounting with the
     new "recover_session=clean" option.

   - serialize buffered reads and O_DIRECT writes (Jeff Layton). Care is
     taken to avoid serializing O_DIRECT reads and writes with each
     other, this is based on the exclusion scheme from NFS.

   - handle large osdmaps better in the face of fragmented memory
     (myself)

   - don't limit what security.* xattrs can be get or set (Jeff Layton).
     We were overly restrictive here, unnecessarily preventing things
     like file capability sets stored in security.capability from
     working.

   - allow copy_file_range() within the same inode and across different
     filesystems within the same cluster (Luis Henriques)"

* tag 'ceph-for-5.4-rc1' of git://github.com/ceph/ceph-client: (41 commits)
  ceph: call ceph_mdsc_destroy from destroy_fs_client
  libceph: use ceph_kvmalloc() for osdmap arrays
  libceph: avoid a __vmalloc() deadlock in ceph_kvmalloc()
  ceph: allow object copies across different filesystems in the same cluster
  ceph: include ceph_debug.h in cache.c
  ceph: move static keyword to the front of declarations
  rbd: pull rbd_img_request_create() dout out into the callers
  ceph: reconnect connection if session hang in opening state
  libceph: drop unused con parameter of calc_target()
  ceph: use release_pages() directly
  rbd: fix response length parameter for encoded strings
  ceph: allow arbitrary security.* xattrs
  ceph: only set CEPH_I_SEC_INITED if we got a MAC label
  ceph: turn ceph_security_invalidate_secctx into static inline
  ceph: add buffered/direct exclusionary locking for reads and writes
  libceph: handle OSD op ceph_pagelist_append() errors
  ceph: don't return a value from void function
  ceph: don't freeze during write page faults
  ceph: update the mtime when truncating up
  ceph: fix indentation in __get_snap_name()
  ...
2019-09-25 10:21:13 -07:00
Linus Torvalds
7b1373dd6e fuse update for 5.4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXYod6wAKCRDh3BK/laaZ
 PG3fAP9WXuvUeYh3X7ThQPa2D33VCIMJRd6t+1TVSSc/H8P3dAD/ehN5HIWjnmzz
 iZFc3zDtO9UCJUe23IZomblxOQbu6Qk=
 =I0S2
 -----END PGP SIGNATURE-----

Merge tag 'fuse-update-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:

 - Continue separating the transport (user/kernel communication) and the
   filesystem layers of fuse. Getting rid of most layering violations
   will allow for easier cleanup and optimization later on.

 - Prepare for the addition of the virtio-fs filesystem. The actual
   filesystem will be introduced by a separate pull request.

 - Convert to new mount API.

 - Various fixes, optimizations and cleanups.

* tag 'fuse-update-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (55 commits)
  fuse: Make fuse_args_to_req static
  fuse: fix memleak in cuse_channel_open
  fuse: fix beyond-end-of-page access in fuse_parse_cache()
  fuse: unexport fuse_put_request
  fuse: kmemcg account fs data
  fuse: on 64-bit store time in d_fsdata directly
  fuse: fix missing unlock_page in fuse_writepage()
  fuse: reserve byteswapped init opcodes
  fuse: allow skipping control interface and forced unmount
  fuse: dissociate DESTROY from fuseblk
  fuse: delete dentry if timeout is zero
  fuse: separate fuse device allocation and installation in fuse_conn
  fuse: add fuse_iqueue_ops callbacks
  fuse: extract fuse_fill_super_common()
  fuse: export fuse_dequeue_forget() function
  fuse: export fuse_get_unique()
  fuse: export fuse_send_init_request()
  fuse: export fuse_len_args()
  fuse: export fuse_end_request()
  fuse: fix request limit
  ...
2019-09-25 09:55:59 -07:00
Linus Torvalds
4ef5b13a29 New code for 5.4:
- Report both io errors and short io results to the directio endio
   handler.
 - Allow directio callers to pass an ops structure to iomap_dio_rw.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl2EQBwACgkQ+H93GTRK
 tOvZ0w//R00Dya3pZmO+HQX/Kovo/AOKCct/gPaLvRf4DVvWZ3ZH5lPtDUSKCHBV
 Pw+potTuabwtp//mZrQ84ZMoQYOmxBmxchquJ2L49qWT/mZ1BDqBA2sxQOCehffO
 Dlv715eOPVhkVHOEAHv+BsatxlFDkuKgGcwUqxE4LNHWGrvmjm6rBWPCnxQMTwX9
 BhGo/0WG6D0leWFSMoUIpcD4Oj302/O43RX4lJsyl0Jd5pKJD/RFtaKqwIeE+pj9
 36MnoQYtT5e9BTm1/zzHRVpGgLDDWTY+IClgZNlZprU/Em+TtOGMWitWzob3wWcU
 QHj5bJ9NiWBT6QJ192i0+I0thSTwW/peOAJrkBOW3YhBkH3UIKSzaPiwCBoSEUKS
 fuIOJkRHFW7HTjKSw6wUCJJ6ShD+rNPOoaJ9Ivzz7x2HGEqb1al3EIumu6oDJgyq
 BdnLEecN/JSxPnakpSGAnvlCjZiYbJ95E0JfapzMy7HYpMXfbzZZ4JO9Ld0hQflR
 sMrGwL82tqE9ish1yRr+2nNEY5PqVJqV0XXU3dZz3HVknNuAvqrqXDXgIZT142zS
 s9vGAW+2XphyUKHH9pImQKw142n2xiZK+Pd2AuiO4I06E0EkAvN/n9CN7oJ4Cr4F
 z9rQKNXNx8OSjFtryYe6+IzQ0qQtl2OP7o88K91jYArKP17D8ds=
 =zkFp
 -----END PGP SIGNATURE-----

Merge tag 'iomap-5.4-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap updates from Darrick Wong:
 "After last week's failed pull request attempt, I scuttled everything
  in the branch except for the directio endio api changes, which were
  trivial. Everything else will simply have to wait for the next cycle.

  Summary:

   - Report both io errors and short io results to the directio endio
     handler.

   - Allow directio callers to pass an ops structure to iomap_dio_rw"

* tag 'iomap-5.4-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: move the iomap_dio_rw ->end_io callback into a structure
  iomap: split size and error for iomap_dio_rw ->end_io
2019-09-25 09:01:43 -07:00
Mathieu Desnoyers
227a4aadc7 sched/membarrier: Fix p->mm->membarrier_state racy load
The membarrier_state field is located within the mm_struct, which
is not guaranteed to exist when used from runqueue-lock-free iteration
on runqueues by the membarrier system call.

Copy the membarrier_state from the mm_struct into the scheduler runqueue
when the scheduler switches between mm.

When registering membarrier for mm, after setting the registration bit
in the mm membarrier state, issue a synchronize_rcu() to ensure the
scheduler observes the change. In order to take care of the case
where a runqueue keeps executing the target mm without swapping to
other mm, iterate over each runqueue and issue an IPI to copy the
membarrier_state from the mm_struct into each runqueue which have the
same mm which state has just been modified.

Move the mm membarrier_state field closer to pgd in mm_struct to use
a cache line already touched by the scheduler switch_mm.

The membarrier_execve() (now membarrier_exec_mmap) hook now needs to
clear the runqueue's membarrier state in addition to clear the mm
membarrier state, so move its implementation into the scheduler
membarrier code so it can access the runqueue structure.

Add memory barrier in membarrier_exec_mmap() prior to clearing
the membarrier state, ensuring memory accesses executed prior to exec
are not reordered with the stores clearing the membarrier state.

As suggested by Linus, move all membarrier.c RCU read-side locks outside
of the for each cpu loops.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190919173705.2181-5-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-25 17:42:30 +02:00
Mathieu Desnoyers
2840cf02fa sched/membarrier: Call sync_core only before usermode for same mm
When the prev and next task's mm change, switch_mm() provides the core
serializing guarantees before returning to usermode. The only case
where an explicit core serialization is needed is when the scheduler
keeps the same mm for prev and next.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190919173705.2181-4-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-25 17:42:30 +02:00
Eric W. Biederman
154abafc68 tasks, sched/core: With a grace period after finish_task_switch(), remove unnecessary code
Remove work arounds that were written before there was a grace period
after tasks left the runqueue in finish_task_switch().

In particular now that there tasks exiting the runqueue exprience
a RCU grace period none of the work performed by task_rcu_dereference()
excpet the rcu_dereference() is necessary so replace task_rcu_dereference()
with rcu_dereference().

Remove the code in rcuwait_wait_event() that checks to ensure the current
task has not exited.  It is no longer necessary as it is guaranteed
that any running task will experience a RCU grace period after it
leaves the run queueue.

Remove the comment in rcuwait_wake_up() as it is no longer relevant.

Ref: 8f95c90ceb ("sched/wait, RCU: Introduce rcuwait machinery")
Ref: 150593bf86 ("sched/api: Introduce task_rcu_dereference() and try_get_task_struct()")
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87lfurdpk9.fsf_-_@x220.int.ebiederm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-25 17:42:29 +02:00
Eric W. Biederman
3fbd7ee285 tasks: Add a count of task RCU users
Add a count of the number of RCU users (currently 1) of the task
struct so that we can later add the scheduler case and get rid of the
very subtle task_rcu_dereference(), and just use rcu_dereference().

As suggested by Oleg have the count overlap rcu_head so that no
additional space in task_struct is required.

Inspired-by: Linus Torvalds <torvalds@linux-foundation.org>
Inspired-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Kirill Tkhai <tkhai@yandex.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87woebdplt.fsf_-_@x220.int.ebiederm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-25 17:42:29 +02:00
Linus Walleij
cb063a83ca thermal: db8500: Finalize device tree conversion
At some point there was an attempt to convert the DB8500
thermal sensor to device tree: a probe path was added
and the device tree was augmented for the Snowball board.
The switchover was never completed: instead the thermal
devices came from from the PRCMU MFD device and the probe
on the Snowball was confused as another set of configuration
appeared from the device tree.

Move over to a device-tree only approach, as we fixed up
the device trees.

Cc: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2019-09-24 22:54:49 -07:00
Linus Torvalds
351c8a09b0 Merge branch 'i2c/for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:

 - new driver for ICY, an Amiga Zorro card :)

 - axxia driver gained slave mode support, NXP driver gained ACPI

 - the slave EEPROM backend gained 16 bit address support

 - and lots of regular driver updates and reworks

* 'i2c/for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (52 commits)
  i2c: tegra: Move suspend handling to NOIRQ phase
  i2c: imx: ACPI support for NXP i2c controller
  i2c: uniphier(-f): remove all dev_dbg()
  i2c: uniphier(-f): use devm_platform_ioremap_resource()
  i2c: slave-eeprom: Add comment about address handling
  i2c: exynos5: Remove IRQF_ONESHOT
  i2c: stm32f7: Make structure stm32f7_i2c_algo constant
  i2c: cht-wc: drop check because i2c_unregister_device() is NULL safe
  i2c-eeprom_slave: Add support for more eeprom models
  i2c: fsi: Add of_put_node() before break
  i2c: synquacer: Make synquacer_i2c_ops constant
  i2c: hix5hd2: Remove IRQF_ONESHOT
  i2c: i801: Use iTCO version 6 in Cannon Lake PCH and beyond
  watchdog: iTCO: Add support for Cannon Lake PCH iTCO
  i2c: iproc: Make bcm_iproc_i2c_quirks constant
  i2c: iproc: Add full name of devicetree node to adapter name
  i2c: piix4: Add ACPI support
  i2c: piix4: Fix probing of reserved ports on AMD Family 16h Model 30h
  i2c: ocores: use request_any_context_irq() to register IRQ handler
  i2c: designware: Fix optional reset error handling
  ...
2019-09-24 16:48:02 -07:00
Linus Torvalds
2e959dd87a for-5.4/post-2019-09-24
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl2J8xQQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpujgD/94s9GGKN8JShxCpT0YNuWyyFF5gNlaimQU
 RSGAwnv2YUgEGNSUOPpcaj5FAYhTfYzbqoHlE+jytA2U5KXTOhc5Z85QV+TY4HPs
 I03xczYuYD/uX0QuF00zU2+6eV3lETELPiBARbfEQdHfm72iwurweHzlh4dfhbxW
 P7UA/cKixXWF2CH9wg5347Ll93nD24f2pi8BUyLJi/xpdlaRrN11Ii8AzNlRmq52
 VRxURuogl98W89F6EV2VhPGFgUEYHY2Ot7II2OqqV+jmjHDQW9y5hximzINOqkxs
 bQwo5J+WrDSPoqwl8+db2k7QQjAl1XKDAHmCwz+7J/BoOgZj8/M1FMBwzita+5x+
 UqxEYe7k+2G3w2zuhBrq03BypU8pwqFep/QI0cCCPaHs4J5QnkVOScEqd6iV/C3T
 FPvMvqDf7MrElghj4Qa2IZlh/CgqmLG5NUEz8E40cXkdiP+E+eK9ZY2Uwx2XhBrm
 7Gl+SpG5DxWqqJeRNVWjFwM4p5L+01NtwDbTjZ1rsf+mCW5cNsy/L9B4UpPz4HxW
 coAs0y/Ce+ZhCopIXZ4jLDBoTG9yoVg8EcyfaHKD2Zz0mUFxa2xm+LvXKeT49qqx
 xuodpKD3fiuM7h9Xgv+cDsmn8Rr8gSeXEGV7qzpudmkxbp6IVg/yG5hC/dM921GR
 EVrRtUIwdw==
 =aAPP
 -----END PGP SIGNATURE-----

Merge tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:
 "Some later additions that weren't quite done for the first pull
  request, and also a few fixes that have arrived since.

  This contains:

   - Kill silly pktcdvd warning on attempting to register a non-scsi
     passthrough device (me)

   - Use symbolic constants for the block t10 protection types, and
     switch to handling it in core rather than in the drivers (Max)

   - libahci platform missing node put fix (Nishka)

   - Small series of fixes for BFQ (Paolo)

   - Fix possible nbd crash (Xiubo)"

* tag 'for-5.4/post-2019-09-24' of git://git.kernel.dk/linux-block:
  block: drop device references in bsg_queue_rq()
  block: t10-pi: fix -Wswitch warning
  pktcdvd: remove warning on attempting to register non-passthrough dev
  ata: libahci_platform: Add of_node_put() before loop exit
  nbd: fix possible page fault for nbd disk
  nbd: rename the runtime flags as NBD_RT_ prefixed
  block, bfq: push up injection only after setting service time
  block, bfq: increase update frequency of inject limit
  block, bfq: reduce upper bound for inject limit to max_rq_in_driver+1
  block, bfq: update inject limit only after injection occurred
  block: centralize PI remapping logic to the block layer
  block: use symbolic constants for t10_pi type
2019-09-24 16:31:50 -07:00
Linus Torvalds
9c9fa97a8e Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few hot fixes

 - ocfs2 updates

 - almost all of -mm (slab-generic, slab, slub, kmemleak, kasan,
   cleanups, debug, pagecache, memcg, gup, pagemap, memory-hotplug,
   sparsemem, vmalloc, initialization, z3fold, compaction, mempolicy,
   oom-kill, hugetlb, migration, thp, mmap, madvise, shmem, zswap,
   zsmalloc)

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (132 commits)
  mm/zsmalloc.c: fix a -Wunused-function warning
  zswap: do not map same object twice
  zswap: use movable memory if zpool support allocate movable memory
  zpool: add malloc_support_movable to zpool_driver
  shmem: fix obsolete comment in shmem_getpage_gfp()
  mm/madvise: reduce code duplication in error handling paths
  mm: mmap: increase sockets maximum memory size pgoff for 32bits
  mm/mmap.c: refine find_vma_prev() with rb_last()
  riscv: make mmap allocation top-down by default
  mips: use generic mmap top-down layout and brk randomization
  mips: replace arch specific way to determine 32bit task with generic version
  mips: adjust brk randomization offset to fit generic version
  mips: use STACK_TOP when computing mmap base address
  mips: properly account for stack randomization and stack guard gap
  arm: use generic mmap top-down layout and brk randomization
  arm: use STACK_TOP when computing mmap base address
  arm: properly account for stack randomization and stack guard gap
  arm64, mm: make randomization selected by generic topdown mmap layout
  arm64, mm: move generic mmap layout functions to mm
  arm64: consider stack randomization for mmap base only when necessary
  ...
2019-09-24 16:10:23 -07:00
Hui Zhu
c165f25d23 zpool: add malloc_support_movable to zpool_driver
As a zpool_driver, zsmalloc can allocate movable memory because it support
migate pages.  But zbud and z3fold cannot allocate movable memory.

Add malloc_support_movable to zpool_driver.  If a zpool_driver support
allocate movable memory, set it to true.  And add
zpool_malloc_support_movable check malloc_support_movable to make sure if
a zpool support allocate movable memory.

Link: http://lkml.kernel.org/r/20190605100630.13293-1-teawaterz@linux.alibaba.com
Signed-off-by: Hui Zhu <teawaterz@linux.alibaba.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:12 -07:00
Alexandre Ghiti
649775be63 mm, fs: move randomize_stack_top from fs to mm
Patch series "Provide generic top-down mmap layout functions", v6.

This series introduces generic functions to make top-down mmap layout
easily accessible to architectures, in particular riscv which was the
initial goal of this series.  The generic implementation was taken from
arm64 and used successively by arm, mips and finally riscv.

Note that in addition the series fixes 2 issues:

- stack randomization was taken into account even if not necessary.

- [1] fixed an issue with mmap base which did not take into account
  randomization but did not report it to arm and mips, so by moving arm64
  into a generic library, this problem is now fixed for both
  architectures.

This work is an effort to factorize architecture functions to avoid code
duplication and oversights as in [1].

[1]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1429066.html

This patch (of 14):

This preparatory commit moves this function so that further introduction
of generic topdown mmap layout is contained only in mm/util.c.

Link: http://lkml.kernel.org/r/20190730055113.23635-2-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Song Liu
27e1f82731 khugepaged: enable collapse pmd for pte-mapped THP
khugepaged needs exclusive mmap_sem to access page table.  When it fails
to lock mmap_sem, the page will fault in as pte-mapped THP.  As the page
is already a THP, khugepaged will not handle this pmd again.

This patch enables the khugepaged to retry collapse the page table.

struct mm_slot (in khugepaged.c) is extended with an array, containing
addresses of pte-mapped THPs.  We use array here for simplicity.  We can
easily replace it with more advanced data structures when needed.

In khugepaged_scan_mm_slot(), if the mm contains pte-mapped THP, we try to
collapse the page table.

Since collapse may happen at an later time, some pages may already fault
in.  collapse_pte_mapped_thp() is added to properly handle these pages.
collapse_pte_mapped_thp() also double checks whether all ptes in this pmd
are mapping to the same THP.  This is necessary because some subpage of
the THP may be replaced, for example by uprobe.  In such cases, it is not
possible to collapse the pmd.

[kirill.shutemov@linux.intel.com: add comments for retract_page_tables()]
  Link: http://lkml.kernel.org/r/20190816145443.6ard3iilytc6jlgv@box
Link: http://lkml.kernel.org/r/20190815164525.1848545-6-songliubraving@fb.com
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Song Liu
bfe7b00de6 mm, thp: introduce FOLL_SPLIT_PMD
Introduce a new foll_flag: FOLL_SPLIT_PMD.  As the name says
FOLL_SPLIT_PMD splits huge pmd for given mm_struct, the underlining huge
page stays as-is.

FOLL_SPLIT_PMD is useful for cases where we need to use regular pages, but
would switch back to huge page and huge pmd on.  One of such example is
uprobe.  The following patches use FOLL_SPLIT_PMD in uprobe.

Link: http://lkml.kernel.org/r/20190815164525.1848545-4-songliubraving@fb.com
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Song Liu
010c164a5f mm: move memcmp_pages() and pages_identical()
Patch series "THP aware uprobe", v13.

This patchset makes uprobe aware of THPs.

Currently, when uprobe is attached to text on THP, the page is split by
FOLL_SPLIT.  As a result, uprobe eliminates the performance benefit of
THP.

This set makes uprobe THP-aware.  Instead of FOLL_SPLIT, we introduces
FOLL_SPLIT_PMD, which only split PMD for uprobe.

After all uprobes within the THP are removed, the PTE-mapped pages are
regrouped as huge PMD.

This set (plus a few THP patches) is also available at

   https://github.com/liu-song-6/linux/tree/uprobe-thp

This patch (of 6):

Move memcmp_pages() to mm/util.c and pages_identical() to mm.h, so that we
can use them in other files.

Link: http://lkml.kernel.org/r/20190815164525.1848545-2-songliubraving@fb.com
Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <matthew.wilcox@oracle.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Yang Shi
87eaceb3fa mm: thp: make deferred split shrinker memcg aware
Currently THP deferred split shrinker is not memcg aware, this may cause
premature OOM with some configuration.  For example the below test would
run into premature OOM easily:

$ cgcreate -g memory:thp
$ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes
$ cgexec -g memory:thp transhuge-stress 4000

transhuge-stress comes from kernel selftest.

It is easy to hit OOM, but there are still a lot THP on the deferred split
queue, memcg direct reclaim can't touch them since the deferred split
shrinker is not memcg aware.

Convert deferred split shrinker memcg aware by introducing per memcg
deferred split queue.  The THP should be on either per node or per memcg
deferred split queue if it belongs to a memcg.  When the page is
immigrated to the other memcg, it will be immigrated to the target memcg's
deferred split queue too.

Reuse the second tail page's deferred_list for per memcg list since the
same THP can't be on multiple deferred split queues.

[yang.shi@linux.alibaba.com: simplify deferred split queue dereference per Kirill Tkhai]
  Link: http://lkml.kernel.org/r/1566496227-84952-5-git-send-email-yang.shi@linux.alibaba.com
Link: http://lkml.kernel.org/r/1565144277-36240-5-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Yang Shi
0a432dcbeb mm: shrinker: make shrinker not depend on memcg kmem
Currently shrinker is just allocated and can work when memcg kmem is
enabled.  But, THP deferred split shrinker is not slab shrinker, it
doesn't make too much sense to have such shrinker depend on memcg kmem.
It should be able to reclaim THP even though memcg kmem is disabled.

Introduce a new shrinker flag, SHRINKER_NONSLAB, for non-slab shrinker.
When memcg kmem is disabled, just such shrinkers can be called in
shrinking memcg slab.

[yang.shi@linux.alibaba.com: add comment]
  Link: http://lkml.kernel.org/r/1566496227-84952-4-git-send-email-yang.shi@linux.alibaba.com
Link: http://lkml.kernel.org/r/1565144277-36240-4-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Yang Shi
364c1eebe4 mm: thp: extract split_queue_* into a struct
Patch series "Make deferred split shrinker memcg aware", v6.

Currently THP deferred split shrinker is not memcg aware, this may cause
premature OOM with some configuration.  For example the below test would
run into premature OOM easily:

$ cgcreate -g memory:thp
$ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes
$ cgexec -g memory:thp transhuge-stress 4000

transhuge-stress comes from kernel selftest.

It is easy to hit OOM, but there are still a lot THP on the deferred split
queue, memcg direct reclaim can't touch them since the deferred split
shrinker is not memcg aware.

Convert deferred split shrinker memcg aware by introducing per memcg
deferred split queue.  The THP should be on either per node or per memcg
deferred split queue if it belongs to a memcg.  When the page is
immigrated to the other memcg, it will be immigrated to the target memcg's
deferred split queue too.

Reuse the second tail page's deferred_list for per memcg list since the
same THP can't be on multiple deferred split queues.

Make deferred split shrinker not depend on memcg kmem since it is not
slab.  It doesn't make sense to not shrink THP even though memcg kmem is
disabled.

With the above change the test demonstrated above doesn't trigger OOM even
though with cgroup.memory=nokmem.

This patch (of 4):

Put split_queue, split_queue_lock and split_queue_len into a struct in
order to reduce code duplication when we convert deferred_split to memcg
aware in the later patches.

Link: http://lkml.kernel.org/r/1565144277-36240-2-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Suggested-by: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Song Liu
09d91cda0e mm,thp: avoid writes to file with THP in pagecache
In previous patch, an application could put part of its text section in
THP via madvise().  These THPs will be protected from writes when the
application is still running (TXTBSY).  However, after the application
exits, the file is available for writes.

This patch avoids writes to file THP by dropping page cache for the file
when the file is open for write.  A new counter nr_thps is added to struct
address_space.  In do_dentry_open(), if the file is open for write and
nr_thps is non-zero, we drop page cache for the whole file.

Link: http://lkml.kernel.org/r/20190801184244.3169074-8-songliubraving@fb.com
Signed-off-by: Song Liu <songliubraving@fb.com>
Reported-by: kbuild test robot <lkp@intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Song Liu
60fbf0ab5d mm,thp: stats for file backed THP
In preparation for non-shmem THP, this patch adds a few stats and exposes
them in /proc/meminfo, /sys/bus/node/devices/<node>/meminfo, and
/proc/<pid>/task/<tid>/smaps.

This patch is mostly a rewrite of Kirill A.  Shutemov's earlier version:
https://lkml.kernel.org/r/20170126115819.58875-5-kirill.shutemov@linux.intel.com/

Link: http://lkml.kernel.org/r/20190801184244.3169074-5-songliubraving@fb.com
Signed-off-by: Song Liu <songliubraving@fb.com>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Vlastimil Babka
4943308556 mm, compaction: raise compaction priority after it withdrawns
Mike Kravetz reports that "hugetlb allocations could stall for minutes or
hours when should_compact_retry() would return true more often then it
should.  Specifically, this was in the case where compact_result was
COMPACT_DEFERRED and COMPACT_PARTIAL_SKIPPED and no progress was being
made."

The problem is that the compaction_withdrawn() test in
should_compact_retry() includes compaction outcomes that are only possible
on low compaction priority, and results in a retry without increasing the
priority.  This may result in furter reclaim, and more incomplete
compaction attempts.

With this patch, compaction priority is raised when possible, or
should_compact_retry() returns false.

The COMPACT_SKIPPED result doesn't really fit together with the other
outcomes in compaction_withdrawn(), as that's a result caused by
insufficient order-0 pages, not due to low compaction priority.  With this
patch, it is moved to a new compaction_needs_reclaim() function, and for
that outcome we keep the current logic of retrying if it looks like
reclaim will be able to help.

Link: http://lkml.kernel.org/r/20190806014744.15446-4-mike.kravetz@oracle.com
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Tested-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:10 -07:00
Pengfei Li
688fcbfc06 mm/vmalloc: modify struct vmap_area to reduce its size
Objective
---------

The current implementation of struct vmap_area wasted space.

After applying this commit, sizeof(struct vmap_area) has been
reduced from 11 words to 8 words.

Description
-----------

1) Pack "subtree_max_size", "vm" and "purge_list".  This is no problem
   because

A) "subtree_max_size" is only used when vmap_area is in "free" tree

B) "vm" is only used when vmap_area is in "busy" tree

C) "purge_list" is only used when vmap_area is in vmap_purge_list

2) Eliminate "flags".

;Since only one flag VM_VM_AREA is being used, and the same thing can be
done by judging whether "vm" is NULL, then the "flags" can be eliminated.

Link: http://lkml.kernel.org/r/20190716152656.12255-3-lpf.vector@gmail.com
Signed-off-by: Pengfei Li <lpf.vector@gmail.com>
Suggested-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:10 -07:00
David Hildenbrand
b6c88d3b9d drivers/base/memory.c: don't store end_section_nr in memory blocks
Each memory block spans the same amount of sections/pages/bytes.  The size
is determined before the first memory block is created.  No need to store
what we can easily calculate - and the calculations even look simpler now.

Michal brought up the idea of variable-sized memory blocks.  However, if
we ever implement something like this, we will need an API compatibility
switch and reworks at various places (most code assumes a fixed memory
block size).  So let's cleanup what we have right now.

While at it, fix the variable naming in register_mem_sect_under_node() -
we no longer talk about a single section.

Link: http://lkml.kernel.org/r/20190809110200.2746-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:09 -07:00
David Hildenbrand
902ce63b33 driver/base/memory.c: validate memory block size early
Let's validate the memory block size early, when initializing the memory
device infrastructure.  Fail hard in case the value is not suitable.

As nobody checks the return value of memory_dev_init(), turn it into a
void function and fail with a panic in all scenarios instead.  Otherwise,
we'll crash later during boot when core/drivers expect that the memory
device infrastructure (including memory_block_size_bytes()) works as
expected.

I think long term, we should move the whole memory block size
configuration (set_memory_block_size_order() and
memory_block_size_bytes()) into drivers/base/memory.c.

Link: http://lkml.kernel.org/r/20190806090142.22709-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:09 -07:00
Nicholas Piggin
13224794cb mm: remove quicklist page table caches
Patch series "mm: remove quicklist page table caches".

A while ago Nicholas proposed to remove quicklist page table caches [1].

I've rebased his patch on the curren upstream and switched ia64 and sh to
use generic versions of PTE allocation.

[1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com

This patch (of 3):

Remove page table allocator "quicklists".  These have been around for a
long time, but have not got much traction in the last decade and are only
used on ia64 and sh architectures.

The numbers in the initial commit look interesting but probably don't
apply anymore.  If anybody wants to resurrect this it's in the git
history, but it's unhelpful to have this code and divergent allocator
behaviour for minor archs.

Also it might be better to instead make more general improvements to page
allocator if this is still so slow.

Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:09 -07:00
akpm@linux-foundation.org
2d15eb31b5 mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
[11~From: John Hubbard <jhubbard@nvidia.com>
Subject: mm/gup: add make_dirty arg to put_user_pages_dirty_lock()

Patch series "mm/gup: add make_dirty arg to put_user_pages_dirty_lock()",
v3.

There are about 50+ patches in my tree [2], and I'll be sending out the
remaining ones in a few more groups:

* The block/bio related changes (Jerome mostly wrote those, but I've had
  to move stuff around extensively, and add a little code)

* mm/ changes

* other subsystem patches

* an RFC that shows the current state of the tracking patch set.  That
  can only be applied after all call sites are converted, but it's good to
  get an early look at it.

This is part a tree-wide conversion, as described in fc1d8e7cca ("mm:
introduce put_user_page*(), placeholder versions").

This patch (of 3):

Provide more capable variation of put_user_pages_dirty_lock(), and delete
put_user_pages_dirty().  This is based on the following:

1.  Lots of call sites become simpler if a bool is passed into
   put_user_page*(), instead of making the call site choose which
   put_user_page*() variant to call.

2.  Christoph Hellwig's observation that set_page_dirty_lock() is
   usually correct, and set_page_dirty() is usually a bug, or at least
   questionable, within a put_user_page*() calling chain.

This leads to the following API choices:

    * put_user_pages_dirty_lock(page, npages, make_dirty)

    * There is no put_user_pages_dirty(). You have to
      hand code that, in the rare case that it's
      required.

[jhubbard@nvidia.com: remove unused variable in siw_free_plist()]
  Link: http://lkml.kernel.org/r/20190729074306.10368-1-jhubbard@nvidia.com
Link: http://lkml.kernel.org/r/20190724044537.10458-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Matthew Wilcox (Oracle)
4101196b19 mm: page cache: store only head pages in i_pages
Transparent Huge Pages are currently stored in i_pages as pointers to
consecutive subpages.  This patch changes that to storing consecutive
pointers to the head page in preparation for storing huge pages more
efficiently in i_pages.

Large parts of this are "inspired" by Kirill's patch
https://lore.kernel.org/lkml/20170126115819.58875-2-kirill.shutemov@linux.intel.com/

Kirill and Huang Ying contributed several fixes.

[willy@infradead.org: use compound_nr, squish uninit-var warning]
Link: http://lkml.kernel.org/r/20190731210400.7419-1-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Jan Kara <jack@suse.cz>
Reviewed-by: Kirill Shutemov <kirill@shutemov.name>
Reviewed-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Tested-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Tested-by: Qian Cai <cai@lca.pw>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Vlastimil Babka
37389167a2 mm, page_owner: keep owner info when freeing the page
For debugging purposes it might be useful to keep the owner info even
after page has been freed, and include it in e.g.  dump_page() when
detecting a bad page state.  For that, change the PAGE_EXT_OWNER flag
meaning to "page owner info has been set at least once" and add new
PAGE_EXT_OWNER_ACTIVE for tracking whether page is supposed to be
currently tracked allocated or free.  Adjust dump_page() accordingly,
distinguishing free and allocated pages.  In the page_owner debugfs file,
keep printing only allocated pages so that existing scripts are not
confused, and also because free pages are irrelevant for the memory
statistics or leak detection that's the typical use case of the file,
anyway.

Link: http://lkml.kernel.org/r/20190820131828.22684-4-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Matthew Wilcox (Oracle)
d8c6546b1a mm: introduce compound_nr()
Replace 1 << compound_order(page) with compound_nr(page).  Minor
improvements in readability.

Link: http://lkml.kernel.org/r/20190721104612.19120-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Matthew Wilcox (Oracle)
94ad933810 mm: introduce page_shift()
Replace PAGE_SHIFT + compound_order(page) with the new page_shift()
function.  Minor improvements in readability.

[akpm@linux-foundation.org: fix build in tce_page_is_contained()]
  Link: http://lkml.kernel.org/r/201907241853.yNQTrJWd%25lkp@intel.com
Link: http://lkml.kernel.org/r/20190721104612.19120-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Matthew Wilcox (Oracle)
a50b854e07 mm: introduce page_size()
Patch series "Make working with compound pages easier", v2.

These three patches add three helpers and convert the appropriate
places to use them.

This patch (of 3):

It's unnecessarily hard to find out the size of a potentially huge page.
Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).

Link: http://lkml.kernel.org/r/20190721104612.19120-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Waiman Long
9adeaa2269 mm, slab: move memcg_cache_params structure to mm/slab.h
The memcg_cache_params structure is only embedded into the kmem_cache of
slab and slub allocators as defined in slab_def.h and slub_def.h and used
internally by mm code.  There is no needed to expose it in a public
header.  So move it from include/linux/slab.h to mm/slab.h.  It is just a
refactoring patch with no code change.

In fact both the slub_def.h and slab_def.h should be moved into the mm
directory as well, but that will probably cause many merge conflicts.

Link: http://lkml.kernel.org/r/20190718180827.18758-1-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Joseph Qi
963abb9aeb jbd2: remove jbd2_journal_inode_add_[write|wait]
Since ext4/ocfs2 are using jbd2_inode dirty range scoping APIs now,
jbd2_journal_inode_add_[write|wait] are not used any more, remove them.

Link: http://lkml.kernel.org/r/1562977611-8412-2-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Acked-by: Changwei Ge <chge@linux.alibaba.com>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:07 -07:00
Arnd Bergmann
710ec38b0f mm: add dummy can_do_mlock() helper
On kernels without CONFIG_MMU, we get a link error for the siw driver:

drivers/infiniband/sw/siw/siw_mem.o: In function `siw_umem_get':
siw_mem.c:(.text+0x4c8): undefined reference to `can_do_mlock'

This is probably not the only driver that needs the function and could
otherwise build correctly without CONFIG_MMU, so add a dummy variant that
always returns false.

Link: http://lkml.kernel.org/r/20190909204201.931830-1-arnd@arndb.de
Fixes: 2251334dca ("rdma/siw: application buffer management")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:06 -07:00
Linus Torvalds
af5a7e99cc - First round of vmbus hibernation support from Dexuan Cui.
- Removal of dependencies on PAGE_SIZE by Maya Nakamura.
 - Moving the hyper-v tools/ code into the tools build system by Andy
 Shevchenko.
 - hyper-v balloon cleanups by Dexuan Cui.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAl2JanAACgkQ3qZv95d3
 LNwf2Q//Rtmclmnk+lXn1BhEEiXtzliSY7wcjpRR87WCzTIj6p2y//R2PuweQr+b
 dAlXKd6reK/c2Q/FnaQ5Gf1daNWvMh/39viMaesGZcvoSWlT60gDPdnelj6Z8sO0
 gWDRQV5d4AHIs2garl2zTzO+TS9/8Ot/YD0gVpX4wNAy7j9ZeEvNVoanGvQ88Et3
 pSQFKTNLVPOLlMOchm6HAhwBo5k6Y1LB3/RE/qqcX1sR8/CLp4DT0VhsMVA1DZXV
 hb3a0tEzn8fJxifx8/iguZr84SetXA/qTKKWDG59xAU2kijJrLyb3KXRE92GOzlA
 HzwOlnX0vWpTTthEzaLlvOgFKybTNBGMEQQJKmpI2PucC0iaHmYVH2dDxhBb2gX5
 uJGGr4arHjMDQYfppCVy/VXE5hCpKE29L/7kl+DsElM6NkgyJAfK7Crpuxs8KMME
 HwHi5UwTSvaKv1XKilWIDy4PpuzvGx5ftPMyBqgEH/aLK9aP1N+folCTUc01qCFU
 vz/Yjrs/p/U7T9P4rDCXMb+IPiCpr1puBsC/z0RJvsKUdKrzDzpXPLU8Wagv6UxS
 iHpZRR/ArUYByRp3N42+PR8i9uqrcOxtNgzphnRsBo3lzOAphVaQY0tPQkBPSMp2
 SQI2NP1G74l3WdszeeHi446v6S40ichN/FYsDuiGCs9YJY78mMs=
 =Dk9i
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull Hyper-V updates from Sasha Levin:

 - first round of vmbus hibernation support (Dexuan Cui)

 - remove dependencies on PAGE_SIZE (Maya Nakamura)

 - move the hyper-v tools/ code into the tools build system (Andy
   Shevchenko)

 - hyper-v balloon cleanups (Dexuan Cui)

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Resume after fixing up old primary channels
  Drivers: hv: vmbus: Suspend after cleaning up hv_sock and sub channels
  Drivers: hv: vmbus: Clean up hv_sock channels by force upon suspend
  Drivers: hv: vmbus: Suspend/resume the vmbus itself for hibernation
  Drivers: hv: vmbus: Ignore the offers when resuming from hibernation
  Drivers: hv: vmbus: Implement suspend/resume for VSC drivers for hibernation
  Drivers: hv: vmbus: Add a helper function is_sub_channel()
  Drivers: hv: vmbus: Suspend/resume the synic for hibernation
  Drivers: hv: vmbus: Break out synic enable and disable operations
  HID: hv: Remove dependencies on PAGE_SIZE for ring buffer
  Tools: hv: move to tools buildsystem
  hv_balloon: Reorganize the probe function
  hv_balloon: Use a static page for the balloon_up send buffer
2019-09-24 12:36:31 -07:00
Aneesh Kumar K.V
cf387d9644 libnvdimm/altmap: Track namespace boundaries in altmap
With PFN_MODE_PMEM namespace, the memmap area is allocated from the device
area. Some architectures map the memmap area with large page size. On
architectures like ppc64, 16MB page for memap mapping can map 262144 pfns.
This maps a namespace size of 16G.

When populating memmap region with 16MB page from the device area,
make sure the allocated space is not used to map resources outside this
namespace. Such usage of device area will prevent a namespace destroy.

Add resource end pnf in altmap and use that to check if the memmap area
allocation can map pfn outside the namespace. On ppc64 in such case we fallback
to allocation from memory.

This fix kernel crash reported below:

[  132.034989] WARNING: CPU: 13 PID: 13719 at mm/memremap.c:133 devm_memremap_pages_release+0x2d8/0x2e0
[  133.464754] BUG: Unable to handle kernel data access at 0xc00c00010b204000
[  133.464760] Faulting instruction address: 0xc00000000007580c
[  133.464766] Oops: Kernel access of bad area, sig: 11 [#1]
[  133.464771] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
.....
[  133.464901] NIP [c00000000007580c] vmemmap_free+0x2ac/0x3d0
[  133.464906] LR [c0000000000757f8] vmemmap_free+0x298/0x3d0
[  133.464910] Call Trace:
[  133.464914] [c000007cbfd0f7b0] [c0000000000757f8] vmemmap_free+0x298/0x3d0 (unreliable)
[  133.464921] [c000007cbfd0f8d0] [c000000000370a44] section_deactivate+0x1a4/0x240
[  133.464928] [c000007cbfd0f980] [c000000000386270] __remove_pages+0x3a0/0x590
[  133.464935] [c000007cbfd0fa50] [c000000000074158] arch_remove_memory+0x88/0x160
[  133.464942] [c000007cbfd0fae0] [c0000000003be8c0] devm_memremap_pages_release+0x150/0x2e0
[  133.464949] [c000007cbfd0fb70] [c000000000738ea0] devm_action_release+0x30/0x50
[  133.464955] [c000007cbfd0fb90] [c00000000073a5a4] release_nodes+0x344/0x400
[  133.464961] [c000007cbfd0fc40] [c00000000073378c] device_release_driver_internal+0x15c/0x250
[  133.464968] [c000007cbfd0fc80] [c00000000072fd14] unbind_store+0x104/0x110
[  133.464973] [c000007cbfd0fcd0] [c00000000072ee24] drv_attr_store+0x44/0x70
[  133.464981] [c000007cbfd0fcf0] [c0000000004a32bc] sysfs_kf_write+0x6c/0xa0
[  133.464987] [c000007cbfd0fd10] [c0000000004a1dfc] kernfs_fop_write+0x17c/0x250
[  133.464993] [c000007cbfd0fd60] [c0000000003c348c] __vfs_write+0x3c/0x70
[  133.464999] [c000007cbfd0fd80] [c0000000003c75d0] vfs_write+0xd0/0x250

djbw: Aneesh notes that this crash can likely be triggered in any kernel that
supports 'papr_scm', so flagging that commit for -stable consideration.

Fixes: b5beae5e22 ("powerpc/pseries: Add driver for PAPR SCM regions")
Cc: <stable@vger.kernel.org>
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Tested-by: Santosh Sivaraj <santosh@fossix.org>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Link: https://lore.kernel.org/r/20190910062826.10041-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-09-24 10:24:12 -07:00
Aneesh Kumar K.V
f537669978 libnvdimm/dax: Pick the right alignment default when creating dax devices
Allow arch to provide the supported alignments and use hugepage alignment only
if we support hugepage. Right now we depend on compile time configs whereas this
patch switch this to runtime discovery.

Architectures like ppc64 can have THP enabled in code, but then can have
hugepage size disabled by the hypervisor. This allows us to create dax devices
with PAGE_SIZE alignment in this case.

Existing dax namespace with alignment larger than PAGE_SIZE will fail to
initialize in this specific case. We still allow fsdax namespace initialization.

With respect to identifying whether to enable hugepage fault for a dax device,
if THP is enabled during compile, we default to taking hugepage fault and in dax
fault handler if we find the fault size > alignment we retry with PAGE_SIZE
fault size.

This also addresses the below failure scenario on ppc64

ndctl create-namespace --mode=devdax  | grep align
 "align":16777216,
 "align":16777216

cat /sys/devices/ndbus0/region0/dax0.0/supported_alignments
 65536 16777216

daxio.static-debug  -z -o /dev/dax0.0
  Bus error (core dumped)

  $ dmesg | tail
   lpar: Failed hash pte insert with error -4
   hash-mmu: mm: Hashing failure ! EA=0x7fff17000000 access=0x8000000000000006 current=daxio
   hash-mmu:     trap=0x300 vsid=0x22cb7a3 ssize=1 base psize=2 psize 10 pte=0xc000000501002b86
   daxio[3860]: bus error (7) at 7fff17000000 nip 7fff973c007c lr 7fff973bff34 code 2 in libpmem.so.1.0.0[7fff973b0000+20000]
   daxio[3860]: code: 792945e4 7d494b78 e95f0098 7d494b78 f93f00a0 4800012c e93f0088 f93f0120
   daxio[3860]: code: e93f00a0 f93f0128 e93f0120 e95f0128 <f9490000> e93f0088 39290008 f93f0110

The failure was due to guest kernel using wrong page size.

The namespaces created with 16M alignment will appear as below on a config with
16M page size disabled.

$ ndctl list -Ni
[
  {
    "dev":"namespace0.1",
    "mode":"fsdax",
    "map":"dev",
    "size":5351931904,
    "uuid":"fc6e9667-461a-4718-82b4-69b24570bddb",
    "align":16777216,
    "blockdev":"pmem0.1",
    "supported_alignments":[
      65536
    ]
  },
  {
    "dev":"namespace0.0",
    "mode":"fsdax",    <==== devdax 16M alignment marked disabled.
    "map":"mem",
    "size":5368709120,
    "uuid":"a4bdf81a-f2ee-4bc6-91db-7b87eddd0484",
    "state":"disabled"
  }
]

Cc: linux-mm@kvack.org
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/20190905154603.10349-8-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-09-24 10:23:41 -07:00
Vitaly Kuznetsov
e1572f1d08 cpu/SMT: create and export cpu_smt_possible()
KVM needs to know if SMT is theoretically possible, this means it is
supported and not forcefully disabled ('nosmt=force'). Create and
export cpu_smt_possible() answering this question.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-24 13:37:28 +02:00
Yevgeny Kliteynik
d32d7c52e0 net/mlx5: DR, Fix SW steering HW bits and definitions
Fix wrong reserved bits offsets.

Fixes: 97b5484ed6 ("net/mlx5: Add HW bits and definitions required for SW steering")
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-24 12:38:06 +03:00
Linus Torvalds
4c07e2ddab - New Drivers
- Add support for Merrifield Basin Cove PMIC
 
  - New Device Support
    - Add support for Intel Tiger Lake to Intel LPSS PCI
    - Add support for Intel Sky Lake to Intel LPSS PCI
    - Add support for ST-Ericsson DB8520 to DB8500 PRCMU
 
  - New Functionality
    - Add RTC and PWRC support to MT6323
 
  - Fix-ups
    - Clean-up include files; davinci_voicecodec, asic3, sm501, mt6397
    - Ignore return values from debugfs_create*(); ab3100-*, ab8500-debugfs, aat2870-core
    - Device Tree changes; rn5t618, mt6397
    - Use new I2C API; tps80031, 88pm860x-core, ab3100-core, bcm590xx,
                       da9150-core, max14577, max77693, max77843, max8907,
                       max8925-i2c, max8997, max8998, palmas, twl-core,
    - Remove obsolete code; da9063, jz4740-adc
    - Simplify semantics; timberdale, htc-i2cpld
    - Add 'fall-through' tags; omap-usb-host, db8500-prcmu
    - Remove superfluous prints; ab8500-debugfs, db8500-prcmu, fsl-imx25-tsadc,
                                 intel_soc_pmic_bxtwc, qcom_rpm, sm501
    - Trivial rename/whitespace/typo fixes; mt6397-core, MAINTAINERS
    - Reorganise code structure; mt6397-*
    - Improve code consistency; intel-lpss
    - Use MODULE_SOFTDEP() helper; intel-lpss
    - Use DEFINE_RES_*() helpers; mt6397-core
 
  - Bug Fixes
    - Clean-up resources; max77620
    - Prevent input events being dropped on resume; intel-lpss-pci
    - Prevent sleeping in IRQ context; ezx-pcap
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAl2JUDEACgkQUa+KL4f8
 d2HKPQ//TLMO2Ed7AdhajgpYQVRQ2wM5VGOQLaTj/sO0WTjk3DfPdmp13ZZh9UTN
 evfYiVJI2ofpcliK6Tg8X27oaNg/YK+ff562lyFzMNIQMJ7hpzjvyaAfGhMz4y7I
 tN8RgMNHSU5luywJvJ43OH6K5C+ckg+kPp4Ywvnh3Hwqm3y7QJ9/CshTON69BOCf
 kYvE9dvAqubbTl80aMOqNbl/p6+h/C9sFhvGsC7LnWdE9vev/SE1kYmjm1ha5Z1N
 hRbnAw1nXidV1mB6oIhzvsx3MgpvBjvi6UXg8TCWuZTUlwMplgAfsDKWb6J9czxB
 lhP0W5/wuJPIkdZPRSEGVJ6MCsxpF5GcGknWTf1dL2hsx0kh1JsT/AUQeHR02NpQ
 pnGF7YDgQugHnkHqr+JfbswlIunh5U3ZxspvHQWTxzaVQvQ0eGk91Kqr5zKhdHf9
 z5e1j/VtjzAsYqGkamAjXCXPES1PIvwqKT6qLrERAoFBgigwP/i4nxSlTJGvqTVY
 M3+RH7I7IWfGCuJ227PWpkvFgzX7N+okKxujOWo+Fd4DH6lhMDstnrIoVZxpEfE+
 cQLo5soG7mJtk9meGFTrxkfT3IqcVWMG0vCXeaureNS1N+OjtZLRDN7ZuG2d8zFx
 nRCi+tJoAZe34DenfiSe+VcczlOOtqR2JpZKAdwW3ZT3Kcfl9U4=
 =03Zw
 -----END PGP SIGNATURE-----

Merge tag 'mfd-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "New Drivers:
   - Add support for Merrifield Basin Cove PMIC

  New Device Support:
   - Add support for Intel Tiger Lake to Intel LPSS PCI
   - Add support for Intel Sky Lake to Intel LPSS PCI
   - Add support for ST-Ericsson DB8520 to DB8500 PRCMU

  New Functionality:
   - Add RTC and PWRC support to MT6323

  Fix-ups:
   - Clean-up include files; davinci_voicecodec, asic3, sm501, mt6397
   - Ignore return values from debugfs_create*(); ab3100-*, ab8500-debugfs, aat2870-core
   - Device Tree changes; rn5t618, mt6397
   - Use new I2C API; tps80031, 88pm860x-core, ab3100-core, bcm590xx,
                      da9150-core, max14577, max77693, max77843, max8907,
                      max8925-i2c, max8997, max8998, palmas, twl-core,
   - Remove obsolete code; da9063, jz4740-adc
   - Simplify semantics; timberdale, htc-i2cpld
   - Add 'fall-through' tags; omap-usb-host, db8500-prcmu
   - Remove superfluous prints; ab8500-debugfs, db8500-prcmu, fsl-imx25-tsadc,
                                intel_soc_pmic_bxtwc, qcom_rpm, sm501
   - Trivial rename/whitespace/typo fixes; mt6397-core, MAINTAINERS
   - Reorganise code structure; mt6397-*
   - Improve code consistency; intel-lpss
   - Use MODULE_SOFTDEP() helper; intel-lpss
   - Use DEFINE_RES_*() helpers; mt6397-core

  Bug Fixes:
   - Clean-up resources; max77620
   - Prevent input events being dropped on resume; intel-lpss-pci
   - Prevent sleeping in IRQ context; ezx-pcap"

* tag 'mfd-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (48 commits)
  mfd: mt6323: Add MT6323 RTC and PWRC
  mfd: mt6323: Replace boilerplate resource code with DEFINE_RES_* macros
  mfd: mt6397: Add mutex include
  dt-bindings: mfd: mediatek: Add MT6323 Power Controller
  dt-bindings: mfd: mediatek: Update RTC to include MT6323
  dt-bindings: mfd: mediatek: mt6397: Change to relative paths
  mfd: db8500-prcmu: Support the higher DB8520 ARMSS
  mfd: intel-lpss: Use MODULE_SOFTDEP() instead of implicit request
  mfd: htc-i2cpld: Drop check because i2c_unregister_device() is NULL safe
  mfd: sm501: Include the GPIO driver header
  mfd: intel-lpss: Add Intel Skylake ACPI IDs
  mfd: intel-lpss: Consistently use GENMASK()
  mfd: Add support for Merrifield Basin Cove PMIC
  mfd: ezx-pcap: Replace mutex_lock with spin_lock
  mfd: asic3: Include the right header
  MAINTAINERS: altera-sysmgr: Fix typo in a filepath
  mfd: mt6397: Extract IRQ related code from core driver
  mfd: mt6397: Rename macros to something more readable
  mfd: Remove dev_err() usage after platform_get_irq()
  mfd: db8500-prcmu: Mark expected switch fall-throughs
  ...
2019-09-23 19:37:49 -07:00
Linus Torvalds
d0b3cfee33 - Core Frameworks
- Obtain scale type through sysfs
 
  - New Functionality
    - Provide Device Tree functionality; rave-sp-backlight
    - Calculate if scale type is (non-)linear; pwm_bl
 
  - Fix-ups
    - Simplify code; lm3630a_bl
    - Trivial rename/whitespace/typo fixes; lms283gf05
    - Remove superfluous NULL check; tosa_lcd
    - Fix power state initialisation; gpio_backlight
    - List supported file; MAINTAINERS
 
  - Bug Fixes
    - Kconfig - default to not building unless requested; {LED,BACKLIGHT}_CLASS_DEVICE
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAl2JTwMACgkQUa+KL4f8
 d2EC7g/+Pliuvx8dgW+MPX8K6cnap+m7xV8Z9OyyEYHroz+MX7TsBGmJeChoOz9K
 RoFCap5VCtEEyiobDbe8HKpxivY88CdweegYxxBx+Uo9qJRogDv93kqVz3BmHxBm
 v2ShTY+8r2nf6YNN30FXZJSM4WkFqAg2MvM6LXP3R4rPSQ/u2Xp6L5VnzVETl23b
 /ZsP71axvwDNpi437ekUSoraUl9GZnHaINc7/0R/SnRPvZ2aakko7IRNYU0z+NQ+
 unImiBpFjuC1UXDaebd2Bt9ayrM47QkBGLu6LgOTUSg9tFlosvF6Z3TDLo3L2ptD
 Td5SWuTXALFaOv1xmMaylcsA/36D/KRhwnmQMVn/Ld5c/M+l4vOm4YDNgS1G8JY4
 i7KzoD7Z2yhyI3OHYoBGXI8+fmj2qQs+VGThtdhb96LlaHVUibfSlA0i5OjeMKO+
 ddiTy9ksiH+pBcn0YPvHJRNOI0uVmqBMSPkJ1Bdwz6HS58QcL3LvBzHtMVNoGkCK
 iWPPn90hveZmbXAxFIb8a2KRjFz531sReSZh+9dIZ/zOOF5OdyOotMk1zacWCUrt
 Jre1paR4W1VPXf4cE3IcpE6dJeA9DN9niz+9bsMgbr6cYG6xhB7xELrTUCXZn5CR
 C9tNMWuiqUlv0NZHb7CsifQW1bwzw8hTPhDk9VpBb0cFCucLYV0=
 =DSnC
 -----END PGP SIGNATURE-----

Merge tag 'backlight-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 "Core Frameworks
   - Obtain scale type through sysfs

  New Functionality:
   - Provide Device Tree functionality in rave-sp-backlight
   - Calculate if scale type is (non-)linear in pwm_bl

  Fix-ups:
   - Simplify code in lm3630a_bl
   - Trivial rename/whitespace/typo fixes in lms283gf05
   - Remove superfluous NULL check in tosa_lcd
   - Fix power state initialisation in gpio_backlight
   - List supported file in MAINTAINERS

  Bug Fixes:
   - Kconfig - default to not building unless requested in
     {LED,BACKLIGHT}_CLASS_DEVICE"

* tag 'backlight-next-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: pwm_bl: Set scale type for brightness curves specified in the DT
  backlight: pwm_bl: Set scale type for CIE 1931 curves
  backlight: Expose brightness curve type through sysfs
  MAINTAINERS: Add entry for stable backlight sysfs ABI documentation
  backlight: gpio-backlight: Correct initial power state handling
  video: backlight: tosa_lcd: drop check because i2c_unregister_device() is NULL safe
  video: backlight: Drop default m for {LCD,BACKLIGHT_CLASS_DEVICE}
  backlight: lms283gf05: Fix a typo in the description passed to 'devm_gpio_request_one()'
  backlight: lm3630a: Switch to use fwnode_property_count_uXX()
  backlight: rave-sp: Leave initial state and register with correct device
2019-09-23 19:33:56 -07:00
Linus Torvalds
299d14d4c3 pci-v5.4-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl2JNVAUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyTOA/9EZeyS7J+ZcOwihWz5vNijf0kfpKp
 /jZ9VF9nHjsL9Pw3/Fzha605Ssrtwcqge8g/sze9f0g/pxZk99lLHokE6dEOurEA
 GyKpNNMdiBol4YZMCsSoYji0MpwW0uMCuASPMiEwv2LxZ72A2Tu1RbgYLU+n4m1T
 fQldDTxsUMXc/OH/8SL8QDEh6o8qyDRhmSXFAOv8RGqN8N3iUwVwhQobKpwpmEvx
 ddzqWMS8f91qkhIKO7fgc9P4NI/7yI7kkF+wcdwtfiMO8Qkr4IdcdF7qwNVAtpKA
 A+sMRi59i2XxDTqRFx+wXXMa+rt+Pf1pucv77SO74xXWwpuXSxLVDYjULP1YQugK
 FTBo4SNmico/ts+n5cgm+CGMq2P2E29VYeqkI1Un6eDDvQnQlBgQdpdcBoadJ0rW
 y31OInjhRJC1ZK5bATKfCMbmB+VQxFsbyeUA7PBlrALyAmXZfw30iNxX9iHBhWqc
 myPNVEJJGp0cWTxGxMAU9MhelzeQxDAd+Eb44J5gv51bx0w9yqmZHECSDrOVdtYi
 HpOyI7E3Cb8m23BOHvCdB/v8igaYMZl08LUUJqu1S9mFclYyYVuOOIB04Yc2Qrx1
 3PHtT8TC47FbWuzKwo12RflzoAiNShJGw+tNKo6T1jC+r5jdbKWWtTnsoRqbSfaG
 rG5RJpB7EuQSP1Y=
 =/xB3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Consolidate _HPP/_HPX stuff in pci-acpi.c and simplify it
     (Krzysztof Wilczynski)

   - Fix incorrect PCIe device types and remove dev->has_secondary_link
     to simplify code that deals with upstream/downstream ports (Mika
     Westerberg)

   - After suspend, restore Resizable BAR size bits correctly for 1MB
     BARs (Sumit Saxena)

   - Enable PCI_MSI_IRQ_DOMAIN support for RISC-V (Wesley Terpstra)

  Virtualization:

   - Add ACS quirks for iProc PAXB (Abhinav Ratna), Amazon Annapurna
     Labs (Ali Saidi)

   - Move sysfs SR-IOV functions to iov.c (Kelsey Skunberg)

   - Remove group write permissions from sysfs sriov_numvfs,
     sriov_drivers_autoprobe (Kelsey Skunberg)

  Hotplug:

   - Simplify pciehp indicator control (Denis Efremov)

  Peer-to-peer DMA:

   - Allow P2P DMA between root ports for whitelisted bridges (Logan
     Gunthorpe)

   - Whitelist some Intel host bridges for P2P DMA (Logan Gunthorpe)

   - DMA map P2P DMA requests that traverse host bridge (Logan
     Gunthorpe)

  Amazon Annapurna Labs host bridge driver:

   - Add DT binding and controller driver (Jonathan Chocron)

  Hyper-V host bridge driver:

   - Fix hv_pci_dev->pci_slot use-after-free (Dexuan Cui)

   - Fix PCI domain number collisions (Haiyang Zhang)

   - Use instance ID bytes 4 & 5 as PCI domain numbers (Haiyang Zhang)

   - Fix build errors on non-SYSFS config (Randy Dunlap)

  i.MX6 host bridge driver:

   - Limit DBI register length (Stefan Agner)

  Intel VMD host bridge driver:

   - Fix config addressing issues (Jon Derrick)

  Layerscape host bridge driver:

   - Add bar_fixed_64bit property to endpoint driver (Xiaowei Bao)

   - Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC drivers separately
     (Xiaowei Bao)

  Mediatek host bridge driver:

   - Add MT7629 controller support (Jianjun Wang)

  Mobiveil host bridge driver:

   - Fix CPU base address setup (Hou Zhiqiang)

   - Make "num-lanes" property optional (Hou Zhiqiang)

  Tegra host bridge driver:

   - Fix OF node reference leak (Nishka Dasgupta)

   - Disable MSI for root ports to work around design problem (Vidya
     Sagar)

   - Add Tegra194 DT binding and controller support (Vidya Sagar)

   - Add support for sideband pins and slot regulators (Vidya Sagar)

   - Add PIPE2UPHY support (Vidya Sagar)

  Misc:

   - Remove unused pci_block_cfg_access() et al (Kelsey Skunberg)

   - Unexport pci_bus_get(), etc (Kelsey Skunberg)

   - Hide PM, VC, link speed, ATS, ECRC, PTM constants and interfaces in
     the PCI core (Kelsey Skunberg)

   - Clean up sysfs DEVICE_ATTR() usage (Kelsey Skunberg)

   - Mark expected switch fall-through (Gustavo A. R. Silva)

   - Propagate errors for optional regulators and PHYs (Thierry Reding)

   - Fix kernel command line resource_alignment parameter issues (Logan
     Gunthorpe)"

* tag 'pci-v5.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (112 commits)
  PCI: Add pci_irq_vector() and other stubs when !CONFIG_PCI
  arm64: tegra: Add PCIe slot supply information in p2972-0000 platform
  arm64: tegra: Add configuration for PCIe C5 sideband signals
  PCI: tegra: Add support to enable slot regulators
  PCI: tegra: Add support to configure sideband pins
  PCI: vmd: Fix shadow offsets to reflect spec changes
  PCI: vmd: Fix config addressing when using bus offsets
  PCI: dwc: Add validation that PCIe core is set to correct mode
  PCI: dwc: al: Add Amazon Annapurna Labs PCIe controller driver
  dt-bindings: PCI: Add Amazon's Annapurna Labs PCIe host bridge binding
  PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port
  PCI/VPD: Prevent VPD access for Amazon's Annapurna Labs Root Port
  PCI: Add ACS quirk for Amazon Annapurna Labs root ports
  PCI: Add Amazon's Annapurna Labs vendor ID
  MAINTAINERS: Add PCI native host/endpoint controllers designated reviewer
  PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers
  dt-bindings: PCI: tegra: Add PCIe slot supplies regulator entries
  dt-bindings: PCI: tegra: Add sideband pins configuration entries
  PCI: tegra: Add Tegra194 PCIe support
  PCI: Get rid of dev->has_secondary_link flag
  ...
2019-09-23 19:16:01 -07:00