Commit graph

781637 commits

Author SHA1 Message Date
Julia Lawall
ce1d6f22fa Input: elan_i2c_smbus - cast sizeof to int for comparison
Comparing an int to a size, which is unsigned, causes the int to become
unsigned, giving the wrong result.  i2c_smbus_read_block_data can return the
result of i2c_smbus_xfer, whih can return a negative error code.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
int x;
expression e,e1;
identifier f;
@@

*x = f(...);
... when != x = e1
    when != if (x < 0 || ...) { ... return ...; }
*x < sizeof(e)
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-08-01 16:05:55 -07:00
Christoph Hellwig
1572497cb0 kconfig: include common Kconfig files from top-level Kconfig
Instead of duplicating the source statements in every architecture just
do it once in the toplevel Kconfig file.

Note that with this the inclusion of arch/$(SRCARCH/Kconfig moves out of
the top-level Kconfig into arch/Kconfig so that don't violate ordering
constraits while keeping a sensible menu structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02 08:03:23 +09:00
Christoph Hellwig
17c46a6aff kconfig: remove duplicate SWAP symbol defintions
microblaze and nios2 define their own always n SWAP symbols.  Remove those
and let the generic defintion do the right thing by adding a new symbol
to disable swap entirely.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02 08:03:23 +09:00
Christoph Hellwig
9bea18010f um: create a proper drivers Kconfig
Merge arch/um/Kconfig.char and arch/um/Kconfig.net into a new
arch/um/drivers/Kconfig.  This fits the way Kconfig files are placed
elsewhere in the kernel tree.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02 08:03:23 +09:00
Christoph Hellwig
f163977d21 um: cleanup Kconfig files
We can handle all not architecture specific UM configuration directly in
the newly added arch/um/Kconfig.  Do so by merging the Kconfig.common,
Kconfig.rest and Kconfig.um files into arch/um/Kconfig, and move the main
UML menu as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02 08:03:23 +09:00
Christoph Hellwig
79b05c1f31 um: stop abusing KBUILD_KCONFIG
Instead create a arch/um/Kconfig file that just includes the actual
per-arch Kconfig file.  Note that we use HEADER_ARCH to find the
per-arch Kconfig file as that variable already includes the
normalization from i386 or x86_64 to x86.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-08-02 08:03:23 +09:00
Andy Shevchenko
c42b65e363 bitmap: Add bitmap_alloc(), bitmap_zalloc() and bitmap_free()
A lot of code become ugly because of open coding allocations for bitmaps.

Introduce three helpers to allow users be more clear of intention
and keep their code neat.

Note, due to multiple circular dependencies we may not provide
the helpers as inliners. For now we keep them exported and, perhaps,
at some point in the future we will sort out header inclusion and
inheritance.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-08-01 15:49:40 -07:00
Andy Shevchenko
e64e4018d5 md: Avoid namespace collision with bitmap API
bitmap API (include/linux/bitmap.h) has 'bitmap' prefix for its methods.

On the other hand MD bitmap API is special case.
Adding 'md' prefix to it to avoid name space collision.

No functional changes intended.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Shaohua Li <shli@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-08-01 15:49:39 -07:00
Andy Shevchenko
5cc9cdf631 dm: Avoid namespace collision with bitmap API
bitmap API (include/linux/bitmap.h) has 'bitmap' prefix for its methods.

On the other hand DM bitmap API is special case.
Adding 'dm' prefix to it to avoid potential name space collision.

No functional changes intended.

Suggested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-08-01 15:49:38 -07:00
Huang Rui
df36b2fb83 drm/ttm: clean up non-x86 definitions on ttm_tt
All non-x86 definitions are moved to ttm_set_memory header, so remove it from
ttm_tt.c.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-01 17:23:56 -05:00
Huang Rui
dceb219fc6 drm/ttm: Add ttm_set_pages_wc and ttm_set_pages_uc helper
These two helpers will be used on set page caching.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-01 17:23:05 -05:00
Huang Rui
fe710322b8 drm/ttm: fix missed conversion of set_pages_array_uc
This patch fixed the error when do not configure CONFIG_X86, otherwise, below
error will be encountered.

All errors (new ones prefixed by >>):

   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c: In function 'ttm_set_pages_caching':
>> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:272:7: error: implicit declaration of function 'set_pages_array_uc'; did you mean
+'ttm_set_pages_array_uc'? [-Werror=implicit-function-declaration]
      r = set_pages_array_uc(pages, cpages);
          ^~~~~~~~~~~~~~~~~~
          ttm_set_pages_array_uc
   cc1: some warnings being treated as errors

Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-01 17:22:20 -05:00
Linus Torvalds
6b47037682 Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fix from Russell King:
 "Just a single fix this time around for recent binutils causing build
  problems when generating Thumb-2 code"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+
2018-08-01 15:01:26 -07:00
Denis Drozdov
75da96067a IB/IPoIB: Set ah valid flag in multicast send flow
The change of ipoib_ah data structure with adding "valid" flag and
checks of ah->valid in ipoib_start_xmit affected multicast packet flow.

Since the multicast flow doesn't invoke path_rec_start, "ah->valid" flag
remains unset, so that ipoib_start_xmit end up with neigh_refresh_path
instead of sending the packet using neigh.

"ah->valid" has to be set in multicast send flow. As a result IPoIB
starts sending packets via neigh immediately and eliminates 60sec delay
of neigh keep alive interval.

The typical example of this issue are two sequential arpings:

arping 11.134.208.9 -> got response (mcast_send)
arping 11.134.208.9 -> no response  (ah->valid = 0)

Fixes: fa9391dbad ("RDMA/ipoib: Update paths on CLIENT_REREG/SM_CHANGE events")
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 15:23:03 -06:00
Dan Carpenter
19da44cd33 pinctrl: freescale: off by one in imx1_pinconf_group_dbg_show()
The info->groups[] array is allocated in imx1_pinctrl_parse_dt().  It
has info->ngroups elements.  Thus the > here should be >= to prevent
reading one element beyond the end of the array.

Cc: stable@vger.kernel.org
Fixes: 30612cd900 ("pinctrl: imx1 core driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Uwe Kleine-König <u.kleine-könig@pengutronix.de>
Acked-by: Dong Aisheng <Aisheng.dong@nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-08-01 23:01:07 +02:00
Jason Gunthorpe
0f50d88a6e IB/uverbs: Allow all DESTROY commands to succeed after disassociate
The disassociate function was broken by design because it failed all
commands. This prevents userspace from calling destroy on a uobject after
it has detected a device fatal error and thus reclaiming the resources in
userspace is prevented.

This fix is now straightforward, when anything destroys a uobject that is
not the user the object remains on the IDR with a NULL context and object
pointer. All lookup locking modes other than DESTROY will fail. When the
user ultimately calls the destroy function it is simply dropped from the
IDR while any related information is returned.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
a9b66d6453 IB/uverbs: Do not block disassociate during write()
Now that all the callbacks are safe to run concurrently with
disassociation this test can be eliminated. The ufile core infrastructure
becomes entirely self contained and is not sensitive to disassociation.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
e83f0ecdc4 IB/uverbs: Do not pass struct ib_device to the ioctl methods
This does the same as the patch before, except for ioctl. The rules are
the same, but for the ioctl methods the core code handles setting up the
uobject.

- Retrieve the ib_dev from the uobject->context->device. This is
  safe under ioctl as the core has already done rdma_alloc_begin_uobject
  and so CREATE calls are entirely protected by the rwsem.
- Retrieve the ib_dev from uobject->object
- Call ib_uverbs_get_ucontext()

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
bbd51e881f IB/uverbs: Do not pass struct ib_device to the write based methods
This is a step to get rid of the global check for disassociation. In this
model, the ib_dev is not proven to be valid by the core code and cannot be
provided to the method. Instead, every method decides if it is able to
run after disassociation and obtains the ib_dev using one of three
different approaches:

- Call srcu_dereference on the udevice's ib_dev. As before, this means
  the method cannot be called after disassociation begins.
  (eg alloc ucontext)
- Retrieve the ib_dev from the ucontext, via ib_uverbs_get_ucontext()
- Retrieve the ib_dev from the uobject->object after checking
  under SRCU if disassociation has started (eg uobj_get)

Largely, the code is all ready for this, the main work is to provide a
ib_dev after calling uobj_alloc(). The few other places simply use
ib_uverbs_get_ucontext() to get the ib_dev.

This flexibility will let the next patches allow destroy to operate
after disassociation.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
cc2e14e680 IB/uverbs: Lower the test for ongoing disassociation
Commands that are reading/writing to objects can test for an ongoing
disassociation during their initial call to rdma_lookup_get_uobject.  This
directly prevents all of these commands from conflicting with an ongoing
disassociation.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
1e857e65d4 IB/uverbs: Allow uobject allocation to work concurrently with disassociate
After all the recent structural changes this is now straightforward, hold
the hw_destroy_rwsem across the entire uobject creation. We already take
this semaphore on the success path, so holding it a bit longer is not
going to change the performance.

After this change none of the create callbacks require the
disassociate_srcu lock to be correct.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
7452a3c745 IB/uverbs: Allow RDMA_REMOVE_DESTROY to work concurrently with disassociate
After all the recent structural changes this is now straightfoward, hoist
the hw_destroy_rwsem up out of rdma_destroy_explicit and wrap it around
the uobject write lock as well as the destroy.

This is necessary as obtaining a write lock concurrently with
uverbs_destroy_ufile_hw() will cause malfunction.

After this change none of the destroy callbacks require the
disassociate_srcu lock to be correct.

This requires introducing a new lookup mode, UVERBS_LOOKUP_DESTROY as the
IOCTL interface needs to hold an unlocked kref until all command
verification is completed.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
9867f5c669 IB/uverbs: Convert 'bool exclusive' into an enum
This is more readable, and future patches will need a 3rd lookup type.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
87ad80abc7 IB/uverbs: Consolidate uobject destruction
There are several flows that can destroy a uobject and each one is
minimized and sprinkled throughout the code base, making it difficult to
understand and very hard to modify the destroy path.

Consolidate all of these into uverbs_destroy_uobject() and call it in all
cases where a uobject has to be destroyed.

This makes one change to the lifecycle, during any abort (eg when
alloc_commit is not called) we always call out to alloc_abort, even if
remove_commit needs to be called to delete a HW object.

This also renames RDMA_REMOVE_DURING_CLEANUP to RDMA_REMOVE_ABORT to
clarify its actual usage and revises some of the comments to reflect what
the life cycle is for the type implementation.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
32ed5c00ac IB/uverbs: Make the write path destroy methods use the same flow as ioctl
The ridiculous dance with uobj_remove_commit() is not needed, the write
path can follow the same flow as ioctl - lock and destroy the HW object
then use the data left over in the uobject to form the response to
userspace.

Two helpers are introduced to make this flow straightforward for the
caller.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:48 -06:00
Jason Gunthorpe
aa72c9a5f9 IB/uverbs: Remove rdma_explicit_destroy() from the ioctl methods
The core code will destroy the HW object on behalf of the method, if the
method provides an implementation it must simply copy data from the stub
uobj into the response. Destroy methods cannot touch the HW object.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-01 14:55:37 -06:00
Takashi Iwai
93ce1b1296 ALSA: seq: Drop unused 64bit division macros
The old ugly macros remained in the code without usage.
Rip them off.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-01 22:54:37 +02:00
Takashi Iwai
04702e8d00 ALSA: seq: Use no intrruptible mutex_lock
All usages of mutex in ALSA sequencer core would take too long, hence
we don't have to care about the user interruption that makes things
complicated.  Let's replace them with simpler mutex_lock().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-01 22:54:36 +02:00
Takashi Iwai
00976ad527 ALSA: seq: Fix leftovers at probe error path
The sequencer core module doesn't call some destructors in the error
path of the init code, which may leave some resources.

This patch mainly fix these leaks by calling the destructors
appropriately at alsa_seq_init().  Also the patch brings a few
cleanups along with it, namely:

- Expand the old "if ((err = xxx) < 0)" coding style
- Get rid of empty seq_queue_init() and its caller
- Change snd_seq_info_done() to void

Last but not least, a couple of functions lose __exit annotation since
they are called also in alsa_seq_init().

No functional changes but minor code cleanups.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-01 22:54:36 +02:00
Takashi Iwai
fc4bfd9a35 ALSA: seq: Remove dead codes
There are a few functions that have been commented out for ages.
And also there are functions that do nothing but placeholders.
Let's kill them.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-01 22:54:35 +02:00
Takashi Iwai
ef965ad5a7 ALSA: seq: Minor cleanup of MIDI event parser helpers
snd_midi_event_encode_byte() can never fail, and it can return rather
true/false.  Change the return type to bool, adjust the argument to
receive a MIDI byte as unsigned char, and adjust the comment
accordingly.  This allows callers to drop error checks, which
simplifies the code.

Meanwhile, snd_midi_event_encode() helper is used only in seq_midi.c,
and it can be better folded into it.  This will reduce the total
amount of lines in the end.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-08-01 22:54:35 +02:00
Vincent Bernat
db57dc7c7a net: don't declare IPv6 non-local bind helper if CONFIG_IPV6 undefined
Fixes: 83ba464515 ("net: add helpers checking if socket can be bound to nonlocal address")
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 13:45:31 -07:00
Linus Torvalds
8b11ec1b5f mm: do not initialize TLB stack vma's with vma_init()
Commit 2c4541e24c ("mm: use vma_init() to initialize VMAs on stack and
data segments") tried to initialize various left-over ad-hoc vma's
"properly", but actually made things worse for the temporary vma's used
for TLB flushing.

vma_init() doesn't actually initialize all of the vma, just a few
fields, so doing something like

   -       struct vm_area_struct vma = { .vm_mm = tlb->mm, };
   +       struct vm_area_struct vma;
   +
   +       vma_init(&vma, tlb->mm);

was actually very bad: instead of having a nicely initialized vma with
every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma
with only a couple of fields initialized.  And they weren't even fields
that the code in question mostly cared about.

The flush_tlb_range() function takes a "struct vma" rather than a
"struct mm_struct", because a few architectures actually care about what
kind of range it is - being able to only do an ITLB flush if it's a
range that doesn't have data accesses enabled, for example.  And all the
normal users already have the vma for doing the range invalidation.

But a few people want to call flush_tlb_range() with a range they just
made up, so they also end up using a made-up vma.  x86 just has a
special "flush_tlb_mm_range()" function for this, but other
architectures (arm and ia64) do the "use fake vma" thing instead, and
thus got caught up in the vma_init() changes.

At the same time, the TLB flushing code really doesn't care about most
other fields in the vma, so vma_init() is just unnecessary and
pointless.

This fixes things by having an explicit "this is just an initializer for
the TLB flush" initializer macro, which is used by the arm/arm64/ia64
people who mis-use this interface with just a dummy vma.

Fixes: 2c4541e24c ("mm: use vma_init() to initialize VMAs on stack and data segments")
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01 13:43:38 -07:00
Paul Burton
48ae93fdd1
MIPS: Delete unused code in linux32.c
The A() & AA() macros have been unused since commit 05e4396651
("[MIPS] Use SYSVIPC_COMPAT to fix various problems on N32"), which
switched to the more standard compat_ptr().

RLIM_INFINITY32, RESOURCE32() & struct rlimit32 have been present but
unused since the beginning of the git era.

Remove the dead code.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20108/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
2018-08-01 13:20:27 -07:00
Paul Burton
3a1c0fc592
MIPS: Remove unused sys_32_mmap2
The sys_32_mmap2 function has been unused since we started using syscall
wrappers in commit dbda6ac089 ("MIPS: CVE-2009-0029: Enable syscall
wrappers."), and is indeed identical to the sys_mips_mmap2 function that
replaced it in sys32_call_table.

Remove the dead code.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20107/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
2018-08-01 13:20:21 -07:00
Paul Burton
96a68b14db
MIPS: Remove nabi_no_regargs
Our sigreturn functions make use of a macro named nabi_no_regargs to
declare 8 dummy arguments to a function, forcing the compiler to expect
a pt_regs structure on the stack rather than in argument registers. This
is an ugly hack which unnecessarily causes these sigreturn functions to
need to care about the calling convention of the ABI the kernel is built
for. Although this is abstracted via nabi_no_regargs, it's still ugly &
unnecessary.

Remove nabi_no_regargs & the struct pt_regs argument from sigreturn
functions, and instead use current_pt_regs() to find the struct pt_regs
on the stack, which works cleanly regardless of ABI.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20106/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
2018-08-01 13:20:15 -07:00
Steven Rostedt (VMware)
ec57350883 tracing: Make tracer_tracing_is_on() return bool
There's code that expects tracer_tracing_is_on() to be either true or false,
not some random number. Currently, it should only return one or zero, but
just in case, change its return value to bool, to enforce it.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01 16:08:57 -04:00
Steven Rostedt (VMware)
978defee11 tracing: Do a WARN_ON() if start_thread() in hwlat is called when thread exists
The start function of the hwlat tracer should never be called when the hwlat
thread already exists. If it is called, do a WARN_ON().

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01 16:06:02 -04:00
Erica Bugden
82fbc8c48a ftrace: Add missing check for existing hwlat thread
The hwlat tracer uses a kernel thread to measure latencies. The function
that creates this kernel thread, start_kthread(), can be called when the
tracer is initialized and when the tracer is explicitly enabled.
start_kthread() does not check if there is an existing hwlat kernel
thread and will create a new one each time it is called.

This causes the reference to the previous thread to be lost. Without the
thread reference, the old kernel thread becomes unstoppable and
continues to use CPU time even after the hwlat tracer has been disabled.
This problem can be observed when a system is booted with tracing
enabled and the hwlat tracer is configured like this:

	echo hwlat > current_tracer; echo 1 > tracing_on

Add the missing check for an existing kernel thread in start_kthread()
to prevent this problem. This function and the rest of the hwlat kernel
thread setup and teardown are already serialized because they are called
through the tracer core code with trace_type_lock held.

[
 Note, this only fixes the symptom. The real fix was not to call
 this function when tracing_on was already one. But this still makes
 the code more robust, so we'll add it.
]

Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de

Signed-off-by: Erica Bugden <erica.bugden@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01 16:04:24 -04:00
Steven Rostedt (VMware)
f143641bfe tracing: Do not call start/stop() functions when tracing_on does not change
Currently, when one echo's in 1 into tracing_on, the current tracer's
"start()" function is executed, even if tracing_on was already one. This can
lead to strange side effects. One being that if the hwlat tracer is enabled,
and someone does "echo 1 > tracing_on" into tracing_on, the hwlat tracer's
start() function is called again which will recreate another kernel thread,
and make it unable to remove the old one.

Link: http://lkml.kernel.org/r/1533120354-22923-1-git-send-email-erica.bugden@linutronix.de

Cc: stable@vger.kernel.org
Fixes: 2df8f8a6a8 ("tracing: Fix regression with irqsoff tracer and tracing_on file")
Reported-by: Erica Bugden <erica.bugden@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-01 16:01:02 -04:00
Hugh Dickins
53406ed1bc mm: delete historical BUG from zap_pmd_range()
Delete the old VM_BUG_ON_VMA() from zap_pmd_range(), which asserted
that mmap_sem must be held when splitting an "anonymous" vma there.
Whether that's still strictly true nowadays is not entirely clear,
but the danger of sometimes crashing on the BUG is now fairly clear.

Even with the new stricter rules for anonymous vma marking, the
condition it checks for can possible trigger. Commit 44960f2a7b
("staging: ashmem: Fix SIGBUS crash when traversing mmaped ashmem
pages") is good, and originally I thought it was safe from that
VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is
disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE)
insists on VM_SHARED.

But after I read John's earlier mail, drawing attention to the
vfs_fallocate() in there: I may be wrong, and I don't know if Android
has THP in the config anyway, but it looks to me like an
unmap_mapping_range() from ashmem's vfs_fallocate() could hit precisely
the VM_BUG_ON_VMA(), once it's vma_is_anonymous().

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01 12:23:45 -07:00
Arnaldo Carvalho de Melo
b912885ab7 perf trace: Do not require --no-syscalls to suppress strace like output
So far the --syscalls option was the default, requiring explicit
--no-syscalls when wanting to process just some other event, invert that
and assume it only when no other event was specified, allowing its
explicit enablement when wanting to see all syscalls together with some
other event:

E.g:

The existing default is maintained for a single workload:

  # perf trace sleep 1
<SNIP>
     0.264 ( 0.003 ms): sleep/12762 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7f62cbf04000
     0.271 ( 0.001 ms): sleep/12762 close(fd: 3) = 0
     0.295 (1000.130 ms): sleep/12762 nanosleep(rqtp: 0x7ffd15194fd0) = 0
  1000.469 ( 0.006 ms): sleep/12762 close(fd: 1) = 0
  1000.480 ( 0.004 ms): sleep/12762 close(fd: 2) = 0
  1000.502 (         ): sleep/12762 exit_group()
  #

For a pid:

  # pidof ssh
  7826 3961 3226 2628 2493
  # perf trace -p 3961
         ? (         ):  ... [continued]: select()) = 1
     0.023 ( 0.005 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce870               ) = 0
     0.036 ( 0.009 ms): read(fd: 5</dev/pts/7>, buf: 0x7ffcc8fca7b0, count: 16384             ) = 3
     0.060 ( 0.004 ms): getpid(                                                               ) = 3961 (ssh)
     0.079 ( 0.004 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce8e0               ) = 0
     0.088 ( 0.003 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce7c0               ) = 0
<SNIP>

For system wide, threads, cgroups, user, etc when no event is specified,
the existing behaviour is maintained, i.e. --syscalls is selected.

When some event is specified, then --no-syscalls doesn't need to be
specified:

  # perf trace -e tcp:tcp_probe ssh localhost
     0.000 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=53 snd_nxt=0xb67ce8f7 snd_una=0xb67ce8f7 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43690
     0.010 tcp:tcp_probe:src=[::1]:39074 dest=[::1]:22 mark=0 length=32 snd_nxt=0xa8f9ef38 snd_una=0xa8f9ef23 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43690 srtt=31 rcv_wnd=43776
     4.525 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=1240 snd_nxt=0xb67ce90c snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43776
     7.242 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=80 snd_nxt=0xb67ced44 snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=174720
  The authenticity of host 'localhost (::1)' can't be established.
  ECDSA key fingerprint is SHA256:TKZS58923458203490asekfjaklskljmkjfgPMBfHzY.
  ECDSA key fingerprint is MD5:d8:29:54:40:71:fa:b8:44:89:52:64:8a:35:42:d0:e8.
  Are you sure you want to continue connecting (yes/no)?
^C
  #

To get the previous behaviour just use --syscalls and get all syscalls formatted
strace like + the specified extra events:

  # trace -e sched:*switch --syscalls sleep 1
  <SNIP>
     0.160 ( 0.003 ms): sleep/12877 mprotect(start: 0x7fdfe2361000, len: 4096, prot: READ) = 0
     0.164 ( 0.009 ms): sleep/12877 munmap(addr: 0x7fdfe2345000, len: 113155) = 0
     0.211 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce68e000
     0.212 ( 0.002 ms): sleep/12877 brk(brk: 0x55d3ce6af000) = 0x55d3ce6af000
     0.215 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce6af000
     0.219 ( 0.004 ms): sleep/12877 open(filename: 0xe1f07c00, flags: CLOEXEC) = 3
     0.225 ( 0.001 ms): sleep/12877 fstat(fd: 3, statbuf: 0x7fdfe2138aa0) = 0
     0.227 ( 0.003 ms): sleep/12877 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7fdfdb1b8000
     0.234 ( 0.001 ms): sleep/12877 close(fd: 3) = 0
     0.257 (         ): sleep/12877 nanosleep(rqtp: 0x7fffb36b6020) ...
     0.260 (         ): sched:sched_switch:prev_comm=sleep prev_pid=12877 prev_prio=120 prev_state=D ==> next_comm=swapper/3 next_pid=0 next_prio=120
     0.257 (1000.134 ms): sleep/12877  ... [continued]: nanosleep()) = 0
  1000.428 ( 0.006 ms): sleep/12877 close(fd: 1) = 0
  1000.440 ( 0.004 ms): sleep/12877 close(fd: 2) = 0
  1000.461 (         ): sleep/12877 exit_group()
  #

When specifiying just some syscalls, the behaviour doesn't change, i.e.:

  # trace -e nanosleep -e sched:*switch sleep 1
     0.000 (         ): sleep/14974 nanosleep(rqtp: 0x7ffc344ba9c0                                        ) ...
     0.007 (         ): sched:sched_switch:prev_comm=sleep prev_pid=14974 prev_prio=120 prev_state=D ==> next_comm=swapper/2 next_pid=0 next_prio=120
     0.000 (1000.139 ms): sleep/14974  ... [continued]: nanosleep()) = 0
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-om2fulll97ytnxv40ler8jkf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-08-01 16:20:28 -03:00
Chao Yu
82cf4f132e f2fs: fix to active page in lru list for read path
If config CONFIG_F2FS_FAULT_INJECTION is on, for both read or write path
we will call find_lock_page() to get the page, but for read path, it
missed to passing FGP_ACCESSED to allocator to active the page in LRU
list, result in being reclaimed in advance incorrectly, fix it.

Reported-by: Xianrong Zhou <zhouxianrong@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00
Chao Yu
18767e6263 f2fs: don't keep meta pages used for block migration
For migration of encrypted inode's block, we load data of encrypted block
into meta inode's page cache, after checkpoint, those all intermediate
pages should be clean, and no one will read them again, so let's just
release them for more memory.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00
Chao Yu
4ddc1b28aa f2fs: fix to restrict mount condition when without CONFIG_QUOTA
Like quota_ino feature, we need to reject mounting RDWR with image
which enables project_quota feature when there is no CONFIG_QUOTA
be set in kernel.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00
Sheng Yong
00960c2cd8 f2fs: quota: do not mount as RDWR without QUOTA if quota feature enabled
If quota feature is enabled, quota is on by default. However, if
CONFIG_QUOTA is not built in kernel, dquot entries will not get updated,
which leads to quota inconsistency.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00
Sheng Yong
76cf05d79c f2fs: quota: fix incorrect comments
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00
Sheng Yong
955ac6e523 f2fs: quota: decrease the lock granularity of statfs_project
According to fs/quota/dquot.c, `dq_data_lock' protects mem_dqinfo
structures and modifications of dquot pointers in the inode, and
`dquot->dq_dqb_lock' protects data from dq_dqb.

We should use dquot->dq_dqb_lock in statfs_project instead of
dq_dat_lock.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00
Yunlong Song
970e348d98 f2fs: add proc entry to show victim_secmap bitmap
This patch adds a new proc entry to show victim_secmap information in
more detail, which is very helpful to know the get_victim candidate
status clearly, and helpful to debug problems (e.g., some sections can
not gc all of its blocks, since some blocks belong to atomic file,
leaving victim_secmap with section bit setting, in extrem case, this
will lead all bytes of victim_secmap setting with 0xff).

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00
Chao Yu
fd8c8caf7e f2fs: let checkpoint flush dnode page of regular
Fsyncer will wait on all dnode pages of regular writeback before flushing,
if there are async dnode pages blocked by IO scheduler, it may decrease
fsync's performance.

In this patch, we choose to let f2fs_balance_fs_bg() to trigger checkpoint
to flush these dnode pages of regular, so async IO of dnode page can be
elimitnated, making fsyncer only need to wait for sync IO.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01 11:52:36 -07:00