Commit graph

56841 commits

Author SHA1 Message Date
Jiri Kosina
59cc1f61f0 net: sched: convert qdisc linked list to hashtable
Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.

The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.

Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:19:02 -07:00
Tero Kristo
23a34f9d03 regulator: tps65218: do not disable DCDC3 during poweroff on broken PMICs
Some versions of tps65218 do not seem to support poweroff modes properly
if DCDC3 regulator is shut-down. Thus, keep it enabled even during
poweroff if the version info matches the broken silicon revision.

Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-08-10 18:21:55 +01:00
Tero Kristo
f11fa1796a mfd: tps65218: add version check to the PMIC probe
Version information will be needed to handle some error cases under the
regulator driver, so store the information once during MFD probe.

Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2016-08-10 18:21:55 +01:00
Tejun Heo
4c737b41de cgroup: make cgroup_path() and friends behave in the style of strlcpy()
cgroup_path() and friends used to format the path from the end and
thus the resulting path usually didn't start at the start of the
passed in buffer.  Also, when the buffer was too small, the partial
result was truncated from the head rather than tail and there was no
way to tell how long the full path would be.  These make the functions
less robust and more awkward to use.

With recent updates to kernfs_path(), cgroup_path() and friends can be
made to behave in strlcpy() style.

* cgroup_path(), cgroup_path_ns[_locked]() and task_cgroup_path() now
  always return the length of the full path.  If buffer is too small,
  it contains nul terminated truncated output.

* All users updated accordingly.

v2: cgroup_path() usage in kernel/sched/debug.c converted.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2016-08-10 11:23:44 -04:00
Tejun Heo
bb09c8634b kernfs: remove kernfs_path_len()
It doesn't have any in-kernel user and the same result can be obtained
from kernfs_path(@kn, NULL, 0).  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
2016-08-10 11:23:44 -04:00
Tejun Heo
3abb1d90f5 kernfs: make kernfs_path*() behave in the style of strlcpy()
kernfs_path*() functions always return the length of the full path but
the path content is undefined if the length is larger than the
provided buffer.  This makes its behavior different from strlcpy() and
requires error handling in all its users even when they don't care
about truncation.  In addition, the implementation can actully be
simplified by making it behave properly in strlcpy() style.

* Update kernfs_path_from_node_locked() to always fill up the buffer
  with path.  If the buffer is not large enough, the output is
  truncated and terminated.

* kernfs_path() no longer needs error handling.  Make it a simple
  inline wrapper around kernfs_path_from_node().

* sysfs_warn_dup()'s use of kernfs_path() doesn't need error handling.
  Updated accordingly.

* cgroup_path()'s use of kernfs_path() updated to retain the old
  behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2016-08-10 11:23:44 -04:00
Tejun Heo
0e0b2afdf6 kernfs: add dummy implementation of kernfs_path_from_node()
The dummy version of kernfs_path_from_node() was missing.  This
currently doesn't break anything.  Let's add it for consistency and to
ease adding wrappers around it.

v2: Removed stray ';' which was causing build failures.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-10 11:23:43 -04:00
Vegard Nossum
d1c6d149cf sched/debug: Make the "Preemption disabled at ..." message more useful
This message is currently really useless since it always prints a value
that comes from the printk() we just did, e.g.:

    BUG: sleeping function called from invalid context at mm/slab.h:388
    in_atomic(): 0, irqs_disabled(): 0, pid: 31996, name: trinity-c1
    Preemption disabled at:[<ffffffff8119db33>] down_trylock+0x13/0x80

    BUG: sleeping function called from invalid context at include/linux/freezer.h:56
    in_atomic(): 0, irqs_disabled(): 0, pid: 31996, name: trinity-c1
    Preemption disabled at:[<ffffffff811aaa37>] console_unlock+0x2f7/0x930

Here, both down_trylock() and console_unlock() is somewhere in the
printk() path.

We should save the value before calling printk() and use the saved value
instead. That immediately reveals the offending callsite:

    BUG: sleeping function called from invalid context at mm/slab.h:388
    in_atomic(): 0, irqs_disabled(): 0, pid: 14971, name: trinity-c2
    Preemption disabled at:[<ffffffff819bcd46>] rhashtable_walk_start+0x46/0x150

Bug report:

  http://marc.info/?l=linux-netdev&m=146925979821849&w=2

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russel <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 16:07:20 +02:00
Peter Zijlstra
80127a3968 locking/percpu-rwsem: Optimize readers and reduce global impact
Currently the percpu-rwsem switches to (global) atomic ops while a
writer is waiting; which could be quite a while and slows down
releasing the readers.

This patch cures this problem by ordering the reader-state vs
reader-count (see the comments in __percpu_down_read() and
percpu_down_write()). This changes a global atomic op into a full
memory barrier, which doesn't have the global cacheline contention.

This also enables using the percpu-rwsem with rcu_sync disabled in order
to bias the implementation differently, reducing the writer latency by
adding some cost to readers.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Fixed modular build. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 14:34:01 +02:00
Morten Rasmussen
bd425d4bfc sched/core: Fix power to capacity renaming in comment
It is seems that this one escaped Nico's renaming of cpu_power to
cpu_capacity a while back.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: linux-kernel@vger.kernel.org
Cc: mgalbraith@suse.de
Cc: vincent.guittot@linaro.org
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1466615004-3503-2-git-send-email-morten.rasmussen@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 14:03:32 +02:00
Peter Zijlstra
e48c178814 perf/core: Optimize perf_pmu_sched_task()
For perf record -b, which requires the pmu::sched_task callback the
current code is rather expensive:

     7.68%  sched-pipe  [kernel.vmlinux]    [k] perf_pmu_sched_task
     5.95%  sched-pipe  [kernel.vmlinux]    [k] __switch_to
     5.20%  sched-pipe  [kernel.vmlinux]    [k] __intel_pmu_disable_all
     3.95%  sched-pipe  perf                [.] worker_thread

The problem is that it will iterate all registered PMUs, most of which
will not have anything to do. Avoid this by keeping an explicit list
of PMUs that have requested the callback.

The perf_sched_cb_{inc,dec}() functions already takes the required pmu
argument, and now that these functions are no longer called from NMI
context we can use them to manage a list.

With this patch applied the function doesn't show up in the top 4
anymore (it dropped to 18th place).

     6.67%  sched-pipe  [kernel.vmlinux]    [k] __switch_to
     6.18%  sched-pipe  [kernel.vmlinux]    [k] __intel_pmu_disable_all
     3.92%  sched-pipe  [kernel.vmlinux]    [k] switch_mm_irqs_off
     3.71%  sched-pipe  perf                [.] worker_thread

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 13:13:28 +02:00
David Carrillo-Cisneros
db4a835601 perf/core: Set cgroup in CPU contexts for new cgroup events
There's a perf stat bug easy to observer on a machine with only one cgroup:

  $ perf stat -e cycles -I 1000 -C 0 -G /
  #          time             counts unit events
      1.000161699      <not counted>      cycles                    /
      2.000355591      <not counted>      cycles                    /
      3.000565154      <not counted>      cycles                    /
      4.000951350      <not counted>      cycles                    /

We'd expect some output there.

The underlying problem is that there is an optimization in
perf_cgroup_sched_{in,out}() that skips the switch of cgroup events
if the old and new cgroups in a task switch are the same.

This optimization interacts with the current code in two ways
that cause a CPU context's cgroup (cpuctx->cgrp) to be NULL even if a
cgroup event matches the current task. These are:

  1. On creation of the first cgroup event in a CPU: In current code,
  cpuctx->cpu is only set in perf_cgroup_sched_in, but due to the
  aforesaid optimization, perf_cgroup_sched_in will run until the next
  cgroup switches in that CPU. This may happen late or never happen,
  depending on system's number of cgroups, CPU load, etc.

  2. On deletion of the last cgroup event in a cpuctx: In list_del_event,
  cpuctx->cgrp is set NULL. Any new cgroup event will not be sched in
  because cpuctx->cgrp == NULL until a cgroup switch occurs and
  perf_cgroup_sched_in is executed (updating cpuctx->cgrp).

This patch fixes both problems by setting cpuctx->cgrp in list_add_event,
mirroring what list_del_event does when removing a cgroup event from CPU
context, as introduced in:

  commit 68cacd2916 ("perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx()")

With this patch, cpuctx->cgrp is always set/clear when installing/removing
the first/last cgroup event in/from the CPU context. With cpuctx->cgrp
correctly set, event_filter_match works as intended when events are
sched in/out.

After the fix, the output is as expected:

  $ perf stat -e cycles -I 1000 -a -G /
  #         time             counts unit events
     1.004699159          627342882      cycles                    /
     2.007397156          615272690      cycles                    /
     3.010019057          616726074      cycles                    /

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1470124092-113192-1-git-send-email-davidcc@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-10 13:05:52 +02:00
Patrice Chotard
6bb9f0d933 mfd: Add STMPE1600 support
STMPE1600 is a 16-bit port expander.
Datasheet is available here :
http://www2.st.com/content/st_com/en/products/interfaces-and-transceivers/
i-o-expanders-and-level-translators/i-o-expanders/stmpe1600.html

Signed-off-by: Amelie DELAUNAY <amelie.delaunay@st.com>
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-08-10 09:25:18 +01:00
Patrice Chotard
897ac6674c mfd: stmpe: Rework registers access
this update allows to use registers map as following :
regs[reg_index + offset] instead of
regs[reg_index] + offset

This makes code clearer and will facilitate the addition of STMPE1600
on which LSB and MSB registers are respectively located at addr and addr + 1.
Despite for all others STMPE variant, LSB and MSB registers are respectively
located in reverse order at addr + 1 and addr.

For variant which have 3 registers's bank, we use LSB,CSB and MSB indexes
which contains respectively LSB (or LOW), CSB (or MID) and MSB (or HIGH)
register addresses (STMPE1801/STMPE24xx).
For variant which have 2 registers's bank, we use LSB and CSB indexes only.
In this case the CSB index contains the MSB regs address (STMPE 1601).

Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-08-10 09:24:39 +01:00
Patrice Chotard
0f4be8cf63 mfd: stmpe: Add STMPE_IDX_SYS_CTRL/2 enum
As STMPE1801/1601/24xx has a SYS_CTRL register and
STMPE1601/2403 has even a SYS_CTRL2 register, add
STMPE_IDX_SYS_CTRL/2 and update driver code accordingly

This update prepares the ground for not yet supported STMPE1600
which share similar REG_SYS_CTRL register.

Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-08-10 09:23:25 +01:00
Linus Torvalds
a0cba2179e Revert "printk: create pr_<level> functions"
This reverts commit 874f9c7da9.

Geert Uytterhoeven reports:
 "This change seems to have an (unintendent?) side-effect.

  Before, pr_*() calls without a trailing newline characters would be
  printed with a newline character appended, both on the console and in
  the output of the dmesg command.

  After this commit, no new line character is appended, and the output
  of the next pr_*() call of the same type may be appended, like in:

    - Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
    - Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
    + Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"

Joe Perches says:
 "No, that is not intentional.

  The newline handling code inside vprintk_emit is a bit involved and
  for now I suggest a revert until this has all the same behavior as
  earlier"

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Requested-by: Joe Perches <joe@perches.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-09 10:48:18 -07:00
Andre Przywara
fd837b08d9 KVM: arm64: ITS: return 1 on successful MSI injection
According to the KVM API documentation a successful MSI injection
should return a value > 0 on success.
Return possible errors in vgic_its_trigger_msi() and report a
successful injection back to userland, while also reporting the
case where the MSI could not be delivered due to the guest not
having the LPI mapped, for instance.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-08-09 16:43:23 +02:00
Stephen Boyd
1ebe88d38d usb: ulpi: Automatically set driver::owner with ulpi_driver_register()
Let's follow other driver registration functions and
automatically set the driver's owner member to THIS_MODULE when
ulpi_driver_register() is called. This allows ulpi driver writers
to forget about this boiler plate detail and avoids common bugs
in the process.

Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-09 16:06:49 +02:00
Gary R Hook
4b394a232d crypto: ccp - Let a v5 CCP provide the same function as v3
Enable equivalent function on a v5 CCP. Add support for a
version 5 CCP which enables AES/XTS/SHA services. Also,
more work on the data structures to virtualize
functionality.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-08-09 18:47:16 +08:00
Marc Zyngier
f3b0946d62 genirq/msi: Make sure PCI MSIs are activated early
Bharat Kumar Gogada reported issues with the generic MSI code, where the
end-point ended up with garbage in its MSI configuration (both for the vector
and the message).

It turns out that the two MSI paths in the kernel are doing slightly different
things:

generic MSI: disable MSI -> allocate MSI -> enable MSI -> setup EP
PCI MSI: disable MSI -> allocate MSI -> setup EP -> enable MSI

And it turns out that end-points are allowed to latch the content of the MSI
configuration registers as soon as MSIs are enabled.  In Bharat's case, the
end-point ends up using whatever was there already, which is not what you
want.

In order to make things converge, we introduce a new MSI domain flag
(MSI_FLAG_ACTIVATE_EARLY) that is unconditionally set for PCI/MSI. When set,
this flag forces the programming of the end-point as soon as the MSIs are
allocated.

A consequence of this is that we have an extra activate in irq_startup, but
that should be without much consequence.

tglx: 

 - Several people reported a VMWare regression with PCI/MSI-X passthrough. It
   turns out that the patch also cures that issue.

 - We need to have a look at the MSI disable interrupt path, where we write
   the msg to all zeros without disabling MSI in the PCI device. Is that
   correct?

Fixes: 52f518a3a7 "x86/MSI: Use hierarchical irqdomains to manage MSI interrupts"
Reported-and-tested-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
Reported-and-tested-by: Foster Snowhill <forst@forstwoof.ru>
Reported-by: Matthias Prager <linux@matthiasprager.de>
Reported-by: Jason Taylor <jason.taylor@simplivity.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1468426713-31431-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-08-09 09:19:32 +02:00
Sudarsana Reddy Kalluru
59bcb7972f qed: Add dcbx app support for IEEE Selection Field.
MFW now supports the Selection field for IEEE mode. Add driver changes to
use the newer MFW masks to read/write the port-id value.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 22:22:20 -07:00
Mickaël Salaün
a4f4528a31 module: Fully remove the kernel_module_from_file hook
Remove remaining kernel_module_from_file hook left by commit
a1db742094 ("module: replace copy_module_from_fd with kernel version")

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-08-09 10:58:57 +10:00
Vivek Goyal
2602625b7e security, overlayfs: Provide hook to correctly label newly created files
During a new file creation we need to make sure new file is created with the
right label. New file is created in upper/ so effectively file should get
label as if task had created file in upper/.

We switched to mounter's creds for actual file creation. Also if there is a
whiteout present, then file will be created in work/ dir first and then
renamed in upper. In none of the cases file will be labeled as we want it to
be.

This patch introduces a new hook dentry_create_files_as(), which determines
the label/context dentry will get if it had been created by task in upper
and modify passed set of creds appropriately. Caller makes use of these new
creds for file creation.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: fix whitespace issues found with checkpatch.pl]
[PM: changes to use stat->mode in ovl_create_or_link()]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:46:46 -04:00
Vivek Goyal
121ab822ef security,overlayfs: Provide security hook for copy up of xattrs for overlay file
Provide a security hook which is called when xattrs of a file are being
copied up. This hook is called once for each xattr and LSM can return
0 if the security module wants the xattr to be copied up, 1 if the
security module wants the xattr to be discarded on the copy, -EOPNOTSUPP
if the security module does not handle/manage the xattr, or a -errno
upon an error.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: whitespace cleanup for checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:42:13 -04:00
Vivek Goyal
d8ad8b4961 security, overlayfs: provide copy up security hook for unioned files
Provide a security hook to label new file correctly when a file is copied
up from lower layer to upper layer of a overlay/union mount.

This hook can prepare a new set of creds which are suitable for new file
creation during copy up. Caller will use new creds to create file and then
revert back to old creds and release new creds.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: whitespace cleanup to appease checkpatch.pl]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-08 20:06:53 -04:00
Linus Torvalds
1eccfa090e Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user
bounds checking for most architectures on SLAB and SLUB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXl9tlAAoJEIly9N/cbcAm5BoP/ikTtDp2bFw1sn92yHTnIWzl
 O+dcKVAeRgjfnSvPfb1JITpaM58exQSaDsPBeR0DbVzU1zDdhLcwHHiQupFh98Ka
 vBZthbrlL/u4NB26enEEW0iyA32BsxYBMnIu0z5ux9RbZflmQwGQ0c0rvy3dJ7/b
 FzB5ayVST5y/a0m6/sImeeExh78GU9rsMb1XmJRMwlJAy6miDz/F9TP0LnuW6PhG
 J5XC99ygNJS1pQBLACRsrZw6ImgBxXnWCok6tWPMxFfD+rJBU2//wqS+HozyMWHL
 iYP7+ytVo/ZVok4114X/V4Oof3a6wqgpBuYrivJ228QO+UsLYbYLo6sZ8kRK7VFm
 9GgHo/8rWB1T9lBbSaa7UL5r0dVNNLjFGS42vwV+YlgUMQ1A35VRojO0jUnJSIQU
 Ug1IxKmylLd0nEcwD8/l3DXeQABsfL8GsoKW0OtdTZtW4RND4gzq34LK6t7hvayF
 kUkLg1OLNdUJwOi16M/rhugwYFZIMfoxQtjkRXKWN4RZ2QgSHnx2lhqNmRGPAXBG
 uy21wlzUTfLTqTpoeOyHzJwyF2qf2y4nsziBMhvmlrUvIzW1LIrYUKCNT4HR8Sh5
 lC2WMGYuIqaiu+NOF3v6CgvKd9UW+mxMRyPEybH8mEgfm+FLZlWABiBjIUpSEZuB
 JFfuMv1zlljj/okIQRg8
 =USIR
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull usercopy protection from Kees Cook:
 "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
  copy_from_user bounds checking for most architectures on SLAB and
  SLUB"

* tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mm: SLUB hardened usercopy support
  mm: SLAB hardened usercopy support
  s390/uaccess: Enable hardened usercopy
  sparc/uaccess: Enable hardened usercopy
  powerpc/uaccess: Enable hardened usercopy
  ia64/uaccess: Enable hardened usercopy
  arm64/uaccess: Enable hardened usercopy
  ARM: uaccess: Enable hardened usercopy
  x86/uaccess: Enable hardened usercopy
  mm: Hardened usercopy
  mm: Implement stack frame object validation
  mm: Add is_migrate_cma_page
2016-08-08 14:48:14 -07:00
Daniel Borkmann
479ffcccef bpf: fix checksum fixups on bpf_skb_store_bytes
bpf_skb_store_bytes() invocations above L2 header need BPF_F_RECOMPUTE_CSUM
flag for updates, so that CHECKSUM_COMPLETE will be fixed up along the way.
Where we ran into an issue with bpf_skb_store_bytes() is when we did a
single-byte update on the IPv6 hoplimit despite using BPF_F_RECOMPUTE_CSUM
flag; simple ping via ICMPv6 triggered a hw csum failure as a result. The
underlying issue has been tracked down to a buffer alignment issue.

Meaning, that csum_partial() computations via skb_postpull_rcsum() and
skb_postpush_rcsum() pair invoked had a wrong result since they operated on
an odd address for the hoplimit, while other computations were done on an
even address. This mix doesn't work as-is with skb_postpull_rcsum(),
skb_postpush_rcsum() pair as it always expects at least half-word alignment
of input buffers, which is normally the case. Thus, instead of these helpers
using csum_sub() and (implicitly) csum_add(), we need to use csum_block_sub(),
csum_block_add(), respectively. For unaligned offsets, they rotate the sum
to align it to a half-word boundary again, otherwise they work the same as
csum_sub() and csum_add().

Adding __skb_postpull_rcsum(), __skb_postpush_rcsum() variants that take the
offset as an input and adapting bpf_skb_store_bytes() to them fixes the hw
csum failures again. The skb_postpull_rcsum(), skb_postpush_rcsum() helpers
use a 0 constant for offset so that the compiler optimizes the offset & 1
test away and generates the same code as with csum_sub()/_add().

Fixes: 608cd71a9c ("tc: bpf: generalize pedit action")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 13:11:43 -07:00
Linus Torvalds
1bd4403d86 unsafe_[get|put]_user: change interface to use a error target label
When I initially added the unsafe_[get|put]_user() helpers in commit
5b24a7a2aa ("Add 'unsafe' user access functions for batched
accesses"), I made the mistake of modeling the interface on our
traditional __[get|put]_user() functions, which return zero on success,
or -EFAULT on failure.

That interface is fairly easy to use, but it's actually fairly nasty for
good code generation, since it essentially forces the caller to check
the error value for each access.

In particular, since the error handling is already internally
implemented with an exception handler, and we already use "asm goto" for
various other things, we could fairly easily make the error cases just
jump directly to an error label instead, and avoid the need for explicit
checking after each operation.

So switch the interface to pass in an error label, rather than checking
the error value in the caller.  Best do it now before we start growing
more users (the signal handling code in particular would be a good place
to use the new interface).

So rather than

	if (unsafe_get_user(x, ptr))
		... handle error ..

the interface is now

	unsafe_get_user(x, ptr, label);

where an error during the user mode fetch will now just cause a jump to
'label' in the caller.

Right now the actual _implementation_ of this all still ends up being a
"if (err) goto label", and does not take advantage of any exception
label tricks, but for "unsafe_put_user()" in particular it should be
fairly straightforward to convert to using the exception table model.

Note that "unsafe_get_user()" is much harder to convert to a clever
exception table model, because current versions of gcc do not allow the
use of "asm goto" (for the exception) with output values (for the actual
value to be fetched).  But that is hopefully not a limitation in the
long term.

[ Also note that it might be a good idea to switch unsafe_get_user() to
  actually _return_ the value it fetches from user space, but this
  commit only changes the error handling semantics ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-08 13:02:01 -07:00
Phil Sutter
dca3f53c02 sctp: Export struct sctp_info to userspace
This is required to correctly interpret INET_DIAG_INFO messages exported
by sctp_diag module.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 12:51:58 -07:00
Eric W. Biederman
703286608a netns: Add a limit on the number of net namespaces
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 14:42:04 -05:00
Eric W. Biederman
d08311dd6f cgroupns: Add a limit on the number of cgroup namespaces
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 14:42:03 -05:00
Eric W. Biederman
aba3566163 ipcns: Add a limit on the number of ipc namespaces
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 14:42:03 -05:00
Eric W. Biederman
f7af3d1c03 utsns: Add a limit on the number of uts namespaces
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 14:42:02 -05:00
Eric W. Biederman
f333c700c6 pidns: Add a limit on the number of pid namespaces
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 14:42:01 -05:00
Eric W. Biederman
25f9c0817c userns: Generalize the user namespace count into ucount
The same kind of recursive sane default limit and policy
countrol that has been implemented for the user namespace
is desirable for the other namespaces, so generalize
the user namespace refernce count into a ucount.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 14:41:52 -05:00
Eric W. Biederman
f6b2db1a3e userns: Make the count of user namespaces per user
Add a structure that is per user and per user ns and use it to hold
the count of user namespaces.  This makes prevents one user from
creating denying service to another user by creating the maximum
number of user namespaces.

Rename the sysctl export of the maximum count from
/proc/sys/userns/max_user_namespaces to /proc/sys/user/max_user_namespaces
to reflect that the count is now per user.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 14:40:30 -05:00
Eric W. Biederman
b376c3e1b6 userns: Add a limit on the number of user namespaces
Export the export the maximum number of user namespaces as
/proc/sys/userns/max_user_namespaces.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 13:41:24 -05:00
Eric W. Biederman
dbec28460a userns: Add per user namespace sysctls.
Limit per userns sysctls to only be opened for write by a holder
of CAP_SYS_RESOURCE.

Add all of the necessary boilerplate for having per user namespace
sysctls.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 13:18:58 -05:00
Eric W. Biederman
b032132c3c userns: Free user namespaces in process context
Add the necessary boiler plate to move freeing of user namespaces into
work queue and thus into process context where things can sleep.

This is a necessary precursor to per user namespace sysctls.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 09:17:18 -05:00
Eric W. Biederman
13bcc6a285 sysctl: Stop implicitly passing current into sysctl_table_root.lookup
Passing nsproxy into sysctl_table_root.lookup was a premature
optimization in attempt to avoid depending on current.  The
directory /proc/self/sys has not appeared and if and when
it does this code will need to be reviewed closely and reworked
anyway.  So remove the premature optimization.

Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-08-08 09:17:16 -05:00
Chen-Yu Tsai
585083c539 mfd: ac100: Add driver for X-Powers AC100 audio codec / RTC combo IC
The AC100 is a multifunction device with an audio codec subsystem and
an RTC subsystem. These two subsystems share a common register space
and host interface.

Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-08-08 12:53:26 +01:00
Dave Jiang
f067025bc6 dmaengine: add support to provide error result from a DMA transation
Adding a new callback that will provide the error result for a transaction.
The result is allocated on the stack and the callback should create a copy
if it wishes to retain the information after exiting. The result parameter
is now defined and takes over the dummy void pointer we placed in the
helper functions previously. dmaengine drivers should start converting
to the new "callback_result" callback in order to receive transaction
results.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-08-08 08:11:42 +05:30
Chanwoo Choi
7fe95fb889 extcon: Add new EXTCON_CHG_WPT for Wireless Power Transfer device
This patchs add the new EXTCON_CHG_WPT for Wireless Power Transfer[1].
The Wireless Power Transfer is the transmission of electronical energy
from a power source. The EXTCON_CHG_WPT has the EXTCON_TYPE_CHG.

[1] https://en.wikipedia.org/wiki/Wireless_power_transfer

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2016-08-08 11:17:23 +09:00
Chanwoo Choi
9c0595d688 extcon: Add new EXTCON_DISP_HMD for Head-mounted Display device
This patch adds the new EXTCON_DISP_HMD id for Head-mounted Display[1] device.
The HMD device is usually for USB connector type So, the HMD connector
has the two extcon types of both EXTCON_TYPE_DISP and EXTCON_TYPE_USB.

[1] https://en.wikipedia.org/wiki/Head-mounted_display

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2016-08-08 11:17:23 +09:00
Chris Zhong
2164188d57 extcon: Add EXTCON_DISP_DP and the property for USB Type-C
Add EXTCON_DISP_DP for the Display external connector. For Type-C
connector the DisplayPort can work as an Alternate Mode(VESA DisplayPort
Alt Mode on USB Type-C Standard). The Type-C support both normal
and flipped orientation, so add a property to extcon.

Signed-off-by: Chris Zhong <zyw@rock-chips.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Chris Zhong <zyw@rock-chips.com>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
2016-08-08 11:17:23 +09:00
Chanwoo Choi
ab11af049f extcon: Add the synchronization extcon APIs to support the notification
This patch adds the synchronization extcon APIs to support the notifications
for both state and property. When extcon_*_sync() functions is called,
the extcon informs the information from extcon provider to extcon client.

The extcon driver may need to change the both state and multiple properties
at the same time. After setting the data of a external connector,
the extcon send the notification to client driver with the extcon_*_sync().

The list of new extcon APIs as following:
- extcon_sync() : Send the notification for each external connector to
		synchronize the information between extcon provider driver
		and extcon client driver.
- extcon_set_state_sync() : Set the state of external connector with noti.
- extcon_set_property_sync() : Set the property of external connector with noti.

For example,
case 1, change the state of external connector and synchronized the data.
	extcon_set_state_sync(edev, EXTCON_USB, 1);

case 2, change both the state and property of external connector
	and synchronized the data.
	extcon_set_state(edev, EXTCON_USB, 1);
	extcon_set_property(edev, EXTCON_USB, EXTCON_PROP_USB_VBUS 1);
	extcon_sync(edev, EXTCON_USB);

case 3, change the property of external connector and synchronized the data.
	extcon_set_property(edev, EXTCON_USB, EXTCON_PROP_USB_VBUS, 0);
	extcon_sync(edev, EXTCON_USB);

case 4, change the property of external connector and synchronized the data.
	extcon_set_property_sync(edev, EXTCON_USB, EXTCON_PROP_USB_VBUS, 0);

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Chris Zhong <zyw@rock-chips.com>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
2016-08-08 11:17:22 +09:00
Chanwoo Choi
575c2b867e extcon: Rename the extcon_set/get_state() to maintain the function naming pattern
This patch just renames the existing extcon_get/set_cable_state_()
as following because of maintaining the function naming pattern
like as extcon APIs for property.
- extcon_set_cable_state_() -> extcon_set_state()
- extcon_get_cable_state_() -> extcon_get_state()

But, this patch remains the old extcon_set/get_cable_state_() functions
to prevent the build break. After altering new APIs, remove the old APIs.

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Chris Zhong <zyw@rock-chips.com>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
2016-08-08 11:17:22 +09:00
Chanwoo Choi
7f2a0a1699 extcon: Add the support for the capability of each property
This patch adds the support of the property capability setting. This function
decides the supported properties of each external connector on extcon provider
driver.

Ths list of new extcon APIs to get/set the capability of property as following:
- int extcon_get_property_capability(struct extcon_dev *edev,
					unsigned int id, unsigned int prop);
- int extcon_set_property_capability(struct extcon_dev *edev,
					unsigned int id, unsigned int prop);

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Chris Zhong <zyw@rock-chips.com>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
2016-08-08 11:17:22 +09:00
Chanwoo Choi
792e7e9e5d extcon: Add the support for extcon property according to extcon type
This patch support the extcon property for the external connector
because each external connector might have the property according to
the H/W design and the specific characteristics.

- EXTCON_PROP_USB_[property name]
- EXTCON_PROP_CHG_[property name]
- EXTCON_PROP_JACK_[property name]
- EXTCON_PROP_DISP_[property name]

Add the new extcon APIs to get/set the property value as following:
- int extcon_get_property(struct extcon_dev *edev, unsigned int id,
			unsigned int prop,
			union extcon_property_value *prop_val)
- int extcon_set_property(struct extcon_dev *edev, unsigned int id,
			unsigned int prop,
			union extcon_property_value prop_val)

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Chris Zhong <zyw@rock-chips.com>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
2016-08-08 11:17:22 +09:00
Chanwoo Choi
55e4e2f129 extcon: Add the extcon_type to gather each connector into five category
This patch adds the new extcon type to group the each connecotr
into following five category. This type would be used to handle
the connectors as a group unit instead of a connector unit.
- EXTCON_TYPE_USB  : USB connector
- EXTCON_TYPE_CHG  : Charger connector
- EXTCON_TYPE_JACK : Jack connector
- EXTCON_TYPE_DISP : Display connector
- EXTCON_TYPE_MISC : Miscellaneous connector

Also, each external connector is possible to belong to one more extcon type.
In caes of EXTCON_CHG_USB_SDP, it have the EXTCON_TYPE_CHG and EXTCON_TYPE_USB.

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Chris Zhong <zyw@rock-chips.com>
Tested-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
2016-08-08 11:17:20 +09:00