Commit graph

737148 commits

Author SHA1 Message Date
Dan Williams
edfbae53da x86/spectre: Report get_user mitigation for spectre_v1
Reflect the presence of get_user(), __get_user(), and 'syscall' protections
in sysfs. The expectation is that new and better tooling will allow the
kernel to grow more usages of array_index_nospec(), for now, only claim
mitigation for __user pointer de-references.

Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727420158.33451.11658324346540434635.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:32 +01:00
Dan Williams
259d8c1e98 nl80211: Sanitize array index in parse_txq_params
Wireless drivers rely on parse_txq_params to validate that txq_params->ac
is less than NL80211_NUM_ACS by the time the low-level driver's ->conf_tx()
handler is called. Use a new helper, array_index_nospec(), to sanitize
txq_params->ac with respect to speculation. I.e. ensure that any
speculation into ->conf_tx() handlers is done with a value of
txq_params->ac that is within the bounds of [0, NL80211_NUM_ACS).

Reported-by: Christian Lamparter <chunkeey@gmail.com>
Reported-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: linux-wireless@vger.kernel.org
Cc: torvalds@linux-foundation.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727419584.33451.7700736761686184303.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:32 +01:00
Dan Williams
56c30ba7b3 vfs, fdtable: Prevent bounds-check bypass via speculative execution
'fd' is a user controlled value that is used as a data dependency to
read from the 'fdt->fd' array.  In order to avoid potential leaks of
kernel memory values, block speculative execution of the instruction
stream that could issue reads based on an invalid 'file *' returned from
__fcheck_files.

Co-developed-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727418500.33451.17392199002892248656.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:32 +01:00
Dan Williams
2fbd7af5af x86/syscall: Sanitize syscall table de-references under speculation
The syscall table base is a user controlled function pointer in kernel
space. Use array_index_nospec() to prevent any out of bounds speculation.

While retpoline prevents speculating into a userspace directed target it
does not stop the pointer de-reference, the concern is leaking memory
relative to the syscall table base, by observing instruction cache
behavior.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727417984.33451.1216731042505722161.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:31 +01:00
Dan Williams
c7f631cb07 x86/get_user: Use pointer masking to limit speculation
Quoting Linus:

    I do think that it would be a good idea to very expressly document
    the fact that it's not that the user access itself is unsafe. I do
    agree that things like "get_user()" want to be protected, but not
    because of any direct bugs or problems with get_user() and friends,
    but simply because get_user() is an excellent source of a pointer
    that is obviously controlled from a potentially attacking user
    space. So it's a prime candidate for then finding _subsequent_
    accesses that can then be used to perturb the cache.

Unlike the __get_user() case get_user() includes the address limit check
near the pointer de-reference. With that locality the speculation can be
mitigated with pointer narrowing rather than a barrier, i.e.
array_index_nospec(). Where the narrowing is performed by:

	cmp %limit, %ptr
	sbb %mask, %mask
	and %mask, %ptr

With respect to speculation the value of %ptr is either less than %limit
or NULL.

Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727417469.33451.11804043010080838495.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:31 +01:00
Dan Williams
304ec1b050 x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec
Quoting Linus:

    I do think that it would be a good idea to very expressly document
    the fact that it's not that the user access itself is unsafe. I do
    agree that things like "get_user()" want to be protected, but not
    because of any direct bugs or problems with get_user() and friends,
    but simply because get_user() is an excellent source of a pointer
    that is obviously controlled from a potentially attacking user
    space. So it's a prime candidate for then finding _subsequent_
    accesses that can then be used to perturb the cache.

__uaccess_begin_nospec() covers __get_user() and copy_from_iter() where the
limit check is far away from the user pointer de-reference. In those cases
a barrier_nospec() prevents speculation with a potential pointer to
privileged memory. uaccess_try_nospec covers get_user_try.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727416953.33451.10508284228526170604.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:31 +01:00
Dan Williams
b5c4ae4f35 x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end}
In preparation for converting some __uaccess_begin() instances to
__uacess_begin_nospec(), make sure all 'from user' uaccess paths are
using the _begin(), _end() helpers rather than open-coded stac() and
clac().

No functional changes.

Suggested-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727416438.33451.17309465232057176966.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:30 +01:00
Dan Williams
b3bbfb3fb5 x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec
For __get_user() paths, do not allow the kernel to speculate on the value
of a user controlled pointer. In addition to the 'stac' instruction for
Supervisor Mode Access Protection (SMAP), a barrier_nospec() causes the
access_ok() result to resolve in the pipeline before the CPU might take any
speculative action on the pointer value. Given the cost of 'stac' the
speculation barrier is placed after 'stac' to hopefully overlap the cost of
disabling SMAP with the cost of flushing the instruction pipeline.

Since __get_user is a major kernel interface that deals with user
controlled pointers, the __uaccess_begin_nospec() mechanism will prevent
speculative execution past an access_ok() permission check. While
speculative execution past access_ok() is not enough to lead to a kernel
memory leak, it is a necessary precondition.

To be clear, __uaccess_begin_nospec() is addressing a class of potential
problems near __get_user() usages.

Note, that while the barrier_nospec() in __uaccess_begin_nospec() is used
to protect __get_user(), pointer masking similar to array_index_nospec()
will be used for get_user() since it incorporates a bounds check near the
usage.

uaccess_try_nospec provides the same mechanism for get_user_try.

No functional changes.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Suggested-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727415922.33451.5796614273104346583.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:30 +01:00
Dan Williams
b3d7ad85b8 x86: Introduce barrier_nospec
Rename the open coded form of this instruction sequence from
rdtsc_ordered() into a generic barrier primitive, barrier_nospec().

One of the mitigations for Spectre variant1 vulnerabilities is to fence
speculative execution after successfully validating a bounds check. I.e.
force the result of a bounds check to resolve in the instruction pipeline
to ensure speculative execution honors that result before potentially
operating on out-of-bounds data.

No functional changes.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Suggested-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727415361.33451.9049453007262764675.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:29 +01:00
Dan Williams
babdde2698 x86: Implement array_index_mask_nospec
array_index_nospec() uses a mask to sanitize user controllable array
indexes, i.e. generate a 0 mask if 'index' >= 'size', and a ~0 mask
otherwise. While the default array_index_mask_nospec() handles the
carry-bit from the (index - size) result in software.

The x86 array_index_mask_nospec() does the same, but the carry-bit is
handled in the processor CF flag without conditional instructions in the
control flow.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727414808.33451.1873237130672785331.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:29 +01:00
Dan Williams
f380420330 array_index_nospec: Sanitize speculative array de-references
array_index_nospec() is proposed as a generic mechanism to mitigate
against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary
checks via speculative execution. The array_index_nospec()
implementation is expected to be safe for current generation CPUs across
multiple architectures (ARM, x86).

Based on an original implementation by Linus Torvalds, tweaked to remove
speculative flows by Alexei Starovoitov, and tweaked again by Linus to
introduce an x86 assembly implementation for the mask generation.

Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Co-developed-by: Alexei Starovoitov <ast@kernel.org>
Suggested-by: Cyril Novikov <cnovikov@lynx.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:29 +01:00
Mark Rutland
f84a56f73d Documentation: Document array_index_nospec
Document the rationale and usage of the new array_index_nospec() helper.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: linux-arch@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: gregkh@linuxfoundation.org
Cc: kernel-hardening@lists.openwall.com
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727413645.33451.15878817161436755393.stgit@dwillia2-desk3.amr.corp.intel.com
2018-01-30 21:54:28 +01:00
Chengguang Xu
16515a6d54 ceph: improving efficiency of syncfs
write_inode() could be called variety of reasons, in the case of syncfs(2)
there is no need to wait for flush getting completed in write_inode(),
->sync_fs is for guaranteeing flush completion for all inodes at that point.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-01-30 21:10:15 +01:00
Linus Torvalds
af8c5e2d60 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Implement frequency/CPU invariance and OPP selection for
     SCHED_DEADLINE (Juri Lelli)

   - Tweak the task migration logic for better multi-tasking
     workload scalability (Mel Gorman)

   - Misc cleanups, fixes and improvements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Make bandwidth enforcement scale-invariant
  sched/cpufreq: Move arch_scale_{freq,cpu}_capacity() outside of #ifdef CONFIG_SMP
  sched/cpufreq: Remove arch_scale_freq_capacity()'s 'sd' parameter
  sched/cpufreq: Always consider all CPUs when deciding next freq
  sched/cpufreq: Split utilization signals
  sched/cpufreq: Change the worker kthread to SCHED_DEADLINE
  sched/deadline: Move CPU frequency selection triggering points
  sched/cpufreq: Use the DEADLINE utilization signal
  sched/deadline: Implement "runtime overrun signal" support
  sched/fair: Only immediately migrate tasks due to interrupts if prev and target CPUs share cache
  sched/fair: Correct obsolete comment about cpufreq_update_util()
  sched/fair: Remove impossible condition from find_idlest_group_cpu()
  sched/cpufreq: Don't pass flags to sugov_set_iowait_boost()
  sched/cpufreq: Initialize sg_cpu->flags to 0
  sched/fair: Consider RT/IRQ pressure in capacity_spare_wake()
  sched/fair: Use 'unsigned long' for utilization, consistently
  sched/core: Rework and clarify prepare_lock_switch()
  sched/fair: Remove unused 'curr' parameter from wakeup_gran
  sched/headers: Constify object_is_on_stack()
2018-01-30 11:55:56 -08:00
Linus Torvalds
a1c75e17e7 Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS updates from Ingo Molnar:

 - various AMD SMCA error parsing/reporting improvements (Yazen Ghannam)

 - extend Intel CMCI error reporting to more cases (Xie XiuQi)

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/MCE: Make correctable error detection look at the Deferred bit
  x86/MCE: Report only DRAM ECC as memory errors on AMD systems
  x86/MCE/AMD: Define a function to get SMCA bank type
  x86/mce/AMD: Don't set DEF_INT_TYPE in MSR_CU_DEF_ERR on SMCA systems
  x86/MCE: Extend table to report action optional errors through CMCI too
2018-01-30 11:48:44 -08:00
Linus Torvalds
d8b91dde38 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - Clean up the x86 instruction decoder (Masami Hiramatsu)

   - Add new uprobes optimization for PUSH instructions on x86 (Yonghong
     Song)

   - Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)

   - Fix misc bugs, update documentation, plus various cleanups (Jiri
     Olsa)

  There's a large number of tooling side improvements:

   - Intel-PT/BTS improvements (Adrian Hunter)

   - Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)

   - Introduce an errno code to string facility (Hendrik Brueckner)

   - Various build system improvements (Jiri Olsa)

   - Add support for CoreSight trace decoding by making the perf tools
     use the external openCSD (Mathieu Poirier, Tor Jeremiassen)

   - Add ARM Statistical Profiling Extensions (SPE) support (Kim
     Phillips)

   - libtraceevent updates (Steven Rostedt)

   - Intel vendor event JSON updates (Andi Kleen)

   - Introduce 'perf report --mmaps' and 'perf report --tasks' to show
     info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)

   - Add infrastructure to record first and last sample time to the
     perf.data file header, so that when processing all samples in a
     'perf record' session, such as when doing build-id processing, or
     when specifically requesting that that info be recorded, use that
     in 'perf report --time', that also got support for percent slices
     in addition to absolute ones.

     I.e. now it is possible to ask for the samples in the 10%-20% time
     slice of a perf.data file (Jin Yao)

   - Allow system wide 'perf stat --per-thread', sorting the result (Jin
     Yao)

     E.g.:

      [root@jouet ~]# perf stat --per-thread --metrics IPC
      ^C
       Performance counter stats for 'system wide':

                  make-22229  23,012,094,032  inst_retired.any   #  0.8 IPC
                   cc1-22419     692,027,497  inst_retired.any   #  0.8 IPC
                   gcc-22418     328,231,855  inst_retired.any   #  0.9 IPC
                   cc1-22509     220,853,647  inst_retired.any   #  0.8 IPC
                   gcc-22486     199,874,810  inst_retired.any   #  1.0 IPC
                    as-22466     177,896,365  inst_retired.any   #  0.9 IPC
                   cc1-22465     150,732,374  inst_retired.any   #  0.8 IPC
                   gcc-22508     112,555,593  inst_retired.any   #  0.9 IPC
                   cc1-22487     108,964,079  inst_retired.any   #  0.7 IPC
       qemu-system-x86-2697       21,330,550  inst_retired.any   #  0.3 IPC
       systemd-journal-551        20,642,951  inst_retired.any   #  0.4 IPC
       docker-containe-17651       9,552,892  inst_retired.any   #  0.5 IPC
       dockerd-current-9809        7,528,586  inst_retired.any   #  0.5 IPC
                  make-22153  12,504,194,380  inst_retired.any   #  0.8 IPC
               python2-22429  12,081,290,954  inst_retired.any   #  0.8 IPC
      <SNIP>
               python2-22429  15,026,328,103  cpu_clk_unhalted.thread
                   cc1-22419     826,660,193  cpu_clk_unhalted.thread
                   gcc-22418     365,321,295  cpu_clk_unhalted.thread
                   cc1-22509     279,169,362  cpu_clk_unhalted.thread
                   gcc-22486     210,156,950  cpu_clk_unhalted.thread
      <SNIP>

           5.638075538 seconds time elapsed

     [root@jouet ~]#

   - Improve shell auto-completion of perf events (Jin Yao)

   - 'perf probe' improvements (Masami Hiramatsu)

   - Improve PMU infrastructure to support amp64's ThunderX2
     implementation defined core events (Ganapatrao Kulkarni)

   - Various annotation related improvements and fixes (Thomas Richter)

   - Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
     code, removing the 'overwrite' parameter from several functions as
     it was always used it as 'false' (Wang Nan)

   - Fix/improve 'perf record' reverse recording support (Wang Nan)

   - Improve command line options documentation (Sihyeon Jang)

   - Optimize sample parsing for ordering events, where we don't need to
     parse all the PERF_SAMPLE_ bits, just the ones leading to the
     timestamp needed to reorder events (Jiri Olsa)

   - Generalize the annotation code to support other source information
     besides objdump/DWARF obtained ones, starting with python scripts,
     that will is slated to be merged soon (Jiri Olsa)

   - ... and a lot more that I failed to list, see the shortlog and
     changelog for details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
  perf trace beauty flock: Move to separate object file
  perf evlist: Remove fcntl.h from evlist.h
  perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
  perf trace: Do not print from time delta for interrupted syscall lines
  perf trace: Add --print-sample
  perf bpf: Remove misplaced __maybe_unused attribute
  MAINTAINERS: Adding entry for CoreSight trace decoding
  perf tools: Add mechanic to synthesise CoreSight trace packets
  perf tools: Add full support for CoreSight trace decoding
  pert tools: Add queue management functionality
  perf tools: Add functionality to communicate with the openCSD decoder
  perf tools: Add support for decoding CoreSight trace data
  perf tools: Add decoder mechanic to support dumping trace data
  perf tools: Add processing of coresight metadata
  perf tools: Add initial entry point for decoder CoreSight traces
  perf tools: Integrating the CoreSight decoding library
  perf vendor events intel: Update IvyTown files to V20
  perf vendor events intel: Update IvyBridge files to V20
  perf vendor events intel: Update BroadwellDE events to V7
  perf vendor events intel: Update SkylakeX events to V1.06
  ...
2018-01-30 11:15:14 -08:00
Linus Torvalds
5e7481a25e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes relate to making lock_is_held() et al (and external
  wrappers of them) work on const data types - this requires const
  propagation through the depths of lockdep.

  This removes a number of ugly type hacks the external helpers used"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep: Convert some users to const
  lockdep: Make lockdep checking constant
  lockdep: Assign lock keys on registration
2018-01-30 10:44:56 -08:00
Linus Torvalds
b8dbf73086 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The biggest change in this cycle was the addition of ARM CPER error
  decoding when printing EFI errors into the kernel log.

  There are also misc smaller updates: documentation update, cleanups
  and an EFI memory map permissions quirk"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Clarify that reset attack mitigation needs appropriate userspace
  efi: Parse ARM error information value
  efi: Move ARM CPER code to new file
  efi: Use PTR_ERR_OR_ZERO()
  arm64/efi: Ignore EFI_MEMORY_XP attribute if RP and/or WP are set
  efi/capsule-loader: Fix pr_err() string to end with newline
2018-01-30 10:42:39 -08:00
Linus Torvalds
d772794637 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main RCU changes in this cycle were:

   - Updates to use cond_resched() instead of cond_resched_rcu_qs()
     where feasible (currently everywhere except in kernel/rcu and in
     kernel/torture.c). Also a couple of fixes to avoid sending IPIs to
     offline CPUs.

   - Updates to simplify RCU's dyntick-idle handling.

   - Updates to remove almost all uses of smp_read_barrier_depends() and
     read_barrier_depends().

   - Torture-test updates.

   - Miscellaneous fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
  torture: Save a line in stutter_wait(): while -> for
  torture: Eliminate torture_runnable and perf_runnable
  torture: Make stutter less vulnerable to compilers and races
  locking/locktorture: Fix num reader/writer corner cases
  locking/locktorture: Fix rwsem reader_delay
  torture: Place all torture-test modules in one MAINTAINERS group
  rcutorture/kvm-build.sh: Skip build directory check
  rcutorture: Simplify functions.sh include path
  rcutorture: Simplify logging
  rcutorture/kvm-recheck-*: Improve result directory readability check
  rcutorture/kvm.sh: Support execution from any directory
  rcutorture/kvm.sh: Use consistent help text for --qemu-args
  rcutorture/kvm.sh: Remove unused variable, `alldone`
  rcutorture: Remove unused script, config2frag.sh
  rcutorture/configinit: Fix build directory error message
  rcutorture: Preempt RCU-preempt readers more vigorously
  torture: Reduce #ifdefs for preempt_schedule()
  rcu: Remove have_rcu_nocb_mask from tree_plugin.h
  rcu: Add comment giving debug strategy for double call_rcu()
  tracing, rcu: Hide trace event rcu_nocb_wake when not used
  ...
2018-01-30 10:15:30 -08:00
Linus Torvalds
c1488798ad Merge branch 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull STRICT_DEVMEM default from Ingo Molnar:
 "Make CONFIG_STRICT_DEVMEM default-y on x86 and arm64 as well, to
  follow the distro status quo"

* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64
2018-01-30 10:11:26 -08:00
Andi Shyti
7546db025f Input: mms114 - use BIT() macro instead of explicit shifting
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-01-30 10:08:16 -08:00
Andi Shyti
a22e88fd7d Input: mms114 - replace mdelay with msleep
200 milliseconds is a very long time to keep the CPU busy looping.
Use msleep instead.

Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-01-30 10:08:12 -08:00
Florian Schmidt
03eac8b221 Documentation: Fix 'file_mapped' -> 'mapped_file'
There is no entry file_mapped in the memory.stat file. This looks like a
simple word flip that's gone unnoticed since 2010 (dc10e281f5,
memcg: update documentation).

Signed-off-by: Florian Schmidt <florian.schmidt@neclab.eu>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-01-30 09:52:11 -08:00
Andreas Gruenbacher
af38816e48 gfs2: Add a few missing newlines in messages
Some of the info, warning, and error messages are missing their trailing
newline.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-30 10:32:30 -07:00
Rob Herring
3a6fbcb2e2 xtensa: remove arch specific early DT functions
Now that the DT core code handles bootmem arches, we can remove the xtensa
specific early_init_dt_alloc_memory_arch function. The common
early_init_dt_add_memory_arch can be used too now that xtensa switched to
memblock.

Cc: Chris Zankel <chris@zankel.net>
Cc: linux-xtensa@linux-xtensa.org
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-30 11:17:37 -06:00
Rob Herring
b75e250a90 x86: remove arch specific early_init_dt_alloc_memory_arch
Now that the DT core code handles bootmem arches, we can remove the x86
specific early_init_dt_alloc_memory_arch function.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-30 11:17:37 -06:00
Rob Herring
f7a6e01c8c nios2: remove arch specific early_init_dt_alloc_memory_arch
Now that the DT core code handles bootmem arches, we can remove the nios2
specific early_init_dt_alloc_memory_arch function.

Cc: Ley Foon Tan <lftan@altera.com>
Cc: nios2-dev@lists.rocketboards.org
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-30 11:17:12 -06:00
Rob Herring
a518a04eb0 mips: remove arch specific early_init_dt_alloc_memory_arch
Now that the DT core code handles bootmem arches, we can remove the MIPS
specific early_init_dt_alloc_memory_arch function.

Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-30 11:15:55 -06:00
Rob Herring
ac94f18f66 metag: remove arch specific early DT functions
Now that the DT core code handles bootmem arches, we can remove the metag
specific early_init_dt_alloc_memory_arch function. As the default
early_init_dt_add_memory_arch just does a WARN, we can remove it too.

Cc: James Hogan <jhogan@kernel.org>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-30 11:15:54 -06:00
Rob Herring
0fbc0b67a8 cris: remove arch specific early DT functions
Now that the DT core code handles bootmem arches, we can remove the cris
specific early_init_dt_alloc_memory_arch function. As the default
early_init_dt_add_memory_arch just does a WARN, we can just remove the
entire devicetree.c file.

Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: linux-cris-kernel@axis.com
Signed-off-by: Rob Herring <robh@kernel.org>
2018-01-30 11:15:54 -06:00
Abhi Das
957a7acd46 gfs2: Remove inode from ordered write list in gfs2_write_inode()
The vfs clears the I_DIRTY inode flag before calling gfs2_write_inode()
having queued any data that needed to be written to disk.
This is a good time to remove such inodes from our ordered write list
so they don't hang around for long periods of time.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2018-01-30 10:00:27 -07:00
Radim Krčmář
92ea2b3381 KVM: s390: Fixes and features for 4.16 part 2
- exitless interrupts for emulated devices (Michael Mueller)
 - cleanup of cpuflag handling (David Hildenbrand)
 - kvm stat counter improvements (Christian Borntraeger)
 - vsie improvements (David Hildenbrand)
 - mm cleanup (Janosch Frank)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJaayoSAAoJEBF7vIC1phx8T8EP/1xGy6ZZBdVCAT00u3GWZ9eh
 M5m1thSUuhKuHdJHaN1ORrBPNhR5l+Bvf5VMi5LkWORUpQc4jOYz1BX1qrryvWXk
 Bt6363v1rhbInk7uKv4E0q9i3Ei6dSfoT0YByihqiPjkaeJyG830Ez2IUFFdGUQQ
 ulyRtKXJ8Sk9L3LhO9uLHFtU9CZ+2CpiEEM6q3TCNzduU7LC9NHdl/bx4uKqgz4h
 9l/i+P3CXFglBpFDL+JTD72myBbbm78bQAXDoJWSTm9EKolpUZaTP2xpCrrG+A7f
 RRPzJoYOtxEgDTnNcjH16OX2TGXpPL0Q2cTl4vJihaZW9KrTPjYHQY5BIf4EQzhU
 Kd2p1yN/aYQFsLA5cofpkC4HPBKeiocg0HAu6byo9N+uLTjnxoZUkekfq03UXVan
 xwENUYknFmuI0SLnz3f2fHqWKXsdrron+gzldtBUBtTd8EOvh7c7/TNE+IrpTgRo
 HvC4KddOp30filp8GBxXU79i2qFzTyvjGwxzuiMo29n0R0i8tzdCokywsvAgWHvX
 KC8ukcmwsGq1lJB32rZ5RkacBCDS/iSaD0xg7iseHoYYpUPRuWxzoTVV134LGh5Q
 IX42NGvS8mA2I5byd3R9DtDY5eQYSN7bRGsNWSKJCKRSX9zEHpIKTQXT8N/aXvxe
 PQoxcEau3AuzbQs9KjFj
 =YJ0E
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: Fixes and features for 4.16 part 2

- exitless interrupts for emulated devices (Michael Mueller)
- cleanup of cpuflag handling (David Hildenbrand)
- kvm stat counter improvements (Christian Borntraeger)
- vsie improvements (David Hildenbrand)
- mm cleanup (Janosch Frank)
2018-01-30 17:42:40 +01:00
Jason Gunthorpe
e7996a9a77 Linux 4.15
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJabj6pAAoJEHm+PkMAQRiGs8cIAJQFkCWnbz86e3vG4DuWhyA8
 CMGHCQdUOxxFGa/ixhIiuetbC0x+JVHAjV2FwVYbAQfaZB3pfw2iR1ncQxpAP1AI
 oLU9vBEqTmwKMPc9CM5rRfnLFWpGcGwUNzgPdxD5yYqGDtcM8K840mF6NdkYe5AN
 xU8rv1wlcFPF4A5pvHCH0pvVmK4VxlVFk/2H67TFdxBs4PyJOnSBnf+bcGWgsKO6
 hC8XIVtcKCH2GfFxt5d0Vgc5QXJEpX1zn2mtCa1MwYRjN2plgYfD84ha0xE7J0B0
 oqV/wnjKXDsmrgVpncr3txd4+zKJFNkdNRE4eLAIupHo2XHTG4HvDJ5dBY2NhGU=
 =sOml
 -----END PGP SIGNATURE-----

Merge tag v4.15 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

To resolve conflicts in:
 drivers/infiniband/hw/mlx5/main.c
 drivers/infiniband/hw/mlx5/qp.c

From patches merged into the -rc cycle. The conflict resolution matches
what linux-next has been carrying.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-30 09:30:00 -07:00
Andy Lutomirski
37a8f7c383 x86/asm: Move 'status' from thread_struct to thread_info
The TS_COMPAT bit is very hot and is accessed from code paths that mostly
also touch thread_info::flags.  Move it into struct thread_info to improve
cache locality.

The only reason it was in thread_struct is that there was a brief period
during which arch-specific fields were not allowed in struct thread_info.

Linus suggested further changing:

  ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

to:

  if (unlikely(ti->status & (TS_COMPAT|TS_I386_REGS_POKED)))
          ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

on the theory that frequently dirtying the cacheline even in pure 64-bit
code that never needs to modify status hurts performance.  That could be a
reasonable followup patch, but I suspect it matters less on top of this
patch.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Link: https://lkml.kernel.org/r/03148bcc1b217100e6e8ecf6a5468c45cf4304b6.1517164461.git.luto@kernel.org
2018-01-30 15:30:36 +01:00
Andy Lutomirski
d1f7732009 x86/entry/64: Push extra regs right away
With the fast path removed there is no point in splitting the push of the
normal and the extra register set. Just push the extra regs right away.

[ tglx: Split out from 'x86/entry/64: Remove the SYSCALL64 fast path' ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
2018-01-30 15:30:36 +01:00
Andy Lutomirski
21d375b6b3 x86/entry/64: Remove the SYSCALL64 fast path
The SYCALLL64 fast path was a nice, if small, optimization back in the good
old days when syscalls were actually reasonably fast.  Now there is PTI to
slow everything down, and indirect branches are verboten, making everything
messier.  The retpoline code in the fast path is particularly nasty.

Just get rid of the fast path. The slow path is barely slower.

[ tglx: Split out the 'push all extra regs' part ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
2018-01-30 15:30:36 +01:00
Dou Liyang
9471eee918 x86/spectre: Check CONFIG_RETPOLINE in command line parser
The spectre_v2 option 'auto' does not check whether CONFIG_RETPOLINE is
enabled. As a consequence it fails to emit the appropriate warning and sets
feature flags which have no effect at all.

Add the missing IS_ENABLED() check.

Fixes: da28512156 ("x86/spectre: Add boot time option to select Spectre v2 mitigation")
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: Tomohiro" <misono.tomohiro@jp.fujitsu.com>
Cc: dave.hansen@intel.com
Cc: bp@alien8.de
Cc: arjan@linux.intel.com
Cc: dwmw@amazon.co.uk
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/f5892721-7528-3647-08fb-f8d10e65ad87@cn.fujitsu.com
2018-01-30 15:30:35 +01:00
William Grant
55f49fcb87 x86/mm: Fix overlap of i386 CPU_ENTRY_AREA with FIX_BTMAP
Since commit 92a0f81d89 ("x86/cpu_entry_area: Move it out of the
fixmap"), i386's CPU_ENTRY_AREA has been mapped to the memory area just
below FIXADDR_START. But already immediately before FIXADDR_START is the
FIX_BTMAP area, which means that early_ioremap can collide with the entry
area.

It's especially bad on PAE where FIX_BTMAP_BEGIN gets aligned to exactly
match CPU_ENTRY_AREA_BASE, so the first early_ioremap slot clobbers the
IDT and causes interrupts during early boot to reset the system.

The overlap wasn't a problem before the CPU entry area was introduced,
as the fixmap has classically been preceded by the pkmap or vmalloc
areas, neither of which is used until early_ioremap is out of the
picture.

Relocate CPU_ENTRY_AREA to below FIX_BTMAP, not just below the permanent
fixmap area.

Fixes: commit 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Signed-off-by: William Grant <william.grant@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/7041d181-a019-e8b9-4e4e-48215f841e2c@canonical.com
2018-01-30 15:30:35 +01:00
James Hogan
91e6dd8284 ipmr: Fix ptrdiff_t print formatting
ipmr_vif_seq_show() prints the difference between two pointers with the
format string %2zd (z for size_t), however the correct format string is
%2td instead (t for ptrdiff_t).

The same bug in ip6mr_vif_seq_show() was already fixed long ago by
commit d430a227d2 ("bogus format in ip6mr").

Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-30 09:20:25 -05:00
Dmitry Mastykin
02e389e63e pinctrl: mcp23s08: fix irq setup order
When using mcp23s08 module with gpio-keys, often (50% of boots)
it fails to get irq numbers with message:
"gpio-keys keys: Unable to get irq number for GPIO 0, error -6".
Seems that irqs must be setup before devm_gpiochip_add_data().

Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Mastykin <mastichi@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-01-30 15:17:14 +01:00
Josh Poimboeuf
830c1e3d16 objtool: Warn on stripped section symbol
With the following fix:

  2a0098d706 ("objtool: Fix seg fault with gold linker")

... a seg fault was avoided, but the original seg fault condition in
objtool wasn't fixed.  Replace the seg fault with an error message.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/dc4585a70d6b975c99fc51d1957ccdde7bd52f3a.1517284349.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-30 15:09:23 +01:00
Josh Poimboeuf
17bc33914b objtool: Add support for alternatives at the end of a section
Now that the previous patch gave objtool the ability to read retpoline
alternatives, it shows a new warning:

  arch/x86/entry/entry_64.o: warning: objtool: .entry_trampoline: don't know how to handle alternatives at end of section

This is due to the JMP_NOSPEC in entry_SYSCALL_64_trampoline().

Previously, objtool ignored this situation because it wasn't needed, and
it would have required a bit of extra code.  Now that this case exists,
add proper support for it.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/2a30a3c2158af47d891a76e69bb1ef347e0443fd.1517284349.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-30 15:09:17 +01:00
Josh Poimboeuf
a845c7cf4b objtool: Improve retpoline alternative handling
Currently objtool requires all retpolines to be:

  a) patched in with alternatives; and

  b) annotated with ANNOTATE_NOSPEC_ALTERNATIVE.

If you forget to do both of the above, objtool segfaults trying to
dereference a NULL 'insn->call_dest' pointer.

Avoid that situation and print a more helpful error message:

  quirks.o: warning: objtool: efi_delete_dummy_variable()+0x99: unsupported intra-function call
  quirks.o: warning: objtool: If this is a retpoline, please patch it in with alternatives and annotate it with ANNOTATE_NOSPEC_ALTERNATIVE.

Future improvements can be made to make objtool smarter with respect to
retpolines, but this is a good incremental improvement for now.

Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/819e50b6d9c2e1a22e34c1a636c0b2057cc8c6e5.1517284349.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-30 15:09:14 +01:00
Ingo Molnar
7e86548e2c Linux 4.15
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJabj6pAAoJEHm+PkMAQRiGs8cIAJQFkCWnbz86e3vG4DuWhyA8
 CMGHCQdUOxxFGa/ixhIiuetbC0x+JVHAjV2FwVYbAQfaZB3pfw2iR1ncQxpAP1AI
 oLU9vBEqTmwKMPc9CM5rRfnLFWpGcGwUNzgPdxD5yYqGDtcM8K840mF6NdkYe5AN
 xU8rv1wlcFPF4A5pvHCH0pvVmK4VxlVFk/2H67TFdxBs4PyJOnSBnf+bcGWgsKO6
 hC8XIVtcKCH2GfFxt5d0Vgc5QXJEpX1zn2mtCa1MwYRjN2plgYfD84ha0xE7J0B0
 oqV/wnjKXDsmrgVpncr3txd4+zKJFNkdNRE4eLAIupHo2XHTG4HvDJ5dBY2NhGU=
 =sOml
 -----END PGP SIGNATURE-----

Merge tag 'v4.15' into x86/pti, to be able to merge dependent changes

Time has come to switch PTI development over to a v4.15 base - we'll still
try to make sure that all PTI fixes backport cleanly to v4.14 and earlier.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-30 15:08:27 +01:00
Daniel Mentz
a1dfb4c48c media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic
The 32-bit compat v4l2 ioctl handling is implemented based on its 64-bit
equivalent. It converts 32-bit data structures into its 64-bit
equivalents and needs to provide the data to the 64-bit ioctl in user
space memory which is commonly allocated using
compat_alloc_user_space().

However, due to how that function is implemented, it can only be called
a single time for every syscall invocation.

Supposedly to avoid this limitation, the existing code uses a mix of
memory from the kernel stack and memory allocated through
compat_alloc_user_space().

Under normal circumstances, this would not work, because the 64-bit
ioctl expects all pointers to point to user space memory. As a
workaround, set_fs(KERNEL_DS) is called to temporarily disable this
extra safety check and allow kernel pointers. However, this might
introduce a security vulnerability: The result of the 32-bit to 64-bit
conversion is writeable by user space because the output buffer has been
allocated via compat_alloc_user_space(). A malicious user space process
could then manipulate pointers inside this output buffer, and due to the
previous set_fs(KERNEL_DS) call, functions like get_user() or put_user()
no longer prevent kernel memory access.

The new approach is to pre-calculate the total amount of user space
memory that is needed, allocate it using compat_alloc_user_space() and
then divide up the allocated memory to accommodate all data structures
that need to be converted.

An alternative approach would have been to retain the union type karg
that they allocated on the kernel stack in do_video_ioctl(), copy all
data from user space into karg and then back to user space. However, we
decided against this approach because it does not align with other
compat syscall implementations. Instead, we tried to replicate the
get_user/put_user pairs as found in other places in the kernel:

    if (get_user(clipcount, &up->clipcount) ||
        put_user(clipcount, &kp->clipcount)) return -EFAULT;

Notes from hans.verkuil@cisco.com:

This patch was taken from:
    97b733953c

Clearly nobody could be bothered to upstream this patch or at minimum
tell us :-( We only heard about this a week ago.

This patch was rebased and cleaned up. Compared to the original I
also swapped the order of the convert_in_user arguments so that they
matched copy_in_user. It was hard to review otherwise. I also replaced
the ALLOC_USER_SPACE/ALLOC_AND_GET by a normal function.

Fixes: 6b5a9492ca ("v4l: introduce string control support.")

Signed-off-by: Daniel Mentz <danielmentz@google.com>
Co-developed-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: <stable@vger.kernel.org>      # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-01-30 07:40:41 -05:00
Hans Verkuil
d83a8243aa media: v4l2-compat-ioctl32.c: don't copy back the result for certain errors
Some ioctls need to copy back the result even if the ioctl returned
an error. However, don't do this for the error code -ENOTTY.
It makes no sense in that cases.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: <stable@vger.kernel.org>      # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-01-30 07:35:16 -05:00
Hans Verkuil
169f24ca68 media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type
There is nothing wrong with using an unknown buffer type. So
stop spamming the kernel log whenever this happens. The kernel
will just return -EINVAL to signal this.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: <stable@vger.kernel.org>      # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-01-30 07:34:09 -05:00
Hans Verkuil
a751be5b14 media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32
put_v4l2_window32() didn't copy back the clip list to userspace.
Drivers can update the clip rectangles, so this should be done.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: <stable@vger.kernel.org>      # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-01-30 07:33:14 -05:00
Hans Verkuil
b8c601e8af media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer
ctrl_is_pointer just hardcoded two known string controls, but that
caused problems when using e.g. custom controls that use a pointer
for the payload.

Reimplement this function: it now finds the v4l2_ctrl (if the driver
uses the control framework) or it calls vidioc_query_ext_ctrl (if the
driver implements that directly).

In both cases it can now check if the control is a pointer control
or not.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: <stable@vger.kernel.org>      # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-01-30 07:31:33 -05:00
Hans Verkuil
8ed5a59dcb media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32
The struct v4l2_plane32 should set m.userptr as well. The same
happens in v4l2_buffer32 and v4l2-compliance tests for this.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: <stable@vger.kernel.org>      # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-01-30 07:30:22 -05:00