Commit graph

62682 commits

Author SHA1 Message Date
Randy Dunlap
404376af78 Documentation: kernel-api: add bitmap operations from linux/bitmap.h
Add <linux/bitmap.h> to kernel-api Bitmap Operations section.
Fix kernel-doc nitpicks in <linux/bitmap.h>.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-09-26 14:42:25 -06:00
Daniel Borkmann
de8f3a83b0 bpf: add meta pointer for direct access
This work enables generic transfer of metadata from XDP into skb. The
basic idea is that we can make use of the fact that the resulting skb
must be linear and already comes with a larger headroom for supporting
bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
for adjusting a new pointer called xdp->data_meta. Thus, the packet has
a flexible and programmable room for meta data, followed by the actual
packet data. struct xdp_buff is therefore laid out that we first point
to data_hard_start, then data_meta directly prepended to data followed
by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
account whether we have meta data already prepended and if so, memmove()s
this along with the given offset provided there's enough room.

xdp->data_meta is optional and programs are not required to use it. The
rationale is that when we process the packet in XDP (e.g. as DoS filter),
we can push further meta data along with it for the XDP_PASS case, and
give the guarantee that a clsact ingress BPF program on the same device
can pick this up for further post-processing. Since we work with skb
there, we can also set skb->mark, skb->priority or other skb meta data
out of BPF, thus having this scratch space generic and programmable
allows for more flexibility than defining a direct 1:1 transfer of
potentially new XDP members into skb (it's also more efficient as we
don't need to initialize/handle each of such new members). The facility
also works together with GRO aggregation. The scratch space at the head
of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
yet supporting xdp->data_meta can simply be set up with xdp->data_meta
as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
such that the subsequent match against xdp->data for later access is
guaranteed to fail.

The verifier treats xdp->data_meta/xdp->data the same way as we treat
xdp->data/xdp->data_end pointer comparisons. The requirement for doing
the compare against xdp->data is that it hasn't been modified from it's
original address we got from ctx access. It may have a range marking
already from prior successful xdp->data/xdp->data_end pointer comparisons
though.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Daniel Borkmann
6aaae2b6c4 bpf: rename bpf_compute_data_end into bpf_compute_data_pointers
Just do the rename into bpf_compute_data_pointers() as we'll add
one more pointer here to recompute.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Michal Kalderon
1e99c49701 qed: iWARP - Add check for errors on a SYN packet
A SYN packet which arrives with errors from FW should be dropped.
This required adding an additional field to the ll2
rx completion data.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 11:22:03 -07:00
Shaohua Li
0b508bc926 block: fix a build error
The code is only for blkcg not for all cgroups

Fixes: d4478e92d6 ("block/loop: make loop cgroup aware")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-26 12:07:24 -06:00
Shaohua Li
902ec5b6de block: make blkcg aware of kthread stored original cgroup info
bio_blkcg is the only API to get cgroup info for a bio right now. If
bio_blkcg finds current task is a kthread and has original blkcg
associated, it will use the css instead of associating the bio to
current task. This makes it possible that kthread dispatches bios on
behalf of other threads.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-26 07:41:22 -06:00
Shaohua Li
af551fb3be blkcg: delete unused APIs
Nobody uses the APIs right now.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-26 07:41:22 -06:00
Shaohua Li
05e3db95eb kthread: add a mechanism to store cgroup info
kthread usually runs jobs on behalf of other threads. The jobs should be
charged to cgroup of original threads. But the jobs run in a kthread,
where we lose the cgroup context of original threads. The patch adds a
machanism to record cgroup info of original threads in kthread context.
Later we can retrieve the cgroup info and attach the cgroup info to jobs.

Since this mechanism is only required by kthread, we store the cgroup
info in kthread data instead of generic task_struct.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-26 07:41:22 -06:00
Corentin Labbe
003278e431 nfs_common: convert int to bool
Since __state_in_grace return only true/false, make it return bool
instead of int.
Same change for the two user of it, locks_in_grace/opens_in_grace

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-09-26 09:25:38 -04:00
Linus Torvalds
19240e6b2a Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Two sets of NVMe pull requests from Christoph:
      - Fixes for the Fibre Channel host/target to fix spec compliance
      - Allow a zero keep alive timeout
      - Make the debug printk for broken SGLs work better
      - Fix queue zeroing during initialization
      - Set of RDMA and FC fixes
      - Target div-by-zero fix

 - bsg double-free fix.

 - ndb unknown ioctl fix from Josef.

 - Buffered vs O_DIRECT page cache inconsistency fix. Has been floating
   around for a long time, well reviewed. From Lukas.

 - brd overflow fix from Mikulas.

 - Fix for a loop regression in this merge window, where using a union
   for two members of the loop_cmd turned out to be a really bad idea.
   From Omar.

 - Fix for an iostat regression fix in this series, using the wrong API
   to get at the block queue. From Shaohua.

 - Fix for a potential blktrace delection deadlock. From Waiman.

* 'for-linus' of git://git.kernel.dk/linux-block: (30 commits)
  nvme-fcloop: fix port deletes and callbacks
  nvmet-fc: sync header templates with comments
  nvmet-fc: ensure target queue id within range.
  nvmet-fc: on port remove call put outside lock
  nvme-rdma: don't fully stop the controller in error recovery
  nvme-rdma: give up reconnect if state change fails
  nvme-core: Use nvme_wq to queue async events and fw activation
  nvme: fix sqhd reference when admin queue connect fails
  block: fix a crash caused by wrong API
  fs: Fix page cache inconsistency when mixing buffered and AIO DIO
  nvmet: implement valid sqhd values in completions
  nvme-fabrics: Allow 0 as KATO value
  nvme: allow timed-out ios to retry
  nvme: stop aer posting if controller state not live
  nvme-pci: Print invalid SGL only once
  nvme-pci: initialize queue memory before interrupts
  nvmet-fc: fix failing max io queue connections
  nvme-fc: use transport-specific sgl format
  nvme: add transport SGL definitions
  nvme.h: remove FC transport-specific error values
  ...
2017-09-25 15:46:04 -07:00
Rafael J. Wysocki
a380f2edef PM / core: Drop legacy class suspend/resume operations
There are no classes using the legacy suspend/resume operations in
the tree any more, so drop these operations and update the code
referring to them accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-25 23:58:37 +02:00
Peter Zijlstra
1db49484f2 smp/hotplug: Hotplug state fail injection
Add a sysfs file to one-time fail a specific state. This can be used
to test the state rollback code paths.

Something like this (hotplug-up.sh):

  #!/bin/bash

  echo 0 > /debug/sched_debug
  echo 1 > /debug/tracing/events/cpuhp/enable

  ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
  STATES=${1:-$ALL_STATES}

  for state in $STATES
  do
	  echo 0 > /sys/devices/system/cpu/cpu1/online
	  echo 0 > /debug/tracing/trace
	  echo Fail state: $state
	  echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
	  cat /sys/devices/system/cpu/cpu1/hotplug/fail
	  echo 1 > /sys/devices/system/cpu/cpu1/online

	  cat /debug/tracing/trace > hotfail-${state}.trace

	  sleep 1
  done

Can be used to test for all possible rollback (barring multi-instance)
scenarios on CPU-up, CPU-down is a trivial modification of the above.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org
2017-09-25 22:11:44 +02:00
Peter Zijlstra
fac1c20402 smp/hotplug: Add state diagram
Add a state diagram to clarify when which states are ran where.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.661598270@infradead.org
2017-09-25 22:11:41 +02:00
Thomas Gleixner
4c3711d7fb timekeeping: Provide NMI safe access to clock realtime
The configurable printk timestamping wants access to clock realtime. Right
now there is no ktime_get_real_fast_ns() accessor because reading the
monotonic base and the realtime offset cannot be done atomically. Contrary
to boot time this offset can change during runtime and cause half updated
readouts.

struct tk_read_base was fully packed when the fast timekeeper access was
implemented. commit ceea5e3771 ("time: Fix clock->read(clock) race around
clocksource changes") removed the 'read' function pointer from the
structure, but of course left the comment stale.

So now the structure can fit a new 64bit member w/o violating the cache
line constraints.

Add real_base to tk_read_base and update it in the fast timekeeper update
sequence.

Implement an accessor which follows the same scheme as the accessor to
clock monotonic, but uses the new real_base to access clock real time.

The runtime overhead for updating real_base is minimal as it just adds two
cache hot values and stores them into an already dirtied cache line along
with the other fast timekeeper updates.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead,org>
Link: https://lkml.kernel.org/r/1505757060-2004-3-git-send-email-prarit@redhat.com
2017-09-25 21:05:59 +02:00
Stefan Wahren
88bbe85dcd irqchip: bcm2836: Move SMP startup code to arch/arm (v2)
In order to easily provide SMP for BCM2837 on 32-bit and 64-bit
the SMP startup code was placed in irq-bcm2836. That's not the
right approach. So move this code where it belongs.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 41f4988cc2 ("irqchip/bcm2836: Add SMP support for the 2836")
Tested-by: Eric Anholt <eric@anholt.net>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2017-09-25 11:52:26 -07:00
Danilo Krummrich
1d66af8190 clk: bcm2835: remove remains from stub clk driver
This commit removes the fixed clocks introduced as a stub clock driver
added with commit 75fabc3f64 ("ARM: bcm2835: add stub clock driver").
Originally they were used to drive the AMBA bus and PL011 uart driver.
Now these clocks are derived by the CPRMAN clock driver and configured
in DT.

Additionally, get rid of init_machine function in bcm2835 board file
as there's nothing to do any longer.

Signed-off-by: Danilo Krummrich <danilokrummrich@dk-develop.de>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
2017-09-25 11:52:24 -07:00
James Smart
6b71f9e1e8 nvmet-fc: sync header templates with comments
Comments were incorrect:
- defer_rcv was in host port template. moved to target port template
- Added Mandatory statements for target port template items

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-25 12:42:11 -06:00
Thomas Gleixner
2f75d9e1c9 genirq: Implement bitmap matrix allocator
Implement the infrastructure for a simple bitmap based allocator, which
will replace the x86 vector allocator. It's in the core code as other
architectures might be able to reuse/extend it. For now it only implements
allocations for single CPUs, but it's simple to add multi CPU allocation
support if required.

The concept is rather simple:

 Global information:
 	system_vector bitmap
	global accounting

 PerCPU information:
 	allocation bitmap
	managed allocation bitmap
	local accounting

The system vector bitmap is used to exclude vectors system wide from the
allocation space.

The allocation bitmap is used to keep track of per cpu used vectors.

The managed allocation bitmap is used to reserve vectors for managed
interrupts.

When a regular (non managed) interrupt allocation happens then the
following rule applies:

      tmpmap = system_map | alloc_map | managed_map
      find_zero_bit(tmpmap)

Oring the bitmaps together gives the real available space. The same rule
applies for reserving a managed interrupt vector. But contrary to the
regular interrupts the reservation only marks the bit in the managed map
and therefor excludes it from the regular allocations. The managed map is
only cleaned out when the a managed interrupt is completely released and it
stays alive accross CPU offline/online operations.

For managed interrupt allocations the rule is:

      tmpmap = managed_map & ~alloc_map
      find_first_bit(tmpmap)

This returns the first bit which is in the managed map, but not yet
allocated in the allocation map. The allocation marks it in the allocation
map and hands it back to the caller for use.

The rest of the code are helper functions to handle the various
requirements and the accounting which are necessary to replace the x86
vector allocation code. The result is a single patch as the evolution of
this infrastructure cannot be represented in bits and pieces.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213153.185437174@linutronix.de
2017-09-25 20:38:26 +02:00
Thomas Gleixner
22d0b12f35 genirq/irqdomain: Add force reactivation flag to irq domains
Allow irqdomains to tell the core code, that after early activation the
interrupt needs to be reactivated at request_irq() time.

This allows reservation of vectors at early activation time and actual
vector assignment at request_irq() time.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213153.106242536@linutronix.de
2017-09-25 20:38:25 +02:00
Thomas Gleixner
42e1cc2dc5 genirq/irqdomain: Propagate early activation
Propagate the early activation mode to the irqdomain activate()
callbacks. This is required for the upcoming reservation, late vector
assignment scheme, so that the early activation call can act accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213153.028353660@linutronix.de
2017-09-25 20:38:25 +02:00
Thomas Gleixner
bb9b428a5c genirq/irqdomain: Allow irq_domain_activate_irq() to fail
Allow irq_domain_activate_irq() to fail. This is required to support a
reservation and late vector assignment scheme.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.933882227@linutronix.de
2017-09-25 20:38:24 +02:00
Thomas Gleixner
7249164346 genirq/irqdomain: Update irq_domain_ops.activate() signature
The irq_domain_ops.activate() callback has no return value and no way to
tell the function that the activation is early.

The upcoming changes to support a reservation scheme which allows to assign
interrupt vectors on x86 only when the interrupt is actually requested
requires:

  - A return value, so activation can fail at request_irq() time
  
  - Information that the activate invocation is early, i.e. before
    request_irq().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.848490816@linutronix.de
2017-09-25 20:38:24 +02:00
Thomas Gleixner
457f6d3507 genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY
In the !IRQ_DOMAIN_HIERARCHY cas the activation stubs are not
setting/clearing the activation status bits. This is not a problem at the
moment, but upcoming changes require a correct status.

Add the set/clear incovations to the stub functions and move them to the
core internal header to avoid duplication and visibility outside the core.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.591985591@linutronix.de
2017-09-25 20:38:23 +02:00
Thomas Gleixner
c3e7239a7f irqdomain/debugfs: Provide domain specific debug callback
Some interrupt domains like the X86 vector domain has special requirements
for debugging, like showing the vector usage on the CPUs.

Add a callback to the irqdomain ops which can be filled in by domains which
require it and add conditional invocations to the irqdomain and the per irq
debug files.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.512937505@linutronix.de
2017-09-25 20:38:23 +02:00
Thomas Gleixner
07557ccb8c genirq/msi: Capture device name for debugfs
For debugging the allocation of unused or potentially leaked interrupt
descriptor it's helpful to have some information about the site which
allocated them. In case of MSI this is simple because the caller hands the
device struct pointer into the domain allocation function.

Duplicate the device name and show it in the debugfs entry of the interrupt
descriptor.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.433038426@linutronix.de
2017-09-25 20:38:22 +02:00
Geert Uytterhoeven
fe59493240 PCI: Add dummy pci_acs_enabled() for CONFIG_PCI=n build
If CONFIG_PCI=n and gcc (e.g. 4.1.2) decides not to inline
get_pci_function_alias_group(), the build fails with:

  drivers/iommu/iommu.o: In function `get_pci_function_alias_group':
  iommu.c:(.text+0xfdc): undefined reference to `pci_acs_enabled'

Due to the various dummies for PCI calls in the CONFIG_PCI=n case,
pci_acs_enabled() never called, but not all versions of gcc are smart
enough to realize that.

While explicitly marking get_pci_function_alias_group() inline would fix
the build, this would inflate the code for the CONFIG_PCI=y case, as
get_pci_function_alias_group() is a not-so-small function called from two
places.

Hence fix the issue by introducing a dummy for pci_acs_enabled() instead.

Fixes: 0ae349a0f3 ("iommu/qcom: Add qcom_iommu")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2017-09-25 11:08:20 -05:00
Tejun Heo
041cd640b2 cgroup: Implement cgroup2 basic CPU usage accounting
In cgroup1, while cpuacct isn't actually controlling any resources, it
is a separate controller due to combination of two factors -
1. enabling cpu controller has significant side effects, and 2. we
have to pick one of the hierarchies to account CPU usages on.  cpuacct
controller is effectively used to designate a hierarchy to track CPU
usages on.

cgroup2's unified hierarchy removes the second reason and we can
account basic CPU usages by default.  While we can use cpuacct for
this purpose, both its interface and implementation leave a lot to be
desired - it collects and exposes two sources of truth which don't
agree with each other and some of the exposed statistics don't make
much sense.  Also, it propagates all the way up the hierarchy on each
accounting event which is unnecessary.

This patch adds basic resource accounting mechanism to cgroup2's
unified hierarchy and accounts CPU usages using it.

* All accountings are done per-cpu and don't propagate immediately.
  It just bumps the per-cgroup per-cpu counters and links to the
  parent's updated list if not already on it.

* On a read, the per-cpu counters are collected into the global ones
  and then propagated upwards.  Only the per-cpu counters which have
  changed since the last read are propagated.

* CPU usage stats are collected and shown in "cgroup.stat" with "cpu."
  prefix.  Total usage is collected from scheduling events.  User/sys
  breakdown is sourced from tick sampling and adjusted to the usage
  using cputime_adjust().

This keeps the accounting side hot path O(1) and per-cpu and the read
side O(nr_updated_since_last_read).

v2: Minor changes and documentation updates as suggested by Waiman and
    Roman.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Roman Gushchin <guro@fb.com>
2017-09-25 08:12:05 -07:00
Tejun Heo
d2cc5ed694 cpuacct: Introduce cgroup_account_cputime[_field]()
Introduce cgroup_account_cputime[_field]() which wrap cpuacct_charge()
and cgroup_account_field().  This doesn't introduce any functional
changes and will be used to add cgroup basic resource accounting.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
2017-09-25 08:12:04 -07:00
Tejun Heo
cfb766da54 sched/cputime: Expose cputime_adjust()
Will be used by basic cgroup resource stat reporting later.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
2017-09-25 08:12:04 -07:00
James Smart
d85cf20749 nvme: add transport SGL definitions
Add transport SGL defintions from NVMe TP 4008, required for
the final NVMe-FC standard.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-25 08:56:05 -06:00
James Smart
c98cb3bd88 nvme.h: remove FC transport-specific error values
The NVM express group recinded the reserved range for the transport.
Remove the FC-centric values that had been defined.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-25 08:56:05 -06:00
Waiman Long
5acb3cc2c2 blktrace: Fix potential deadlock between delete & sysfs ops
The lockdep code had reported the following unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(s_active#228);
                               lock(&bdev->bd_mutex/1);
                               lock(s_active#228);
  lock(&bdev->bd_mutex);

 *** DEADLOCK ***

The deadlock may happen when one task (CPU1) is trying to delete a
partition in a block device and another task (CPU0) is accessing
tracing sysfs file (e.g. /sys/block/dm-1/trace/act_mask) in that
partition.

The s_active isn't an actual lock. It is a reference count (kn->count)
on the sysfs (kernfs) file. Removal of a sysfs file, however, require
a wait until all the references are gone. The reference count is
treated like a rwsem using lockdep instrumentation code.

The fact that a thread is in the sysfs callback method or in the
ioctl call means there is a reference to the opended sysfs or device
file. That should prevent the underlying block structure from being
removed.

Instead of using bd_mutex in the block_device structure, a new
blk_trace_mutex is now added to the request_queue structure to protect
access to the blk_trace structure.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Fix typo in patch subject line, and prune a comment detailing how
the code used to work.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-09-25 08:56:05 -06:00
Eric Biggers
237bbd29f7 KEYS: prevent creating a different user's keyrings
It was possible for an unprivileged user to create the user and user
session keyrings for another user.  For example:

    sudo -u '#3000' sh -c 'keyctl add keyring _uid.4000 "" @u
                           keyctl add keyring _uid_ses.4000 "" @u
                           sleep 15' &
    sleep 1
    sudo -u '#4000' keyctl describe @u
    sudo -u '#4000' keyctl describe @us

This is problematic because these "fake" keyrings won't have the right
permissions.  In particular, the user who created them first will own
them and will have full access to them via the possessor permissions,
which can be used to compromise the security of a user's keys:

    -4: alswrv-----v------------  3000     0 keyring: _uid.4000
    -5: alswrv-----v------------  3000     0 keyring: _uid_ses.4000

Fix it by marking user and user session keyrings with a flag
KEY_FLAG_UID_KEYRING.  Then, when searching for a user or user session
keyring by name, skip all keyrings that don't have the flag set.

Fixes: 69664cf16a ("keys: don't generate user and user session keyrings unless they're accessed")
Cc: <stable@vger.kernel.org>	[v2.6.26+]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2017-09-25 15:19:57 +01:00
Greg Kroah-Hartman
069f0e0c06 Round one of new device support, features and cleanup for IIO in the 4.15 cycle.
Note there is a misc driver drop in here given we have support
 in IIO and the feeling is no one will care.
 
 A large part of this series is a boiler plate removal series avoiding
 the need to explicitly provide THIS_MODULE in various locations.
 It's very dull but touches all drivers.
 
 New device support
 * ad5446
   - add ids to support compatible parts DAC081S101, DAC101S101,
     DAC121S101.
   - add the dac7512 id and drop the misc driver as feeling is no
     one is using it (was introduced for a board that is long obsolete)
 * mt6577
   - add bindings for mt2712 which is fully compatible with other
     supported parts.
 * st_pressure
   - add support for LPS33HW and LPS35HW with bindings (ids mostly).
 
 New features
 * ccs811
   - Add support for the data ready trigger.
 * mma8452
   - remove artifical restriction on supporting multiple event types
     at the same time.
 * tcs3472
   - support out of threshold events
 
 Core and tree wide cleanup
 * Use macro magic to remove the need to provide THIS_MODULE as part of
   struct iio_info or struct iio_trigger_ops.  This is similar to
   work done in a number of other subsystems (e.g. i2c, spi).
 
   All drivers are fixed and then the fields in these structures are
   removed.
 
   This will cause build failures for out of tree drivers and any
   new drivers that cross with this work going into the kernel.
 
   Note mostly done with a coccinelle patch, included in the series
   on the mailing list but not merged as the fields no longer exist
   in the structures so the any hold outs will cause a build failure.
 
 Cleanups
 * ads1015
   - avoid writing config register when it doesn't change.
   - add 10% to conversion wait time as it seems it is sometimes
     a little small.
 * ade7753
   - replace use of core mlock with a local lock.  This is part of a
     long term effort to make the use of mlock opaque and single
     purpose.
 * ade7759
   - expand the use of buf_lock to cover previous mlock cases.  This
     is a slightly nicer solution to the same issue as in ade7753.
 * cros_ec
   - drop an unused variable
 * inv_mpu6050
   - add a missing break in a switch for consistency - not actual
     bug,
   - make some local arrays static to save on object code size.
 * max5481
   - drop manual setting of the spi module owner as handled by the
     spi core.
 * max5487
   - drop manual setting of the spi module owner as handled by the
     spi core.
 * max9611
   - drop explicit setting of the i2c module owner as handled by
     the i2c core.
 * mcp320x
   - speed up reads on single channel devices,
   - drop unused of_device_id data elements,
   - document the struct mcp320x,
   - improve binding docs to reflect restrictions on spi setup and
     to make it explicit that the reference regulator is needed.
 * mma8452
   - symbolic to octal permissions,
   - unsigned to unsigned int.
 * st_lsm6dsx
   - avoid setting odr values multiple times,
   - drop config of LIR as it is only ever set to the existing
     defaults,
   - drop rounding configuration as it only ever matches the defaults.
 * ti-ads8688
   - drop manual setting of the spi module owner as handled by the
     spi core.
 * tsl2x7x
   - constify the i2c_device_id,
   - cleanup limit checks to avoid static checker warnings (and generally
     have nicer code).
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAlnH4XkRHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0Fogkaw//e3z+4TQT2Hn+550lBYUV8pBR5emDiSe3
 0QTG+ZS7Kh+fPYENLCXtW9ttZicmUSqkTQFvlMTjAxHyj9XzL7+BXS9UlNgVLsqX
 pn9KprPj31lrXpJOXMSgcdiqWMZLtCvprAJgnwfZt1GevS3WbCMmnnoaBuJV61jp
 w0VD+forukTGF7b0OMGB0d5mwtYS0bJYqXRRMPD+2bNeM4hqt5YM3+wHSqP35t3l
 MoaqKlbx7ZtKDF4zIa59nKNP7Ky7IByWogLZRlJ/vD/uKrACckPT22+KT8rX2TwA
 NpZb1Oy/KZBTl5D9iRjZADq4NaRJENFXJiG6GkjoGjrQhUqHaCinHWpLioqLGlRq
 qCPL2nRjqm4Qr7E8sxlwR1Ajlg0utBMY7Oflym/XJMMLF/ZE6HSrzyrxuVMG2EjV
 T7SVIncbfg6kyr/r4kKsAT3BUMV+TdO4sXm+JgphZBUqZLp0nFHnmjP7Rm2j2uWq
 +yLrSuF25RijrRj3sp28zg9dFWlRwRvZvcAx8kEGm1kMjMWr+Q10xTK9o/5LlFEw
 57sUm6qgmigPK8sahDtcdLIwaCPVvAYvJ0e4Mfw5UsPSlZmHmM1mLwjpwiXBZ5ig
 oxnJmTXsn5RcOGiW/mg0VCH26NkBx7H0fsRqQeq9wkxHLrm75vXroIn7YqRIg+Ad
 /Itu6x6fOIg=
 =ik5C
 -----END PGP SIGNATURE-----

Merge tag 'iio-for-4.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next

Jonathan writes:

Round one of new device support, features and cleanup for IIO in the 4.15 cycle.

Note there is a misc driver drop in here given we have support
in IIO and the feeling is no one will care.

A large part of this series is a boiler plate removal series avoiding
the need to explicitly provide THIS_MODULE in various locations.
It's very dull but touches all drivers.

New device support
* ad5446
  - add ids to support compatible parts DAC081S101, DAC101S101,
    DAC121S101.
  - add the dac7512 id and drop the misc driver as feeling is no
    one is using it (was introduced for a board that is long obsolete)
* mt6577
  - add bindings for mt2712 which is fully compatible with other
    supported parts.
* st_pressure
  - add support for LPS33HW and LPS35HW with bindings (ids mostly).

New features
* ccs811
  - Add support for the data ready trigger.
* mma8452
  - remove artifical restriction on supporting multiple event types
    at the same time.
* tcs3472
  - support out of threshold events

Core and tree wide cleanup
* Use macro magic to remove the need to provide THIS_MODULE as part of
  struct iio_info or struct iio_trigger_ops.  This is similar to
  work done in a number of other subsystems (e.g. i2c, spi).

  All drivers are fixed and then the fields in these structures are
  removed.

  This will cause build failures for out of tree drivers and any
  new drivers that cross with this work going into the kernel.

  Note mostly done with a coccinelle patch, included in the series
  on the mailing list but not merged as the fields no longer exist
  in the structures so the any hold outs will cause a build failure.

Cleanups
* ads1015
  - avoid writing config register when it doesn't change.
  - add 10% to conversion wait time as it seems it is sometimes
    a little small.
* ade7753
  - replace use of core mlock with a local lock.  This is part of a
    long term effort to make the use of mlock opaque and single
    purpose.
* ade7759
  - expand the use of buf_lock to cover previous mlock cases.  This
    is a slightly nicer solution to the same issue as in ade7753.
* cros_ec
  - drop an unused variable
* inv_mpu6050
  - add a missing break in a switch for consistency - not actual
    bug,
  - make some local arrays static to save on object code size.
* max5481
  - drop manual setting of the spi module owner as handled by the
    spi core.
* max5487
  - drop manual setting of the spi module owner as handled by the
    spi core.
* max9611
  - drop explicit setting of the i2c module owner as handled by
    the i2c core.
* mcp320x
  - speed up reads on single channel devices,
  - drop unused of_device_id data elements,
  - document the struct mcp320x,
  - improve binding docs to reflect restrictions on spi setup and
    to make it explicit that the reference regulator is needed.
* mma8452
  - symbolic to octal permissions,
  - unsigned to unsigned int.
* st_lsm6dsx
  - avoid setting odr values multiple times,
  - drop config of LIR as it is only ever set to the existing
    defaults,
  - drop rounding configuration as it only ever matches the defaults.
* ti-ads8688
  - drop manual setting of the spi module owner as handled by the
    spi core.
* tsl2x7x
  - constify the i2c_device_id,
  - cleanup limit checks to avoid static checker warnings (and generally
    have nicer code).
2017-09-25 12:56:37 +02:00
Greg Kroah-Hartman
b2e312061c First round of IIO fixes for the 4.14 cycle
Note this includes fixes from recent merge window.  As such the tree
 is based on top of a prior staging/staging-next tree.
 
 * iio core
   - return and error for a failed read_reg debugfs call rather than
     eating the error.
 * ad7192
   - Use the dedicated reset function in the ad_sigma_delta library
     instead of an spi transfer with the data on the stack which
     could cause problems with DMA.
 * ad7793
   - Implement a dedicate reset function in the ad_sigma_delta library
     and use it to correctly reset this part.
 * bme280
   - ctrl_reg write must occur after any register writes
   for updates to take effect.
 * mcp320x
   - negative voltage readout was broken.
   - Fix an oops on module unload due to spi_set_drvdata not being called
     in probe.
 * st_magn
   - Fix the data ready line configuration for the lis3mdl. It is not
     configurable so the st_magn core was assuming it didn't exist
     and so wasn't consuming interrupts resulting in an unhandled
     interrupt.
 * stm32-adc
   - off by one error on max channels checking.
 * stm32-timer
   - preset should not be buffered - reorganising register writes avoids
   this.
   - fix a corner case in which write preset goes wrong when a timer is
   used first as a trigger then as a counter with preset. Odd case but
   you never know.
 * ti-ads1015
   - Fix setting of comparator polarity by fixing bitfield definition.
 * twl4030
   - Error path handling fix to cleanup in event of regulator
     registration failure.
   - Disable the vusb3v1 regulator correctly in error handling
   - Don't paper over a regulator enable failure.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAlnH2PkRHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0FoiNFw//XKT/CRq7R8Dg5vy4FT7NRw/SpSVH9u8M
 JsIISyAoh0jfOFJFkVv2Ll+Zs+SkbXJOPMUreaFJMnftSCkBdteT9D5kz5cYiUkO
 sFn58YEEqKCqQVJBEi0yM00HMxbpNKiNHXzo9WC8ZiUVRJqSigq1coznSqaZngRd
 yf4a9Z33jiH/NXalqy5Vb4CSn76UWpvvYJ11tuNxXqOldkyU4sm8bvsPWnBg1qG3
 YIuVZs83Rv1HW0t6ktYf3WwC2rBhKEM8st2fUCLkU+4+foDuqQBiFdWJzTcxe6Ru
 8za+q3ngOo7rRkLS8cSTsh8hbfKwIV0e3rQvsN3zqodqjPilw9uH3eybMYzLK/0I
 j9o12jaD1Ow7zQfiQnNnyJqDLddEt+f6GcrI/iYQPlTIQIhqDii4YInDyCrF4xbr
 i/8saOr5n4PsdHRovTUl2rrgiX+Jix6OmNLEtUdp1HWrc0XwbZ5L/1g6pXFL98QZ
 Kq8MwjNu+6x+66Hga/1fll19hZ/VMYGqxFtoP+6S28TS1bO9dXm6XNl5/U5Nh5mN
 J255UFCdFkBmJO9nP7tmbDscMCAbhc6AOHKxjUebuoNnyhyV+20W9S/gpzTIilnO
 Gv9D0z29zITc8Hpfhl2YKC6F00BuRAdPFJo05rSE/KoWX3bkPCiXFTfQ+aPwB72h
 bYT4oJeUoQg=
 =CcjQ
 -----END PGP SIGNATURE-----

Merge tag 'iio-fixes-for-4.14a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

First round of IIO fixes for the 4.14 cycle

Note this includes fixes from recent merge window.  As such the tree
is based on top of a prior staging/staging-next tree.

* iio core
  - return and error for a failed read_reg debugfs call rather than
    eating the error.
* ad7192
  - Use the dedicated reset function in the ad_sigma_delta library
    instead of an spi transfer with the data on the stack which
    could cause problems with DMA.
* ad7793
  - Implement a dedicate reset function in the ad_sigma_delta library
    and use it to correctly reset this part.
* bme280
  - ctrl_reg write must occur after any register writes
  for updates to take effect.
* mcp320x
  - negative voltage readout was broken.
  - Fix an oops on module unload due to spi_set_drvdata not being called
    in probe.
* st_magn
  - Fix the data ready line configuration for the lis3mdl. It is not
    configurable so the st_magn core was assuming it didn't exist
    and so wasn't consuming interrupts resulting in an unhandled
    interrupt.
* stm32-adc
  - off by one error on max channels checking.
* stm32-timer
  - preset should not be buffered - reorganising register writes avoids
  this.
  - fix a corner case in which write preset goes wrong when a timer is
  used first as a trigger then as a counter with preset. Odd case but
  you never know.
* ti-ads1015
  - Fix setting of comparator polarity by fixing bitfield definition.
* twl4030
  - Error path handling fix to cleanup in event of regulator
    registration failure.
  - Disable the vusb3v1 regulator correctly in error handling
  - Don't paper over a regulator enable failure.
2017-09-25 10:58:22 +02:00
Linus Torvalds
6e7f253801 DeviceTree fixes for 4.14:
- Fix build for !OF providing empty of_find_device_by_node
 
 - Fix Abracon vendor prefix
 
 - Sync dtx_diff include paths (again)
 
 - A stm32h7 clock binding doc fix
 -----BEGIN PGP SIGNATURE-----
 
 iQItBAABCAAXBQJZyDBuEBxyb2JoQGtlcm5lbC5vcmcACgkQ+vtdtY28YcPmCRAA
 gkNwWbuAwT4VcfOwHYsEaEU77xbM0y1fGbE8dDFp5AtfDV12gJYZlROsEhYvD/aH
 7g47adO/t9tbMW2NA/0d3TAHlTv5TarGJqxkV36FkC2P3hkw56B+0Een7g4KUXm4
 QLPhgaWdcrD6SnHy9BA+X+aNnqI3Ti4IV1QEM6ul4g5E6tGXVMDKzZ3uLMdGh37s
 9UMxy49GoWraga04TAT1ENvlrsY0sH494MoMh+ZJYzYPAJOla3GE8dI1mrsGyjGe
 chWxUBSUa46Lcq/jLM1y3i4S2x4v8fkBTyTIHjklseT5r6T/4KfLl+oXrR6RvuRz
 tBalOtBj85/ihLu/qle0KChugGalF4qytP6WhFrT2911tmjMLHLSB3Y0TAT4d95t
 ZcGmLIezS6duUVL2Is9CauuLvRklQ0osCcMVgh1nmpAWGJ2ROlfBpafSJveqN3+0
 3tyQU8XuRW4VyQbsARgFZ+g1gmbqBMI3nTcdcg2qith3EMRrMo4/mU5kh1EfSXyQ
 hUShqnOLMLYY38KjX7gNKy8J0XAx90oR9chyxoq3fwcqkubLngprZkk00+JPdjVq
 BQwfn10bwxAg+qSMy/5eUPCz0tG94ROJ0PPG4QiSLSvvZH8kSIymCJoq8sGdta3E
 nDwtkyKSgZoqF4VAdy4OhVv1aRqCwlrLQBEc/qBs/Is=
 =6bcc
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-fixes-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:

 - fix build for !OF providing empty of_find_device_by_node

 - fix Abracon vendor prefix

 - sync dtx_diff include paths (again)

 - a stm32h7 clock binding doc fix

* tag 'devicetree-fixes-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: clk: stm32h7: fix clock-cell size
  scripts/dtc: dtx_diff - 2nd update of include dts paths to match build
  dt-bindings: fix vendor prefix for Abracon
  of: provide inline helper for of_find_device_by_node
2017-09-24 16:04:12 -07:00
Linus Torvalds
43d368a18f Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
 "Three irqchip driver fixes, and an affinity mask helper function bug
  fix affecting x86"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "genirq: Restrict effective affinity to interrupts actually using it"
  irqchip.mips-gic: Fix shared interrupt mask writes
  irqchip/gic-v4: Fix building with ancient gcc
  irqchip/gic-v3: Iterate over possible CPUs by for_each_possible_cpu()
2017-09-24 11:57:07 -07:00
Linus Torvalds
a4306434b7 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull address-limit checking fixes from Ingo Molnar:
 "This fixes a number of bugs in the address-limit (USER_DS) checks that
  got introduced in the merge window, (mostly) affecting the ARM and
  ARM64 platforms"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arm64/syscalls: Move address limit check in loop
  arm/syscalls: Optimize address limit check
  Revert "arm/syscalls: Check address limit on user-mode return"
  syscalls: Use CHECK_DATA_CORRUPTION for addr_limit_user_check
2017-09-24 11:53:13 -07:00
Dragos Bogdan
7fc10de8d4 iio: ad_sigma_delta: Implement a dedicated reset function
Since most of the SD ADCs have the option of reseting the serial
interface by sending a number of SCLKs with CS = 0 and DIN = 1,
a dedicated function that can do this is usefull.

Needed for the patch:  iio: ad7793: Fix the serial interface reset
Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2017-09-24 16:58:21 +01:00
David S. Miller
1f8d31d189 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-09-23 10:16:53 -07:00
Linus Torvalds
71aa60f67f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix NAPI poll list corruption in enic driver, from Christian
    Lamparter.

 2) Fix route use after free, from Eric Dumazet.

 3) Fix regression in reuseaddr handling, from Josef Bacik.

 4) Assert the size of control messages in compat handling since we copy
    it in from userspace twice. From Meng Xu.

 5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
    from Ursula Braun.

 6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.

 7) Don't use ARRAY_SIZE on spinlock array which might have zero
    entries, from Geert Uytterhoeven.

 8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
    Kasiviswanathan.

 9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
    Long.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
  inet: fix improper empty comparison
  net: use inet6_rcv_saddr to compare sockets
  net: set tb->fast_sk_family
  net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
  MAINTAINERS: update git tree locations for ieee802154 subsystem
  net: prevent dst uses after free
  net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
  net/smc: no close wait in case of process shut down
  net/smc: introduce a delay
  net/smc: terminate link group if out-of-sync is received
  net/smc: longer delay for client link group removal
  net/smc: adapt send request completion notification
  net/smc: adjust net_device refcount
  net/smc: take RCU read lock for routing cache lookup
  net/smc: add receive timeout check
  net/smc: add missing dev_put
  net: stmmac: Cocci spatch "of_table"
  lan78xx: Use default values loaded from EEPROM/OTP after reset
  lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
  lan78xx: Fix for eeprom read/write when device auto suspend
  ...
2017-09-23 05:41:27 -10:00
Gao Feng
242c1a28eb net: Remove useless function skb_header_release
There is no one which would invokes the function skb_header_release.
So just remove it now.

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-22 20:43:13 -07:00
Linus Torvalds
6876eb3720 Power management fixes for v4.14-rc2
- Fix a regression in cpufreq on systems using DT as the source of
    CPU configuration information where two different code paths
    attempt to create the cpufreq-dt device object (there can be only
    one) and fix up the "compatible" matching for some TI platforms
    on top of that (Viresh Kumar, Dave Gerlach).
 
  - Fix an initialization time memory leak in cpuidle on ARM which
    occurs if the cpuidle driver initialization fails (Stefan Wahren).
 
  - Fix a PM core function that checks whether or not there are any
    system suspend/resume callbacks for a device, but forgets to
    check legacy callbacks which then may be skipped incorrectly
    and the system may crash and/or the device may become unusable
    after a suspend-resume cycle (Rafael Wysocki).
 
  - Fix request type validation for latency tolerance PM QoS requests
    which may lead to unexpected behavior (Jan Schönherr).
 
  - Fix a broken link to PM documentation from a header file and a
    typo in a PM document (Geert Uytterhoeven, Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZxYXOAAoJEILEb/54YlRxgkQP/1tQha6hykoryGscv1XGFPsc
 VP7tpsf00gjMcUnbuRFbgD+jF60SAWtt48sIACUOqtxSXRTGdPHS8zggLo9pa0+A
 A0Y8QlFXYHlk6up2iUyNOE0C2+UfD1pnh1fGfDtNLKob+hDhrrmRI2foMRVij9Cw
 0+szDkUENcVwLcCzyWgv7uCdrG823kzBS0gCOJT4cpYid+s7lfFEwRZMz7m2Q/2Y
 s0MMELdhBaaMAZyGIh28B2D+bAS1QTTxLr++nSiiIejVplraWlUYRToGlQMpZbJn
 yPjKld5Zc6+VJ4sPl3oRw3wnfo4Rndwgx0VDGDdne1e0S7pQtIjNRbyGTbl2n6yS
 +nckLpW767l9Mt9TVQzQBgnalUUZX78sqHqa3Ca9j8QXFcG0Khn5aewq76+9KVdv
 uSAxUmthvErJUGa6GTkhBGywZxKtuIdA/eg23XDN/P2PSVW8hj+z4Fn8DylXSEo7
 H3KMOJgSRWDFvAaYsusNEdQwy6+UWN9OiMoZYMVL6XcDYyFZrAaXtaZIb7MjOL5i
 U9cv3QCU4CWtsbKoQb3qvQZudUa92MIHtLWILTuLT/lWPIPGiUGLbIYRE+hzRzkj
 26J21Ex8veYuGEJT+YP6GBqtywX9aKuec2RZVSfqtAK/x/zXxwjFP6T3JHVril3T
 /mxhLZIg/aJwIeJQ3JTG
 =Cbgd
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a cpufreq regression introduced by recent changes related to
  the generic DT driver, an initialization time memory leak in cpuidle
  on ARM, a PM core bug that may cause system suspend/resume to fail on
  some systems, a request type validation issue in the PM QoS framework
  and two documentation-related issues.

  Specifics:

   - Fix a regression in cpufreq on systems using DT as the source of
     CPU configuration information where two different code paths
     attempt to create the cpufreq-dt device object (there can be only
     one) and fix up the "compatible" matching for some TI platforms on
     top of that (Viresh Kumar, Dave Gerlach).

   - Fix an initialization time memory leak in cpuidle on ARM which
     occurs if the cpuidle driver initialization fails (Stefan Wahren).

   - Fix a PM core function that checks whether or not there are any
     system suspend/resume callbacks for a device, but forgets to check
     legacy callbacks which then may be skipped incorrectly and the
     system may crash and/or the device may become unusable after a
     suspend-resume cycle (Rafael Wysocki).

   - Fix request type validation for latency tolerance PM QoS requests
     which may lead to unexpected behavior (Jan Schönherr).

   - Fix a broken link to PM documentation from a header file and a typo
     in a PM document (Geert Uytterhoeven, Rafael Wysocki)"

* tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: ti-cpufreq: Support additional am43xx platforms
  ARM: cpuidle: Avoid memleak if init fail
  cpufreq: dt-platdev: Add some missing platforms to the blacklist
  PM: core: Fix device_pm_check_callbacks()
  PM: docs: Drop an excess character from devices.rst
  PM / QoS: Use the correct variable to check the QoS request type
  driver core: Fix link to device power management documentation
2017-09-22 17:28:59 -10:00
Linus Torvalds
d32e5f44a5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:

 - fixes for two long standing issues (lock up and a crash) in force
   feedback handling in uinput driver

 - tweak to firmware update timing in Elan I2C touchpad driver.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elan_i2c - extend Flash-Write delay
  Input: uinput - avoid crash when sending FF request to device going away
  Input: uinput - avoid FF flush when destroying device
2017-09-22 17:23:41 -10:00
Linus Torvalds
c0a3a64e72 Major additions:
- sysctl and seccomp operation to discover available actions. (tyhicks)
 - new per-filter configurable logging infrastructure and sysctl. (tyhicks)
 - SECCOMP_RET_LOG to log allowed syscalls. (tyhicks)
 - SECCOMP_RET_KILL_PROCESS as the new strictest possible action.
 - self-tests for new behaviors.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJZxVbTAAoJEIly9N/cbcAmvIAQALR9aVQQXjma4lLhZxwTsLtG
 rJm8t/o4y/2aBV8vzpFbMPT5gfN/PAkHJpCoxVPssx0k4PH2M7HjpnR6E1OC+erg
 RNom3uNdNqZeFlDpdX1qriYiCTB9p6rHe0DPwgG9iGqgDxsJ+G3W+x1sMZ1C+A0M
 shxA3fwt+Qpivo8Zq44xjMFjK+Zeor9V3yPc51QoZktWHlM16ID3HvHVnUtzqAUb
 nTWF6ZlmZlJ/lp4Dq8/55lytVcXPo240G3H0Odai+SNFakK6p5UO//BRBV209bmb
 05jpAOH6uym1sxVz00TQXCtDqOEzs2mQgomtTSShHg8SrLFX7nFkEFtAVA6tEri2
 FqDYce9KX7ZtOYiq83C7pnpAFCouc0z31dQl9USHiAiexXklwBIX+OsVv98omWGi
 pW43uLE2ovY0cpOsN50xI4mnxiGh6MhFcdbor2VLRJwLIFSw3XjjgNCCLyK4AJxs
 N514252qi70c9cWyAHYDLy077yTVxu3JUlsVQKtRTMfoFUq6bX1jPXVXE8qkVrui
 bc/Ay54pPrUwM854IpQ9ZBOuMfs6I5opocGIsBvMaND45U4o2B0ANCsxhuZ0zEtM
 E55DhK5OgjukNemQmlWK2foDckYdtkJXCj2yMBNQady0Uynr2BWZ6VDBP7vFcnRB
 UihRlFZRZleu8383uHsc
 =sKeC
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "Major additions:

   - sysctl and seccomp operation to discover available actions
     (tyhicks)

   - new per-filter configurable logging infrastructure and sysctl
     (tyhicks)

   - SECCOMP_RET_LOG to log allowed syscalls (tyhicks)

   - SECCOMP_RET_KILL_PROCESS as the new strictest possible action

   - self-tests for new behaviors"

[ This is the seccomp part of the security pull request during the merge
  window that was nixed due to unrelated problems   - Linus ]

* tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  samples: Unrename SECCOMP_RET_KILL
  selftests/seccomp: Test thread vs process killing
  seccomp: Implement SECCOMP_RET_KILL_PROCESS action
  seccomp: Introduce SECCOMP_RET_KILL_PROCESS
  seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD
  seccomp: Action to log before allowing
  seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW
  seccomp: Selftest for detection of filter flag support
  seccomp: Sysctl to configure actions that are allowed to be logged
  seccomp: Operation for checking if an action is available
  seccomp: Sysctl to display available actions
  seccomp: Provide matching filter for introspection
  selftests/seccomp: Refactor RET_ERRNO tests
  selftests/seccomp: Add simple seccomp overhead benchmark
  selftests/seccomp: Add tests for basic ptrace actions
2017-09-22 16:16:41 -10:00
Linus Walleij
a9a1d2a782 pinctrl/gpio: Unify namespace for cross-calls
The pinctrl_request_gpio() and pinctrl_free_gpio() break the nice
namespacing in the other cross-calls like pinctrl_gpio_foo().
Just rename them and all references so we have one namespace
with all cross-calls under pinctrl_gpio_*().

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-09-22 11:02:10 +02:00
Dmitry Torokhov
9c4089e87a Immutable branch between MFD, Input and RTC due for the v3.14 merge window
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZkp3BAAoJEFGvii+H/Hdhtl8QAITZ/6R+aRQC8eFilvo9/9qv
 HTJVvEOZBa5afupmxQlNwuzFW5Zdou21Cl3Y0360hbRE0L3uOFGPBS4iBNIc9FLj
 cMFw4yuSVI98rLXZlmfSrRS/ZZuMbih0uu9/By7WDbgPOv86C8SrLt4DJln0ujjg
 4GmXbEzJlrULDvQQofaXtfVpfjTORjeiscgYVF4wv7rPYu9+4+wn3aXBi51I2gLb
 pkztJ0eAhyIAVxgPGldfa+1y4B1R3edJMfpFztHMVBaIg1aOkqcNC8Cs2MZpqxml
 jI3QpYFDw3q0iQ3UF8G1omcmGOItv2Jr2BGvJuyTApAe0mS47jlI+2djWlEWgHEd
 P3qhInsYX85xL4AYRUxnYxW7qmTq99DffvjHa2YzHMNw1c2q29ops+sVxvrsuHyw
 WBjBzKOXdzbDPdiVuaexP1jP/TPxw+V9l8hgkuqavONKwIN3aLeIRET1fZvO3eTG
 TeQgP37wJSC9uK4mSzcx32K9q1+XCYZCyPU9XZS+08Hq7Lxi3chqdfhRX6AUBCxW
 DiR1lOU1uIK8uZJHOpUDWuOiWIOWeWmkR82CXo88WivWiPW0MHXrfPPZMfxsOAa9
 /PAbobKpSXPZpD3mh1lm7x8f1mhgevZ9jlYVTu5LcVMnNSsLQP0r3p5ytGj7sMuz
 ltd2v7liFJ4CWOwpQPZu
 =j+Q1
 -----END PGP SIGNATURE-----

Merge tag 'ib-mfd-input-rtc-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next

Merge "Immutable branch between MFD, Input and RTC due for the v3.14
merge window" to have dm355evm_msp.h header moved into right place.
2017-09-21 16:41:15 -07:00
Dmitry Torokhov
95a0c7c2d6 Immutable branch between MFD and many other subsystems due for the v4.14 merge window
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZrVgvAAoJEFGvii+H/Hdhw9AP/04kewyz1WwobD+0SZsujELP
 768c92WNmMqux8OsRVUpjqcCsepi+W8ppM2tmV9jJ+rwb9SeT+cOxnaUPrlHMuK+
 zQyc4F4PRlyxfrFZ0tr/VrTHvhsmdmhEux34zMrdKggeShwHAkEQhNUFTEo3efKs
 J32H2BuDTcbnbiqz1Lg00NzIFOEhvpSsplgUQtz7NnG1y8T9U0kLupoXkNquIAj9
 SAP9LTNyUlPqlQ0Ku0S77Zr8R9K202T9xi1RGGoscYiM3421WJA/+4S9RTqfAVje
 atAOfS+nNnaxkeBYJT/wZ71zINdbhj0NKsGa/aah4hGIpbvuwouWPy+8PqyugKYy
 M6uBpjo1uk1gu+kYruzNXYmKLH+F8W8bTMNiovJ2bx4qP08FlB/4X+BCL9Hy00/Z
 btOz1cBTEjY2aUND84b2qZLkmGbH4VTGFS3TAr0TqsM2hQH8ThxP2f+tM7Hseupl
 SvaahUYXiqTNexErLQD/Oya6QKZgoJvUmboGGO65BQmdXeHXoA3hZBltp7+aEBb4
 IYG3eWwY5Shj3jpB16jDAioC43B4hHiLUWDGJtquFuscXJr5WkfeKMBg7PrY46Rh
 reHsYAMhLVmOUBe77NyAEyVjQcBtpHlpkETvsCjM3tP/GqWlHLR+1jhech3Ip3LA
 X7ODA7pC9iGSY2ePTiCj
 =nb+l
 -----END PGP SIGNATURE-----

Merge tag 'ib-mfd-many-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd into next

Merge "Immutable branch between MFD and many other subsystems due for
the v4.14 merge window" to get the TWL headers moved to the right place.
2017-09-21 16:38:09 -07:00
Dmitry Torokhov
e8b95728f7 Input: uinput - avoid FF flush when destroying device
Normally, when input device supporting force feedback effects is being
destroyed, we try to "flush" currently playing effects, so that the
physical device does not continue vibrating (or executing other effects).
Unfortunately this does not work well for uinput as flushing of the effects
deadlocks with the destroy action:

- if device is being destroyed because the file descriptor is being closed,
  then there is noone to even service FF requests;

- if device is being destroyed because userspace sent UI_DEV_DESTROY,
  while theoretically it could be possible to service FF requests,
  userspace is unlikely to do so (they'd need to make sure FF handling
  happens on a separate thread) even if kernel solves the issue with FF
  ioctls deadlocking with UI_DEV_DESTROY ioctl on udev->mutex.

To avoid lockups like the one below, let's install a custom input device
flush handler, and avoid trying to flush force feedback effects when we
destroying the device, and instead rely on uinput to shut off the device
properly.

NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
...
 <<EOE>>  [<ffffffff817a0307>] _raw_spin_lock_irqsave+0x37/0x40
 [<ffffffff810e633d>] complete+0x1d/0x50
 [<ffffffffa00ba08c>] uinput_request_done+0x3c/0x40 [uinput]
 [<ffffffffa00ba587>] uinput_request_submit.part.7+0x47/0xb0 [uinput]
 [<ffffffffa00bb62b>] uinput_dev_erase_effect+0x5b/0x76 [uinput]
 [<ffffffff815d91ad>] erase_effect+0xad/0xf0
 [<ffffffff815d929d>] flush_effects+0x4d/0x90
 [<ffffffff815d4cc0>] input_flush_device+0x40/0x60
 [<ffffffff815daf1c>] evdev_cleanup+0xac/0xc0
 [<ffffffff815daf5b>] evdev_disconnect+0x2b/0x60
 [<ffffffff815d74ac>] __input_unregister_device+0xac/0x150
 [<ffffffff815d75f7>] input_unregister_device+0x47/0x70
 [<ffffffffa00bac45>] uinput_destroy_device+0xb5/0xc0 [uinput]
 [<ffffffffa00bb2de>] uinput_ioctl_handler.isra.9+0x65e/0x740 [uinput]
 [<ffffffff811231ab>] ? do_futex+0x12b/0xad0
 [<ffffffffa00bb3f8>] uinput_ioctl+0x18/0x20 [uinput]
 [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
 [<ffffffff81337553>] ? security_file_ioctl+0x43/0x60
 [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
 [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71

Reported-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Reported-by: Clément VUCHENER <clement.vuchener@gmail.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=193741
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-09-21 16:31:22 -07:00
Paolo Abeni
6e617de84e net: avoid a full fib lookup when rp_filter is disabled.
Since commit 1dced6a854 ("ipv4: Restore accept_local behaviour
in fib_validate_source()") a full fib lookup is needed even if
the rp_filter is disabled, if accept_local is false - which is
the default.

What we really need in the above scenario is just checking
that the source IP address is not local, and in most case we
can do that is a cheaper way looking up the ifaddr hash table.

This commit adds a helper for such lookup, and uses it to
validate the src address when rp_filter is disabled and no
'local' routes are created by the user space in the relevant
namespace.

A new ipv4 netns flag is added to account for such routes.
We need that to preserve the same behavior we had before this
patch.

It also drops the checks to bail early from __fib_validate_source,
added by the commit 1dced6a854 ("ipv4: Restore accept_local
behaviour in fib_validate_source()") they do not give any
measurable performance improvement: if we do the lookup with are
on a slower path.

This improves UDP performances for unconnected sockets
when rp_filter is disabled by 5% and also gives small but
measurable performance improvement for TCP flood scenarios.

v1 -> v2:
 - use the ifaddr lookup helper in __ip_dev_find(), as suggested
   by Eric
 - fall-back to full lookup if custom local routes are present

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-21 15:15:22 -07:00