Commit graph

62682 commits

Author SHA1 Message Date
Wolfram Sang
4ee045f4e9 Merge branch 'for-wolfram' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio into i2c/for-4.15
Refactor i2c-gpio and its users to use gpiod. Done by GPIO maintainer
LinusW.
2017-11-01 23:45:46 +01:00
Kees Cook
00ed87da35 timer: Add parenthesis around timer_setup() macro arguments
In the case where expressions are passed as macro arguments, the LOCKDEP
version of the timer macros need enclosing parenthesis.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20171101143250.GA65266@beast
2017-11-01 19:05:05 +01:00
James Smart
ac7fe82b6f nvme-fc: add a dev_loss_tmo field to the remoteport
Add a dev_loss_tmo value, paralleling the SCSI FC transport, for device
connectivity loss.

The transport initializes the value in the nvme_fc_register_remoteport()
call. If the value is not set, a default of 60s is set.

Add a new routine to the api, nvme_fc_set_remoteport_devloss() routine,
which allows the lldd to dynamically update the value on an existing
remoteport.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-11-01 16:34:36 +01:00
Ming Lei
b347689ffb blk-mq-sched: improve dispatching from sw queue
SCSI devices use host-wide tagset, and the shared driver tag space is
often quite big. However, there is also a queue depth for each lun(
.cmd_per_lun), which is often small, for example, on both lpfc and
qla2xxx, .cmd_per_lun is just 3.

So lots of requests may stay in sw queue, and we always flush all
belonging to same hw queue and dispatch them all to driver.
Unfortunately it is easy to cause queue busy because of the small
.cmd_per_lun.  Once these requests are flushed out, they have to stay in
hctx->dispatch, and no bio merge can happen on these requests, and
sequential IO performance is harmed.

This patch introduces blk_mq_dequeue_from_ctx for dequeuing a request
from a sw queue, so that we can dispatch them in scheduler's way. We can
then avoid dequeueing too many requests from sw queue, since we don't
flush ->dispatch completely.

This patch improves dispatching from sw queue by using the .get_budget
and .put_budget callbacks.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-01 08:20:02 -06:00
Ming Lei
de14829740 blk-mq: introduce .get_budget and .put_budget in blk_mq_ops
For SCSI devices, there is often a per-request-queue depth, which needs
to be respected before queuing one request.

Currently blk-mq always dequeues the request first, then calls
.queue_rq() to dispatch the request to lld. One obvious issue with this
approach is that I/O merging may not be successful, because when the
per-request-queue depth can't be respected, .queue_rq() has to return
BLK_STS_RESOURCE, and then this request has to stay in hctx->dispatch
list. This means it never gets a chance to be merged with other IO.

This patch introduces .get_budget and .put_budget callback in blk_mq_ops,
then we can try to get reserved budget first before dequeuing request.
If the budget for queueing I/O can't be satisfied, we don't need to
dequeue request at all. Hence the request can be left in the IO
scheduler queue, for more merging opportunities.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-01 08:20:02 -06:00
Ming Lei
7930d0a00f sbitmap: introduce __sbitmap_for_each_set()
For blk-mq, we need to be able to iterate software queues starting
from any queue in a round robin fashion, so introduce this helper.

Reviewed-by: Omar Sandoval <osandov@fb.com>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-01 08:20:02 -06:00
Egil Hjelmeland
e9292f2c03 net: dsa: lan9303: Add STP ALR entry on port 0
STP BPDUs arriving on user ports must sent to CPU port only,
for processing by the SW bridge.

Add an ALR entry with STP state override to fix that.

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01 21:30:24 +09:00
Baolin Wang
8698b93647
regmap: Add hardware spinlock support
On some platforms, when reading or writing some special registers through
regmap, we should acquire one hardware spinlock to synchronize between
the multiple subsystems. Thus this patch adds the hardware spinlock
support for regmap.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
2017-11-01 10:06:29 +00:00
Matthias Kaehlcke
a9903f04e0 sched/sysctl: Fix attributes of some extern declarations
The definition of sysctl_sched_migration_cost, sysctl_sched_nr_migrate
and sysctl_sched_time_avg includes the attribute const_debug. This
attribute is not part of the extern declaration of these variables in
include/linux/sched/sysctl.h, while it is in kernel/sched/sched.h,
and as a result Clang generates warnings like this:

  kernel/sched/sched.h:1618:33: warning: section attribute is specified on redeclared variable [-Wsection]
  extern const_debug unsigned int sysctl_sched_time_avg;
                                ^
  ./include/linux/sched/sysctl.h:42:21: note: previous declaration is here
  extern unsigned int sysctl_sched_time_avg;

The header only declares the variables when CONFIG_SCHED_DEBUG is defined,
therefore it is not necessary to duplicate the definition of const_debug.
Instead we can use the attribute __read_mostly, which is the expansion of
const_debug when CONFIG_SCHED_DEBUG=y is set.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shile Zhang <shile.zhang@nokia.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171030180816.170850-1-mka@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-01 09:36:17 +01:00
Alexei Starovoitov
638f5b90d4 bpf: reduce verifier memory consumption
the verifier got progressively smarter over time and size of its internal
state grew as well. Time to reduce the memory consumption.

Before:
sizeof(struct bpf_verifier_state) = 6520
After:
sizeof(struct bpf_verifier_state) = 896

It's done by observing that majority of BPF programs use little to
no stack whereas verifier kept all of 512 stack slots ready always.
Instead dynamically reallocate struct verifier state when stack
access is detected.
Runtime difference before vs after is within a noise.
The number of processed instructions stays the same.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01 11:41:18 +09:00
Christian Brauner
6397fac491 userns: bump idmap limits to 340
There are quite some use cases where users run into the current limit for
{g,u}id mappings. Consider a user requesting us to map everything but 999, and
1001 for a given range of 1000000000 with a sub{g,u}id layout of:

some-user:100000:1000000000
some-user:999:1
some-user:1000:1
some-user:1001:1
some-user:1002:1

This translates to:

MAPPING-TYPE | CONTAINER |    HOST |     RANGE |
-------------|-----------|---------|-----------|
         uid |       999 |     999 |         1 |
         uid |      1001 |    1001 |         1 |
         uid |         0 | 1000000 |       999 |
         uid |      1000 | 1001000 |         1 |
         uid |      1002 | 1001002 | 999998998 |
------------------------------------------------
         gid |       999 |     999 |         1 |
         gid |      1001 |    1001 |         1 |
         gid |         0 | 1000000 |       999 |
         gid |      1000 | 1001000 |         1 |
         gid |      1002 | 1001002 | 999998998 |

which is already the current limit.

As discussed at LPC simply bumping the number of limits is not going to work
since this would mean that struct uid_gid_map won't fit into a single cache-line
anymore thereby regressing performance for the base-cases. The same problem
seems to arise when using a single pointer. So the idea is to use

struct uid_gid_extent {
	u32 first;
	u32 lower_first;
	u32 count;
};

struct uid_gid_map { /* 64 bytes -- 1 cache line */
	u32 nr_extents;
	union {
		struct uid_gid_extent extent[UID_GID_MAP_MAX_BASE_EXTENTS];
		struct {
			struct uid_gid_extent *forward;
			struct uid_gid_extent *reverse;
		};
	};
};

For the base cases we will only use the struct uid_gid_extent extent member. If
we go over UID_GID_MAP_MAX_BASE_EXTENTS mappings we perform a single 4k
kmalloc() which means we can have a maximum of 340 mappings
(340 * size(struct uid_gid_extent) = 4080). For the latter case we use two
pointers "forward" and "reverse". The forward pointer points to an array sorted
by "first" and the reverse pointer points to an array sorted by "lower_first".
We can then perform binary search on those arrays.

Performance Testing:
When Eric introduced the extent-based struct uid_gid_map approach he measured
the performanc impact of his idmap changes:

> My benchmark consisted of going to single user mode where nothing else was
> running. On an ext4 filesystem opening 1,000,000 files and looping through all
> of the files 1000 times and calling fstat on the individuals files. This was
> to ensure I was benchmarking stat times where the inodes were in the kernels
> cache, but the inode values were not in the processors cache. My results:

> v3.4-rc1:         ~= 156ns (unmodified v3.4-rc1 with user namespace support disabled)
> v3.4-rc1-userns-: ~= 155ns (v3.4-rc1 with my user namespace patches and user namespace support disabled)
> v3.4-rc1-userns+: ~= 164ns (v3.4-rc1 with my user namespace patches and user namespace support enabled)

I used an identical approach on my laptop. Here's a thorough description of what
I did. I built a 4.14.0-rc4 mainline kernel with my new idmap patches applied. I
booted into single user mode and used an ext4 filesystem to open/create
1,000,000 files. Then I looped through all of the files calling fstat() on each
of them 1000 times and calculated the mean fstat() time for a single file. (The
test program can be found below.)

Here are the results. For fun, I compared the first version of my patch which
scaled linearly with the new version of the patch:

|   # MAPPINGS |   PATCH-V1 | PATCH-NEW |
|--------------|------------|-----------|
|   0 mappings |     158 ns |   158 ns  |
|   1 mappings |     164 ns |   157 ns  |
|   2 mappings |     170 ns |   158 ns  |
|   3 mappings |     175 ns |   161 ns  |
|   5 mappings |     187 ns |   165 ns  |
|  10 mappings |     218 ns |   199 ns  |
|  50 mappings |     528 ns |   218 ns  |
| 100 mappings |     980 ns |   229 ns  |
| 200 mappings |    1880 ns |   239 ns  |
| 300 mappings |    2760 ns |   240 ns  |
| 340 mappings | not tested |   248 ns  |

Here's the test program I used. I asked Eric what he did and this is a more
"advanced" implementation of the idea. It's pretty straight-forward:

 #define __GNU_SOURCE
 #define __STDC_FORMAT_MACROS
 #include <errno.h>
 #include <dirent.h>
 #include <fcntl.h>
 #include <inttypes.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 #include <sys/stat.h>
 #include <sys/time.h>
 #include <sys/types.h>

 int main(int argc, char *argv[])
 {
 	int ret;
 	size_t i, k;
 	int fd[1000000];
 	int times[1000];
 	char pathname[4096];
 	struct stat st;
 	struct timeval t1, t2;
 	uint64_t time_in_mcs;
 	uint64_t sum = 0;

 	if (argc != 2) {
 		fprintf(stderr, "Please specify a directory where to create "
 				"the test files\n");
 		exit(EXIT_FAILURE);
 	}

 	for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++) {
 		sprintf(pathname, "%s/idmap_test_%zu", argv[1], i);
 		fd[i]= open(pathname, O_RDWR | O_CREAT, S_IXUSR | S_IXGRP | S_IXOTH);
 		if (fd[i] < 0) {
 			ssize_t j;
 			for (j = i; j >= 0; j--)
 				close(fd[j]);
 			exit(EXIT_FAILURE);
 		}
 	}

 	for (k = 0; k < 1000; k++) {
 		ret = gettimeofday(&t1, NULL);
 		if (ret < 0)
 			goto close_all;

 		for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++) {
 			ret = fstat(fd[i], &st);
 			if (ret < 0)
 				goto close_all;
 		}

 		ret = gettimeofday(&t2, NULL);
 		if (ret < 0)
 			goto close_all;

 		time_in_mcs = (1000000 * t2.tv_sec + t2.tv_usec) -
 			      (1000000 * t1.tv_sec + t1.tv_usec);
 		printf("Total time in micro seconds:       %" PRIu64 "\n",
 		       time_in_mcs);
 		printf("Total time in nanoseconds:         %" PRIu64 "\n",
 		       time_in_mcs * 1000);
 		printf("Time per file in nanoseconds:      %" PRIu64 "\n",
 		       (time_in_mcs * 1000) / 1000000);
 		times[k] = (time_in_mcs * 1000) / 1000000;
 	}

 close_all:
 	for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++)
 		close(fd[i]);

 	if (ret < 0)
 		exit(EXIT_FAILURE);

 	for (k = 0; k < 1000; k++) {
 		sum += times[k];
 	}

 	printf("Mean time per file in nanoseconds: %" PRIu64 "\n", sum / 1000);

 	exit(EXIT_SUCCESS);;
 }

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
CC: Serge Hallyn <serge@hallyn.com>
CC: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-10-31 17:23:04 -05:00
Christian Brauner
aa4bf44dc8 userns: use union in {g,u}idmap struct
- Add a struct containing two pointer to extents and wrap both the static extent
  array and the struct into a union. This is done in preparation for bumping the
  {g,u}idmap limits for user namespaces.
- Add brackets around anonymous union when using designated initializers to
  initialize members in order to please gcc <= 4.4.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-10-31 17:22:58 -05:00
Thomas Gleixner
fb56d689fb Merge branch 'fortglx/4.15/time' of https://git.linaro.org/people/john.stultz/linux into timers/core
Pull timekeeping updates from John Stultz:

 - More y2038 work from Arnd Bergmann

 - A new mechanism to allow RTC drivers to specify the resolution of the
   RTC so the suspend/resume code can make informed decisions whether to
   inject the suspended time or not in case of fast suspend/resume cycles.
2017-10-31 23:17:28 +01:00
Rafael J. Wysocki
d5919dcc34 Revert "PM / QoS: Fix device resume latency PM QoS"
This reverts commit 0cc2b4e5a0 (PM / QoS: Fix device resume latency PM
QoS) as it introduced regressions on multiple systems and the fix-up
in commit 2a9a86d5c8 (PM / QoS: Fix default runtime_pm device resume
latency) does not address all of them.

The original problem that commit 0cc2b4e5a0 was attempting to fix
will be addressed later.

Fixes: 0cc2b4e5a0 (PM / QoS: Fix device resume latency PM QoS)
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-31 18:35:40 +01:00
Rafael J. Wysocki
5ba257249e Revert "PM / QoS: Fix default runtime_pm device resume latency"
This reverts commit 2a9a86d5c8 (PM / QoS: Fix default runtime_pm
device resume latency) as the commit it depends on is going to be
reverted.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-31 18:24:38 +01:00
Elena Reshetova
ab97f87325 fsnotify: convert fsnotify_mark.refcnt from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable fsnotify_mark.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-10-31 17:54:56 +01:00
Miklos Szeredi
6685df3125 fanotify: clean up CONFIG_FANOTIFY_ACCESS_PERMISSIONS ifdefs
The only negative from this patch should be an addition of 32bytes to
'struct fsnotify_group' if CONFIG_FANOTIFY_ACCESS_PERMISSIONS is not
defined.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-10-31 17:54:56 +01:00
Elena Reshetova
7761daa6a1 fsnotify: convert fsnotify_group.refcnt from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable fsnotify_group.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-10-31 17:54:56 +01:00
Kees Cook
ece1996a21 module: Do not paper over type mismatches in module_param_call()
The module_param_call() macro was explicitly casting the .set and
.get function prototypes away. This can lead to hard-to-find type
mismatches. Now that all the function prototypes have been fixed
tree-wide, we can drop these casts, and use named initializers too.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2017-10-31 15:30:47 +01:00
Kees Cook
b2f270e874 module: Prepare to convert all module_param_call() prototypes
After actually converting all module_param_call() function prototypes, we
no longer need to do a tricky sizeof(func(thing)) type-check. Remove it.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2017-10-31 15:30:32 +01:00
Bob Peterson
3974320ca6 GFS2: Implement iomap for block_map
This patch implements iomap for block mapping, and switches the
block_map function to use it under the covers.

The additional IOMAP_F_BOUNDARY iomap flag indicates when iomap has
reached a "metadata boundary" and fetching the next mapping is likely to
incur an additional I/O.  This flag is used for setting the bh buffer
boundary flag.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2017-10-31 14:26:33 +01:00
Stephen Hemminger
6981fbf378 Drivers: hv: vmbus: Expose per-channel interrupts and events counters
When investigating performance, it is useful to be able to look at
the number of host and guest events per-channel. This is equivalent
to per-device interrupt statistics.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-31 13:40:29 +01:00
James Ban
707ce9eac5
regulator: da9211: update for supporting da9223/4/5
This is update for supporting additional devices da9223/4/5.
Only device strings is added because only package type is different.

Signed-off-by: James Ban <James.Ban..opensource@diasemi.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2017-10-31 11:01:14 +00:00
Avaneesh Kumar Dwivedi
d82bd35997 firmware: scm: Add new SCM call API for switching memory ownership
Two different processors on a SOC need to switch memory ownership
during load/unload. To enable this, second level memory map table
need to be updated, which is done by secure layer.
This patch adds the interface for making secure monitor call for
memory ownership switching request.

Acked-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Avaneesh Kumar Dwivedi <akdwived@codeaurora.org>
[bjorn: Minor style and kerneldoc updates]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2017-10-30 18:37:07 -07:00
Arnd Bergmann
6546911ed3 time: Move old timekeeping interfaces to timekeeping32.h
The interfaces based on 'struct timespec' and 'unsigned long' seconds
are no longer recommended for new code, and we are trying to migrate to
ktime_t based interfaces and other y2038-safe variants.

This moves all the legacy interfaces from linux/timekeeping.h into a
new timekeeping32.h to better document this.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-10-30 15:17:20 -07:00
Arnd Bergmann
abc8f96e3e time: Move time_t conversion helpers to time32.h
On 64-bit architectures, the timespec64 based helpers in linux/time.h
are defined as macros pointing to their timespec based counterparts.
This made sense when they were first introduced, but as we are migrating
away from timespec in general, it's much less intuitive now.

This changes the macros to work in the exact opposite way: we always
provide the timespec64 based helpers and define the old interfaces as
macros for them. Now we can move those macros into linux/time32.h, which
already contains the respective helpers for 32-bit architectures.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-10-30 15:17:19 -07:00
Arnd Bergmann
5dbf20127f time: Move time_t based interfaces to time32.h
Interfaces based on 'struct timespec' or 'struct timeval' should no
longer be used for new code, which can use either ktime_t or 'struct
timespec64' instead.

To make this a little clearer, this moves the various helpers into a new
time32.h header. For the moment, this gets included by the normal time.h,
but we may be able to separate it entirely when most users of time32.h
are gone.

Individual helpers in the new file can get removed once they become unused
in the future.

Since the contents of time32.h look a lot like what's in time64.h, I'm
reordering them during the move to make them more similar, and to allow
a follow-up patch to redirect the 'timespec' based functions to thei
'timespec64' based counterparts on 64-bit architectures later.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[jstultz: Whitespace & checkpatch fixups]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-10-30 15:16:38 -07:00
Arnd Bergmann
85bf19e7df time: Remove unused functions
The (slow but) ongoing work on conversion from timespec to timespec64
has led some timespec based helper functions to become unused.

No new code should use them, so we can remove the functions entirely.
I'm planning to obsolete additional interfaces next and remove
more of these.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-10-30 15:14:18 -07:00
Arnd Bergmann
e0956dcc4b timekeeping: Consolidate timekeeping_inject_offset code
The code to check the adjtimex() or clock_adjtime() arguments is spread
out across multiple files for presumably only historic reasons. As a
preparatation for a rework to get rid of the use of 'struct timeval'
and 'struct timespec' in there, this moves all the portions into
kernel/time/timekeeping.c and marks them as 'static'.

The warp_clock() function here is not as closely related as the others,
but I feel it still makes sense to move it here in order to consolidate
all callers of timekeeping_inject_offset().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[jstultz: Whitespace fixup]
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-10-30 15:13:35 -07:00
Jason Gunthorpe
0f295b0650 rtc: Allow rtc drivers to specify the tv_nsec value for ntp
ntp is currently hardwired to try and call the rtc set when wall clock
tv_nsec is 0.5 seconds. This historical behaviour works well with certain
PC RTCs, but is not universal to all rtc hardware.

Change how this works by introducing the driver specific concept of
set_offset_nsec, the delay between current wall clock time and the target
time to set (with a 0 tv_nsecs).

For x86-style CMOS set_offset_nsec should be -0.5 s which causes the last
second to be written 0.5 s after it has started.

For compat with the old rtc_set_ntp_time, the value is defaulted to
+ 0.5 s, which causes the next second to be written 0.5s before it starts,
as things were before this patch.

Testing shows many non-x86 RTCs would like set_offset_nsec ~= 0,
so ultimately each RTC driver should set the set_offset_nsec according
to its needs, and non x86 architectures should stop using
update_persistent_clock64 in order to access this feature.
Future patches will revise the drivers as needed.

Since CMOS and RTC now have very different handling they are split
into two dedicated code paths, sharing the support code, and ifdefs
are replaced with IS_ENABLED.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2017-10-30 15:03:24 -07:00
David S. Miller
e1ea2f9856 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several conflicts here.

NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to
nfp_fl_output() needed some adjustments because the code block is in
an else block now.

Parallel additions to net/pkt_cls.h and net/sch_generic.h

A bug fix in __tcp_retransmit_skb() conflicted with some of
the rbtree changes in net-next.

The tc action RCU callback fixes in 'net' had some overlap with some
of the recent tcf_block reworking.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-30 21:09:24 +09:00
Wolfram Sang
6186d06c51 mmc: parse new binding for eMMC fixed driver type
Parse the new binding and store it in the host struct after doing some
sanity checks. The code is designed to support fixed SD driver type if
we ever need that.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-10-30 11:50:38 +01:00
rui_feng
563be8b603 mmc: rtsx: fix tuning fail on gen3 PCI-Express
On gen3 PCI-Express we should send command one by one.
If sending many commands in one packet will lead to a failure.

Signed-off-by: rui_feng <rui_feng@realsil.com.cn>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-10-30 11:45:53 +01:00
Adrian Hunter
72a5af554d mmc: core: Add support for handling CQE requests
Add core support for handling CQE requests, including starting, completing
and recovering.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-10-30 11:45:50 +01:00
Adrian Hunter
6c0cedd1ef mmc: core: Introduce host claiming by context
Currently the host can be claimed by a task.  Change this so that the host
can be claimed by a context that may or may not be a task.  This provides
for the host to be claimed by a block driver queue to support blk-mq, while
maintaining compatibility with the existing use of mmc_claim_host().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-10-30 11:45:49 +01:00
Wolfram Sang
79ea73b05a mmc: sdhci-pci: remove outdated declaration
The function was removed half a year ago, so this declaration can go,
too.

Fixes: 51ced59cc0 ("mmc: sdhci-pci: Use ACPI DSM to get driver strength for some Intel devices")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-10-30 11:40:08 +01:00
Rafael J. Wysocki
82a1faa949 Pull Request from PM/devfreq to Linux-PM
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZ8ZgsAAoJEBOhurvWBoCKjngP/i2JpmeALCNBAOFR9OS4nrWJ
 /iZcQ6N0QG2hv9rzp/JUiAfkjlZ+ZvN2J6ms4Qm4XsWl9LLe/tMwxHTu2eVwYraU
 25KD1QmFT1mCb6bkq0861Jm4ahLn7Df7R5nAUwZXE0nrE29IsTRXYmnt4kXyDUrO
 W8tfMem+RfYdGV6VMdxCiR+0o4qUuUBlieh+HSP2imP8yfT2dIi7hnJnEBSs8Eev
 5ebqJgWcPf7Wf4NQD9OX/xvA1TdX8jEce2i0c2t6N/KfmhH6aYfismgj3UcgKU4G
 KyErWg71JnIinMmr+a9ok5ic0YJ7NGdS9sxaKMKmbCzsDEjMi9zCqnjUB+fpWoDs
 jNipjU2upKelWtgrcV5PArKwMU5IjV4y7cdAyxCQ0DFAycfzr1VjVXpj854cce2F
 85H8NNHW+kkVvE+ZOCeTl2XUXzjcyvlBZHAJH5W3r8C17lF8cBraRfV1bBgV/FfL
 aRuzTWLcwELjvBj1vqVZhw21BEmPq6085EAPQ4t5bGbh7nztFl+vUeFWPCtrlPlw
 fyXNhRPPBHlrHdTckX45cV58sbsAV5v7/w6pOhgel9TWprFXHfWDYEdUGaJxDq6D
 wTjDcSSqhr8pIfQ+FpATgNk0bidD1xF0Pk6BPZdOtV9eK5C5NAQ7yE72PW2ThGYb
 wFNozatFrt+UnmXVHQx6
 =kKeC
 -----END PGP SIGNATURE-----

Merge tag 'pullreq_20171026' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq into pm-devfreq

Pull devfreq changes for v4.15 from MyungJoo Ham.

* tag 'pullreq_20171026' of https://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq:
  PM / devfreq: Define the constant governor name
  PM / devfreq: Remove unneeded conditional statement
  PM / devfreq: Show the all available frequencies
  PM / devfreq: Change return type of devfreq_set_freq_table()
  PM / devfreq: Use the available min/max frequency
  Revert "PM / devfreq: Add show_one macro to delete the duplicate code"
  PM / devfreq: Set min/max_freq when adding the devfreq device
2017-10-30 11:11:21 +01:00
Tero Kristo
2a9a86d5c8 PM / QoS: Fix default runtime_pm device resume latency
The recent change to the PM QoS framework to introduce a proper
no constraint value overlooked to handle the devices which don't
implement PM QoS OPS.  Runtime PM is one of the more severely
impacted subsystems, failing every attempt to runtime suspend
a device.  This leads into some nasty second level issues like
probe failures and increased power consumption among other
things.

Fix this by adding a proper return value for devices that don't
implement PM QoS.

Fixes: 0cc2b4e5a0 (PM / QoS: Fix device resume latency PM QoS)
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-30 10:54:57 +01:00
Baoquan He
15670bfe19 x86/mm/64: Rename the register_page_bootmem_memmap() 'size' parameter to 'nr_pages'
register_page_bootmem_memmap()'s 3rd 'size' parameter is named
in a somewhat misleading fashion - rename it to 'nr_pages' which
makes the units of it much clearer.

Meanwhile rename the existing local variable 'nr_pages' to
'nr_pmd_pages', a more expressive name, to avoid conflict with
new function parameter 'nr_pages'.

(Also clean up the unnecessary parentheses in which get_order() is called.)

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/1509154238-23250-1-git-send-email-bhe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-30 10:30:23 +01:00
Ingo Molnar
e17bae3266 Linux 4.14-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZ9kEFAAoJEHm+PkMAQRiGw6wH/0j197qyGd0hkVFMJO6LAgN3
 KQWS4nZ5BkVDocwv0RVnUJTtXqU1eozFgdVEtSoaFXpzlHGuptR2Tau9efDCJ7w3
 /utZxqvhGebZd2T+j+/o/LE8BRQxhADBNJq2D/o0WNt8ecxuG0GIkhkEYt/o3z1v
 /sxlwVwzXB7Dc/h1WcgGJG7cS6L9KzzAzGAS/iNvdFrPOygHBv8c0MxVZIiBIeeK
 1nZdyvbyM8uenSyG+prGt9ENrqXZxxfwUxIchi2V7A9m1WmD5zijNkf1JCWji/O+
 UsA1auxna7MwoxjxqZuGm4MlKOwZ+8xutk4JGgc+aP/ulndJbJYu+4op/3vaFBM=
 =Mhx+
 -----END PGP SIGNATURE-----

Merge tag 'v4.14-rc7' into x86/mm, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-30 10:30:09 +01:00
Linus Walleij
f926dfc112 gpio: Make it possible for consumers to enforce open drain
Some busses, like I2C, strictly need to have the line handled
as open drain, i.e. not actively driven high. For this reason
the i2c-gpio.c bit-banged I2C driver is reimplementing open
drain handling outside of gpiolib.

This is not very optimal. Instead make it possible for a
consumer to explcitly express that the line must be handled
as open drain instead of allowing local hacks papering over
this issue.

The descriptor tables, whether DT, ACPI or board files, should
of course have flagged these lines as open drain. E.g.:
enum gpio_lookup_flags GPIO_OPEN_DRAIN for a board file, or
gpios = <&foo 42 GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN>; in a
device tree using <dt-bindings/gpio/gpio.h>

But more often than not, these descriptors are wrong. So
we need to make it possible for consumers to enforce this
open drain behaviour.

We now have two new enumerated GPIO descriptor config flags:
GPIOD_OUT_LOW_OPEN_DRAIN and GPIOD_OUT_HIGH_OPEN_DRAIN
that will set up the lined enforced as open drain as output
low or high, using open drain (if the driver supports it)
or using open drain emulation (setting the line as input
to drive it high) from the gpiolib core.

Cc: linux-gpio@vger.kernel.org
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-10-30 08:42:31 +01:00
Linus Walleij
b2e6355559 i2c: gpio: Convert to use descriptors
This converts the GPIO-based I2C-driver to using GPIO
descriptors instead of the old global numberspace-based
GPIO interface. We:

- Convert the driver to unconditionally grab two GPIOs
  from the device by index 0 (SDA) and 1 (SCL) which
  will work fine with device tree and descriptor tables.
  The existing device trees will continue to work just
  like before, but without any roundtrip through the
  global numberspace.

- Brutally convert all boardfiles still passing global
  GPIOs by registering descriptor tables associated with
  the devices instead so this driver does not need to keep
  supporting passing any GPIO numbers as platform data.

There is no stepwise approach as elegant as this, I
strongly prefer this big hammer over any antsteps for this
conversion. This way the old GPIO numbers go away and
NEVER COME BACK.

Special conversion for the different boards utilizing
I2C-GPIO:

- EP93xx (arch/arm/mach-ep93xx): pretty straight forward as
  all boards were using the same two GPIO lines, just define
  these two in a lookup table for "i2c-gpio" and register
  these along with the device. None of them define any
  other platform data so just pass NULL as platform data.
  This platform selects GPIOLIB so all should be smooth.
  The pins appear on a gpiochip for bank "G" as pins 1 (SDA)
  and 0 (SCL).

- IXP4 (arch/arm/mach-ixp4): descriptor tables have to
  be registered for each board separately. They all use
  "IXP4XX_GPIO_CHIP" so it is pretty straight forward.
  Most board define no other platform data than SCL/SDA
  so they can drop the #include of <linux/i2c-gpio.h> and
  assign NULL to platform data.

  The "goramo_mlr" (Goramo Multilink Router) board is a bit
  worrisome: it implements its own I2C bit-banging in the
  board file, and optionally registers an I2C serial port,
  but claims the same GPIO lines for itself in the board file.
  This is not going to work: there will be competition for the
  GPIO lines, so delete the optional extra I2C bus instead, no
  I2C devices are registered on it anyway, there are just hints
  that it may contain an EEPROM that may be accessed from
  userspace. This needs to be fixed up properly by the serial
  clock using I2C emulation so drop a note in the code.

- KS8695 board acs5k (arch/arm/mach-ks8695/board-acs5.c)
  has some platform data in addition to the pins so it needs to
  be kept around sans GPIO lines. Its GPIO chip is named
  "KS8695" and the arch selects GPIOLIB.

- PXA boards (arch/arm/mach-pxa/*) use some of the platform
  data so it needs to be preserved here. The viper board even
  registers two GPIO I2Cs. The gpiochip is named "gpio-pxa" and
  the arch selects GPIOLIB.

- SA1100 Simpad (arch/arm/mach-sa1100/simpad.c) defines a GPIO
  I2C bus, and the arch selects GPIOLIB.

- Blackfin boards (arch/blackfin/bf533 etc) for these I assume
  their I2C GPIOs refer to the local gpiochip defined in
  arch/blackfin/kernel/bfin_gpio.c names "BFIN-GPIO".
  The arch selects GPIOLIB. The boards get spiked with
  IF_ENABLED(I2C_GPIO) but that is a side effect of it
  being like that already (I would just have Kconfig select
  I2C_GPIO and get rid of them all.) I also delete any
  platform data set to 0 as it will get that value anyway
  from static declartions of platform data.

- The MIPS selects GPIOLIB and the Alchemy machine is using
  two local GPIO chips, one of them has a GPIO I2C. We need
  to adjust the local offset from the global number space here.
  The ATH79 has a proper GPIO driver in drivers/gpio/gpio-ath79.c
  and AFAICT the chip is named "ath79-gpio" and the PB44
  PCF857x expander spawns from this on GPIO 1 and 0. The latter
  board only use the platform data to specify pins so it can be
  cut altogether after this.

- The MFD Silicon Motion SM501 is a special case. It dynamically
  spawns an I2C bus off the MFD using sm501_create_subdev().
  We use an approach to dynamically create a machine descriptor
  table and attach this to the "SM501-LOW" or "SM501-HIGH"
  gpiochip. We use chip-local offsets to grab the right lines.
  We can get rid of two local static inline helpers as part
  of this refactoring.

Cc: Steven Miao <realmz6@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: Heiko Schocher <hs@denx.de>
Acked-by: Wu, Aaron <Aaron.Wu@analog.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-10-30 08:42:21 +01:00
Linus Walleij
ef7a612415 hwmon: (gpio-fan) Localize platform data
There is not a single user of the platform data header in
<linux/gpio-fan.h>. We can conclude that all current users are
probing from the device tree, so start simplifying the code by
pulling the header into the driver.

Convert "unsigned" to "unsigned int" in the process to make
checkpatch happy.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-10-29 18:36:03 -07:00
Linus Walleij
186731145f hwmon: (sht15) Root out platform data
After finding out there are active users of this sensor I noticed:

- It has a single PXA27x board file using the platform data
- The platform data is only used to carry two GPIO pins, all other
  fields are unused
- The driver does not use GPIO descriptors but the legacy GPIO
  API

I saw we can swiftly fix this by:

- Killing off the platform data entirely
- Define a GPIO descriptor lookup table in the board file
- Use the standard devm_gpiod_get() to grab the GPIO descriptors
  from either the device tree or the board file table.

This compiles, but needs testing.

Cc: arm@kernel.org
Cc: Marco Franchi <marco.franchi@nxp.com>
Cc: Davide Hug <d@videhug.ch>
Cc: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Marco Franchi <marco.franchi@nxp.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-10-29 18:36:03 -07:00
Linus Torvalds
19e12196da Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix route leak in xfrm_bundle_create().

 2) In mac80211, validate user rate mask before configuring it. From
    Johannes Berg.

 3) Properly enforce memory limits in fair queueing code, from Toke
    Hoiland-Jorgensen.

 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.

 5) Fix TSO header allocation and management in mvpp2 driver, from Yan
    Markman.

 6) Don't take socket lock in BH handler in strparser code, from Tom
    Herbert.

 7) Don't show sockets from other namespaces in AF_UNIX code, from
    Andrei Vagin.

 8) Fix double free in error path of tap_open(), from Girish Moodalbail.

 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
    and Alexander Duyck.

10) Fix DCB mode programming in stmmac driver, from Jose Abreu.

11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
    Long.

12) Properly align SKB head before building SKB in tuntap, from Jason
    Wang.

13) Avoid matching qdiscs with a zero handle during lookups, from Cong
    Wang.

14) Fix various endianness bugs in sctp, from Xin Long.

15) Fix tc filter callback races and add selftests which trigger the
    problem, from Cong Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  selftests: Introduce a new test case to tc testsuite
  selftests: Introduce a new script to generate tc batch file
  net_sched: fix call_rcu() race on act_sample module removal
  net_sched: add rtnl assertion to tcf_exts_destroy()
  net_sched: use tcf_queue_work() in tcindex filter
  net_sched: use tcf_queue_work() in rsvp filter
  net_sched: use tcf_queue_work() in route filter
  net_sched: use tcf_queue_work() in u32 filter
  net_sched: use tcf_queue_work() in matchall filter
  net_sched: use tcf_queue_work() in fw filter
  net_sched: use tcf_queue_work() in flower filter
  net_sched: use tcf_queue_work() in flow filter
  net_sched: use tcf_queue_work() in cgroup filter
  net_sched: use tcf_queue_work() in bpf filter
  net_sched: use tcf_queue_work() in basic filter
  net_sched: introduce a workqueue for RCU callbacks of tc filter
  sctp: fix some type cast warnings introduced since very beginning
  sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
  sctp: fix some type cast warnings introduced by transport rhashtable
  sctp: fix some type cast warnings introduced by stream reconf
  ...
2017-10-29 08:11:49 -07:00
Xin Long
978aa04741 sctp: fix some type cast warnings introduced since very beginning
These warnings were found by running 'make C=2 M=net/sctp/'.
They are there since very beginning.

Note after this patch, there still one warning left in
sctp_outq_flush():
  sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM)

Since it has been moved to sctp_stream_outq_migrate on net-next,
to avoid the extra job when merging net-next to net, I will post
the fix for it after the merging is done.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 18:03:24 +09:00
Xin Long
1da4fc97cb sctp: fix some type cast warnings introduced by stream reconf
These warnings were found by running 'make C=2 M=net/sctp/'.

They are introduced by not aware of Endian when coding stream
reconf patches.

Since commit c0d8bab6ae ("sctp: add get and set sockopt for
reconf_enable") enabled stream reconf feature for users, the
Fixes tag below would use it.

Fixes: c0d8bab6ae ("sctp: add get and set sockopt for reconf_enable")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 18:03:24 +09:00
Phil Reid
69d17246ab i2c: i2c-smbus: add of_i2c_setup_smbus_alert
This commit adds of_i2c_setup_smbus_alert which allows the smbalert
driver to be attached to an i2c adapter via the device tree.

Signed-off-by: Phil Reid <preid@electromag.com.au>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-10-28 23:42:47 +02:00
Phil Reid
9b9f2b8bc2 i2c: i2c-smbus: Use threaded irq for smbalert
Prior to this commit the smbalert_irq was handling in the hard irq
context. This change switch to using a thread irq which avoids the need
for the work thread. Using threaded irq also removes the need for the
edge_triggered flag as the enabling / disabling of the hard irq for level
triggered interrupts will be handled by the irq core.

Without this change have an irq connected to something like an i2c gpio
resulted in a null ptr deferences. Specifically handle_nested_irq calls
the threaded irq handler.

There are currently 3 in tree drivers affected by this change.

i2c-parport driver calls i2c_handle_smbus_alert in a hard irq context.
This driver use edge trigger interrupts which skip the enable / disable
calls. But it still need to handle the smbus transaction on a thread. So
the work thread is kept for this driver.

i2c-parport-light & i2c-thunderx-pcidrv provide the irq number in the
setup which will result in the thread irq being used.

i2c-parport-light is edge trigger so the enable / disable call was
skipped as well.

i2c-thunderx-pcidrv is getting the edge / level trigger setting from of
data and was setting the flag as required. However the irq core should
handle this automatically.

Signed-off-by: Phil Reid <preid@electromag.com.au>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-10-28 23:42:26 +02:00
Girish Moodalbail
dea6e19f4e tap: reference to KVA of an unloaded module causes kernel panic
The commit 9a393b5d59 ("tap: tap as an independent module") created a
separate tap module that implements tap functionality and exports
interfaces that will be used by macvtap and ipvtap modules to create
create respective tap devices.

However, that patch introduced a regression wherein the modules macvtap
and ipvtap can be removed (through modprobe -r) while there are
applications using the respective /dev/tapX devices. These applications
cause kernel to hold reference to /dev/tapX through 'struct cdev
macvtap_cdev' and 'struct cdev ipvtap_dev' defined in macvtap and ipvtap
modules respectively. So,  when the application is later closed the
kernel panics because we are referencing KVA that is present in the
unloaded modules.

----------8<------- Example ----------8<----------
$ sudo ip li add name mv0 link enp7s0 type macvtap
$ sudo ip li show mv0 |grep mv0| awk -e '{print $1 $2}'
  14:mv0@enp7s0:
$ cat /dev/tap14 &
$ lsmod |egrep -i 'tap|vlan'
macvtap                16384  0
macvlan                24576  1 macvtap
tap                    24576  3 macvtap
$ sudo modprobe -r macvtap
$ fg
cat /dev/tap14
^C

<...system panics...>
BUG: unable to handle kernel paging request at ffffffffa038c500
IP: cdev_put+0xf/0x30
----------8<-----------------8<----------

The fix is to set cdev.owner to the module that creates the tap device
(either macvtap or ipvtap). With this set, the operations (in
fs/char_dev.c) on char device holds and releases the module through
cdev_get() and cdev_put() and will not allow the module to unload
prematurely.

Fixes: 9a393b5d59 (tap: tap as an independent module)
Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 19:17:21 +09:00